Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / programming / algorithm

Corona SEIR Workbench

5.00/5 (10 votes)
24 Apr 2022CPOL23 min read 23.7K   945  
Pandemic SEIR and SEIRV modelling software and infrastructure for the Corona SARS-COV-2 COVID-19 disease with data from Johns-Hopkins-University CSSE, Robert Koch-Institute and vaccination data from Our World In Data.
The SARS-COV-2 pandemic has been affecting our lives for months. The effectiveness of measures against the pandemic can be tested and predicted by using epidemiological models. The Corona SEIR Workbench uses a SEIR model and combines a graphical output of the results with a simple parameter input for the model. Modelled data can be compared country by country with the SARS-COV-2 infection data of the Johns Hopkins University. Additionally, the R₀ values of the Robert Koch Institute can be displayed for Germany. Vaccination data is used from Our World In Data.

Introduction

Since December 2019, the corona virus SARS-COV-2 and its COVID-19 disease has been keeping the world in suspense with a pandemic. Although, the occurrence of a comparable pandemic was only a matter of time. It is estimated that similar pandemics occur every 20-30 years[1].

Yet today, no vaccine against SARS-COV-2 is available comprehensively, many different non-medical measures are being used to defeat the pandemic. These range from everyday masks, distancing of people, closure of public institutions, sports facilities, bars and restaurants, up to contact restrictions, all with the aim of reducing the spread of virus particles. Test strategies and contact tracing should detect infected persons as soon as possible and then isolate them in a quarantine. All these measures are designed to control the occurrence of infection and yet have as few restrictions as possible for the population and as few as possible negative economic impacts.

Prerequisite for controlling the pandemic is a data base as good as possible. In Germany, this is generated by the Robert Koch-Institute via the local health authorities. Internationally, the Center for Systems Science and Engineering (CSSE) at Johns-Hopkins-University collects and publishes SARS-COV-2 infection data.

The central goal of pandemic control is to avoid overburdening the health care system so that all those suffering from COVID 19 receive the best possible medical care. Moreover, the lowest possible infection rate is desirable to minimize the health consequences for the population.

On the basis of the data obtained, epidemic models can make predictions about the infection process in the future. The number of infections grows exponentially. As a result, the rise in the number of patients can get out of hand quickly and overburden the healthcare system. Because of this, early intervention and a review of the measures taken are essential to control the pandemic.

Background

The SEIR model belongs to the class of compartment models that sort people into different compartments. The SEIR model divides the population into four compartments: “Susceptible” (people who are susceptible to an infection), “Exposed” (infected people during the incubation period), “Infectious” (infectious people) and “Recovered” (people who recovered from the disease or died):

Compartments of the SEIR model

Compartments of the SEIR model

During modeling, people move from one compartment to the next. The calculation takes place per time unit. Usually, a single day is used as time unit.

At the beginning of the epidemic, almost all persons are healthy and in the “Susceptible” compartment. All persons in this compartment can be infected with the corona virus. After an infection, a person moves to the “Exposed” compartment and remains there for the medium incubation period of the disease. At this time the person is not yet contagious to other people. A study[2] in January stated the mean incubation time as 5.2 days. A detailed discussion can be found in Byrne et al.[3].

As soon as the ill person distributes virus particles and can infect other people, he or she moves from the “Exposed” compartment to the “Infectious” compartment. This “bucket” contains the people who are driving the pandemic. The infectious person remains in this compartment for the average infection time of 2.9 days[4].

The average infection time should not be confused with the maximum infection time, which plays a role in quarantining a sick person and can be up to 14 days. The mean duration of infection is also influenced by the diagnosis, as a symptomatic person is tested often at the onset of symptoms and then isolated.

After a person has survived the disease and is no longer infectious, the person moves from the “Infectious” to the “Recovered” compartment. At this time, the person is immune to re-infection with SARS-COV-2. Another possible outcome is that the person might have died from COVID 19 disease. The SEIR model makes no distinction here and puts the person in both cases into the “Recovered” compartment.

Since December 2020, more and more vaccination programs against SARS-COV-2 are starting. In Europe and the USA, the mRNA vaccines from Biontech/Pfizer and Moderna are currently being used with a high immunization efficacy of 94% to 95%.

There are different approaches to include vaccinated persons in a SEIR model. One approach adds an additional compartment called “Vaccinated” to the SEIR model.[5] After a vaccination, individuals are moved from the “Susceptible” into the “Vaccinated” compartment in which they can no longer be infected.

This approach is simplified, since for an individual two successive vaccinations take place at intervals of 3 or 4 weeks and the vaccination protection builds up gradually over several weeks. Protection against a Corona infection exists 12 days[6] after the first vaccination at the earliest. Therefore, vaccinated individuals cannot be removed from the compartment of susceptible persons completely.

To account for this, the “Vaccinated” compartment can be considered as a subset of the “Susceptible” compartment:

Vaccinated compartment

Vaccinated compartment

With this approach, susceptible individuals can still be infected after a vaccination to a percentage that depends on the effectiveness of the vaccine. The extended SEIRV model contains the following compartments:

Compartments of the SEIRV model

Compartments of the SEIRV model

Maths of the SEIR Model

In the SEIR model, changes of all compartments are calculated per time unit. Additionally, in the SEIRV model, the change between sub-compartment V and compartment S is determined before calculating the change in the SEIR compartments.

In this first step, all vaccinated persons within compartment S are moved to sub-compartment V. A constant number of vaccinated individuals per day is assumed. For this reason, the population number N is used as a reference value and multiplied by a constant factor:

ΔVaccinated

Once the ΔSv vaccinated individuals have been determined, they can be taken into account in the next step when calculating the ΔSi susceptible individuals. In this step, persons in compartment S are infected by infectious persons in compartment I. Important factors are the number of infectious persons and the basic reproduction number R0:

ΔSusceptible

For SARS-COV-2, an R0 value of 3.4[7] was determined for an unrestricted virus spreading. This means that on average, one infected person infects 3.4 other susceptible persons. R0 values greater than 1 result in an exponential growth of new infections, while an R0 value less than 1 leads to a decreasing number of newly infected persons.

From the total number of susceptible persons, the vaccinated persons are subtracted with a factor for vaccination effectiveness, because vaccinated persons cannot be infected. In contrast in a SEIR model without vaccinations, all persons from compartment S would be used here.

The change of infectious persons ΔE depends on the incubation period:

ΔExposed

Therefore, the number of infectious persons increases with a certain time delay after they got infected.

After the disease has been overcome or the infectious person has been isolated, he or she moves to compartment R. The number ΔI depends on the average duration of infectiousness:

ΔInfectious

The symbols used in the equations represent the following values:

SEIRV Legend

The new numbers of persons S', E', I', R' and V' in the compartments S, E, I, R and V after a time unit can be calculated with the following formulas:

Image 9

The Workbench

The goal of the Corona Workbench was to create an infrastructure for modelling an epidemic. In addition to the modelling, this includes the graphical output of results per time unit and an easy modification of modelling parameters. To compare the modelling with historical data, it should be possible to display population sizes, infection data and R0 values in the same chart. In addition, the software should be easily extendable so that other models can be implemented without much effort.

Architecture of the Workbench

We used a class-based approach with C# for the Corona SEIR Workbench, because individual components can be isolated easily and maintained much better this way. Furthermore, .NET offers a very good performance and C# has all the features of a high-level language with object-oriented and functional language elements. Using Linq, operations on sets of data can be programmed very easily.

For a graphical data output, we use the Microsoft Windows Forms Charting[8] Library, which comes with Visual Studio. The user interface was programmed in XAML with the Windows Presentation Foundation. Both were combined in a .NET Core 3.1 project.

In order to present data from as many countries as possible, we use the infection data from the CSSE of the Johns-Hopkins-University. This data is aggregated by Datopian on a daily basis by country. For historical R0 values of Germany, we use the daily nowcasting values published by the Robert Koch-Institute. We determine population data from the web API of the World Bank. Daily vaccination data is retrieved from Our World in Data (OWID) of the Oxford Martin School and University of Oxford. Data is downloaded from the web only once a day and then temporarily stored for further processing.

A Python-based approach to SEIR modelling for SARS-COV-2 has been published by Pina Merkert[9] in Heise c't magazine. The Python code can be found on GitHub.

Using the Workbench

The user interface should be as simple as possible and can be easily extended by C# programmers using Visual Studio. The main window is divided into three blocks:

Corona SEIR Workbench UI

Corona SEIR Workbench UI

In the upper area, a diagram is shown. Data points are displayed per date on the horizontal axis. The left axis refers to case numbers while the right axis shows R0 values. Data points for daily incidence are drawn as bars in the graph and boxes in the legend. Additionally, for modelled values of total cases, daily cases and mean 7-day case numbers, a rhombus is displayed when the values have doubled compared to the value of the current date. In the tooltip, the data value and date are shown for the data point under the mouse pointer.

Below the diagram, individual curves can be hidden or shown in the “Series” area. The scaling of the axes is automatic, so that it usually makes sense to hide the SEIR curve of susceptible persons (“Susceptible”) because of its large values.

The next parameters determine the country and the modelling period. Moreover, SEIR parameters can be set for the average incubation period, the mean infection period and the base reproduction number R0. Changes of the parameters are applied immediately, the model is recalculated and results are displayed in the diagram.

Beside a constant R0 value, the base reproduction number can be calculated from historical data. The calculation of the R0 value is done by a “solver” component, which solves the SEIR model with a Levenberg-Marquardt[8] approach. The first “solver” calculates the R0 value for each day considering a window of n days in the future. The larger this window gets, the smoother the R0 curve gets. In the second method, the R0 value with the smallest deviation in the case numbers is calculated for a given interval of days. Both “solver” components use the largest R0 value, which gives the smallest error if the errors of multiple R0 values are identical.

Finally, from the last 5 days with historical data, the average of the R0 value is calculated and used for future calculations in the SEIR model. You can adjust this mean R0 value manually to allow for infection control measures in the model.

Below the settings for the basic reproduction number, settings for vaccinated persons can be adjusted. From the OWID data on vaccinations, the start of the vaccination campaign and the daily average number of persons vaccinated are determined. For the future, the average value of vaccinated persons of the last 7 days is calculated and then added to the number of vaccinated individuals each day. The average number of vaccinated persons per day can be changed for the future with a slider. If vaccinations have not been carried out in the country or if data about vaccination does not exist, default values are used.

Additionally, the vaccination effectiveness can be changed. The vaccination effectiveness represents the proportion of vaccinated persons who can no longer be infected. Based on the approval data of Biontech and Moderna, an effectiveness of 95% is used by default. This value can be changed by a slider also. Finally, the start of the vaccination campaign can be entered or changed manually.

Besides the start of the vaccinations, you can set the time period after how many days the protection of a vaccination will be effective. Although this is still a gross simplification, it provides a better calculation when vaccinations will have an influence on the R0 value. By default, the period is set to 14 days, which is the lower limit for a vaccine protection period. The value can be manually adjusted to show the impact on the developement of the epidemic.

If the start of vaccinations differs from the OWID data, the historical OWID data is no longer applyed and the manually specified average number of vaccinated persons is used instead. With this setting, scenarios can be simulated with more or fewer vaccinations.

In the lower part of the main window functions can be executed by buttons. “Show Data” displays a table with all data points from the chart. You can transfer them from the table to other applications like Microsoft Excel via the clipboard. With “Export Chart”, you can export the chart in various formats or as a CSV file.

Your chart settings and SEIR model parameters can be saved with the “Save Settings” button. Saved settings are applied automatically during application start. The button “Reset Settings” resets all settings to their default values. Like the “Reset Settings”, the “Clear Settings” button resets all settings, but hides all curves in the chart additionally. This function is convenient for creating your own charts because you don’t have to hide series first.

Results of the Workbench

The simplest SEIR model with an unrestricted virus spread uses a constant base reproduction number of 3.4 for Germany.

SEIR model for Germany

SEIR model for Germany

In this case, almost the entire population would have been infected in April and May this year. However, the actual case numbers of the Johns-Hopkins-University for Germany show a different run of the curve:

Covid-19 cases of the Johns-Hopkins-Universität from 13th November 2020

Covid-19 cases of the Johns-Hopkins-Universität from 13th November 2020

The reason for the different trend can be explained by the R0 values of the Robert Koch-Institute, which decreased from 3.4 to below 1.0 in April:Covid-19 cases of the Johns-Hopkins-Universität from 13th November 2020 with R0 values of the Robert Koch-Institute

Covid-19 cases of the Johns-Hopkins-Universität from 13th November 2020 with R0 values of the Robert Koch-Institute

The basic reproduction number has been significantly reduced by the nation-wide lockdown and contact bans in Germany. The peek in mid-June was probably caused by the high number of SARS-COV-2 infected people in the Gütersloh district caused by butchers' companies. This was a local outbreak and was contained quite quickly.

If R0 values are calculated from historical data using the Levenberg-Marquardt[10] method, the SEIR model shows the same trend. The case numbers (yellow line) correspond to the historical infection series (red line) up to 13th November 2020, which is the current date. If the current R0 value of 1.2 is assumed for the future, however, a further sharp increase is likely and a doubling of case numbers will occur in less than a month (yellow rhombus).

SEIR prediction with calculated R0 base reproduction number

SEIR prediction with calculated R0 base reproduction number

The calculated R0 value in March is significantly higher than the value determined by the RKI. This is due to the low number of cases in February and early March. It is a fundamental weakness of the R0 that the value shows high fluctuations when case numbers are very low. Thus, the average 7-day incidence number per 100,000 people was used as an assessment criterion during the summer.

SEIR 7-day incidence prediction from calculated R0 base reproduction number

SEIR 7-day incidence prediction from calculated R0 base reproduction number

Again, the series of the Johns-Hopkins data correspond to the SEIR model. However, in this simulation, a R0 value of 0.9 was used for future values. This R0 value is close to the 7-day R0 value of November, the 8th of 2020 from the RKI nowcasting table. In this situation, the 7-day incidence value of 100.000 individuals would decrease below 100 cases per day at the beginning of December.

The different simulations show how critical the basic reproduction number is for the further development of the pandemic. A R0 value above 1 is the driver of the epidemic.

Since December the 27th of 2020, vaccinations in Germany have been carried out with the Biontech vaccine. The Workbench uses a simplified model that uses the average of vaccinations per day:

SEIRV with vaccinations since December the 27th of 2020 and an assumed R0 of 1.1

SEIRV with vaccinations since December the 27th of 2020 and an assumed R0 of 1.1

As of January, the 17th of 2021, the average daily number of vaccinations per day is about 52,000 individuals in Germany. With an assumed R0 value of 1.1, the vaccination program would contribute from mid-April to falling incidence cases and reduce the daily number of infected persons to approx. 12,000 per day by the end of June. This corresponds to an incidence number of about 100 cases per 100,000 persons in 7 days. Interestingly, a positive effect of the vaccinations occurs far before the previously assumed threshold of 60%-70% of the population for herd immunity, which is published in the press.[10] Thus, the vaccination program already compensates for an R0 value of 1.1 if 4-5 million people are vaccinated.

The key factor of the vaccination program is the number of people vaccinated daily. The low average of 52,000 vaccinations per day is certainly due to the low availability of the vaccine and the complex management of vaccinations in old people's and nursing homes, which are vaccinated first in Germany due to their vulnerability. Nevertheless, the vaccination numbers are increasing every day and have reached a maximum of about 98,000 on January, the 14th of 2021. Therefore, a much higher vaccination rate can be expected in the coming weeks. E.g., if the average vaccination rate is raises up to 400,000 vaccinations per day, the effect on the pandemic is much stronger:

SEIRV with 400,000 vaccinations per day and an assumed R0 of 1.3

SEIRV with 400,000 vaccinations per day and an assumed R0 of 1.3

In this case, an incidence number of 12,000 infected persons per day is already reached by mid-March, despite the higher R0 value of 1.3. By the end of April, the 7-day incidence for 100,000 individuals per week is below 10 cases.

In summary, vaccination programs are an essential factor of Corona pandemic control. Crucial is the vaccination of as many people as possible in the shortest time possible. A vaccination program has additive effects to other, non-medical measures that reduce the R0 value. The effect is already visible with a vaccination coverage of 5% of the population. However, with the current average number of vaccinations of approx. 52,000 people, this will not be achieved until mid-March in Germany.

However, the modelled infection numbers in the SEIRV model are still very simplified and vaccinations could be taken into account more precisely. In any case, the offset of 12 days for a beginning immunization and 28 days for full immunization should be considered. Furthermore, the efficacies of different vaccine classes may differ. The high effectiveness of mRNA vaccines is probably not achieved by vector-based vaccines.

Class Model of the Workbench

The Corona Workbench implements an interface-based class model for the SEIR and SERIV models. Each SEIR model has an ISEIR interface which provides SEIR parameters and the number of people for the S, E, I and R compartments. The interface defines a Calc method which calculates model values for a certain number of days. The calculation of the compartments S', E', I' and R' is done by using static functions and can be easily changed with these functions. The setup of the ISEIR model takes place in the constructor.

SEIR and SEIRV class model

SEIR and SEIRV class model

A view takes care of the display of ISEIR data. The view can be either implement an ISeriesView interface for discrete time intervals or use an IDateSeriesView interface for a time range with start and end date. The CalcAsync method of the view calculates the model and stores individual data points in a chart data series. The SEIRR0DateSeriesView view allows the SEIR calculation with varying R0 values for different dates.

Historical R0 values are calculated by solver objects with an IR0Solver interface. The interface defines a Solve method, which returns a sequence of R0 values. The IR0Solver interface is implemented by a SEIRR0Solver class, which compares for each day the calculated case numbers of the SEIR model with the actual case numbers and returns the optimal R0 value for the day by using the Levenberg-Marquardt[10] method. As a simple approach, all R0 values between 0 and 10 are calculated in steps of 0.1 and the largest R0 value with the smallest deviation from the actual case numbers is returned.

Class model of the R0 solver

Class model of the R0 solver

Historical infection data from the Johns-Hopkins Center for Systems Science and Engineering is retrieved by a JHU object and returned as data series via the JHUDateSeiriesView view. Since the data of the Johns-Hopkins-University is a unique algorithm, no interface was defined for the JHU class:

Classes for the loading of Johns-Hopkins CSSE data

Classes for the loading of Johns-Hopkins CSSE data

The JHU object caches the retrieved data in a temporary CSV file and parses it as fast as possible.

The same approach is used by the RKINowcasting class, which downloads the Nowcasting csv file of the Robert Koch-Institute once a day and returns the columns “Point estimator of the reproduction number R”, as well as “Point estimator of the 7-day R value” for dates.

Classes for the loading of Robert Koch-Institute Nowcasting data

Classes for the loading of Robert Koch-Institute Nowcasting data

Our World in Data of the Oxford Martin School and University of Oxford collects data of vaccinated persons per country and makes it available on GitHub.

Classes for the getting vaccination data from OWID

Classes for the getting vaccination data from OWID

The current population of a country is determined with the WPPopulation class via the World-Bank API.

Discussion

The Corona SEIR Workbench provides a simple infrastructure for modelling epidemics. It uses standard components such as Windows Forms Charting for displaying charts and the Windows Presentation Foundation for the user interface. Current epidemiological data for COVID-19 can be downloaded from the web and compared with calculated model values.

This allows the application to be extended with little effort. For example, regional infection data from the RKI would show a higher spatial resolution for Germany. Furthermore, future R0 values could be specified as a function or with multiple value in a table.

The calculation of R0 values depends on the Corona test strategy. E.g., a high number of unreported cases lead to wrong R0 values for historical infection data. Since particularly young infected persons can be asymptomatic, the age distribution of infected persons is important for the estimation of unreported cases. Moreover, there are feedback loops with epidemic measures, such as the quarantine of travel returnees from risk areas. A number of these people will never seek tests despite symptoms because they want to get around the quarantine time.

With the launch of vaccination programs, the SEIR model was extended to include a compartment for vaccinated persons. Since vaccinations have a significant impact on further pandemic development, the expanded SEIRV model provides more accurate predictions. The use of a separate compartment for vaccinated individuals allows a clearer separation between the effect of vaccinations and non-medical measures against the Corona pandemic.

However, for the SEIRV model, it must be taken into account that future R0 values of the RKI for Germany will be lower than those of the Corona SEIR Workbench, since the RKI R0 values include vaccination effects. In contrast, the SEIRV model adjusts the vaccination effect before solving historical R0 values which leads to slightly higher R0 values.

A further development of the SEIR model would be the calculation of superspreading events and percolation resulting from overdispersion of corona infections. A cluster diary could be used to track superspreading events.

In order to calculate the effort of medical care, it is necessary to divide the infected persons into mild cases and severe cases. However, for this calculation, data on the age structure of the infected persons are necessary.

Much more advanced modelling would be, for example, Prof. Priesemann's approaches to the infection fatality rate (IFR)[12] or the effects of a testing, tracking and isolation strategy (TTI, Test-Trace-Isolates)[13].

In the end, the Corona SEIR Workbench is just an infrastructure, waiting for the next adoption. We are curious to see what applications you will find.

History

  • 27th November 2020, Initial version
  • 18th December 2020, Version 1.0.1, typos & bug fixes
  • 24th January 2021, Version 2.0.0 with vaccination
  • 4th April 2021, Version 2.1.0 improvements in modelling vaccinations
  • 13th June 2021, Version 2.2.0 with bugfixes for vaccination data
  • 9th July 2021, Version 2.3.0 with a new data source for german RKI data
  • 25th July 2021, Version 2.3.1 with changes for the new RKI data source format
  • 24th April 2022, Version 2.4.0 with a new data source for JHU data

References


  • [9] Pina Merkert, ODE an Corona – Covid-19-Vorhersagen mit dem SEIR Modell, c’t 2020, Heft 11, Seite 124-127

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)