Data on COVID-19 was obtained from the data-sharing portal covid19india.org.Information is collected on daily confirmed cases and daily testing numbers at the state level from 14th March to 17th May 2020. For calculating the test per million population at the state level, the testing numbers were extracted at the state level from the data-sharing portal covid19india.org. The figures of the population for the selected states as on 1st March 2020 has been taken from the report “Population Projection for India and States (2011–2036)” provided by National Commission for Population (NCP). HDI Index 2018 at the state level is taken from the Global Data Lab, which provides the HDI Index at the state level for all countries from 1990-2018. Per Capita Health Expenditure (PCHE) data from the National Health Profile (NHP 2019) has also been used. All the calculations for estimating Rt are done using Jupyter notebook with Python 3.
Effective reproduction Number (Rt)
Effective reproduction number (Rt,) is the mean number of infections generated during the infectious period from a single infected person at time t. The effective reproduction number may vary across locations because contact rates among people may differ due to differences in population density, cultural differences, level of immunity and restrictions imposed on the movement of the people. When Rt>1, the pandemic will spread through a large part of the population. If Rt<1, the pandemic will slow quickly before it has a chance to infect many people. Lower the value of Rt, the situation is more controllable. In general, Rt<1 is the main goal of the policy planners working in the field of epidemiology. Epidemiologists argue that tracking Rt is the only way to manage the transmission of the communicable disease. More importantly, it is useful to understand Rt at the sub-national level to manage the transmission effectively.
Bayesian estimation of Rt with quantified uncertainty
Parameter estimation with quantified uncertainty can be achieved using the Bayesian approach in the context of probabilistic epidemiological models. Bayes’ theorem expresses the full probability distribution for model parameters, such as the effective reproduction number, Rt, in terms of the probabilistic epidemiological model, given the time series for new cases6.
Bettencourt & Ribeiro’s approach has been used to calculate Rt7, as described in8 as well. The data is available on how many new people have COVID-19 based on daily new cases. This new case count gives us information about the current value of Rt. Further, the value of today’s Rt is related to the value of yesterday’s Rt- and every previous value of Rt-m. Bayes’ rule updates the beliefs about the true value of Rt based on the information of how many new cases have been reported each day.
Bayes' Theorem suggests
P (k| Rt): The likelihood of observing ‘k’ new cases given Rt, time points.
P(Rt): The prior beliefs of the value of P(Rt) at the beginning of the study period
P(k): The probability of observing ‘k’ new cases for a given day
Choosing a Likelihood Function P (kt |Rt)
Given an average arrival rate of λ new cases per day, the probability of observing ‘k’ new cases is distributed according to the Poisson distribution:
There exists a relationship between Rt and λ.
λ = kt-1. e ɤ (Rt– 1)
Where ɤ is the reciprocal of the serial interval, and the value of the serial interval has been considered to be four days based on the most reliable findings9. Further, new cases are known; therefore, the likelihood function as a Poisson parameterized by fixing k and varying Rt can be reformulated7.
Input variables required for its calculation are-
- Daily number of confirmed cases at the state and national level which is taken from http://api.covid19india.org/states_daily_csv/confirmed.csv
- Serial Interval for COVID -19 is required
- Basic Reproduction Number at the initial time (14th March) (R0) is required.
Serial Interval and Incubation Period for COVID 19
Literature suggests the mean serial interval for COVID-19 ranges from 4 to 8 days (9-12). Recent analyses by9 used a much larger sample that includes up to 468 pairs, making their estimates of between 4 and 5 days which are more statistically reliable. The estimated mean serial interval is shorter than the preliminary estimates of the mean incubation period (approximately 5 days) (11,12). When the serial interval is shorter than the incubation period for infectious disease, the pre-symptomatic transmission is likely to have taken place and may occur even more frequently than symptomatic transmission13. The Indian Council of Medical Research (ICMR) also confirmed that as much as 80% of all cases could be asymptomatic based on the fact that COVID-19 tests that delivered positive results in India show that 69% of positive cases were asymptomatic, whereas 31% are symptomatic representing a ratio of 2:114. In the present study, the mean value of the serial number is considered as four days.
Basic Reproduction Number (R0)
Reproduction Number for COVID 19 at the initial stage is estimated between 2 and 315. Using the raw CDC data, the estimated value of the basic reproduction number is between 2.2 and 2.3. Another study16 estimated that the median daily reproduction number (Rt) in Wuhan had declined from 2·35 (95% CI 1·15–4·77) at one week before travel restrictions were introduced on 23rd January 2020, to 1·05 (0·41–2·39) one week after. So, a basic reproduction number (R0) of 3 at the initial stage of infection (14th March in our case) will yield good results for the present study. However, the estimate of the effective reproduction number using the current adopted Bayesian approach is independent of the initial assumed basic reproduction number.
Multiple Linear Regression
Multiple linear regression analysis was carried out to quantify the impact that state-level factors made in the decline of Rt. The difference created in the value of Rt during lockdown phase (between 2nd April and 9th May) for states acts as the dependent variable, and state-level factors namely Tests Conducted (between 2nd April and 9th May) per million, HDI, PCHE and Good Governance Index as independent variables. All independent variables were normalized between 0 and 1 to bring all the values of independent variables in the dataset to a common scale15.