Data
From January 20, the NHC and the Hubei Provincial Health Commission (HHC) have issued the numbers of new laboratory-confirmed patients across the country and in Hubei, respectively, on a daily basis [10,11]. According to the Diagnosis and Treatment Plan for Novel Coronavirus Pneumonia (5th Edition) [12], the number of laboratory-confirmed patients in Hubei was no longer issued separately by the HHC. Although the 6th edition of the plan issued on February 18 requires release of laboratory-confirmed patients separately [13], we cannot obtain this number in Hubei from February 16 to February 18. Therefore, we have no choice but to use this number before February 15 as the training data in Hubei and other provinces. Basic parameters can be obtained from the literature published by the CDC [1], as shown in Table 1.
Model establishment
-
Assumptions for model establishment
We set some preconditions for the model: First, patients are infectious only after disease onset, and asymptomatic infectors are not considered infection sources. This is because we rely on the definition of infection source in the 6th edition of the diagnosis and treatment plan [13] and refer to the infectivity characteristics of patients with SARS caused by coronaviruses [14]. Moreover, only patients seeking medical attention can be diagnosed, while patients with asymptomatic infection not seeking medical attention are excluded in the laboratory-confirmed patients of the model. Second, since novel coronavirus pneumonia is a new infectious disease and people have no immunity, all close contacts are considered susceptible. Third, the number of susceptible persons infected by one infector follows Poisson distribution with the R0 as the mean value. Fourth, Hubei and other provinces have similar R0(t). This is because the national prevention and control measures have been implemented under the unified leadership, and the implementation of intervention measures in all regions has been similar and synchronous [7].
2. Establishment of the R0(t)
According to the temporal distribution of new laboratory-confirmed patients, we predicted that before the intervention meaures were initiated, the virus would continuously spread along with daily contact among people, and the R0 would continue to remain high during this period. After the intervention measures were initiated, not only the effective contact frequency among people would be significantly reduced but also the infection period of patients would be significantly shortened due to the active screening, therefore, the R0 in this stage would show a downward trend. According to the abovementioned assumptions, we first listed various possibilities for the R0(t). Subsequently, we substituted each possible function into a computer program to fit with the training data, and after several tests, the function having the best fitting effect with the training data was finally identified.
3. Establishment of the model according to different developmental stages of the epidemic
In the process of model establishment, we divided the epidemic into three stages according to its occurrence, development, and control processes and designed the computer program according to the characteristics of different stages:
The first stage is the emission period of the epidemic from early December 2019 to January 1, 2020 when the Huanan Seafood Market was closed. The main epidemic features at this stage are as follows: First, animal infection sources in the market continued to spread the virus to humans, leading to the successive appearance of patients with pneumonia [15]; second, these patients were also new infection sources, spreading the virus to other close contacts. In the model, the human infection sources at the early stage of the epidemic were 50 patients with a exposure history to the market and 27 patients with unknown causes before the closure of this market, based on the CDC’s findings of investigation [1]. Time of infection, time of seeking medical attention, time of transmission to other susceptible persons, their R0, and other information were calculated and stored in a matrix.
The second stage is the development period of the epidemic from January 1, 2020 to January 25, 2020 when Chinese government created a leading group to respond to the epidemic and coordinate the national epidemic prevention and control. The first characteristic of this stage is that people did not adapt effective protection, resulting in the transmission of the virus among people, and the epidemic began to spread. The second characteristic is that it was during the Spring Festival travel rush in China, and some infectors left Hubei and traveled to all regions of the country and even abroad. Therefore, we randomly selected some of the infectors and patients as the infection sources who arrived at other provinces before Hubei was locked down entirely on January 24. Since then, all the new infectors throughout the country except those in Hubei were infected by these infection sources.
The third stage is the control period of the epidemic, starting from January 25, 2020. The government has strictly implemented a series of powerful measures which have gradually curbed the spread of the epidemic [5]. In the different developmental stages of the epidemic, we assigned the R0 values to patients according to their time points of disease onset; hence, the epidemic developmental trend changes with the R0(t).
4. Coordinate descent algorithm
We used the coordinate descent algorithm to obtain the parameters, which is an efficient optimization method of solving extreme values in machine learning [16]. We took the quadratic sum function of the difference between the daily new laboratory- confirmed patients estimated by the model and the corresponding data issued by the government as the objective function. We took the four parameters in the model, namely, a , b , and t of the R0 and the number of patients and infectors, who left Hubei before it was locked down, namely, m, as the parameters to be estimated (Table 1). Subsequently, we conducted a numerical calculation using the coordinate descent algorithm to obtain the values of the parameters when the objective function reaches the minimum value.
Sensitivity analysis
Partial rank correlation coefficient (PRCC) combined with Latin hypercube sampling was used for the sensitivity analysis to evaluate the influence of the three parameters of R0 infection period on the model output (the total number of accumulative laboratory-confirmed patients nationwide until March 10, 2020). A standard correlation coefficient, ρ, for the parameter and model output was calculated [17,18]. Details of the coordinate descent algorithm and sensitivity analysis are shown in the supplementary material.