Demographic Characteristics
Data were collected from 45,450 patients. The study population had a mean (standard deviation [SD]) age of 53·44 (16·38) years, and 21,689 (47·7%) patients were men. The mean (SD) illness duration was 10·40 (7·90) days (see Figure E1 in the supplementary online data for the distribution). Among all the patients, 7,798 (17·2%) were considered to have severe disease and 37,652 (82·8%) to have nonsevere disease. Accordingly, 37,654 (82·9%) patients were quarantined at home.
Figure 1 shows the distribution of disease severity by age and gender. Both age (r > 0·91, p < 0·0001) and illness duration (r > 0·69, p < 0·0001) correlated positively with disease severity, as seen in Fig. 2, and this was unaffected by gender (men and women held similar trend with age and illness duration as showed in Figure E2). Patients with severe disease had a mean (SD) age of 60·85 (15·28) years and illness duration of 12·55 (7·93) days after symptom onset, compared with patients with nonsevere disease, who had a mean (SD) age of 51·90 (16·17) years and illness duration of 9·95 (7·82) days (t = 44·85, p < 0·0001 for age; t = 26·62, p < 0·0001 for illness duration). Quarantine rate did not differ significantly according to illness severity (χ2 = 0·17, p = 0·682) (Table 1).
Table 1
Demographic Characteristics of the Sample
|
Severe,
mean (sd) / n (%)
(n = 7798)
|
Nonsevere,
mean (sd) / n (%)
(n = 37652)
|
t / χ2
|
P value
|
Age (years)
|
60·85 (15·28)
|
51·90 (16·17)
|
t = 44·85
|
< 0·0001**
|
Gender (male)
|
3908 (50·1%)
|
17781 (47·2%)
|
χ2 = 21·64
|
< 0·0001**
|
Illness Duration (days)
|
12·55 (7·93)
|
9·95 (7·82)
|
t = 26·62
|
< 0·0001**
|
Quarantine (yes)
|
6448 (82·7%)
|
31206 (82·9%)
|
χ2 = 0·17
|
0·682
|
Abbreviation: sd, standard deviation
**: p < 0·0001.
|
Clinical Symptoms and Comorbidities
Clinical manifestations were recorded for 4,984 patients and yielded 20 clinical symptoms (Table 2); among these, the incidence of dyspnea (χ2 = 24·56, p < 0·0001) and shortness of breath (defined as clinical evidence of altered breathing) (χ2 = 62·67, p < 0·0001) differed significantly between severe and nonsevere patients. Among 1,326 patients with severe disease, 225 (17·0%) had dyspnea and 296 (22·3%) had shortness of breath; conversely, among the 3,658 patients with nonsevere disease, 425 (11·6%) had dyspnea and 480 (13·1%) had shortness of breath.
Comorbid conditions were recorded for 5,062 patients and were found in a higher proportion of patients with severe versus nonsevere disease (Table 3). Additionally, the number of comorbidities showed significant association with the severity of COVID-19 (t = 7·96, p < 0·0001). Patients with hypertension, pulmonary disease, diabetes mellitus, and cardio/cerebrovascular disease were more likely to develop severe disease (t > 15·14, p < 0·0001). Notably, there was no significant difference in the prevalence of chronic liver (χ2 = 0·38, p = 0·538) or kidney disease (χ2 = 2·00, p = 0·157) between patients with severe and nonsevere disease.
Table 2
Reported Symptoms of the Participants
Symptoms
|
Severe, n (%) (n = 1326)
|
Nonsevere, n (%) (n = 3658)
|
χ2
|
P value
|
Fever
|
1110 (83·7)
|
2951 (80·7)
|
5·95
|
0·015
|
Vomiting
|
72 (5·4)
|
160 (4·4)
|
2·44
|
0·118
|
Dyspnea
|
225 (17·0)
|
425 (11·6)
|
24·56
|
< 0·0001**
|
Shortness of Breath†
|
296 (22·3)
|
480 (13·1)
|
62·67
|
< 0·0001**
|
Expectoration
|
289 (21·8)
|
715 (19·5)
|
3·06
|
0·080
|
Sore Throat
|
55 (4·1)
|
229 (6·3)
|
8·08
|
0·004*
|
Headache
|
129 (9·7)
|
463 (12·7)
|
7·97
|
0·005*
|
Chills
|
126 (9·5)
|
372 (10·2)
|
0·48
|
0·488
|
Dry Cough
|
580 (43·7)
|
1661 (45·4)
|
1·09
|
0·296
|
Nausea
|
36 (2·7)
|
100 (2·7)
|
0·00
|
0·971
|
Runny Nose
|
19 (1·4)
|
101 (2·8)
|
7·31
|
0·007*
|
Conjunctival Hyperemia
|
3 (0·2)
|
7 (0·2)
|
0·06
|
0·808
|
Muscle Soreness
|
237 (17·9)
|
658 (18·0)
|
0·01
|
0·926
|
Chest Pain
|
22 (1·7)
|
99 (2·7)
|
4·51
|
0·034
|
Chest Tightness
|
229 (17·3)
|
551 (15·1)
|
3·59
|
0·058
|
Diarrhea
|
127 (9·6)
|
362 (9·9)
|
0·11
|
0·738
|
Abdominal Pain
|
3 (0·2)
|
14 (0·4)
|
0·70
|
0·402
|
Nasal Congestion
|
23 (1·7)
|
74 (2·0)
|
0·42
|
0·515
|
Fatigue
|
537 (40·5)
|
1335 (36·5)
|
6·65
|
0·010*
|
Joint Soreness
|
60 (4·5)
|
224 (6·1)
|
4·63
|
0·031
|
†clinical evidence of altered breathing |
*: p < 0·01; **: p < 0·0001. |
Table 3
Reported Comorbidities of the Participants
Comorbidities
|
Severe,
n (%) / mean (sd)
(n = 1339)
|
Nonsevere,
n (%) / mean(sd)
(n = 3723)
|
t / χ2
|
P value
|
Hypertension
|
306 (22·9%)
|
599 (16·1%)
|
χ2 = 30·69
|
< 0·0001**
|
Pulmonary Disease
|
51 (3·8%)
|
71 (1·9%)
|
χ2 = 15·14
|
< 0·0001**
|
Diabetes Mellitus
|
147 (11·0%)
|
239 (6·4%)
|
χ2 = 29·06
|
< 0·0001**
|
Cardio-cerebrovascular Disease
|
126 (9·4%)
|
214 (5·7%)
|
χ2 = 21·08
|
< 0·0001**
|
Chronic Liver Disease
|
14 (1·0%)
|
32 (0·9%)
|
χ2 = 0·38
|
0·538
|
Chronic Kidney Disease
|
17 (1·3%)
|
31 (0·8%)
|
χ2 = 2·00
|
0·157
|
No. of Comorbidities†
|
0·50 (0·80)
|
0·32 (0·66)
|
t = 7·96
|
< 0·0001**
|
†total number of comorbid conditions, including hypertension, pulmonary disease, diabetes mellitus, cardio/cerebrovascular disease, chronic liver disease, and chronic kidney disease, for each subject |
**: p < 0·0001. |
Laboratory and Imaging Results
Laboratory results were recorded for 2,471 patients. The percentages of neutrophils and lymphocytes were significantly different in patients with severe versus nonsevere disease: in patients with severe disease, neutrophils were higher (t = − 7·53, p < 0·0001) and lymphocytes were lower (t = 4·67, p < 0·0001; Table 4).
A total of 3,438 patients underwent computed tomography examination, revealing abnormalities in 90·5% of patients with severe disease and 92·2% of patients with nonsevere disease—a nonsignificant difference (χ2 = 2·16, p = 0·142).
Table 4
Reported Laboratory Results of the Participants
Indexes
|
Severe, mean(sd) (n = 605)
|
Nonsevere, mean(sd) (n = 1866)
|
t
|
P value
|
WBC (White Blood Cell Count)
|
5·69 (3·07)
|
5·37 (2·87)
|
2·38
|
0·018
|
Lx (Lymphocyte Count)
|
1·33 (3·47)
|
1·96 (6·19)
|
-2·39
|
0·017
|
L (Lymphocyte Percentage, %)
|
20·21 (12·74)
|
24·90 (13·50)
|
-7·53
|
< 0·0001**
|
N (Neutrophil Percentage, %)
|
67·17 (21·24)
|
62·90 (18·94)
|
4·67
|
< 0·0001**
|
**: p < 0·0001. |
Predictor Selection
Among the 36 variables analyzed, 12 showed statistical differences between the groups of patients with severe and nonsevere disease, indicating their potential predictive value: gender, age, illness duration, dyspnea, shortness of breath, hypertension, pulmonary disease, diabetes, cardio/cerebrovascular disease, number of comorbidities, neutrophil percentage, and lymphocyte percentage. Of these, four were strong predictors of severe disease (Table 5): age (odds ratio [OR] = 1·03; 95%CI: 1·02–1·04; p < 0·0001), illness duration (OR = 1·08; 95%CI: 1·06–1·10; p < 0·0001), shortness of breath (OR = 1·64; 95%CI: 1·26–2·13; p = 0·0002), and lymphocyte percentage (OR = 0·98; 95%CI: 0·97–0·99; p < 0·0001).
Table 5
Logistic Regression Model for the Prediction of Severe Disease
Variables
|
Odds Ratio
|
95% CI
|
t
|
P value
|
Age
|
1·03
|
1·02,1·04
|
7·28,
|
< 0·0001**
|
Gender
|
1·30
|
1·06,1·58
|
2·55
|
0·011
|
Illness Duration
|
1·08
|
1·06,1·10
|
9·12
|
< 0·0001**
|
Dyspnea
|
1·18
|
0·87,1·61
|
1·07
|
0·287
|
Shortness of Breath†
|
1·64
|
1·26,2·13
|
3·69
|
0·0002*
|
Hypertension
|
0·86
|
0·45,1·67
|
-0·43
|
0·664
|
Pulmonary Disease
|
1·67
|
0·73,3·81
|
1·23
|
0·220
|
Diabetes Mellitus
|
1·29
|
0·64,2·58
|
0·71
|
0·480
|
Cardio-cerebrovascular Disease
|
0·86
|
0·42,1·77
|
-0·41
|
0·684
|
No. of Comorbidities
|
1·03
|
0·57,1·85
|
0·10
|
0·921
|
L (Lymphocyte Percentage)
|
0·98
|
0·97,0·99
|
-4·79
|
< 0·0001**
|
N (Neutrophil Percentage)
|
1·00
|
0·99,1·00
|
-1·17
|
0·242
|
†clinical evidence of altered breathing |
*: p < 0·01; **: p < 0·0001. |
Effect size of strong predictors at different disease stages
To further understand whether the above strong predictors have the same effect in different disease stages, we analyzed effect size of three strong severity predictors (age, shortness of breath, and lymphocyte percentage) at different illness duration. As sample size differed at different stages, we applied Cohen’s d for continuous variables and odds ratio for categorical variables as effect size to describe statistical results. As shown in Fig. 3, all the three variables were always helpful predictors to find out severe patients with illness duration increasing. The risk factors won’t change at different disease stages.
Construction and Performance of the COVID-19 Severity Self-Assessment Scale
As described, regression modelling identified 12 variables for inclusion in the prediction scale; however, blood tests cannot be performed by patients doing self-assessment, and the two blood test indicators (neutrophil percentage and lymphocyte percentage) were excluded from the final scale, leaving a 10-item scale.
Figure 4 shows the results of the ROC analysis evaluating the accuracy of different model scales. The AUC of the full logistic regression model, with all 12 variables showing strong association to disease severity, was 0·72 (p < 0·0001). Following removal of neutrophil percentage and lymphocyte percentage, the remaining model (the final “COVID-19 Severity Self-Assessment Scale”) showed a similar AUC = 0·71 (p < 0·0001) and further, higher accuracy in older-aged (≥ 65 years) patients (AUC = 0·75, p < 0·0001). The LASSO regression extracted similar results, indicating no confounding collinearity between the variables and cross-validating the prediction accuracy.
The final 10-item scale yielded a total score of 100 points, with higher score indicating a higher risk for severe illness. ROC analysis determined the cutoff value of 49·65, with scores above 49·65 predicting high risk. With this score, the final scale can correctly identify 87% patients.
Once the predictive variables were determined and the self-assessment scale developed, an online calculator tool was constructed to allow patients access to expedient results (http://180.167.250.222:10080/COVID-19-Severity-Self-Assessment-Scale.html; Fig. 5).