Epidemiological analysis
Included in our study were a total of 9,065,910 ILI cases from 1st week 2011 to 29th week 2020 among the United States. The ILI annual infection rate fluctuated from 5.92 to 15.84 per 100,000 populations. ILI occurs throughout the year, most often peaking between December and February and can last until May.
In terms of age, the number of ILI cases in the age 5–24 years old is the most, and these groups accounted for about 35 percent, while the number of patients in the age group over 65 years old is the least, it accounted for about 7 percent (Fig. 1). The difference between different age groups had a statistical significance (P < 0.001).
This study collected the population of 49 states and visualized them on the map (Fig. 2). Comparing these results with spatio-temporal analysis could reveal an association between population density and influenza incidence.
Spatio-Temporal Analysis
Overall, the highest cumulative incidence of ILI incidence (per 100,000 populations) during the study period was seen in the states of Louisiana, District of Columbia and Virginia of reported cases 12,200, 9,563 and 9,554, respectively. The lowest cumulative incidence of ILI incidence was reported from the states of Ohio, Washington and Iowa (Fig. 3).
Global Spatial Autocorrelation
The global spatial autocorrelation analysis for ILI suggested a clustering distribution at the state level in the years of 2012 to 2017, the global Moran’s I reached up to the significance level of 0.05. In contrast, the global Moran’s I for 2011, 2018 and 2019 display no significant spatial autocorrelation, though Moran’s I greater than 0 (Table 1).
Table 1
Global spatial autocorrelation analysis
Year | Moran's I | E(I) | Mean | s | Z-Value | P-Value |
2011 | 0.099 | -0.021 | -0.023 | 0.090 | 1.358 | 0.093 |
2012 | 0.185 | -0.021 | -0.019 | 0.095 | 2.151 | 0.028 |
2013 | 0.181 | -0.021 | -0.019 | 0.094 | 2.121 | 0.031 |
2014 | 0.200 | -0.021 | -0.021 | 0.092 | 2.401 | 0.022 |
2015 | 0.166 | -0.021 | -0.222 | 0.091 | 2.073 | 0.037 |
2016 | 0.177 | -0.021 | -0.024 | 0.093 | 2.150 | 0.029 |
2017 | 0.146 | -0.021 | -0.025 | 0.087 | 1.981 | 0.039 |
2018 | 0.074 | -0.021 | -0.023 | 0.092 | 1.053 | 0.145 |
2019 | 0.074 | -0.021 | -0.021 | 0.090 | 1.053 | 0.139 |
Local Spatial Autocorrelation
Local spatial autocorrelation analysis reveals only the relative states, rather than absolute correlations. Only those states whose local Moran's I have reached the significance level of 0.05 will present on the map. From 2011 to 2019, the local spatial autocorrelation showed 3 HH clusters in total with 2 HL clusters, 4 LH and 3 LL clusters. HH clusters were observed in the states of Louisiana (5 years), Mississippi (4 years), and the District of Columbia (1 year). Louisiana and Mississippi had HH clusters for long periods. HL clusters were observed in the states of Illinois (4 years), and Oregon (2 years). LH clusters were observed in the states of Tennessee (4 years), Maryland (6 years), Arkansas (5 years), and Texas (3 years). LL cluster appears in the northeastern part of the United States only in 2018 and 2019 (Fig. 4).
Spatio-Temporal Cluster Analysis
The spatio-temporal Cluster analysis detected 23 clusters of ILI in the study period. The clusters were particularly obvious in the spring and winter. For example, Risk Ratio (RR) was highest in 2015, with a total of 3 levels of clustering. Level 1, with Louisiana at the center of high incidence area and 2 surrounding states, the risk of ILI in this area was 11.66 times more likely to develop the disease than other areas ( LLR = 69,009, P < 0.001). Level 2, with Virginia at the center of high incidence area and 3 surrounding states, the risk of ILI in this area was 9.79 times more likely to develop the disease than other areas ( LLR = 73,277, P < 0.001). Level 3, with New Mexico at the center of high incidence area and 3 surrounding states, the risk of ILI in this area was 3.38 times more likely to develop the disease than other areas ( LLR = 26,518, P < 0.001). At the same time, the states with high cluster in the Local spatial autocorrelation analysis were all located in the high cluster area, the results were consistent. From the cluster time, the high incidence time mainly occurs between January and March (Table 2).
Table 2. Spatio-temporal scan of ILI in the United States from 2011 to 2019.
Year
|
Level
|
Center
|
N
|
Cluster period
|
Coordinates/Radius(km)
|
Observed cases
|
Expected cases
|
RR
|
LLR
|
P-value
|
2011
|
1
|
Kentucky
|
15
|
2011-01-01 to 2011-02-28
|
(37.5N,85.3W)/738.2
|
121829
|
24485
|
6.22
|
108601
|
<0.001
|
|
2
|
Colorado
|
13
|
2011-01-01 to 2011-02-28
|
(39.0N,105.5W)/1005.2
|
62527
|
15984
|
3.91
|
4.32
|
<0.001
|
2012
|
1
|
Mississippi
|
3
|
2012-10-01 to 2012-12-31
|
(32.8N,89.7W)/289.2
|
37831
|
4698
|
8.65
|
46953
|
<0.001
|
|
2
|
Virginia
|
1
|
2012-09-01 to 2012-12-31
|
(37.5N,78.8W)/0
|
29130
|
4234
|
7.26
|
31944
|
<0.001
|
|
3
|
Nebraska
|
17
|
2012-01-01 to 2012-03-31
|
(41.5N,99.8W)/1116.02
|
65301
|
32693
|
2.15
|
13777
|
<0.001
|
2013
|
1
|
Virginia
|
3
|
2013-01-01 to 2013-03-31
|
(37.5N,78.8W)/222.1
|
46638
|
4750
|
10.61
|
66247
|
<0.001
|
|
2
|
Texas
|
15
|
2013-01-01 to 2013-2-28
|
(31.5N,99.4W)/1327.5
|
101773
|
27333
|
4.32
|
64734
|
<0.001
|
2014
|
1
|
Mississippi
|
3
|
2014-10-01 to 2014-12-31
|
(32.8N,89.7W)/289.2
|
45125
|
5686
|
8.52
|
55411
|
<0.001
|
|
2
|
Virginia
|
3
|
2014-01-01 to 2014-04-30
|
(37.5N,78.8W)/222.1
|
43127
|
6524
|
7.06
|
46035
|
<0.001
|
|
3
|
New Mexico
|
12
|
2014-01-01 to 2014-02-28
|
(34.4N,106.1W)/1249.5
|
54639
|
18361
|
3.18
|
24494
|
<0.001
|
2015
|
1
|
Louisiana
|
2
|
2015-01-01 to 2015-04-30
|
(31.1N,92.0W)/289.2
|
45837
|
4254
|
11.66
|
69009
|
<0.001
|
|
2
|
Virginia
|
3
|
2015-01-01 to 2015-04-30
|
(37.5N,78.8W)/222.1
|
54666
|
6134
|
9.79
|
73277
|
<0.001
|
|
3
|
New Mexico
|
12
|
2015-01-01 to 2015-02-28
|
(34.4N,106.1W)/1249.5
|
54311
|
17264
|
3.38
|
26518
|
<0.001
|
2016
|
1
|
Virginia
|
3
|
2016-01-01 to 2016-05-31
|
(37.5N,78.8W)/222.1
|
63287
|
7974
|
8.81
|
78624
|
<0.001
|
|
2
|
Arizona
|
3
|
2016-01-01 to 2016-04-30
|
(34.3N,111.7W)/559.1
|
32319
|
7155
|
4.73
|
24144
|
<0.001
|
|
3
|
Mississippi
|
10
|
2016-01-01 to 2016-04-30
|
(32.8N,89.7W)/820.6
|
104314
|
34608
|
3.47
|
50171
|
<0.001
|
2017
|
1
|
Florida
|
13
|
2017-01-01 to 2016-03-31
|
(28.6N,82.5W)/1247.2
|
192625
|
51339
|
4.63
|
127775
|
<0.001
|
|
2
|
Oregon
|
1
|
2017-11-01 to 2017-12-31
|
(43.9N,120.6W)/0
|
6819
|
1714
|
4.01
|
4329
|
<0.001
|
|
3
|
Colorado
|
14
|
2017-01-01 to 2017-02-28
|
(39.0N,105.5W)/1024.3
|
54139
|
25185
|
2.23
|
13031
|
<0.001
|
2018
|
1
|
Florida
|
13
|
2018-01-01 to 2018-02-28
|
(28.6N,82.5W)/1247.2
|
259855
|
47137
|
6.89
|
253645
|
<0.001
|
|
2
|
Wyoming
|
13
|
2018-01-01 to 2018-02-28
|
(43.0N,107.6W)/1052.7
|
74812
|
19512
|
4.04
|
46665
|
<0.001
|
2019
|
1
|
Colorado
|
7
|
2019-01-01 to 2019-03-31
|
(39.0N,105.5W)/747.6
|
82746
|
19164
|
4.52
|
58874
|
<0.001
|
|
2
|
Florida
|
13
|
2019-01-01 to 2019-03-31
|
(28.6N,85.5W)/1247.2
|
326625
|
94525
|
4.16
|
193772
|
<0.001
|
Correlation analysis of coastal states
As is shown in the figure, a high correlation was detected between 19 coastal states, although Delaware showed a negative correlation (Fig. 5). Interestingly, Mississippi and Texas, two high-risk states, showed a weak correlation. The reason may be as follows, since 2011, the incidence of ILI in these two states has been high and stable, while the incidence in other states were on a gradual upward trend.
Time-series Analysis
Based on the result of spatio-temporal analysis, the HH cluster was identified mainly in Mississippi and Louisiana. In particular, Mississippi has been the HH cluster in recent years. It is necessary to predict the incidence of ILI in Mississippi by time-series analysis.
SARIMA Model
Using raw training data from 1st week 2011 to 52nd week 2018, trend difference (d = 0) and seasonal difference (D = 1) were calculated. The Augmented Dickey-Fuller Test indicated the sequence was stationary (t=-3.98, P = 0.01). The ACF and PACF plots were used to estimate the parameter ranges of p, P and q, Q[19]. After checking ACF and PACF plots (Fig. 6), SARIMA(1, 0, 0)(1, 1, 0)52 was the best fitted model with lowest AIC and BIC values, and this model passed the Ljung-Box Q Test (= 21.822,P = 0.149), indicating it’s a white noise sequence. All the parameter estimates were significant (Table 3).
Table 3. Comparison of candidate SARIMA models.
Model
|
Estimate
|
t
|
P
|
Ljung-Box Q Test
|
AIC
|
BIC
|
RMSE
|
MAPE
|
Statistics
|
P
|
SARIMA (1,0,0)(1,1,0)52
|
-
|
-
|
-
|
21.822
|
0.149
|
2235.530
|
2247.220
|
4.673
|
14.290
|
AR1
|
0.886
|
36.768
|
<0.001
|
-
|
-
|
-
|
-
|
-
|
-
|
SAR1
|
-0.607
|
14.350
|
<0.001
|
-
|
-
|
-
|
-
|
-
|
-
|
SARIMA (1,0,1)(1,1,0)52
|
-
|
-
|
-
|
20.962
|
0.138
|
2235.110
|
2250.700
|
4.655
|
14.368
|
AR1
|
0.865
|
28.837
|
<0.001
|
-
|
-
|
-
|
-
|
-
|
-
|
MA2
|
0.097
|
1.5410
|
0.065
|
-
|
-
|
-
|
-
|
-
|
-
|
SAR1
|
-0.612
|
-14.495
|
<0.001
|
-
|
-
|
-
|
-
|
-
|
-
|
SARIMA (2,0,0)(1,1,0)52
|
-
|
-
|
-
|
20.734
|
0.146
|
2201.100
|
2216.700
|
14.390
|
0.970
|
AR1
|
0.957
|
18.254
|
<0.001
|
-
|
-
|
-
|
-
|
-
|
-
|
AR2
|
-0.080
|
-1.511
|
0.067
|
-
|
-
|
-
|
-
|
-
|
-
|
SAR1
|
-0.612
|
-14.495
|
<0.001
|
-
|
-
|
-
|
-
|
-
|
-
|
SARIMA (2,0,1)(1,1,0)52
|
-
|
-
|
-
|
18.552
|
0.183
|
2233.530
|
2253.012
|
4.636
|
14.602
|
AR1
|
0.131
|
0.695
|
0.245
|
-
|
-
|
-
|
-
|
-
|
-
|
AR2
|
0.653
|
3.744
|
<0.001
|
-
|
-
|
-
|
-
|
-
|
-
|
MA1
|
0.835
|
5.088
|
<0.001
|
-
|
-
|
-
|
-
|
-
|
-
|
SAR1
|
-0.605
|
14.157
|
<0.001
|
-
|
-
|
-
|
-
|
-
|
-
|
SARIMA (2,0,2)(1,1,0)52
|
-
|
-
|
-
|
|
|
|
|
|
|
AR1
|
-0.080
|
-2.161
|
0.018
|
18.405
|
0.143
|
2233.480
|
2256.87
|
4.599
|
14.906
|
AR2
|
0.825
|
26.101
|
<0.001
|
|
|
|
|
|
|
MA1
|
1.064
|
16.272
|
<0.001
|
|
|
|
|
|
|
MA2
|
0.064
|
0.994
|
0.163
|
|
|
|
|
|
|
SAR1
|
-0.611
|
-14.45
|
<0.001
|
|
|
|
|
|
|
Forecasting
The model SARIMA(1,0,0)(1,1,0)52 forecasting effect was tested by comparing predicted values with observed values from 1st week 2019 to 29th week 2020. The results showed that all observed values were within 95%CI of predicted values, and the trend of predicted was basically consistent with the actual trend (Fig. 7). Then, forecasting the ILI incidence from 30th week 2020 to 52nd week 2021 by SARIMA. The forecast results show that from the 30th week 2020 to the 52nd week 2021, ILI will show a trend of high incidence in winter and spring, and low incidence in summer and autumn. The incidence of influenza will reach its peak in the 7th week 2021 (Table 4).
Table 4
Predictive value of ILI incidence (per 100,000)
Year/week | Incidence | 95%CI | | Year/week | Incidence | 95%CI |
2020/30 | 5.504 | -4.248-15.256 | | 2021/16 | 11.536 | -8.684-31.755 |
2020/31 | 5.163 | -7.801-18.128 | | 2021/17 | 10.344 | -9.876-30.563 |
2020/32 | 4.681 | -10.288-19.651 | | 2021/18 | 8.225 | -11.995-28.444 |
2020/33 | 6.943 | -9.399-23.284 | | 2021/19 | 7.892 | -12.327-28.112 |
2020/34 | 8.422 | -8.900-25.743 | | 2021/20 | 7.633 | -12.587-27.853 |
2020/35 | 9.549 | -8.488-27.587 | | 2021/21 | 6.467 | -13.753-26.687 |
2020/36 | 11.084 | -7.484-29.652 | | 2021/22 | 7.606 | -12.613-27.826 |
2020/37 | 9.433 | -9.532-28.398 | | 2021/23 | 5.670 | -14.550-25.890 |
2020/38 | 9.993 | -9.271-29.258 | | 2021/24 | 5.411 | -14.809-25.63 |
2020/39 | 11.696 | -7.795-31.187 | | 2021/25 | 5.784 | -14.436-26.004 |
2020/40 | 11.710 | -7.953-31.373 | | 2021/26 | 5.468 | -14.751-25.688 |
2020/41 | 11.726 | -8.068-31.520 | | 2021/27 | 4.498 | -15.721-24.718 |
2020/42 | 12.617 | -7.276-32.511 | | 2021/28 | 4.990 | -15.230-25.21 |
2020/43 | 15.611 | -4.359-35.581 | | 2021/29 | 4.672 | -15.548-24.892 |
2020/44 | 15.406 | -4.623-35.434 | | 2021/30 | 4.823 | -15.851-25.496 |
2020/45 | 20.754 | 0.681–40.827 | | 2021/31 | 5.010 | -16.006-26.025 |
2020/46 | 23.021 | 2.913–43.128 | | 2021/32 | 4.910 | -16.364-26.184 |
2020/47 | 26.565 | 6.431–46.698 | | 2021/33 | 6.056 | -15.414-27.526 |
2020/48 | 30.270 | 10.116–50.423 | | 2021/34 | 8.693 | -12.926-30.313 |
2020/49 | 24.789 | 4.619–44.958 | | 2021/35 | 9.995 | -11.739-31.728 |
2020/50 | 26.030 | 5.849–46.211 | | 2021/36 | 11.233 | -10.587-33.054 |
2020/51 | 31.144 | 10.954–51.334 | | 2021/37 | 9.925 | -11.962-31.812 |
2020/52 | 34.819 | 14.622–55.016 | | 2021/38 | 10.378 | -11.560-32.316 |
2021/01 | 29.301 | 9.099–49.503 | | 2021/39 | 11.596 | -10.380-33.573 |
2021/02 | 23.157 | 2.951–43.363 | | 2021/40 | 11.996 | -10.010-34.003 |
2021/03 | 25.995 | 5.785–46.204 | | 2021/41 | 11.853 | -10.176-33.883 |
2021/04 | 32.197 | 11.985–52.409 | | 2021/42 | 12.471 | -9.577-34.518 |
2021/05 | 40.479 | 20.266–60.693 | | 2021/43 | 17.086 | -4.975-39.147 |
2021/06 | 54.811 | 34.596–75.026 | | 2021/44 | 15.498 | -6.574-37.569 |
2021/07 | 54.064 | 33.848–74.28 | | 2021/45 | 23.641 | 1.562–45.720 |
2021/08 | 43.281 | 23.064–63.498 | | 2021/46 | 26.943 | 4.858–49.028 |
2021/09 | 37.155 | 16.938–57.373 | | 2021/47 | 33.262 | 11.172–55.352 |
2021/10 | 31.506 | 11.288–51.724 | | 2021/48 | 36.767 | 14.674–58.86 |
2021/11 | 24.705 | 4.487–44.924 | | 2021/49 | 29.557 | 7.461–51.654 |
2021/12 | 20.839 | 0.620-41.058 | | 2021/50 | 30.704 | 8.606–52.802 |
2021/13 | 17.175 | -3.044-37.394 | | 2021/51 | 36.928 | 14.828–59.027 |
2021/14 | 14.670 | -5.549-34.889 | | 2021/52 | 40.739 | 18.638–62.84 |
2021/15 | 11.900 | -8.319-32.119 | | | | |