The distribution of air pollutants and meteorological factors were described in S1 Table. PM2.5 was 45.0 (71.0) μg/m3 with a range of 2.0 and 1004.0 μg/m3, CO was 0.8 (0.8) mg/m3 with a range of 0.1 and 16.7 mg/m3, NO2 was 41.0 (47.0) μg/m3 with a range of 1.0 and 300.0 μg/m3, O3 was 40.0 (69.0) μg/m3 with a range of 1.0 and 504.0 μg/m3, and SO2 was 4.0 (8.0) μg/m3, the range was from 1.0 to 307.0 μg/m3.
The spatial distribution of PM2.5 was shown in S2 Fig, it can be observed that PM2.5 concentration varied in districts and counties, meanwhile gradually decreased from south to north. So we predicted PM2.5 concentration based on the data in every district and county (16 in total) of Beijing.
The time series distribution of daily PM2.5 concentration in the districts and counties with the highest (Fangshan) and lowest (Miyun) during the study period are shown in S3 Fig. Of the total 529 days, daily PM2.5 concentration levels exceed the second standard in China on 191 and 115 days. Meanwhile, daily PM2.5 concentration levels seem to fluctuating randomly, whereas, there was a decline trend during the study period.
The association between PM2.5 and meteorological factors were shown in S2 Table. All the correlation coefficients were statistically significant. If spearman rank correlation coefficients (rs) among several correlated meteorological factors were above 0.60, the one which has the highest spearman rank correlation coefficient with PM2.5 should be selected. Finally, Minimum relative humidity (rs = 0.29, P < 0.001), sea level pressure (rs = –0.09, P < 0.001), maximum wind speed (rs = –0.16, P < 0.001), temperature (rs = 0.01, P < 0.001) were under consideration to be involved in the construction of multi-level AM.
The result of multi-level AM was shown in Table 1, it demonstrated that every variable is statistically correlated with PM2.5 (β = –15.05–33.73, P < 0.05). Finally, minimum relative humidity, sea level pressure, maximum wind speed and wind direction, temperature, rainfall, CO, NO2, SO2, and O3 were selected to be involved in the construction of multi-level AM.
Table 1. Associated factors with PM2.5
variables
|
β
|
SE
|
t
|
P
|
variables
|
β
|
SE
|
t
|
P
|
warm season
|
-12.22
|
3.84
|
-3.19
|
0.001
|
CO
|
33.73
|
0.69
|
48.80
|
<0.001
|
weekend
|
-7.72
|
1.28
|
-6.03
|
<0.001
|
NO2
|
1.36
|
0.03
|
50.18
|
<0.001
|
holiday
|
16.75
|
2.47
|
6.79
|
<0.001
|
SO2
|
2.43
|
0.08
|
28.72
|
<0.001
|
rainfall
|
-12.69
|
1.35
|
-9.37
|
<0.001
|
O2
|
-0.07
|
0.03
|
-2.57
|
0.010
|
wind direction (north as control)
|
min relative humidity
|
1.43
|
0.04
|
37.72
|
<0.001
|
east
|
28.10
|
1.75
|
16.10
|
<0.001
|
sea level pressure
|
-0.79
|
0.12
|
-6.78
|
<0.001
|
south
|
31.07
|
1.56
|
19.96
|
<0.001
|
temperature
|
0.37
|
0.17
|
2.22
|
0.026
|
west
|
8.08
|
1.78
|
4.53
|
<0.001
|
wind speed
|
-15.05
|
0.36
|
-42.33
|
<0.001
|
Note: All estimates are from multi-level additive model.
In multi-level AM, the degrees of freedom of the independent variables are determined one by one according to the principle of minimizing the partial autocorrelation function (pacf). One example was given in S4 Fig, when the k was 20, the pacf was minimum, and then the degree of freedom of this variable in this model could be determined as 20.
In LSTM, when the training and testing loss tend to be stable, it indicated that the model trained well (S5 Fig), the epoch, batch size, number of cells, number of network layers were adjusted to optimize models, and the final LSTM models performed well with 1 LSTM layer with 20 cells, a fully connected layer with 1 cell, the number of epoch is 40, and batch size is 20.
Hourly PM2.5 concentration for the next hour was predicted based on the current hourly data of meteorological factors and air pollutants, and the prediction results are shown in Table 2. The R2 of LSTM is in the range of 0.70~0.92, which is generally higher than multi-level AM (0.59~0.80), RMSE is among 6.20 and 17.58μg/m3, which is lower than multi-level AM (19.19~30.81μg/m3), MAE varies from 4.50 to 13.42μg/m3, which is lower than multi-level AM (13.55~22.35μg/m3), MAPE is in the range of 0.18%~0.55%, which is lower than multi-level AM (0.50%~0.87%). The results suggested that LSTM performs better in hourly PM2.5 concentration predication than multi-level AM. The comparison of observed and predicted PM2.5 hourly concentration based on multi-level AM and LSTM in Daxing (best predication result in LSTM) and Fangshan (worst predication result in LSTM) is shown in Fig 1. It can be seen that the predicated hourly PM2.5 concentration using LSTM are more consistent with the observed than multi-level AM, which could also suggest that LSTM performs better than multi-level AM in hourly PM2.5 concentration predication.
Table 2. Comparison of AM and LSTM on hourly PM2.5 concentration predication.
district and county
|
AM
|
|
LSTM
|
R2
|
RMSE
|
MAE
|
MAPE
|
|
R2
|
RMSE
|
MAE
|
MAPE
|
Dongcheng
|
0.78
|
20.88
|
15.56
|
0.55
|
|
0.86
|
7.36
|
5.86
|
0.24
|
Xicheng
|
0.75
|
22.41
|
16.21
|
0.54
|
|
0.88
|
6.20
|
5.10
|
0.27
|
Chaoyang
|
0.67
|
26.37
|
19.19
|
0.65
|
|
0.78
|
9.99
|
7.40
|
0.21
|
Haidian
|
0.70
|
22.18
|
15.99
|
0.62
|
|
0.77
|
9.62
|
6.97
|
0.20
|
Fengtai
|
0.59
|
27.92
|
20.26
|
0.66
|
|
0.85
|
6.71
|
5.08
|
0.18
|
Shijingshan
|
0.74
|
22.65
|
15.66
|
0.60
|
|
0.78
|
11.62
|
9.58
|
0.48
|
Fangshan
|
0.61
|
29.34
|
21.33
|
0.64
|
|
0.70
|
17.58
|
13.42
|
0.31
|
Daxing
|
0.76
|
21.39
|
15.40
|
0.57
|
|
0.89
|
5.72
|
4.50
|
0.18
|
Tongzhou
|
0.59
|
30.81
|
22.35
|
0.73
|
|
0.83
|
7.47
|
5.47
|
0.18
|
Shunyi
|
0.80
|
20.44
|
14.15
|
0.67
|
|
0.86
|
8.04
|
6.22
|
0.39
|
Changping
|
0.71
|
21.68
|
15.46
|
0.72
|
|
0.89
|
7.54
|
6.77
|
0.55
|
Mentougou
|
0.65
|
26.31
|
17.62
|
0.80
|
|
0.81
|
9.65
|
7.61
|
0.47
|
Pinggu
|
0.63
|
29.12
|
17.87
|
0.70
|
|
0.85
|
7.39
|
5.87
|
0.34
|
Huairou
|
0.65
|
25.66
|
16.98
|
0.87
|
|
0.81
|
10.61
|
7.46
|
0.39
|
Miyun
|
0.71
|
20.93
|
14.45
|
0.70
|
|
0.92
|
9.15
|
7.04
|
0.39
|
Yanqing
|
0.74
|
19.19
|
13.55
|
0.50
|
|
0.76
|
10.86
|
8.94
|
0.40
|
Note: Predicating efficiency of AM and LSTM on hourly PM2.5 concentration.
Daily PM2.5 concentration for the next day was predicted based on the current data of meteorological factors and air pollutants, and the prediction results are shown in Table 3. The R2 of LSTM is in the range of 0.43~0.93, which is generally lower than multi-level AM (0.67~0.98), RMSE is among 32.46 and 46.82μg/m3, which is higher than multi-level AM (4.83~20.98μg/m3), MAE varies from 24.32 to 34.89μg/m3, which is higher than multi-level AM (3.67~16.33μg/m3), MAPE is in the range of 0.92%~1.74%, which is higher than multi-level AM (0.11%~0.45%). The results suggested that multi-level AM performs better in predicting PM2.5 daily concentration than LSTM. The comparison of observed and predicted PM2.5 daily concentration based on multi-level AM and LSTM in Fengtai (best predication result in multi-level AM) and Fangshan (worst predication result in multi-level AM) is shown in Fig 2. It can be seen that the predicated PM2.5 hourly concentration plots of multi-level AM are more consistent with the observed than LSTM, which could also suggest that multi-level AM performs better than LSTM in predication of daily concentration of PM2.5.
Table 3. Comparison of AM and LSTM on daily PM2.5 mean concentration predication.
district and county
|
AM
|
|
LSTM
|
R2
|
RMSE
|
MAE
|
MAPE
|
|
R2
|
RMSE
|
MAE
|
MAPE
|
Dongcheng
|
0.97
|
6.60
|
5.21
|
0.17
|
|
0.70
|
44.66
|
33.69
|
1.34
|
Xicheng
|
0.98
|
5.78
|
4.72
|
0.15
|
|
0.83
|
46.03
|
34.89
|
1.45
|
Chaoyang
|
0.94
|
8.95
|
6.87
|
0.20
|
|
0.86
|
45.23
|
33.82
|
1.33
|
Haidian
|
0.86
|
12.34
|
9.66
|
0.32
|
|
0.64
|
37.31
|
28.71
|
1.19
|
Fengtai
|
0.98
|
4.83
|
3.84
|
0.11
|
|
0.93
|
46.82
|
33.56
|
1.26
|
Shijingshan
|
0.98
|
5.54
|
4.36
|
0.17
|
|
0.73
|
40.12
|
32.12
|
1.35
|
Fangshan
|
0.67
|
20.98
|
16.33
|
0.38
|
|
0.93
|
44.09
|
33.25
|
0.98
|
Daxing
|
0.97
|
6.45
|
5.10
|
0.15
|
|
0.68
|
45.61
|
32.97
|
1.44
|
Tongzhou
|
0.95
|
8.38
|
6.46
|
0.18
|
|
0.63
|
42.96
|
31.45
|
0.92
|
Shunyi
|
0.97
|
6.86
|
5.20
|
0.20
|
|
0.65
|
42.86
|
33.08
|
1.67
|
Changping
|
0.97
|
5.84
|
4.61
|
0.20
|
|
0.71
|
38.46
|
29.65
|
1.65
|
Mentougou
|
0.92
|
9.53
|
7.44
|
0.28
|
|
0.79
|
41.41
|
32.07
|
1.74
|
Pinggu
|
0.98
|
4.83
|
3.67
|
0.13
|
|
0.50
|
41.72
|
31.89
|
1.21
|
Huairou
|
0.79
|
15.16
|
11.34
|
0.44
|
|
0.59
|
36.31
|
27.55
|
1.46
|
Miyun
|
0.95
|
6.83
|
5.53
|
0.22
|
|
0.43
|
32.46
|
24.32
|
1.14
|
Yanqing
|
0.74
|
15.79
|
12.80
|
0.45
|
|
0.62
|
33.20
|
26.79
|
1.16
|
Note: Predicating efficiency of AM and LSTM on daily PM2.5 concentration