This section is structured into two distinct parts: the first provides a diagnosis of the spatiotemporal analysis of AD within TMA-Manaus, while the second focuses on the evaluation of the results of a set of ML algorithms in forecasting CS in TMA-Manaus.
4.1 Atmospheric Discharges
To evaluate the distribution of the quantity of AD and establish the severity thresholds of CS, the boxplot and histogram of the amount of DR-AD within TMA-Manaus are presented in Figure 1. The distribution of quartiles was as follows: the 1st Quartile (Q1) at 69.0, the median (Q2) at 283.5, the 3rd Quartile (Q3) at 1063.5, and an Upper Fence [Q3 + (1.5 * IQR)] at 2555.75. These quantities define the severity thresholds. From 2012 to 2017, there were only 21 days with DR-AD = 0, while 248 days exceeded this upper fence value, indicating significant events. We determined that a rate of 5000 AD is related to the occurrence of more severe CS, with considerable potential impacts on air traffic within TMA-Manaus. There were 109 days exceeding 5000 DR-AD.
Figure 2 displays the percentage distribution of AD events from two datasets: the complete set of AD from 2012 to 2017, consisting of 2,422,173 events (labeled as "All"), and the dataset obtained after step 3.3c of the methodology, which aligns AD data with radiosonde data, consisting of 1,692,333 events (labeled as "Used") that were used to train and test the ML algorithms. Figure 2 includes (a) the hourly distribution of AD, (b) the monthly distribution, and (c) the yearly distribution. We include the “Used” distribution to demonstrate its similarity to the “All” distribution.
From Figure 2a, we see that 73.68% of AD events occur between 15Z and 21Z, with 51.2% occurring between 16Z (12:00 Local Time) and 19Z (15:00 Local Time), indicating that the afternoon period is the most impactful for aviation, as expected. Since the input data is acquired at 12Z, the forecast lead time is set to 4 hours. Figure 2b shows that the highest AD occurrences are during the spring months, from August to November. Figure 2c indicates that 56.1% of AD events occurred between 2013 and 2014. No correlation with the El Niño Southern Oscillation phenomenon was observed; other climate variability parameters, such as Sea Surface Temperature and variation in the position of the ITCZ, were not evaluated.
Figure 3 illustrates the monthly spatial distribution of AD density in the studied area in the period of 2012 to 2016. As seen in Figure 2b, months with higher AD present a more homogeneous spatial distribution, while the other months present cores of AD. But it is interesting to note that AD occurs in every month of the year.
4.2. ML Algorithms
From the SBMN 12Z radiosonde, each of the 24 thermodynamic indices encapsulates critical aspects of atmospheric conditions, offering valuable insights into the potential for classifying convective activity. To optimize the efficiency of the ML models while minimizing redundancy and computational complexity, we conducted an analysis of inter-feature correlations (methodology step 3.3e). This procedure improves the interpretability of the trained model and enhances its predictive performance by mitigating multicollinearity and over fitting.
By identifying and eliminating highly correlated features, the dataset was streamlined to focus on the most informative and discriminative predictors. The resulting subset of features represents a balanced selection that captures essential atmospheric thermodynamics relevant to CS prediction. After this processing, 16 features remained out of the initial 24: SWET, BRCH, CAPE, LIFT, SHOW, KINX, EQLV, CINS, CINV, LFCT, VTOT, LFCV, MLTH, PWAT, THTK, and LCLT.
Table 2 presents the percentage discrepancy among the 16 features for CS and non-CS events according to the defined severity threshold, with DR-AD > k identified as YES where k = 63, 283.5, 1063.5, 2555.75, 5000, and NO otherwise. It is expected that the features with the greatest discrepancy between YES and NO events are particularly influential. The highest discrepancy values are linked to the following 5 features: SHOW, BRCH, CAPE, LIFT, and EQLV. These 5 indices are pivotal indicators of CS occurrence over TMA-Manaus, given their capacity to assess atmospheric stability and wind shear.
For example, the SHOW index evaluates thunderstorm potential in weather forecasting, with positive values indicating stable atmospheric conditions less conducive to thunderstorm formation, while negative values suggest increased instability, raising the likelihood of convective activity and thunderstorms (Doswell III, Davies-Jones & Keller, 1993). Similarly, BRCH values denote heightened instability, promoting storm development through convective cloud formation and heavy precipitation (Brooks & Craven, 2002). Moreover, BRCH considers vertical wind shear (Holton, 2004), prevalent in tropical regions like Manaus, influencing storm structure and intensity.
CAPE represents the available energy driving convective cloud development, with elevated values indicating heightened convective potential due to factors such as humidity, warm temperatures, and solar heating, facilitating atmospheric instability and convective cloud formation (Moncrieff, 2010). LIFT assesses atmospheric stability by comparing the temperature of a lifted air parcel to its environment at a specified altitude, typically 500 hPa above ground level. Negative LIFT values suggest warmer, less dense parcels, indicating instability favorable to convective activity, while positive values imply stability, inhibiting convective development (Emanuel, 1994). The EQLV marks the altitude where a lifted air parcel becomes cooler than its surrounding environment, ceasing to ascend freely (Wallace & Hobbs, 2006).
Together, these features offer crucial insights into atmospheric thermodynamics, essential for comprehending and forecasting CS in TMA-Manaus. The following discussion specifically addresses the relevance of these features during ML training.
Table 2 Percentage discrepancy intra-feature for DR-AD > k, identified as YES where k = 63, 283.5, 1063.5, 2555.75, 5000 and NO otherwise
|
Percentual discrepancy according to DR-AD threshold
|
Feature
|
69
|
283.5
|
1063.5
|
2555.75
|
5000
|
1
|
SHOW
|
114.59
|
139.03
|
171.85
|
173.26
|
258.96
|
2
|
BRCH
|
42.90
|
4.98
|
39.08
|
39.44
|
50.11
|
3
|
CAPE
|
34.89
|
33.29
|
26.37
|
26.59
|
30.60
|
4
|
LIFT
|
47.18
|
37.84
|
27.64
|
19.63
|
20.90
|
5
|
EQLV
|
18.85
|
15.49
|
12.35
|
11.74
|
14.12
|
6
|
CINS
|
27.24
|
21.95
|
7.89
|
9.34
|
5.79
|
7
|
KINX
|
8.75
|
7.00
|
6.20
|
5.45
|
5.91
|
8
|
CINV
|
20.61
|
14.17
|
0.53
|
0.11
|
2.95
|
9
|
SWET
|
2.09
|
2.31
|
1.21
|
1.16
|
1.69
|
10
|
LFCT
|
3.57
|
2.79
|
1.65
|
1.89
|
1.38
|
11
|
VTOT
|
0.80
|
0.81
|
1.32
|
0.71
|
0.90
|
12
|
LFCV
|
2.38
|
1.94
|
0.81
|
0.96
|
0.55
|
13
|
MLTH
|
0.12
|
0.05
|
0.00
|
0.05
|
0.12
|
14
|
PWAT
|
6.12
|
4.33
|
1.90
|
0.24
|
0.07
|
15
|
THTK
|
0.06
|
0.04
|
0.08
|
0.11
|
0.07
|
16
|
LCLT
|
0.28
|
0.21
|
0.08
|
0.02
|
0.03
|
Each ML algorithm (methodology step 3.3i) was executed three times, using 5, 16, and 24 features as input, according to the order of relevance established in the index of Table 2. Figure 4 presents five scatterplots of POD vs. FAR for the ML algorithms for the 5 DR-AD thresholds. Figure 5 summarizes the mean behavior of POD and FAR results using the 16 selected features. To show the CS prediction performance for the lowest DR-AD threshold (69), Table 3 presents the results of POD and FAR, for each model and the number of input features used in the run, ranked by POD value.
From Figure 4, we can observe the decrease in model performance and the increase in model dispersion as the DR-AD threshold increases. Lower thresholds provide the best results, probably due to a greater number of YES events. Also, there is a high false positive (false negative) rate for low (high) thresholds, since YES (NO) events have a higher number (Figure 1b).
Evaluating the results of DR-AD=69, the set of ML models demonstrates quite similar and acceptable performances, with mean POD and FAR values of 0.92±0.06 and 0.19±0.01, respectively, as illustrated in Figures 4a and 5. From Table 3, the best performance was achieved by the QDA algorithm, with a POD of 0.99 and FAR of 0.19. Still from Table 3, the models LDA and Logistic Regression present comparative results to QDA for a smaller number of features. The Decision Tree presented the worst results.
Table 3 Performance of the 10 ML algorithms for DR-AD=69 using 5, 16, and 24 features. The results are ranked by POD value
Model
|
# Features
|
POD
|
FAR
|
QDA
|
24
|
0.99
|
0.19
|
LDA
|
5
|
0.99
|
0.20
|
Logistic
|
5
|
0.99
|
0.20
|
ExtraTrees
|
24
|
0.96
|
0.18
|
ExtraTrees
|
16
|
0.96
|
0.18
|
Logistic
|
24
|
0.96
|
0.18
|
QDA
|
5
|
0.96
|
0.19
|
Logistic
|
16
|
0.96
|
0.19
|
RF
|
24
|
0.95
|
0.18
|
LDA
|
24
|
0.95
|
0.18
|
LDA
|
16
|
0.95
|
0.18
|
RF
|
16
|
0.94
|
0.18
|
KNN
|
24
|
0.94
|
0.19
|
GBoosting
|
24
|
0.93
|
0.17
|
ExtraTrees
|
5
|
0.93
|
0.19
|
KNN
|
5
|
0.93
|
0.19
|
KNN
|
16
|
0.93
|
0.19
|
GBoosting
|
5
|
0.93
|
0.20
|
GBoosting
|
16
|
0.92
|
0.18
|
AdaBoost
|
16
|
0.92
|
0.19
|
RF
|
5
|
0.92
|
0.19
|
AdaBoost
|
5
|
0.91
|
0.18
|
QDA
|
16
|
0.90
|
0.18
|
AdaBoost
|
24
|
0.89
|
0.18
|
Bagging
|
16
|
0.87
|
0.17
|
Bagging
|
5
|
0.86
|
0.18
|
Bagging
|
24
|
0.85
|
0.17
|
DecisionTree
|
16
|
0.80
|
0.19
|
DecisionTree
|
5
|
0.80
|
0.20
|
DecisionTree
|
24
|
0.76
|
0.18
|