For the locations situated in the Triângulo Mineiro region, the meteorological data used for estimating leaf wetness included average relative humidity, maximum relative humidity, minimum relative humidity, average air temperature, maximum air temperature, dew point temperature, precipitation, and wind speed. These data were selected because they showed the highest correlation with the observed LWD for the three study locations (Fig. 3).
For the municipality of Araguari, the variables most correlated with observed LWD were maximum and average relative humidity (Spearman's ρ = 0.93), followed by minimum relative humidity (Spearman's ρ = 0.86), and dew point temperature (Spearman's ρ = 0.68) (Fig. 3a). Both maximum air temperature and global solar radiation were negatively correlated with observed LWD, as higher solar radiation readings are associated with clear and warmer days, increasing the evaporation of water from leaf surfaces.
A similar pattern was observed for Araxá (Fig. 3b) and Patrocínio (Fig. 3c), where average, minimum, and maximum relative humidity were the parameters most correlated with observed leaf wetness duration. Relative humidity was positively correlated with LWD and, due to its importance, is often used as a single variable for estimating leaf wetness duration.
Regarding locations in the southern region of Minas Gerais, LWD in Boa Esperança (Fig. 4a) and Muzambinho (Fig. 4c) showed the highest correlation with average (Spearman's ρ = 0.91 and 0.93), maximum (Spearman's ρ = 0.88), and minimum (Spearman's ρ = 0.82 and 0.87) relative humidity. The same variables also showed high correlation with leaf wetness duration for Carmo de Minas and Varginha, albeit in a different order (Fig. 4b and 4d).
For all locations in the southern region of Minas Gerais, it was observed that maximum air temperature and global solar radiation variables showed a negative correlation with the study's dependent variable. These results can be explained since relative humidity, a variable with high correlation with LWD, is the ratio between the partial pressure of water vapor in the air and the vapor pressure of water, and the latter varies with air temperature.
Out of the 13 parameters subjected to Spearman correlation, the meteorological data used for leaf wetness estimation were average air temperature (Fig. 5a), maximum air temperature (Fig. 5b), dew point temperature (Fig. 5c), average relative humidity (Fig. 5d), maximum relative humidity (Fig. 5e), minimum relative humidity (Fig. 5f), wind speed (Fig. 5g), and precipitation (Fig. 5h) because they showed the highest correlation with the study locations.
Although meteorological conditions show similar patterns among different locations in Minas Gerais, sites in the Triângulo Mineiro region (Araxá, Araguari, and Patrocínio) have higher air temperatures and wind speeds than locations in the South of Minas Gerais (Boa Esperança, Carmo de Minas, Muzambinho, and Varginha), and therefore also have lower relative humidity. Low dew point depression values, for example, mean that the air is very humid and there is a greater likelihood of condensation. According to Sentelhas et al. (2008), dew point depression values below 2°C can be used as an indicator of dew presence on plant leaves.
The LWD obtained from the NHUR > 90% model, used as observed LWD, showed greater variability for locations in the Triângulo Mineiro than for locations in the southern region of Minas, with a seasonality pattern similar to the relative humidity distribution in these locations (Fig. 6). The highest LWDs were recorded from November to June, with values above 6 hours in most assessed locations, as observed by Alvares et al. (2015). Araguari, Araxá, and Patrocínio were the locations with the lowest LWD values, due to their lower relative humidity values. Regions characterized by high relative humidity, high precipitation, milder temperatures, and lower wind speeds correspond to locations with higher LWD. Carmo de Minas showed the highest LWDs, reaching values above 10 hours during almost the entire spring and summer, and above 7 hours even in drier months like August. On the other hand, locations in the Triângulo Mineiro (Araguari, Araxá, and Patrocínio) had values below 1 hour, mainly between August and September. Urashima et al. (2018) reported a variation in the area of orange rust lesions on sugarcane with the LWD.
According to the hyperparameters used by the models (Table 2), the results show an overall satisfactory performance for all models during the training stage (Table 3), with the MLP model outperforming the RF and SVM models. Regarding the locations in the Triângulo Mineiro region, the MLP model achieved high precision (R² = 0.99) with mean squared errors of 28.2 minutes day− 1 in relation to the duration of leaf wetness. For the southern region of Minas, Carmo de Minas and Varginha sites showed precision of 0.98 and even lower RMSEs, of 22.2 and 24.6 minutes day− 1, respectively, for the same model.
Table 2
Hyperparameters used in the Random Forest (RF), Multilayer Perceptron (MLP), and Support Vector Machine (SVM) models for estimating the duration of leaf wetness (DPM).
Models | Hyperparameters | Input used |
RF | max_depth | 13 |
n_estimators | 8 |
min_impurity_decrease | 0 |
MLP | hidden_layer_sizes | (4, 4, 2) |
learning_rate_init | 0.05 |
learning_rate | adaptive |
activation | tanh |
solver | adam |
alpha | 0.05 |
random_state | 20 |
SVM | kernel | rbf |
C | 100 |
gamma | scale |
Legend: max_depth: maximum depth of the tree; n_estimators: number of trees in the forest; min_impurity_decrease: a node will be split if this split induces a decrease of the impurity greater than or equal to this value; hidden_layer_sizes: the nth element represents the number of neurons in the nth hidden layer; learning_rate_init: initial learning rate used; learning_rate: learning rate schedule for weight updates; activation: activation function for the hidden layer; solver: solver for weight optimization; alpha: L2 penalty parameter; random_state: determines random number generation for weights and bias initialization, test/train split if early stopping is used, and batch sampling when solver = 'sgd' or 'adam'; kernel: specifies the kernel type used in the algorithm; C: regularization parameter; gamma: kernel coefficient. Source: Pedregosa et al. (2011) |
Francl and Panigrahi (1997), using artificial neural networks, correctly classified 90% of cases for predicting leaf humidity in wheat; as well as Dalla Marta et al. (2005), who, using data on precipitation, air temperature, relative humidity, wind speed and direction, atmospheric pressure, and solar radiation in an artificial neural network model for predicting DPM, achieved an acceptable error of 6 to 7 minutes on average.
On average, the SVM model showed higher MAE and RMSE compared to the RF model. Asadi and Tian (2021) found similar results by estimating DPM with RF using meteorological data from the ERA5 provider, with an accuracy of 78.2% and MAE of 2.8h. Gillespie et al. (2021), in a comparative study between empirical and machine learning models, concluded that machine learning models were significantly more accurate than empirical models in predicting leaf humidity.
Table 3
Performance of training for Artificial Neural Network Multilayer Perceptron (MLP), Random Forest (RF), and Support Vector Machine (SVM) models for different locations in Minas Gerais
| | Triângulo Mineiro | Sul de Minas Gerais |
| | Araguari | Araxá | Patrocínio | Boa Esperança | Carmo de Minas | Muzambinho | Varginha |
MLP | R² | 0.99 | 0.99 | 0.99 | 0.99 | 0.98 | 0.99 | 0.98 |
RMSE | 0.46 | 0.49 | 0.47 | 0.44 | 0.37 | 0.47 | 0.41 |
MAE | 0.33 | 0.35 | 0.33 | 0.28 | 0.19 | 0.31 | 0.28 |
RF | R² | 0.98 | 0.98 | 0.98 | 0.97 | 0.94 | 0.98 | 0.96 |
RMSE | 0.54 | 0.65 | 0.60 | 0.66 | 0.67 | 0.63 | 0.66 |
MAE | 0.30 | 0.38 | 0.33 | 0.44 | 0.43 | 0.37 | 0.38 |
SVM | R² | 0.98 | 0.98 | 0.98 | 0.98 | 0.96 | 0.98 | 0.97 |
RMSE | 0.57 | 0.58 | 0.56 | 0.57 | 0.56 | 0.56 | 0.56 |
MAE | 0.42 | 0.44 | 0.44 | 0.45 | 0.42 | 0.44 | 0.44 |
Legend - R2: Coefficient of determination (0 to 1). RMSE: Root Mean Square Error (expressed in hours). MAE: Mean Absolute Error (expressed in hours)
The MLP model's performance maintained superiority over the other models in the testing phase for the Triângulo Mineiro region (Fig. 7). All locations showed high accuracy and low RMSE for estimating DPM using the MLP model, and for this model, Araguari yielded the best results with an accuracy of 0.98 and a mean absolute error of 20.4 minutes (Fig. 7a).
Mashonjowa et al. (2013) obtained higher root mean square errors than those of the present study when estimating leaf wetness duration in roses using empirical models with relative humidity data (RMSE = 2.3 h d-1). Typically, machine learning models show a decrease in RMSE and MAE by about 2 hours compared to empirical models (Asadi and Tian, 2021).
Despite the good performance, there is a tendency for the models to overestimate lower values and underestimate higher values of leaf wetness duration. Dalla Marta et al. (2005) highlight a tendency of the Recurrent Neural Network to underestimate leaf wetness prediction values in northern Italy.
Similar performance was observed for areas located in the southern region of Minas Gerais, where the MLP model showed the best performance for all locations (Fig. 8). Analyzing the locations within the MLP model, the DPM estimate for Carmo de Minas showed superior performance compared to other locations (Fig. 8d, e, and F), with high accuracy (R² = 0.98) and low errors (RMSE = 24.6 minutes and MAE = 12 minutes). This result possibly relates to the lower variability of meteorological variables in Carmo de Minas.
Regarding the Triângulo Mineiro, the results from the southern region of Minas were better because the model overestimated the lowest values of leaf wetness duration. As Araguari, Araxá, and Patrocínio have lower DPM than Boa Esperança, Carmo de Minas, Muzambinho, and Varginha in the driest months, these were overestimated by the model.
Kim et al. (2005), estimating DPM using two different models, concluded that the Fuzzy model overestimated DPM during the rainy season, but during days when no leaf wetness was observed, the CART model identified many hours as wet. Besides the importance of model selection, it is emphasized that the choice of input data in the models is also crucial, and often, models based on satellite data may be superior to models based on observed relative humidity, for example (Asadi and Tian, 2021).