Soil salinity estimation by UAV multispectral remote sensing data plays a positive role in salinity monitoring and management. In this study, the spectral data and salinity data acquired during the vegetation cover period were used to construct and validate the estimation model for soil salinity at different depths. In addition, the model performances were validated under different vegetation cover ratios and the impact of cover on salt estimation in different soil layers was also explored.
Feature selection is an important part of constructing the soil salinity estimation model. Through the BDT method, 25 key features were finally identified, including bands reflectance (e.g., blue, red-edge, and near-infrared), spectral indices (e.g., SI2-reg, EVI, and DVI, etc.), and texture data (e.g., the mean, variance, and second-order moments of the GLCM features, etc.). These features efficiently captured the spectral properties of the soil and the spatial structural information, providing the model with sufficient input variables. Numerous studies have demonstrated that the red edge band is more sensitive to soil salinity and is strongly correlated the spectral properties of the vegetation canopy, thereby improving the accuracy of soil salinity estimation30,52. To facilitate modeling and prediction, Hu et al.53 employed UAV to gather remote sensing images of areas with both vegetation cover and barren ground. The regions with vegetation cover showed the most accurate prediction results. As a result, the spectral index produced by incorporating the red edge band significantly increases the accuracy of the soil salinity estimation in saline agricultural lands with vegetation cover. The red edge and near-infrared bands showed high importance at all depths in this study, as they sensitively revealed the changes in soil moisture and salt content, serving as important indicators for soil salinity prediction. This result aligned with the reported of Taghadosi et al.54. Spectral indices such as EVI, EVI-reg and BI indirectly indicated the salinity status of the soil by enhancing the vegetation characteristics. Lobell et al.55 showed that EVI is more reliable than NDVI for salinity monitoring. Spectral index groups, such as SI2-reg, EVI-reg, and DVI-reg, involved the calculation of the red-edge band, further proving the importance of the red-edge band in soil salinity monitoring, consistent with the results of Ma et al.56 Textural features, such as the mean, variance, and contrast in the GLCM, were instrumental in capturing subtle soil surface changes and spatial structure information. Tai57 showed that incorporating texture features could effectively improve the accuracy of soil salinity estimation models under vegetation cover conditions.
Different modelling groups also exerted an influence on estimation model performance. In the single-variable modelling group, spectral indices had a greater contribution to soil salinity estimation, with the RF model demonstrating high accuracy across all four depths (Fig. 4). Bian et al.58 found that there is a certain correlation between spectral index and soil salinity, which has a great contribution to the application of estimation. Although the accuracy of the model built by band reflectance was slightly lower than that of spectral indices, the overall difference was not significant, which was in agreement with Wang et al.59, who evaluated the prediction accuracy of different variable groups for soil salinity in oasis. In contrast, texture data performed poorly among the single variables. However, when combined with other variables, the texture information from multispectral images could improve the accuracy of soil salinity estimation at different depths of vegetation cover. Zheng et al.18 showed that the combination of texture data and spectral information significantly improved the accuracy of rice biomass estimation. The texture features of the UAV multispectral images provided rich information60, making them applicable for estimating soil salinity at different depths. Incorporating texture data significantly enhances model performance, particularly in the RF model. When texture data is combined with other data, the accuracy of the model is notably improved. This study also demonstrates that using sensitive bands and spectral indices as the input variables leads to better modeling and validation outcomes in soil salinity estimation models. Nevertheless, in multi-variate groups, variables may affect and constrain each other, and combining variables of high importance not always achieve optimal results. For instance, the accuracy of all groups in the BPNN model was not optimal compared to other groups. Introducing an excessive number of independent variables could lead to information redundancy, overfitting, and a decrease in model accuracy61.
Previous studies have highlighted the superiority of machine learning methods in estimating soil salinity content due to the complexity and indirect relationships between variables62. In this study, RF, SVM, GPR and BPNN were used to model soil salinity at different depths and to select the most effective model. The evaluation of model criteria revealed that RF and GPR have advantages in estimating soil salinity. Although SVM has strong generalization capability for handling nonlinear problems, it is sensitive to noisy data63. The models exhibited varying levels of accuracy in soil salinity estimation at different depths. The GPR model demonstrates better prediction accuracy than other methods for surface and deep soils, while the RF model performed well in intermediate soils. Due to the complexity of soil salinization mechanism and the intricate nonlinear relationship between soil spectral characteristics and soil salinity, machine learning is well-suited for revealing these relationships. Their strong nonlinear fitting and generalization capabilities, make them ideal for simulating the complex interactions among variables. Ma et al.56 found that the RF model had better estimation results when predicting the soil salinity in Ebinur Lake wetland using multispectral and Digital Elevation Model (DEM) data. Therefore, machine learning algorithms have shown superior performance in both modeling -and validation of soil salinity estimation model. For instance, Wei et al.5 developed SSC estimation models using RF, SVM, and BPNN algorithms, with the RF model achieving the highest prediction accuracy (R2 = 0.84). Additionally, Zhu et al.64 and Yu et al.65 concluded that the RF based SSC estimation model had a high accuracy.
The essence of estimating soil salinity during the vegetation cover period is to indirectly obtain information on soil salinity through the crop spectral response to soil salinity. The vegetation cover ratio of the given crop in the same period may correlate with the degree of soil salinity. The optimal estimation model for various soil depth varied with vegetation cover conditions, with the optimal model on depths ranging from 0ཞ10 cm under both low and medium vegetation cover conditions, and from 20ཞ30 cm under high vegetation cover. During the vegetation cover period, the absorption of soil water by crops was mainly done by the lateral root. Consequently, the soil salinity in the soil layer where the lateral roots were located will significantly affect crop water absorption66, causing crop growth stress. Higher soil salinity levels result in more severe stress and poorer vegetation growth, which would be indirectly expressed in the vegetation canopy.