Melia azedarach L., Meliaceae, is a fast-growing species with good timber attributes of multiple-use, such as construction and furniture, farm tools, boats, vehicles, and musical instruments manufacturing1. The species roots, bark, flowers, and fruits are of high medicinal values2,3. Additionally, its fruit and leaf extracts can control numerous agricultural pests and are commonly used as biological pesticides raw materials4. The species is an excellent urban greening tree that is resistant to smoke and dust and can absorb many toxic and harmful gases. At present, it is planted in more than 50 countries5,6. The species productivity is climate-dependent, and it is expected that climate change will reshape its future suitable habitat7.
The intensification of global warming, accompanied by the frequent occurrence of extreme natural disturbances, such as wind storms, droughts, fires, and floods, will undoubtedly impact the global forest ecosystem8. Different tree species respond differently to climate change, with positive and negative effects in different areas. For example, climate change is expected to increase the suitable habitats of Mediterranean oaks in the western temperate areas9 as well as the total suitable habitat for Cypripedium japonicum10. Conversely, eucalyptus species are expected to face future challenges due to their poor spread capability11, and Persian oak (Quercus macranthera) will experience a reduction in its contemporary range and is expected to move to higher altitudes12. Consequently, assessing the impact of climate change on the potential distribution of species and formulating sustainable forest management strategies are critical to maintaining forest ecosystems integrity.
With climate change challenges, species distribution models (SDMs) have become essential tools for projecting plants adaptation to a changing climate11. At present, a variety of data mining techniques have been applied to model species distribution data. These include: 1) Generalize linear model (GLM), a common regression model first introduced by Austin et al. (1983), to simulate tree species distribution and subsequently was used to predict the spread of Emerald Ash Borer (Agrilus planipennis) in southern Ontario, Canada13; 2) Gradient boosting machine (GBM), a machine-learning technology used to generate predictive models in the form of a collection of weak predictive models14 and currently is being used to predict invasive plant species distribution, high-resolution, high-precision multi-type vegetation mapping, and species distribution models15,16; 3) Random Forest (RF), a machine-learning technology through building a large number of decision trees during the program’s training phase17 and presently is being used to predict invasive species range, classification of tree species based on hyperspectral information, and prediction of stands basal areas and the distribution of plantation forests18,19; 4) Support Vector Machine (SVM), a supervised learning model used for data classification and regression analysis and is widely used to classify invasive species and detect the presence of farmland weeds20,21; 5) Maximum Entropy (MaxEnt), a machine-learning technology by finding the maximum entropy of the probability distribution of the species through the species distribution and environmental data to estimate and predict future species distribution22, and mainly is being used in crop niches, plant diseases and insect pests, and species invasion prediction23,24; 6) Extreme Gradient Boosting (XGBoost), an open-source software library algorithm effective in predicting species abundance and identifying critical environmental factors25, and is playing an essential role in designing new drugs to treat related diseases; and 7) Naive Bayesian Model (NBM), a series of simple probabilistic classifiers based on Bayes' theorem and independent assumptions between features26, and mainly is being applied in forestry for predicting the potential distribution areas of Taxus chinensis and identifying plant long non-coding RNA and predicting its functions27.
Understanding the potential distribution of M. azedarach is of great significance to its cultivation and conservation. Studies conducted on M. azedarach were mainly focused on tree and stand productivity, extraction of active ingredients, and pest resistance potential3,28. Research on M. azedarach potential distribution as affected by climate change is lacking and thus, the present study is aimed at exploring the above-mentioned seven data mining techniques to establish climate-based distribution prediction models and select the best model in predictions of the species future suitable habitat. Our specific objectives were to: 1) compare the prediction accuracy of the seven modeling algorithms and select the one with the best performance; 2) determine the key climatic factors related to the species distribution; 3) develop current and future species suitable habitat maps highlighting the areas of change; and 4) assess the potential impact of future climate change on the species suitable habitat.