Deep learning reveals shifting precipitation patterns on the Qinghai-Tibetan Plateau (1980-2020) linked to Southwest Asian monsoon

doi:10.21203/rs.3.rs-5204062/v1

Download PDF

Article

Deep learning reveals shifting precipitation patterns on the Qinghai-Tibetan Plateau (1980-2020) linked to Southwest Asian monsoon

https://doi.org/10.21203/rs.3.rs-5204062/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

High precision precipitation estimation with high temporal and spatial resolution is essential for depicting the hydrological process in ecological and environmental researches. Various spatial interpolation algorithms were developed but large uncertainties were found for the Qinghai-Tibetan Plateau (QTP), where meteorological stations are sparsely located over its complex topography. This study developed an Attention-Gated Convolutional Neural Network (A-GCN) algorithm to produce more accurate precipitation spatial interpolation. The spatiotemporal changes were explored in the A-GCN-based precipitation in 1980 to 2020 and its underlying mechanism was analyzed in the view of Asia monsoon. The results showed the A-GCN algorithm, through local connectivity and local region weight sharing in convolutional neural networks, enable better focus on local region features, providing good performance by the comparing with independent observations or the available precipitation datasets. The spatial transition was found in the precipitation interannual trend from a decreasing north and increasing south to an increasing north and decreasing south around the year 2000. The transition could be attributed to the dipole precipitation pattern on a global scale and teleconnection with the Southwest Asia Monsoon enhancing in the early period then weakening since 2005. This study provides a state-of-the-art methodological framework for the spatial interpolation for geographic variable for regions with sparse observations. And precipitation changes would profoundly influence ecological and environment and should be paid more attentions.

Earth and environmental sciences/Climate sciences/Climate change/Attribution

Earth and environmental sciences/Climate sciences/Hydrology

meteorological observations

machine learning

spatial interpolation

Precipitation is a fundamental and crucial water source in maintaining ecosystem services and human wellbeing. Its amount and spatiotemporal changes are important information for hydrological research and water resource management policy-making (Devi et al., 2015; Prakash et al., 2015). However, more than half of the uncertainty in these factors has often been attributed to the precision level of precipitation data estimates in hydrological modelling (Chao et al., 2018; Seyyedi et al., 2015). Therefore, how to derive spatial distribution data of precipitation with higher accuracy is a difficult, and important, problem.

Various methods have been developed to estimate temporal and spatial change of precipitation by considering different topographic characteristics (Brus and Heuvelink, 2007; Hong et al., 2005). Previously used methods mainly included: spatial distance-based interpolation methods, such as inverse distance weighting (Franke, 1982) and Thiessen polygons (Thiessen, 1911); geostatistical methods, such as kriging and its improving versions (Cressie, 1993; Oliver and Webster, 1990); and thin plate smoothing splines (Hutchinson and Bischof, 1983). These methods are highly dependent on the number of meteorological station observations and topographical variables included (Aalto et al., 2013; Chen et al., 2010; Ishida and Kawashima, 1993). Therefore, the precision of the interpolation method is often diminished by sparsely located meteorological stations over complex topographical areas (dos Santos, 2020; Lin and Chen, 2004; Xu et al., 2018).

In recent years, machine learning (ML) algorithms have been widely applied to spatial interpolation in different disciplines (Appelhans et al., 2015; Kisi et al., 2017). These algorithms have been found to produce remarkably precise results if dependent variable is more correlative with independent variables (Sekulić et al., 2020). The algorithms can handle non-linear relationships between variables (Hashimoto et al., 2019; Hengl et al., 2015), and can maintain high prediction accuracy even with fewer input data (Erdélyi et al., 2023). As one of the ML algorithms, for example, the k-nearest neighbors (KNN) algorithm, interpolates with high accuracy and is easy to implement because it needs not train the algorithm or estimate any parameters in advance (Keller et al., 1985; Peterson, 2009). These ML algorithms have been further improved by introducing solution algorithms, such as Bayesian parameters estimation (Song et al., 2015), generalized additive models (Aalto et al., 2013) and radial basis function (Carlson and Foley, 1991; Lin and Chen, 2004). These algorithms, however, heavily depend on the strength of the predictive relationship between dependent and independent variables, but generally ignore location information, or spatial autocorrelation, of data to be interpolated. Therefore, these algorithms still face great challenges over complex terrain areas with sparsely observed data (Hengl et al., 2018; Sekulić et al., 2020).

Convolutional neural network (CNN) algorithm, as a method of deep learning, is increasingly and more widely applied due to many significant advantages. One of the advantages is to extract and apply features from adjacent areas by convolution operations (Hinton and Salakhutdinov, 2006; LeCun et al., 1989; Shi et al., 2015; Zhu et al., 2020), which makes the resulting algorithm extremely effective in representing data and approximating functions (LeCun et al., 2015). Presently, CNN has been applied to downscale reanalysis data from a spatial resolution of about 25 km to about 3 km (Jiang et al., 2021). It was found to perform better than machine learning to merge on-ground observations, remote sensing data and downscaling data to estimate precipitation spatiotemporal data (Nan et al., 2023). As on-the-ground direct observations, precipitation gauge data generally is considered being more reliable (Fensholt and Rasmussen, 2011; Jiang et al., 2012). The gauge observation-based spatial interpolation data for precipitation are more important as benchmarks for satellite remote sensing data, reanalysis data, and even as input for land surface process simulation and ecosystem process modelling (Wang, 2017). But until the present, it has not been clear how CNN’s performance is for spatial interpolation from on-the-ground precipitation observations, especially in a region with sparse observations over complex terrain.

The Qinghai-Tibetan Plateau (QTP) plays an important role by influencing regional and global atmospheric circulation, often described as being the Earth’s “third pole” and since it contains the headwaters of many rivers, the “Asian water tower.” This region provides ecosystem services and human wellbeing to a large downstream area of Asia (Yao et al., 2022). Insights into the spatiotemporal change of precipitation over the QTP and its possible mechanisms of influence can help us to gain further in-depth understanding of the hydrological cycle instability in the Asian Water Tower region (Guo and Tian, 2022; You et al., 2015). Therefore, accurate precipitation data is crucial for studying the changes in precipitation over the QTP. Spatial data of precipitation is currently mainly interpolated through insufficient observations from meteorological stations or retrieved through satellite remote sensing or climate model-based reanalysis (He et al., 2022; Jiang et al., 2021). However, satellite-based precipitation estimation, such as using the TRMM dataset (Islam and Uyeda, 2007; Vallejo-Bernal et al., 2021), can be distorted by the surrounding environment (Pedersen et al., 2010) and are subject to some uncertainties, requiring thorough validation (Nan et al., 2023). Reanalysis data, including the ERA5 dataset (Copernicus, 2017), usually have coarse spatial resolution (Andermann et al., 2011) and has been found to contain larger uncertainties in areas of complex terrains (Gao et al., 2020; Krakauer et al., 2013), which restricts its application for quantifying spatial distribution (Peng et al., 2019; Y et al., 2021). As a more reliable data source, spatial precipitation data can be more widely interpolated through gauge observations (Gong et al., 2022; He et al., 2022; Wang, 2017). However, a major challenge is how to generate accurate precipitation with high spatiotemporal resolution when faced with insufficient gauge observations, especially in some remote areas with mountain terrain, such as exists over the QTP (Chen et al., 2010; Nan et al., 2023). A deep learning-based CNN algorithm offers the potential to more accurately reproduce spatial precipitation data, leveraging its ability to extract intricate spatial features from sparse observations, especially in mountainous terrain.

Therefore, we believe this paper is the first research attempt to apply an Attention-Gated Convolutional Neural Network (A-GCN) in precipitation spatial interpolation from gauge observations. It is intended to improve spatial interpolation accuracy with insufficient observations over a mountainous terrain region. It is explored as a tool to estimate spatiotemporal change and possible mechanisms of precipitation change over the QTP, which not only provide understanding about climate change and the hydrological cycle on the plateau, but also facilitates the application of a new deep learning algorithm in environmental research.

2.1 Daily precipitation observations on meteorological stations

Daily precipitation observations exist from January 1980 to December 2020, from the National Meteorological Information Center of China Meteorological Administration (CMA). The daily precipitation data were collected from 215 meteorological stations in the QTP region (Fig. S1). The data quality was checked for data completeness and the stations missing more than 20% of data points were labeled as incomplete and excluded from input for A-GCN algorithm. For missing data of less than 20%, the inverse distance weighted interpolation method was used to ensure interpolation reliability and utilization of the limited number of station observations to the fullest extent possible (Wang et al., 2017). Monthly total precipitation data were calculated and applied as the inputs for the A-GCN and KNN algorithms.

2.2 Data for algorithm validation and evaluation

The algorithms were evaluated through cross validation, independent observation-based validation and by comparing with available present datasets. The precipitation independently measured at four ecological stations was used for validation in this study. The four ecological stations are Dangxiong, Haibei, Gonggashan, and Linzhi, all located on the QTP (Fig. S1), and daily precipitation were provided for the period of 2004–2010, 2003–2010, 1998–2006, and 2001–2007 respectively. The monthly total precipitation data were calculated and used as the independent observation-based validation of the A-GCN and KNN algorithms.

The algorithms were also evaluated by comparing with five published datasets, Worldclim, CRU TS4. 06 (PRCP_CRU), PRE_DS, ERA5 and MeteoGrid. The WorldClim is a monthly dataset with spatial resolution of 2.5 arc minutes (approximately 4.6 km) for China from 1960 to 2021 and was downscaled from reanalysis data of Climate Research Unit (CRU) of 0.5 arc degree. The CRU TS v4 is 0.5° resolution monthly precipitation dataset for the globe from 1901 to 2018 and was generated with an angular-distance weighting algorithm with station observations data as the inputs. The PRE_DS is a 1 km monthly precipitation dataset for China from 1901 to 2022 and was downscaled from the global 0.5° climate dataset of the CRU and the global high-resolution climate dataset of WorldClim. The ERA5 is the fifth generation atmospheric reanalysis dataset generated by the European Centre for Medium-Range Weather Forecasts (ECMWF) and 1 km monthly global climate measures are available from 1979 to 2022 (ERA5-Land Monthly Aggregated – ECMWF Climate Reanalysis Earth Engine Data Catalog Google for Developers). The MeteoGrid dataset provides precipitation data with a spatial resolution of 1 km and a temporal resolution of 8 days from 1980 to 2018 and was generated through an algorithm from thin plate smoothing splines interpolation with the meteorological station observations as inputs.

2.3 Topographic data

The topographic data were extracted from a digital elevation model (DEM) for QTP in this study. The DEM data were from the Shuttle Radar Topographic Mission (SRTM) (https://srtm.csi.cgiar.org/). The relief amplitude, elevation, latitude, and longitude, with spatial resolution of 1 km, are used as the covariates in the study, which can reduce the influence of the sparse meteorological stations and complexity of terrain to a certain extent, thus improving interpolation accuracy (Appelhans et al., 2015; Hutchinson, 1995).

2.4 Monsoon index data

The spatiotemporal distribution of precipitation on the QTP is influenced by monsoon climate and the correlation was analyzed based on the South Asian Summer Monsoon Index (SASMI), the Southwest Asian Summer Monsoon Index (SWAMI) and the East Asian Summer Monsoon Index (EASMI) (Li and Zeng, 2002; Li and Zheng, 2003). The SASMI is defined as an area-averaged seasonally, Jun, July and August (JJA) at 850 hPa in the South Asian domain, 5°-22.5°N, 35°-97.5°E, and the SWAMI in the Southwest Asia, 2.5°-20°N, 35°-70°E. The EASMI is defined as an area-averaged seasonally (JJA) in the region with the latitude between 10°-40°N and longitude 110°-140°E.

2.5 Watershed Boundary Dataset

The watershed boundary dataset of the QTP was created using Shuttle Radar Topography Mission (SRTM) Digital Elevation Model (DEM) data collected from airborne radar topography missions in the year 2000 (Guoqing, 2019). With the hydrological modeling in ArcGIS software, the river network was analyzed and extracted and the QTP was divided into 12 sub-basins, namely AmuDayra, Brahmaputra, Ganges, Hexi, Indus, Inner, Mekong, Qaidam, Salween, Tarim, Yangtze, and Yellow.

3.1 A-GCN algorithm

The Attention-Gated Convolutional Neural Network (A-GCN) was improved to adapt the spatial interpolation of meteorological observations in this study. Then the improvement is evaluated through ablation experiments to quantify the contribution from gated convolution and attention mechanisms for the final algorithm.

3.1.1 Convolutional Neural Networks (CNN)

In this study, CNN was improved by adapting spatial interpolation with the meteorological station observations and their geographic information as covariables. Firstly, the vectors data on the station distribution were converted to a rasterized data of 1 km spatial resolution. Each pixel was given the value of 1 if there is a station on the pixel and 0 if without a station on the pixel (Yu et al., 2018) (Fig. 1A). Secondly, the weighted pooling layer is used to predict precipitation for the pixels without observations by multiplying the meteorological station observed precipitation and their corresponding weights using a 2 $\:\times\:\:$2 moving window (Fig. 1B). In a 2 $\:\times\:\:$2 moving window, if there are $\:n$ ($\:n$ >0) meteorological observation points at the same time on the pixel, the weight of the pixel is $\:\frac{1}{n}$, otherwise the weight is 0. Comparing to the maximum pooling layer and the average pooling layer, the weighted pooling layer effectively reduces data contamination by filtering out invalid values (Zeiler and Fergus, 2013; Zhai et al., 2017). Additionally, it enhances training efficiency, improves algorithmic stability, and mitigates the risk of overfitting (Stergiou et al., 2021).

Multi-scale precipitation features, such as $\:m\:\times\:m$, $\:\frac{m\:}{2}\times\:\frac{m}{2}$, etc., are generated based on the weighted pooling layer and fed into the convolutional layer, with the aim of adding different scales of precipitation information to enable convolutional learning of richer feature information (Gu et al., 2018). Afterwards, the filter performs convolution computation on multi-scale local input data (Fig. 1D). The convolution was done with local data in a 2 $\:\times\:$ 2 moving window over the whole study area. The precipitation is predicted by spatial interpolation using many different kernels formulated as the following (Zhu et al., 2020):

$$\:{a}^{l}=f(\sum\:_{i\:\in\:M}{x}_{i}^{l-1}\otimes\:{\omega\:}_{i}^{l}+{b}_{i}^{l})$$

where $\:{x}_{i}^{l-1}$ is the element in the convolution region of the $\:i$ convolution kernel of layer $\:l-1$; $\:\otimes\:$ for cross product, $\:{{b}_{i}}^{l}$ and $\:{\omega\:}_{i}^{l}$ are the bias and weight matrix of the $\:i$th convolution kernel of layer $\:l$th. $\:M$ is the convolution region, $\:{a}^{l}$ is the output of the convolution of the$\:\:l$ layer through the activation function$\:\:f$ that extracts nonlinear features for CNN (Simonyan and Zisserman, 2014). The activation function used in this study is the LeakReLU function, which continuously optimizes its own network parameters (Ioffe and Szegedy, 2015; Liu et al., 2018). Predicted values are obtained by fusing multi-scale information through the $\:\otimes\:$. The information loss was quantified by the loss function between the observations and the predictions. In this study, the joint loss of smoothing loss and MSE is used as the overall loss function, which is fast converging and able to give proper weight to the gradient, so that the direction of gradient updating can be more accurate (Nair and Hinton, 2010; Weinman et al., 2011). If the loss approaches the minimum and stable, the predictions are output as the interpolation results in this study.

3.1.2 Applying Gated Convolution.

In this study, a gated convolution was applied to automatically learn the features at different scales of the observations and its surrounding topographic features of the regions without stations. After the weight pooling layer filling operation, in the region without any meteorological stations, prediction values are further filled by gated convolution in the 2 $\:\times\:$ 2 moving window to add more feature information, to reduce the influence of invalid values and to improve interpolation accuracy (Fig. 1C). As an improved version of general convolution that takes into account peripheral features, a gated convolution network can learn the features of each location labeled as valid in the given space and extract the spatial change trend in observations from higher scale to output prediction values for the lower scale (Yu et al., 2019), by which the network gradually deepens to a degree until all invalid data are eventually updated to valid values 1 as defined in the section 3.1.1(Yu et al., 2019). The gate convolution is computed as follows:

$$\:gating=sigmod\left(\sum\:_{i\:\in\:M}{X}_{i}^{l-1}\otimes\:{\omega\:}_{i}^{l}+{b}_{i}^{l}\right)$$

$$\:{a}^{1}=gating⨀{a}^{1}$$

The variables inside this equation are the same as in Eq. (1), the $\:sigmod$ is the activation function and output values are limited to 0 to 1, the $\:gating$ is the weight between elements in a 2 $\:\times\:$ 2 matrix, the Eq. (3) is used as the prediction function.

3.1.3 Applying Attention Mechanism.

The gated convolution process considers data features layer by layer through a 2 $\:\times\:$ 2 convolution kernel, but which cannot take global information into account, and it is difficult to learn useful features if there are no meteorological stations near the point to be predicted (Yu et al., 2018). To overcome this limitation, we introduce the attention mechanism in the CNN for the first time for precipitation spatial interpolation. In this study, the attention mechanism was realized through the spatial autocorrelation between meteorological stations:

$$\:{distance}_{i}=\sqrt{{{({m}_{i}-\dot{m})}^{2}+\:({p}_{i}-\dot{p})}^{2}+[{\left({x}_{i}-\dot{x}\right)}^{2}+\dots\:]}$$

$$\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:{weight}_{i}=softmin({distance}_{i},\left\{{distance}_{0},{distance}_{i+1},\dots\:,{distance}_{n}\right\})$$

where$\:{m}_{i}$ and$\:\dot{m}$ represent the location of the predicted points and all surrounding meteorological observation stations, $\:{p}_{i}$ and $\:\dot{p}$ correspond to the trends of precipitation, $\:{x}_{i}$ and $\:\dot{x}\:$correspond to topographical factors of the predicted point and all surrounding meteorological observation stations such as the elevations and relief amplitude. The previous studies showed that considering geographic context (longitude, latitude and topographical data) and the precipitation amount of adjacent pixels can greatly improve prediction accuracy (Behrens et al., 2018; He et al., 2016; Li et al., 2011). The $\:{distance}_{i}$ incorporates information on topography, location and precipitation trends and is the distance information for surrounding meteorological observation stations $\:i$ with the predicted point,$\:\:\left\{{distance}_{0},{distance}_{i+1},\dots\:,{distance}_{n}\right\}$ represents the distance information for surrounding meteorological observation stations $\:0,$ $\:i-1,$ $\:i+1,$ $\:n$ on the predicted points $\:i$, the Euclidean distance inside the Cdist function was used to calculate the weights (Zeng et al., 2019).

The $\:{weight}_{i}\:$represents the weight of surrounding meteorological observation stations on the predicted point i, indicating the spatial autocorrelation between these points. The spatial autocorrelation among meteorological stations is calculated by normalizing these weights using the $\:Softmin$ function within the attention mechanism (Fig. S7). The attention mechanism was assumed to improve the interpolating accuracy of precipitation based on the spatial correlations among meteorological station observations in sparsely covered areas with complex topography.

In the attention mechanism, similarity is a key step in calculating the attention weights. The distance ($\:pdist$) similarity was used to replace the dot product similarity in the attention mechanism in the previous study (Yu et al., 2018). The dot product similarity and $\:pdist$ similarity are formulated as follows:

$$\:corr=Softmax\left(X{X}^{T}\right)$$

$$\:corr=Softmin\left(pdist\right(X\left)\right)$$

Where the $\:corr$ represents a measure of similarity, the $\:Softmax$ as a common mathematical function used to convert the raw output of a model to a category probability, usually used in maximization problems. The $\:Softmin$, which corresponds to $\:Softmax$, is also used to convert the elements of a vector into a probability distribution and is usually used in minimization problems. The $\:pdist$ is used to compute the distances of individual stations from each other. To determine the optimal function of similarity distances, the iteration experiments were applied though the same network but the different similarity function of Eqs. (6) and (7) respectively, both network experiments converge after 50000 iterations and the loss respectively were 13.5 mm and 14.0 mm for the $\:pdist$ and dot product similarity modules (Fig. S2a). Based on the experiments, the $\:pdist$ similarity module was applied for its better performance in our algorithm (Fig. S2b).

3.1.4 Ablation experiments

Ablation experiments were designed and the three network structures were proposed to evaluate the performance of the proposed module in terms of converging speed when training the model and accuracy to predict precipitation with sparse observations. The three structures are gated convolution network with elevation as covariable (GCN), attention-general convolution network with elevation as covariable (AGC), and attention-gated convolution network with elevation and relief amplitude as covariables (A-GCN). The model is iteratively convolved and trained until its residual loss is minimized and stable. The model structures were assessed by the residual loss between predicted and observed precipitation and the evaluation metrics of determination coefficient and RRMSE through cross validation.

3.2 K-Nearest Neighbors (KNN) algorithm

The K-nearest neighbor algorithm (KNN), one of the most popular machine learning algorithms, is applied as a candidate algorithm in this study. The KNN algorithm is a semi-supervised learning and non-linear regression prediction method, where the quality of the interpolation results significantly hinges on the value of the parameter K, which denotes the number of nearest neighbors. (Larose and Larose, 2014). The prediction accuracy can be significantly affected if the K value is either too large or too small. In this study, based on interpolation experiments, a K value of 5 was chosen as it yielded the highest accuracy. The KNN algorithm involves two functions of distances and weights of the nearest neighbors.

The weights are calculated through the inverse of the distance between the point to be predicted and its nearest 5 neighbors. The distance could be defined and calculated by Euclidean distance, Mahalanobis distance and Chebyshev's distance (Peterson, 2009) and in this study the Euclidean distance is used in the following:

$$\:{d}_{i}=\sqrt{\sum\:_{i=1}^{n}{\left(V-{\stackrel{-}{V}}_{i}\right)}^{2}}$$

Where $\:{d}_{i}$ is the distance between the point to be predicted and its nearest neighbor i, $\:V$ is the coordination, elevation, and relief amplitude of the $\:i$ point to be predicted, $\:{\stackrel{-}{V}}_{i}$ is the corresponding values of the $\:i$ neighbor of the point being predicted and $\:n$ is the number of nearest neighbors to the point to be predicted.

3.3 Evaluation Metrics

Relative Root Mean Square Error (RRMSE), and coefficient of determination (R²) were used to evaluate the performance of the algorithm and accuracy of the predictions. The R² was calculated as the square of the correlation between the predictions and observations (Chen et al., 2010; Graf et al., 2019). RRMSE is defined separately as follows:

$$\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:RRMSE=\:\left[\frac{1}{n}\sum\:_{i=1}^{n}\left(\right({s-{s}_{i})/{s}_{i})}^{2}\right]1/2$$

Where, $\:n$ is the number of data pair of prediction and observation, the $\:s$ and $\:{s}_{i}$ are respectively the prediction and the observation.

3.4 Random sampling

Random sampling was applied to explore the effect on interpolation accuracy with a sampling rate that was defined as the percent of the number of observations used in the interpolation to the total number of the available observations. There are a total of 179 station observations available for precipitation in the QTP region in 2015. Observations at the same time were randomly sampled at the rate of 10% and began from 10–100%, that is, the sampled observation numbers were 18, 36, 54, 72, 90, 107, 122, 143, 161, and 179 respectively. Based on the sampled observations, the precipitation was interpolated through A-GCN algorithm, and its accuracy was evaluated, by which it was explored how sparse observations influence spatial interpolation precision over complex terrain in QTP.

3.5 Surface Rainfall Calculation Based on Thiessen Polygons

Thiessen Polygons is a method proposed by the Dutch climatologist A. H. Thiessen in 1911 to calculate average rainfall based on discrete air rainfall stations. It involves connecting neighboring rainfall stations within a certain area or adjacent areas to form triangles. The vertical bisectors of these triangles intersect at the center of the circumcircle, creating multiple polygons.(Thiessen, 1911). Each polygon contains a rainfall station, and the rainfall measured at that station represents the average rainfall for the polygon's area. In hydrology, the surface rainfall is expressed as (Nganro et al., 2020):

$$\:P=\:\frac{{a}_{1}{x}_{1+}+{a}_{2}{x}_{2+}\dots\:+{a}_{n}{x}_{n}}{{a}_{1}+{a}_{2}+\dots\:+{a}_{n}}$$

Where: $\:P$ is the average precipitation in the region for a certain period of time, mm; $\:{x}_{i}$ is the precipitation at the $\:i$ station in the region at the same time, mm; $\:{a}_{i}$ is the area of the polygon where the $\:i$ station is located, km²; $\:n$ is the number of regional stations.

4.1 Algorithm validation

The newly developed algorithm, A-GCN, was evaluated through cross-validation with the observations from meteorological stations, independent validation with observation from ecological stations, and comparing with the available datasets until present for the QTP. According to the cross-validation, the A-GCN showed better accuracy than the machine learning-based KNN model-based precipitation data. The A-GCN had a higher R² of 0.95 (p < 0.001), and the lower RRMSE 0.30 (Fig. S3a), while the KNN model had the R² and RRMSE of 0.84 and 0.51, respectively (Fig. S3b). The A-GCN algorithm had an R² 13.10% higher and an RRMSE 41.18% lower than those of the KNN model.

The cross-validation for each station showed the A-GCN algorithm can explain 95% of the temporal variability, that is, R² = 0.95, averaged for all stations with an average RRMSE of 0.28. Spatially, for over 95% of the stations, the A-GCN algorithm can explain over 90% of the temporal variability. Only 0.49% of the total number of stations had an R² lower than 0.8, and these stations are predominantly situated in the extremely arid regions of the QTP (Fig. S4). Meanwhile, the RRMSE is mainly concentrated between 0.1–0.3 at most stations (75.86% of the total number of stations). And the stations with an RRMSE over 0.5 accounts for only 6.37% of the total station number. Those stations with the higher RRMSE are mainly distributed in extremely arid regions over the QTP (Fig. S4). Because the data used for the cross-validation was the time series data of all stations over the whole region, the cross-validation found the A-GCN algorithm had very good performance estimating the spatiotemporal variability in monthly precipitation observations, especially the observations in the extremely arid region, over the QTP.

The A-GCN algorithm was further validated through independent observations from the ecological stations to overcome possible auto-correlation in the cross validation. The results showed the monthly precipitation interpolated by both A-GCN and KNN are significantly linearly correlated with the observed precipitation on the ecological stations (Fig. 2a, b, c, d). The precipitation interpolated by the A-GCN algorithm can explain the monthly variability of 85% in the observed data for all four stations, while it was 82% for the KNN. Meanwhile, the A-GCN had smaller bias (RRMSE = 0.45) than that (RRMSE = 0.48) by the KNN. The A-GCN showed better performance with a higher R² by 2.97% and a lower RRMSE by 6.93% than the corresponding metrics from the KNN model (Fig. 2e, f, g, h and Table S2).

Comparing with the available five datasets, the A-GCN-based precipitation (PRCP_A−GCN) algorithm showed the best performance to capture the spatiotemporal variability in the precipitation observations (PRCP_OBS) on all meteorological stations over the QTP (Fig. 5b and Fig. S8). The PRCP_A−GCN significantly and linearly correlated with the PRCP_OBS with the highest R² (0.95) and the lowest RRMSE (0.30) among the available datasets. However, the available datasets (ERA5, PRE_DS, PRCP_CRU, MeteoGrid, Worldclim) only can explain the spatial-temporal variability of 67% (R² = 0.67) with a range from 56–73% and had the larger RRMSE of 0.86 with the range 0.69 to 1.14 for the region (Table S3). The dataset in this study had a higher R² of 41.79% and a lower RRMSE of 65.12% than the corresponding mean metrics of the five datasets and showed the best performance among all available datasets for the QTP.

4.2 Spatiotemporal changes in precipitation

An insignificant declining trend was found in the annual total precipitation (ATP) time series over most of the area since 1980 to 2020 according to the A-GCN-based interpolated data in this study (Fig. 3 and Table S4). The regional mean ATP was insignificantly decreasing by the rate of -5.40 mm 10a^− 1 (R² = 0.04, p = 0.21). Spatially, the significant declining trend distributed over the area of 7.78% on the QTP, and the significant increasing area was only 1.76%, and there are more pronounced spatial differences (Fig. 3 and Table S4). The southeast QTP showed a fast-decreasing precipitation rate of -14.06 mm 10a^− 1 (R² = 0.13, p = 0.02), followed by the southwestern (Rate = -8.41 mm 10a^− 1, R² = 0.03, p = 0.30) and the northwestern (Rate = -3.15 mm 10a^− 1, R² = 0.01, p = 0.56). The interannual trend of precipitation in the northeastern area was insignificant with the rate of 3.68 mm 10a^− 1 (R² = 0.02, p = 0.37) but in this sub-region, there was 3.14% of the area where the precipitation significantly increased in the period since 1980 to 2020 (Table S4).

Further the turning point (TP) was detected for the ATP over the plateau during the period of 1980 to 2020. Though the TP varied among the stations across the plateau, the TP occurred around the year 2000, with the highest frequency of 36.8% in all station number of 215 based on observations from the meteorological stations (Fig. S9). Comparing to the insignificant increasing trend of 17.19 mm 10a^− 1 (R² = 0.10, p = 0.17) during the early period from 1980 to 2000, the precipitation for the whole plateau significantly decreased with a rate of -29.18 mm 10a^− 1 (R² = 0.39, p = 0.00) in the most recent period from 2000 to 2020, which was − 1.71 times the rate during the early period. Except of the northeast, the precipitation in the most recent period showed a speedup of declining rates of 2.32, -1.28 and − 1.56 times of the rate in the early period respectively for the southeast, southwest and northwest sub-regions. Exceptionally, the precipitation in the northeast plateau changed without a significant trend in the two periods or the whole period from 1980 to 2020.

This drying phenomenon was also more widespread in the period from 2000 to 2020 than in the period from 1980 to 2000 (Table S4). In the whole area of the region, the drying area with the significant decreasing precipitation was only 3.91% in the early two decades but was 37.34% in the recent two decades over the whole plateau. The drying region mainly distributed in the northwest, accounted for an area of 64.73% of this sub-region, southwest (41.88%), southeast (26.04%) in the recent period, while the area respectively was 0.00%, 0.01% and 7.15% in the previous period for the three sub-regions.

The results illustrate that the QTP has not become wetter since the turn of the century; on the contrary, the precipitation trend spatiotemporally shifted from the southeast and southwest increasing with the northeast decreasing in the previous 20 years to the southeast and southwest decreasing and the northeast increasing in the most recent 20 year, with an accelerated drying trend over a wider spatial extent in the most recent period comparing with that of the previous period.

5.1 Uncertainties assessment

Spatial interpolation of precipitation is a great challenge due to stochasticity of precipitation, insufficient observations and probably complex terrain (Aalto et al., 2013; Zhu et al., 2019). This pioneering study developed a deep learning-based precipitation spatial interpolation algorithm (A-GCN), for the first time. The developed A-GCN demonstrated excellent performance due to its higher accuracy and lower bias, quantified by cross-validation (Fig. S3), independent validation (Fig. 2) and dataset comparisons (Fig. S8). The performance could be attributed to the improvement of the CNN algorithm structure by applying attention mechanism and gated convolution, using terrain factors as co-variables.

The algorithm was improved through improving its structure as shown by the ablation experiments (Fig. S2). The ablation experiments were applied to test the three versions of the algorithm: the final version of A-GCN, the gated convolutional version (GCN) and the version with attentive mechanism (AGC). The A-GCN showed the lowest bias and highest R² among the three versions in this region with a low number of station observations (Fig. S5). Without the attentive mechanism, the GCN has a lower accuracy because it only can learn local features through a moving window and is difficult to learn useful features for the location to be predicted with less neighbor stations. Both the A-GCN and AGC algorithms introduce the attention mechanism that can extract spatial autocorrelation between the location to be predicted and its surrounding neighbor stations by fusing the information of location, terrain and precipitation amount. This deep learning algorithm has also demonstrated that attentional mechanisms can extract information such as long distance dependency relationships (Vaswani et al., 2017).

Using terrain relief as a co-variable would be another reason of having higher interpolation accuracy for the A-GCN algorithm, along with traditional co-variables of geographic coordinates and elevation. In geography, two stations with similar topography are considered to be characterized with having similar geographic attributes, i.e., stations with similar topography should have similar precipitation distributions (Oleg Antonić, 2001; Pereira et al., 2010; Zhu et al., 2018). Therefore, more topographic information will improve the accuracy of precipitation interpolation, which was illustrated by the ablation experiments that both GCN and AGC without considering terrain inevitably lead to larger bias and lower accuracy than the A-GCN (Fig. S5).

Therefore, in this study, the new algorithm, incorporating the gated-attentive mechanism, captures the autocorrelation among precipitation observations and its correlation with topography among the stations sparsely distributed well, which resolved the problem of insufficient observations over complex topography in the precipitation interpolation to a great extent.

The accuracy of spatial interpolation, however, is bound to be influenced by many sources of uncertainty, such as the number of meteorological observations (Chen et al., 2010; Hutchinson et al., 2009), which cannot be completely avoided for any interpolation algorithm, including the A-GCN. According to the random sampling test, when the sample rate is higher than 40%, the A-GCN will approach a relative higher accuracy with an R² of more than 0.8 and the RRMSE less than 0.2 (Fig. S6b). Meanwhile, the accuracy, as indicated by the RRMSE and R², is significantly and positively correlated with precipitation intensity and can be well fitted by an exponential function (Fig. S6a). Previous studies have also showed that the minimum spatial autocorrelation increases with precipitation intensity, and the range of spatial autocorrelations is very large under conditions of lower precipitation (Chen et al., 2010; Hutchinson et al., 2009). The distance of the nearest neighbor that exerts the influence of the weights in this study is also increasing in the western region with lower precipitation (Fig. S7), which means that more distant stations have an influence on the values to be estimated in the western region to be interpolated, which has the benefit of improving the accuracy of interpolating the inter-annual variability in the region where the stations are sparse but also leads to a possible overestimation of precipitation in the western region, bringing about a systematic bias. This is an area where the algorithm can be further improved in the future.

5.2 Precipitation change and possible monsoon influence.

This study found an insignificant dryness trend in precipitation from 1980 to 2020 and its dryness rate increased over the last 20 years from the previous 41 years over the QTP region as showed by both interpolated data and observations from the meteorological stations (Fig. 3). The different trends and rates were reported in publications for different periods and were calculated from the present available datasets for the same period from 1980 to 2020 in this study (Table S5). However, the terrestrial total water storage estimated from the Gravity Recovery and Climate Experiment (GRACE) also showed a decreasing trend by -3.4 mm 10a^− 1 (p = 0.57) from 2002 to 2020 and − 4.3 mm 10a^− 1 (p < 0.001) from 2002 to 2019 estimated from different data sources, respectively (Table S5). Based on those results, it could be referred that the precipitation has an increasing trend over the long-term since 1961, while it would be decreasing since 1980, and the decreasing rate is increasing after the turning-point of 2000 over the QTP (Fig. 3, Fig. 4, and Fig. S10a). The precipitation trend spatiotemporally transited from an increasing southeast-southwest trend and a decreasing northeast previously to the southeast-southwest decreasing trend with an increasing northeast trend in more recent decades (Fig. 3). Such transition, as a south drying and north wetting, is considered being an interdecadal manifestation of the precipitation dipole shaped by the two large-scale circulations of the westerlies and the Indian monsoon (Yao et al., 2022).

Influence from the Indian monsoon was quantified through the monsoon indexes defined by Li et al. (Li and Zeng, 2002; Li and Zheng, 2003), which was dependent on the regions over the QTP. The precipitations in the south, the northwest and the whole region were significantly correlated with the SWAMI (Fig. 6e, f), but the correlations with EAMSI were not significant in this study (Fig. 6a, b). Meanwhile, the SWAMI showed a decreasing trend in the whole period from 1980 to 2020 (Rate = -0.01 a^− 1, R² = 0.11, p = 0.03), with a significant turning point around 2005. Before the turning point, the SWAMI increased by a rate of 0.01 (R² = 0.24, p < 0.01), while it decreased after the turning point by the rate of -0.04 (R² = 0.32, p < 0.01).

Regionally, the precipitation of the western QTP is influenced by the Southwest Asian summer monsoon according to its telecorrelation with the SWAMI (R² = 0.25, p < 0.001), and with the SASMI (R² = 0.18, p < 0.05) in this study (Fig. 6c, e). The strengthening of the Indian monsoon brings abundant moisture to the QTP and enhances its influence mainly in the western QTP, though the Indian monsoon during the summer season in the Bay of Bengal is getting weaker and weaker, making its moisture to the west decrease, making precipitation in the west show an insignificant decreasing trend (Liu and Yin, 2001; Xie et al., 2010; Zhang et al., 2024). The westerly winds probably influence the precipitation in the northern QTP, except for the weak effects from the Indian monsoon (Zhang et al., 2019; Zhang et al., 2024). The enhanced mid-latitude westerly winds transport more moisture to the northern QTP and results in more precipitation in this region (Yao et al., 2012; Zhang et al., 2024). Those could explain the spatiotemporal transition in precipitation over the QTP in the most recent 40 years.

5.3 Implication

This study, for the first time, applied an Attention-Gated Convolutional Neural Network algorithm to interpolate precipitation observations from the meteorological stations on the Qinghai-Tibetan plateau. The independent validation, cross-validation, and comparison with present available datasets demonstrated that resulting interpolated data have the highest accuracy. This indicates that A-GCN is particularly suitable for precipitation interpolation for a region with sparsely distributed stations and complex terrain, such as the QTP.

Analysis revealed an insignificant drying trend in the interpolated precipitation data on the QTP, which could be attributed to the weakening of the Southwest Asian summer monsoon. Further, this study quantified the spatiotemporal transition of the north-south precipitation trend which happened around the year 2000. The transition was not only influenced by the dipole precipitation pattern on a global scale, but also attributed to the teleconnection with the Southwest Asia Monsoon enhancing in the early period then weakening since 2005. This provides quantifying explanations to precipitation changes on the QTP.

It should be noted that such spatiotemporal transition in precipitation will inform climate change and hydrological cycle research, and further threaten biodiversity conservation and ecosystem restorations on the QTP. Therefore, it is necessary to pay more attention to climate feedback and ecosystem responses to precipitation changes.

In summary, this study provides a state-of-the-art methodological framework and practice full of implications. The framework could be applied in other geographic variables, such as air temperature, air relative humidity and sunlight hours in this region or other regions with less observations.

Spatially interpolated precipitation data are widely applied as the input of land surface process modelling in hydrology, but their contribution has been restricted by sparse data, especially over complex terrain regions, such as the Qinghai-Tibetan plateau. This is the first study to introduce a novel spatial interpolation algorithm, an Attentive Gated Convolutional Neural Network (A-GCN), to attempt to solve this problem. By applying an attention mechanism (gated convolution), the new algorithm inhibits the influence of insufficient observations and improves spatial interpolation accuracy. The analysis of the interpolated results reveals a shift in spatial and temporal trends of precipitation from the dry south to the wet north between 1980 and 2000, and in recent decades, along with a noticeable north-south dipole pattern of precipitation variability over the QTP. This precipitation trend is influenced by the weakening of the SWSMI. Further research is needed to understand how these precipitation changes over the QTP affect climate change (or vice versa), hydrological cycles, biodiversity, and ecosystems. Nevertheless, this study not only provides a dataset and methods but also offers a new solution for the spatial interpolation of high-precision estimates in complex terrain areas with sparse observations.

Funding

This work was financially supported by the Second Tibetan Plateau Scientific Expedition and Research (STEP) program (Grant No. 2019QZKK0302-02), the CERN long-term observation data mining and annual data report (Grant No. KFJ-SW-YW043) and Chief Scientist Program of Qinghai Province (2024-SF-102).

Data availability statement.

The CMA station data were obtained from the National Meteorological Information Center (https://data.cma.cn/en/?r=data/detail&dataCode=A.0012.0001). The DEM data were from the Shuttle Radar Topographic Mission (SRTM) (https://srtm.csi.cgiar.org/). The Qinghai-Tibetan plateau Basin Boundaries Dataset (2016) (https://poles.tpdc.ac.cn/en/data/dff6b437-90a1-4729-8140-faafc544860f/). The monsoon index data (lijianping.cn/dct/page/65576). The precipitation dataset generated by interpolating the in this paper(https://ecodb.scidb.cn/en/detail?dataSetId=75dc6c8899d64fbd88fc8345cabe260c&version=V1&code=o00119).

Aalto, J., Pirinen, P., Heikkinen, J., & Venäläinen, A. (2013). Spatial interpolation of monthly climate data for Finland: comparing the performance of kriging and generalized additive models. Theoretical and Applied Climatology, 112, 99-111. https://doi: 10.1007/s00704-012-0716-9.
Andermann, C., Bonnet, S., & Gloaguen, R. (2011). Evaluation of precipitation data sets along the Himalayan front. Geochemistry. Geophysics. Geosystems, 12(7). https://doi.org/10.1029/2011GC003 513.
Appelhans, T., Mwangomo, E., Hardy, D. R., Hemp, A., & Nauss, T. (2015). Evaluating machine learning approaches for the interpolation of monthly air temperature at Mt. Kilimanjaro, Tanzania. Spatial Statistics, 14, 91-113. https://doi.org/10.1016/j.spasta.2015.05.008.
Behrens, T., Schmidt, K., Viscarra Rossel, R. A., Gries, P., Scholten, T., & MacMillan, R. A. (2018). Spatial modelling with Euclidean distance fields and machine learning. European Journal of Soil Science, 69(5), 757-770. https://doi.org/10.1111/ejss.12687.
Brus, D. J., & Heuvelink, G. B. (2007). Optimization of sample patterns for universal kriging of environmental variables. Geoderma, 138(1-2), 86-95. https://doi.org/10.1016/j.geoderma.2006.10.01 6.
Carlson, R. E., & Foley, T. A. (1991). The parameter R2 in multiquadric interpolation. Computers & Mathematics with Applications, 21(9), 29-42. https://doi.org/10.1016/0898-1221(91)90123-L.
Chao, L.vJ., Zhang, K., Li, Z.vJ., Zhu, Y.vL., Wang, J.vF., & Yu, Z.vB. (2018). Geographically weighted regression based methods for merging satellite and gauge precipitation. Journal of Hydrology, 558, 275-289. https://doi.org/10.1016/j.jhydrol.2018.01.042.
Chen, D. L., Ou, T. H., Gong, L. B., Xu, C.Y., Li, W. J., & Ho, C. H., et al. (2010). Spatial interpolation of daily precipitation in China: 1951–2005. Advances in Atmospheric Sciences, 27, 1221-1232. http://doi:10.1007/s00376-010-9151-y.
Copernicus. (2017). Fifth generation of ECMWF atmospheric reanalyses of the global climate,Copernicus Climate Change Service Climate Data Store (CDS).
Cressie, N. (1993). Statistics for Spatial Data. Wiley-Interscience, 928. https://doi:10.1002/9781119115 15151.
Devi, G., Ganasri, B., & Dwarakish, G. (2015). A Review on Hydrological Models. Aquatic Procedia, 4, 1001–1007. https://doi.org/10.1016/j.aqpro.2015.02.126.
dos Santos, R. S. (2020). Estimating spatio-temporal air temperature in London (UK) using machine learning and earth observation satellite data. International Journal of Applied Earth Observation and Geoinformation, 88, 102066. https://doi.org/10.1016/j.jag.2020.102066.
Erdélyi, D., Hatvani, I.bG., Jeon, H., Jones, M., Tyler, J.,& Kern, Z. (2023). Predicting spatial distribution of stable isotopes in precipitation by classical geostatistical-and machine learning methods. Journal of Hydrology, 617, 129129. https://doi.org/10.1016/j.jhydrol.2023.129129.
Fensholt, R., & Rasmussen, K. (2011). Analysis of trends in the Sahelian ‘rain-use efficiency’ using GIMMS NDVI, RFE and GPCP rainfall data. Remote Sensing of Environment, 115(2), 438-451. https://doi.org/10.1016/j.rse.2010.09.014.
Franke, R. (1982). Scattered data interpolation: tests of some methods. Mathematics of Computation, 38(157), 181-200. https://doi:10.1090/s0025-5718-1982-0637296-4.
Gao, Y., Chen, F., & Jiang, Y. (2020). Evaluation of a convection-permitting modeling of precipitation over the Tibetan Plateau and its influences on the simulation of snow-cover fraction. Journal of Hydrometeorology, 21(7), 1531-1548. https://doi.org/10.1175/JHM-D-19-0277.1.
Gong, H. B., Liu, H. Y., Xiang, X. Q., Jiao, F. S., Cao, L., & Xu, X. J. (2022). 1km Monthly Precipitation and Temperatures Dataset for China from 1952 to 2019 based on a Brand-New and High-Quality Baseline Climatology Surface. Earth System Science Data Discussions, 2022, 1-30. https://doi.org/10.5194/essd-2022-45.
Graf, R., Zhu, S., & Sivakumar, B. (2019). Forecasting river water temperature time series using a wavelet–neural network hybrid modelling approach. Journal of Hydrology, 578, 124115. https://doi.org/10.1016/j.jhydrol.2019.124115
Gu, J.X., Wang, Z.H., Kuen, J., Ma, L.Y., Shahroudy, A., & Shuai, B., et al. (2018). Recent advances in convolutional neural networks. Pattern Recognition, 77(.1), 354-377. https://doi.org/10.1016/j.patcog.2017.10.013.
Guo, X., & Tian, L. (2022). Spatial patterns and possible mechanisms of precipitation changes in recent decades over and around the Tibetan Plateau in the context of intense warming and weakening winds. Climate Dynamics, 59(7-8), 1-22. Https://doi:10.1007/s00382-022-06197-1.
Guoqing, Z. (2019). Dataset of river basins map over the TP（2016）. A Big Earth Data Platform for Three Poles. https://doi:10.11888/BaseGeography.tpe.249465.file.
Hashimoto, H., Wang, W. L., Melton, F. S., Moreno, A. L., Ganguly, S., & Michaelis, A.R., et al. 2019. High-resolution mapping of daily climate variables by aggregating multiple spatial data sets with the random forest algorithm over the conterminous United States. International Journal of Climatology, 39(6), 2964-2983. https://doi.org/10.1002/joc.5995.
He, Q., Wang, M., Liu, K. W., Li, K., & Jiang, Z. Y. (2022). GPRChinaTemp1km: a high-resolution monthly air temperature data set for China (1951–2020) based on machine learning. Earth System Science Data, 14(7), 3273-3292. https://doi.org/10.5194/essd-14-3273-2022.
He, X. G., Chaney, N. W., Schleiss, M., & Sheffield, J. (2016). Spatial downscaling of precipitation using adaptable random forests. Water Resources Research. 52(10), 8217-8237. https://doi.org/10.1002/20 16WR019034.
Hengl, T., Heuvelink, G.B.M., Kempen, B., Leenaars, J.G.B.,Walsh, M.G., & Shepherd, K.D.,et al. (2015). Mapping Soil Properties of Africa at 250 m Resolution: Random Forests Significantly Improve Current Predictions. PloS one, 10(6), e0125814. https://doi.org/10.1371/journal.pone.0125814.
Hengl, T., Nussbaum, M., Wright, M. N., Heuvelink, G. B., & Gräler, B. (2018). Random forest as a generic framework for predictive modeling of spatial and spatio-temporal variables. PeerJ, 6, e5518. https://doi: 10.7717/peerj.5518. eCollection 2018.
Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 313(5786), 504-507. https:// doi: 10.1126/science.1127647.
Hong, Y., Nix, H. A., Hutchinson, M. F., & Booth, T .H. (2005). Spatial interpolation of monthly mean climate data for China. International Journal of Climatology, 25(10), 1369-1379. https://doi.org/10.1002/joc.1187.
Hutchinson, M. F. (1995). Interpolating mean rainfall using thin plate smoothing splines. International journal of geographical information systems, 9(4), 385-403. https://doi:10.1080/02693799508902045.
Hutchinson, M. F., & Bischof, R. (1983). A New Method for Estimating the Spatial Distribution of Mean Seasonal and Annual Rainfall Applied to the Hunter Valley, New South Wales. Australian meteorological magazine, 31, 179-184.
Hutchinson, M. F., Mckenney, D. W., Lawrence, K., Pedlar, J. H., Hopkinson, R. F., & Milewska, E., et al., (2009). Development and testing of Canada-wide interpolated spatial models of daily minimum–maximum temperature and precipitation for 1961–2003. Journal of Applied Meteorology and Climatology, 48(4), 725-741. https://doi.org/10.1175/2008JAMC1979.1.
Ioffe, S., & Szegedy, C. (2015). Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. arXiv. abs/1502.03167.
Ishida, T., & Kawashima, S. (1993). Use of cokriging to estimate surface air temperature from elevation. Theoretical and Applied Climatology, 47, 147-157. https://doi: 10.1007/BF00867447.
Islam, M.N., & Uyeda, H. (2007). Use of TRMM in determining the climatic characteristics of rainfall over Bangladesh. Remote sensing of Environment, 108(3), 264-276. https://doi.org/10.1016/j.rse.200 6.11.011.
Jiang, S. H., Ran, L. L., Hong, Y., Yong, B., Yang, X. L., & Yuan, F., et al. (2012). Comprehensive evaluation of multi-satellite precipitation products with a dense rain gauge network and optimally merging their simulated hydrological flows using the Bayesian model averaging method. Journal of Hydrology, 452-453, 213-225. https://doi.org/10.1016/j.jhydrol.2012.05.055.
Jiang, Y. Z., Yang, K., Shao, C. K., Zhou, X., Zhao, L., & Chen, Y. Y., et al. (2021). A downscaling approach for constructing high-resolution precipitation dataset over the Tibetan Plateau from ERA5 reanalysis. Atmospheric Research, 256, 105574. https://doi.org/10.1016/j.atmosres.2021.105574.
Keller, J.M., Gray, M.R., & Givens, J.A. (1985). A fuzzy k-nearest neighbor algorithm. IEEE transactions on systems, man, and cybernetics, SMC-15(4), 580-585. https://doi:10.1109/TSMC.198 5.6313426.
Kisi, O., Sanikhani, H., & Cobaner, M. (2017). Soil temperature modeling at different depths using neuro-fuzzy, neural network, and genetic programming techniques. Theoretical and Applied Climatology, 129, 833-848. https://doi:10.1007/s00704-016-1810-1.
Krakauer, N. Y., Pradhanang, S. M., Lakhankar, T., & Jha, A. K. (2013). Evaluating satellite products for precipitation estimation in mountain regions: A case study for Nepal. Remote Sensing, 5(8), 4107-4123. https://doi.org/10.3390/rs5084107.
Larose, D.T., & Larose, C.D. 2014. k‐nearest neighbor algorithm. Discovering Knowledge in Data: An Introduction to Data Mining, 149-164. https:// doi: 10.1002/9781118874059.ch7.
LeCun, Y., Boser, B., Denker, J. S., Henderson, D.,Howard, R. E., & Hubbard, W., et al. (1989). Backpropagation applied to handwritten zip code recognition. Neural computation, 1(4): 541-551. https://doi: 10.1162/neco.1989.1.4.541.
LeCun, Y., Bengio, Y., & Hinton, G., (2015). Deep learning. Nature, 521(7553): 436-444. https://doi:10.1038/nature14539.
Li, J., Heap, A. D., Potter, A., & Daniell, J.J. (2011). Application of machine learning methods to spatial interpolation of environmental variables. Environmental Modelling & Software, 26(12): 1647-1659. https://doi.org/10.1016/j.envsoft.2011.07.004.
Li, J. P., & Zeng, Q. C. (2002). A unified monsoon index. Geophysical Research Letters, 29(8): 115(1-4). https://doi.org/10.1029/2001GL013874.
Li, J. P., & Zheng, Q. C. (2003). A new monsoon index and the geographical distribution of the global monsoons. Advances in Atmospheric Sciences, 20(2): 299-302. https://doi:10.1007/s00376-003-0016-5.
Lin, G. F., & Chen, L. H. (2004). A spatial interpolation method based on radial basis function networks incorporating a semivariogram model. Journal of Hydrology, 288(3-4): 288-298. https://doi.org/10.10 16/j.jhydrol.2003.10.008.
Liu, G. L., Reda, F. A., Shih, K. J., Wang,T. C., Tao, A., & Catanzaro, B. (2018). Image Inpainting for Irregular Holes Using Partial Convolutions. Computer Vision and Pattern Recognition, 11215, 89-105. https://doi.org/10.48550/arXiv.1804.07723.
Liu, X. D., & Yin, Z. Y. (2001). Spatial and temporal variation of summer precipitation over the eastern Tibetan Plateau and the North Atlantic oscillation. Journal of Climate, 14(13), 2896-2909. https://doi.org/10.1175/1520-0442(2001)014<2896:SATVOS>2.0.CO;2.
Nair, V., & Hinton, G. E. (2010). Rectified linear units improve restricted boltzmann machines. Department of Computer Science, 27, 807-814.
Nan, T. Y., Chen, J., Ding, Z. W., Li, W., & Chen, H. (2023). Deep learning-based multi-source precipitation merging for the Tibetan Plateau. Science China Earth Sciences, 66(4), 852-870. https://doi.org/10.1007/s11430-022-1050-2.
Nganro, S., Trisutomo, S., Barkey, R. A., & Ali, M. (2020). Rainfall Analysis of the Makassar City using Thiessen Polygon Method Based on GIS. Journal of Engineering and Applied Sciences, 15(6), 1426-1430. https://doi:10.36478/jeasci.2020.1426.1430.
Oleg, A., Križan, J., Marki, A., & Bukovec, D. (2001). Spatio-temporal interpolation of climatic variables over large region of complex terrain using neural networks. Ecological Modelling, 138(1-3), 255-263. https://doi.org/10.1016/S0304-3800(00)00406-3.
Oliver, M.A., & Webster, R. (1990). Kriging: a method of interpolation for geographical information systems. International Journal of Geographical Information Systems, 4(3), 313-332. https://doi:10.1080/02693799008941549.
Pedersen, L., Jensen, N. E., & Madsen, H. (2010). Calibration of Local Area Weather Radar—Identifying significant factors affecting the calibration. Atmospheric Research, 97(1-2), 129-143. https://doi.org/10.1016/j.atmosres.2010.03.016.
Peng, S. Z., Ding, Y. X., Liu, W. Z., & Li, Z. (2019). 1 km monthly temperature and precipitation dataset for China from 1901 to 2017. Earth System Science Data, 11(4), 1931-1946. https://doi.org/10.5194/e ssd-11-1931-2019.
Pereira, P., Oliva, M., & Baltrėnaitė, E. (2010). Modelling extreme precipitation in hazardous mountainous areas. Contribution to landscape planning and environmental management. Journal of Environmental Engineering and Landscape Management, 18(4), 329-342. https://doi.org/10.3846/je elm.2010.38.
Peterson, L. E. (2009). K-nearest neighbor. Scholarpedia, 4(2), 1883. https:// doi:10.4249/scholarpedl a.1883.
Prakash, S., Mitra, A. K., AghaKouchak, A., & Pai, D. (2015). Error characterization of TRMM Multisatellite Precipitation Analysis (TMPA-3B42) products over India for different seasons. Journal of Hydrology, 529, 1302-1312. https://doi.org/10.1016/j.jhydrol.2015.08.062.
Sekulić, A., Kilibarda, M., Heuvelink, G. B. M., Nikolić, M., & Bajat, B. (2020). Random Forest Spatial Interpolation. Remote Sensing, 12(10), 1687. https://doi.org/10.3390/rs12101687.
Seyyedi, H., Anagnostou, E.cN., Beighley, E., & McCollum, J. (2015). Hydrologic evaluation of satellite and reanalysis precipitation datasets over a mid-latitude basin. Atmospheric Research, 164, 37-48. https://doi.org/10.1016/j.atmosres.2015.03.019.
Shi, X. J., Chen, Z. R., Wang, H., & Yeung, D. Y. (2015). Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting. arXiv: Computer Vision and Pattern Recognition.
Simonyan, K., & Zisserman, A. (2014). Very Deep Convolutional Networks for Large-Scale Image Recognition. Computer Vision and Pattern Recognition. abs/1409.1556. https://doi.org/10.48550/arXi v.1409.1556.
Song, J. J., Kwon, S., & Lee, G. (2015). Incorporation of parameter uncertainty into spatial interpolation using Bayesian trans-Gaussian kriging. Advances in Atmospheric Sciences, 32, 413-423. https://doi:1 0.1007/s00376-014-4040-4.
Stergiou, A., Poppe, R., & Kalliatakis, G. (2021). Refining activation downsampling with SoftPool. IEEE/CVF International Conference on Computer Vision (ICCV), 10337-10346. https://doi: 10.1109/ICCV48922.2021.01019.
Thiessen, A. H. (1911). Precipitation averages for large areas. Monthly Weather Review, 39(7), 1082-1084. https://doi.org/10.1175/1520-0493(1911)39<1082b:PAFLA>2.0.CO;2.
Vallejo‐Bernal, S. M., Urrea, V., Bedoya‐Soto, J. M., Posada, D., Olarte, A., & Cárdenas‐Posso,Y., et al. (2021). Ground validation of TRMM 3B43 V7 precipitation estimates over Colombia. Part I: Monthly and seasonal timescales. International Journal of Climatology, 41(1), 601-624. https://doi.org/10.100 2/joc.6640.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., & Gomez, A. N., et al. (2017). Attention Is All You Need. Computation and Language,5. https://doi.org/10.48550/arXiv.1706.03762.
Wang, J. B., Wang, J. W., Ye, H., Liu, Y., & He, H. L. (2017). Spatially interpolated dataset of national temperature and precipitation on 1 km grid from 2000 to 2012. China Science Data, 8, 8. https://doi:1 0.11922/csdata.170.2016.
Weinman, J. J., Lidaka, A., & Aggarwal, S. (2011). Chapter 19-large-scale machine learning. GPU Computing Gems Emerald Edition, 277-291. https://doi.org/10.1016/B978-0-12-384988-5.00019-X.
Xie, H., Ye, J. S., Liu, X. M., & E, C. Y. (2010). Warming and drying trends on the Tibetan Plateau (1971–2005). Theoretical and Applied Climatology, 101(3-4), 241-253. https://doi:10.1007/s00704-009-0215-9.
Xu, Y. M., Knudby, A., Shen, Y., & Liu, Y. H. (2018). Mapping monthly air temperature in the Tibetan Plateau from MODIS data based on machine learning methods. IEEE journal of selected topics in applied earth observations and remote sensing, 11(2), 345-354. https://doi: 10.1109/JSTARS.2017.2 787191.
Miao, Y. X., Liu, R. M., Wang, Q. R., Jiao, L. J., Wang, Y. F., & Li, L., et al. (2021). Study of uncertainty of satellite and reanalysis precipitation products and their impact on hydrological simulation. Environmental science and pollution research international, 28(43), 60935-60953. https://doi: 10.100 7/s11356-021-14847-w.
Yao, T. D., Bolch, T., Chen, D. L., & Gao, J. (2022). The imbalance of the Asian water tower. Nature Reviews Earth & Environment, 3(10), 618-632. https://doi:10.1038/s43017-022-00299-4.
Yao, T. D., Thopson, L., Yang, W., & Yu, W. S. (2012). Different glacier status with atmospheric circulations in Tibetan Plateau and surroundings. Nature Climate Change, 2(9), 663-667.https://doi:1 0.1038/nclim ate1580.
You, Q. L., Min, J. Z., Zhang, W., Pepin, N., & Kang, S. (2015). Comparison of multiple datasets with gridded precipitation observations over the Tibetan Plateau. Climate Dynamics, 45(3-4), 791-806.https://doi: 10.1007/s00382-014-2310-6.
Yu, J. H., Lin, Z., Yang, J.bM., Shan, X.bH., Lu, X., & Huang, T. (2019). Free-Form Image Inpainting with Gated Convolution. Computer Vision and Pattern Recognition, 4470-4479. https://doi.org/10.48550/arXiv.1806.03589.
Yu, J. H., Lin, Z., Yang, J. M., Shan, X. H., Lu, X., & Huang, T. (2018). Generative Image Inpainting with Contextual Attention. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5505-5514. https://doi.org/10.48550/arXiv.1801.07892.
Zeiler, M. D., & Fergus, R. (2013). Stochastic Pooling for Regularization of Deep Convolutional Neural Networks. CoRR. abs/1301.3557. https://doi.org/10.48550/arXiv.1301.3557.
Zeng, Y. H., Fu, J. L., Chao, H. Y., & Guo, B. N. (2019). Learning Pyramid-Context Encoder Network for High-Quality Image Inpainting. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1486-1494. https://doi.org/10.48550/arXiv.1904.07475.
Zhai, S. F., Wu, H., Kumar, A, Cheng, Y., Lu, Y. X., & Zhang, Z. F, et al. (2017). S3Pool: Pooling with Stochastic Spatial Sampling. IEEE Conference on Computer Vision and Pattern Recognition, 4003-4011. https://doi:10.1109/cvpr.2017.426.
Zhang, C., Tang, Q. H., Chen, D. L., J. van der Ent, R., Liu, X. C., & Li, W. H., et al. (2019). Moisture Source Changes Contributed to Different Precipitation Changes over the Northern and Southern Tibetan Plateau. Journal of Hydrometeorology, 20(2), 217-229.https://doi:10.1175/jhm-d-18-0094.1.
Zhang, C., Zhang, X., Tang, Q. H., Chen, D. L., Huang, J. C., & Wu, S. H., et al. (2024). Quantifying precipitation moisture contributed by different atmospheric circulations across the Tibetan Plateau. Journal of Hydrology, 628, 130517. https://doi.org/10.1016/j.jhydrol.2023.130517.
Zhu, A. X., Lu, G. N., Liu, J., Qin, C. Z., & Zhou, C. G. (2018). Spatial prediction based on Third Law of Geography. Annals of GIS, 24(4), 225-240. https://doi:10.1080/19475683.2018.1534890.
Zhu, D., Cheng, X. M., Zhang, F., Yao, X., Gao, Y., & Liu, Y. (2020). Spatial interpolation using conditional generative adversarial neural networks. International Journal of Geographical Information Science, 34(4), 735-758. : https://doi.org/10.1080/13658816.2019.1599122.
Zhu, X. D., Zhang, Q., Xu, C. Y., Sun, P., & Hu, P. (2019). Reconstruction of high spatial resolution surface air temperature data across China: A new geo-intelligent multisource data-based machine learning technique. Science of The Total Environment, 665(1), 300-313. https://doi.org/10.1016/j.scitotenv.2019.02.077.

There is NO Competing Interest.

SupplementalMaterial.docx

Download PDF

Version 1

posted

You are reading this latest preprint version

Deep learning reveals shifting precipitation patterns on the Qinghai-Tibetan Plateau (1980-2020) linked to Southwest Asian monsoon

Status:

Version 1

Abstract

Figures

1. Introduction

2. Data

2.1 Daily precipitation observations on meteorological stations

2.2 Data for algorithm validation and evaluation

2.3 Topographic data

2.4 Monsoon index data

2.5 Watershed Boundary Dataset

3. Methods

3.1 A-GCN algorithm

3.1.1 Convolutional Neural Networks (CNN)

3.1.2 Applying Gated Convolution.

3.1.3 Applying Attention Mechanism.

3.1.4 Ablation experiments

3.2 K-Nearest Neighbors (KNN) algorithm

3.3 Evaluation Metrics

3.4 Random sampling

3.5 Surface Rainfall Calculation Based on Thiessen Polygons

4 Results

4.1 Algorithm validation

4.2 Spatiotemporal changes in precipitation

5 Discussion

5.1 Uncertainties assessment

5.2 Precipitation change and possible monsoon influence.

5.3 Implication

6. Conclusion

Declarations

Funding

Data availability statement.

References

Additional Declarations

Supplementary Files

Status:

Version 1