Precipitation is a fundamental and crucial water source in maintaining ecosystem services and human wellbeing. Its amount and spatiotemporal changes are important information for hydrological research and water resource management policy-making (Devi et al., 2015; Prakash et al., 2015). However, more than half of the uncertainty in these factors has often been attributed to the precision level of precipitation data estimates in hydrological modelling (Chao et al., 2018; Seyyedi et al., 2015). Therefore, how to derive spatial distribution data of precipitation with higher accuracy is a difficult, and important, problem.
Various methods have been developed to estimate temporal and spatial change of precipitation by considering different topographic characteristics (Brus and Heuvelink, 2007; Hong et al., 2005). Previously used methods mainly included: spatial distance-based interpolation methods, such as inverse distance weighting (Franke, 1982) and Thiessen polygons (Thiessen, 1911); geostatistical methods, such as kriging and its improving versions (Cressie, 1993; Oliver and Webster, 1990); and thin plate smoothing splines (Hutchinson and Bischof, 1983). These methods are highly dependent on the number of meteorological station observations and topographical variables included (Aalto et al., 2013; Chen et al., 2010; Ishida and Kawashima, 1993). Therefore, the precision of the interpolation method is often diminished by sparsely located meteorological stations over complex topographical areas (dos Santos, 2020; Lin and Chen, 2004; Xu et al., 2018).
In recent years, machine learning (ML) algorithms have been widely applied to spatial interpolation in different disciplines (Appelhans et al., 2015; Kisi et al., 2017). These algorithms have been found to produce remarkably precise results if dependent variable is more correlative with independent variables (Sekulić et al., 2020). The algorithms can handle non-linear relationships between variables (Hashimoto et al., 2019; Hengl et al., 2015), and can maintain high prediction accuracy even with fewer input data (Erdélyi et al., 2023). As one of the ML algorithms, for example, the k-nearest neighbors (KNN) algorithm, interpolates with high accuracy and is easy to implement because it needs not train the algorithm or estimate any parameters in advance (Keller et al., 1985; Peterson, 2009). These ML algorithms have been further improved by introducing solution algorithms, such as Bayesian parameters estimation (Song et al., 2015), generalized additive models (Aalto et al., 2013) and radial basis function (Carlson and Foley, 1991; Lin and Chen, 2004). These algorithms, however, heavily depend on the strength of the predictive relationship between dependent and independent variables, but generally ignore location information, or spatial autocorrelation, of data to be interpolated. Therefore, these algorithms still face great challenges over complex terrain areas with sparsely observed data (Hengl et al., 2018; Sekulić et al., 2020).
Convolutional neural network (CNN) algorithm, as a method of deep learning, is increasingly and more widely applied due to many significant advantages. One of the advantages is to extract and apply features from adjacent areas by convolution operations (Hinton and Salakhutdinov, 2006; LeCun et al., 1989; Shi et al., 2015; Zhu et al., 2020), which makes the resulting algorithm extremely effective in representing data and approximating functions (LeCun et al., 2015). Presently, CNN has been applied to downscale reanalysis data from a spatial resolution of about 25 km to about 3 km (Jiang et al., 2021). It was found to perform better than machine learning to merge on-ground observations, remote sensing data and downscaling data to estimate precipitation spatiotemporal data (Nan et al., 2023). As on-the-ground direct observations, precipitation gauge data generally is considered being more reliable (Fensholt and Rasmussen, 2011; Jiang et al., 2012). The gauge observation-based spatial interpolation data for precipitation are more important as benchmarks for satellite remote sensing data, reanalysis data, and even as input for land surface process simulation and ecosystem process modelling (Wang, 2017). But until the present, it has not been clear how CNN’s performance is for spatial interpolation from on-the-ground precipitation observations, especially in a region with sparse observations over complex terrain.
The Qinghai-Tibetan Plateau (QTP) plays an important role by influencing regional and global atmospheric circulation, often described as being the Earth’s “third pole” and since it contains the headwaters of many rivers, the “Asian water tower.” This region provides ecosystem services and human wellbeing to a large downstream area of Asia (Yao et al., 2022). Insights into the spatiotemporal change of precipitation over the QTP and its possible mechanisms of influence can help us to gain further in-depth understanding of the hydrological cycle instability in the Asian Water Tower region (Guo and Tian, 2022; You et al., 2015). Therefore, accurate precipitation data is crucial for studying the changes in precipitation over the QTP. Spatial data of precipitation is currently mainly interpolated through insufficient observations from meteorological stations or retrieved through satellite remote sensing or climate model-based reanalysis (He et al., 2022; Jiang et al., 2021). However, satellite-based precipitation estimation, such as using the TRMM dataset (Islam and Uyeda, 2007; Vallejo-Bernal et al., 2021), can be distorted by the surrounding environment (Pedersen et al., 2010) and are subject to some uncertainties, requiring thorough validation (Nan et al., 2023). Reanalysis data, including the ERA5 dataset (Copernicus, 2017), usually have coarse spatial resolution (Andermann et al., 2011) and has been found to contain larger uncertainties in areas of complex terrains (Gao et al., 2020; Krakauer et al., 2013), which restricts its application for quantifying spatial distribution (Peng et al., 2019; Y et al., 2021). As a more reliable data source, spatial precipitation data can be more widely interpolated through gauge observations (Gong et al., 2022; He et al., 2022; Wang, 2017). However, a major challenge is how to generate accurate precipitation with high spatiotemporal resolution when faced with insufficient gauge observations, especially in some remote areas with mountain terrain, such as exists over the QTP (Chen et al., 2010; Nan et al., 2023). A deep learning-based CNN algorithm offers the potential to more accurately reproduce spatial precipitation data, leveraging its ability to extract intricate spatial features from sparse observations, especially in mountainous terrain.
Therefore, we believe this paper is the first research attempt to apply an Attention-Gated Convolutional Neural Network (A-GCN) in precipitation spatial interpolation from gauge observations. It is intended to improve spatial interpolation accuracy with insufficient observations over a mountainous terrain region. It is explored as a tool to estimate spatiotemporal change and possible mechanisms of precipitation change over the QTP, which not only provide understanding about climate change and the hydrological cycle on the plateau, but also facilitates the application of a new deep learning algorithm in environmental research.