Downscaling Daily Wind Speed with Bayesian Deep Learning for Climate Monitoring

doi:10.21203/rs.3.rs-1999403/v1

Download PDF

Research Article

Downscaling Daily Wind Speed with Bayesian Deep Learning for Climate Monitoring

https://doi.org/10.21203/rs.3.rs-1999403/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 04 Jun, 2023

Read the published version in International Journal of Data Science and Analytics →

You are reading this latest preprint version

Wind dynamics are extremely complex and have critical impacts on the level of damage from natural hazards, such as storms and wildfires. In the wake of climate change, wind dynamics are becoming more complex, making the prediction of future wind characteristics a more challenging task. Nevertheless, having long-term projections of some wind characteristics, such as daily wind speed, is crucial for effective monitoring of climate change, and for efficient disaster risk management. Furthermore, accurate projections of wind speed result in optimized generation of wind-based electric power. General Circulation Models (GCMs) provide long-term simulations (often till year 2100 or more) of multiple climate variables. However, simulations from a GCM are at a grid with coarse spatial resolution, rendering them ineffective to resolve and analyze climate change at the local regional level. Spatial downscaling techniques are often used to map such global large-scale simulations to a local small-scale region. In this paper, we present a novel deep learning framework for spatial downscaling, specifically for forecasting the daily average wind speed at a local station level using GCM simulations. Our framework, named Wind Convolutional Neural Networks with Transformers, or WCT for short, consists of multi-head convolutional neural networks, followed by stacked transformers, and an uncertainty quantification component based on Bayesian inference. Experimental results show the suitability of WCT when applied on four wind stations in New Jersey and Pennsylvania, United States. Moreover, we use the trained WCT on future GCM simulations to produce local-scale daily wind speed projections up to year 2100.

Deep Learning

Convolutional Neural Networks

Transformer

Wind Speed

Climate Change

The impact of climate change on the environment has become increasingly visible today with the rise of sea level and the increased severity of hurricanes and heatwaves. Scientists are predicting a significant rise in temperature over the next century, and a severe increase of climate change effects over time [1, 2]. According to the Fourth National Climate Assessment [3], the United States will face long-term impacts of climate change, including higher temperatures, longer frost-free seasons, increased rainfall rates and storm intensity. There is a crucial need for communities to enhance their preparedness and capacity to absorb disasters, tackling all these challenges. The availability of local-scale long-term projections of climate variables provides communities with better insights to monitor climate change and mitigate the subsequent impacts. Wind in particular is a crucial climate field to predict, playing a critical part during adverse events, and is a widely leveraged source of renewable energy.

General Circulation Models (GCMs) are numerical-based physical models that are able to simulate the physical processes and dynamics taking place in the atmosphere. Simulated data from GCMs can be obtained at the daily scale for future periods, usually up to year 2100. A GCM delivers important information and insights around the physical processes and dynamics governing our atmospheric climate system. GCM simulations are providing vital assessments of climate change impacts on various lifeline entities, including public health, critical infrastructure, as well as the ecosystem, among others. It is known, however, that these simulations fall short of adequately tackling the inter-disciplinary climate questions, particularly at the local regional scale. For instance, the spatial scales represented by the GCM may be too coarse with respect to what stakeholders require. Moreover, GCM simulations are considered to include biases relative to the data used in developing the model. It is deemed though that over time, these impediments of GCMs will be reduced with the advances in model formulation. However, it will be unlikely that such improved models will be able to address the different scales of interest.

Currently, the resolution of GCM simulations is limited to 100 km or more, due to computationally expensive processes employed within GCMs. However, regional climate can fluctuate significantly, even within one GCM grid cell. Local research organizations and practitioners are interested in analyzing such regional fluctuation, especially above certain areas, such as watersheds, small islands, etc. As such, the focus on downscaling climate projections grew significantly in the past decades, especially with the increasing visible impacts of climate change. Downscaled projections of various climate variables, such as wind speed, lead to a better analysis of local climate variability [4], and enable better local efforts for climate change mitigation as well as disaster risk management.

Downscaling of GCM outputs is performed using dynamical [5-7] or statistical approaches [8-10]. Dynamical downscaling approaches mimic the methodology of GCMs by extracting numeral equations to model the relationship between the climate variables, but at a finer scale. These approaches are computationally expensive and require expertise in the physical interactions between the climate variables. On the other hand, statistical downscaling approaches presume a statistical stationarity relationship between local observations (e.g., wind speed over a local station) and global large-scale GCM outputs (e.g., simulated wind speed at a coarse-scale) [11]. Moreover, statistical approaches assume that local small-scale climate patterns are mainly influenced by global large-scale climate patterns [12, 13]. Such approaches are mainly used for downscaling in the climate field, and often include machine learning-based regression techniques [14-16].

In this paper, we present a novel deep learning framework for statistical downscaling, specifically for forecasting the daily average wind speed at the station level using GCM simulations. Our framework, named Wind Convolutional Neural Networks with Transformers, or WCT for short, consists multi-head convolutional neural networks (CNNs) prefixed to stacked transformers, and an uncertainty quantification [17, 18] component based on the Monte-Carlo dropout sampling approach [19-21]. We apply WCT to forecast daily wind speed over four stations in New Jersey and Pennsylvania, United States.

In [22], we introduced the downscaling daily wind speed problem. Here, we make new contributions and extend the work in [22] along five directions, listed below.

We provide more background information related to GCMs, and the use of machine learning for statistical downscaling.
We adopt GCM simulations as input to WCT instead of the NCEP dataset (which is a proxy of GCM simulations) used in [22].
We leverage CNNs with attention to explicitly and better perform spatial feature embedding. Specifically, we adopt multi-head CNNs with self-attention here instead of the AIG component used in [22] and a stacked transformer with two transformer models here instead of the single transformer with one transformer model used in [22].
We expand the comparative study to include more benchmarked methods and analyze the time and space complexities of WCT and the related machine learning (ML) methods.
We report future projections of wind speed up to year 2100, which are absent in [22].

We organize the rest of this paper as follows. Section 2 presents background information and reviews some of the ML-based approaches for statistical downscaling. Section 3 formulates the wind downscaling problem as a multivariate time series forecasting problem, and describes the datasets used in our study. Section 4 details the WCT framework. Section 5 reports experimental results. Section 6 concludes the paper and points out some directions for future research.

Statistical downscaling works by learning the statistical relationship (regression mapping) between historical local observations (e.g., wind speed over a local weather monitoring station) and the simulations from a selected GCM under a certain Representative Concentration Pathway (RCP) scenario. RCP depicts the greenhouse gases emission trajectory, which includes four different scenarios, RCP 2.6, RCP 4.5, RCP 6.5, and RCP 8.5, where each scenario corresponds to the peak level of the radiative forcing [23]. Statistical downscaling using machine learning-based regression methods are popular, due to their efficiency [15, 24]. Machine learning algorithms used in statistical downscaling include Random Forests (RF) [25–27], Support Vector Regression (SVR) [24, 28, 29], Linear Regression (LR) [30, 31], and Decision Trees (DT) [32, 33].

Relatively few attempts use deep learning for statistical downscaling. Sun and Lan [34] employed CNNs [35–38] to downscale precipitation and temperature over China. Baño-Medina, et al. [39] compared different configurations of CNNs with techniques from VALUE [40], for downscaling over Europe. Misra, et al. [41] adopted LSTM [42–46] for downscaling precipitation over two basins in India and Canada using simulated large-scale climate variables depicting temperature, sea-level pressure, humidity, and wind to predict the local-scale precipitation. In contrast, we present here a CNN- and Transformer-based method for downscaling wind speed at the weather station level, with uncertainty quantification.

3.1 Problem Formulation

In this study, we aim to downscale the daily average wind speed using coarse-resolution simulations of multiple climate variables (temperature, heat flux, humidity, wind, and so on) from GCM. To capture the different interactions of weather patterns, we extract the GCM outputs from multiple grid points, surrounding our desired local weather stations. Figure 1 shows the locations of the 4 weather stations and the GCM grid used in this study where the GCM grid points, with 49 points in total, are denoted by black circles. We formulate the downscaling task as a multivariate time series forecasting problem. The input to the proposed forecasting model consists of multiple GCM-simulated climate variables from time t down to time t - lag, where lag is set to 7 in this work. The output of the forecasting model is the predicted daily average wind speed over a specific station at time t. Note that we downscale wind speed above each local station, independently.

3.2 Local Observations

We retrieved daily average wind speed data from the Global Historical Climatology Network (GHCN)-Daily [47]. We extracted the daily wind speed, reported in meters/second (m/s), from January 1st, 1984, which is the earliest date where wind data is available, to December 31st, 2005 with a total of 22 years, for four wind monitoring stations, represented by red circles in Fig. 1. These four stations are located at Newark Liberty International Airport (EWR), Allentown Lehigh Valley International Airport (ABE), Philadelphia International Airport (PHL), and Atlantic City International Airport (ACY) respectively.

3.3 GCM Simulations

Multiple GCMs exist as part of the Coupled Model Intercomparison Project Phase 5 (CMIP5), which are included in the fifth assessment report of the Intergovernmental Panel on Climate Change (IPCC). In this study, We opted-in to select the CM3 model of the Geophysical Fluid Dynamics Laboratory (GFDL) of NOAA [48]. CM3 simulations are at a grid with ~ 220 km × 270 km resolution. We extract, from the CM3 model, 26 climate variables for each of the 49 grid points shown in Fig. 1. These variables are listed in Table 1. Thus, we have a total of 49 × 26 = 1,274 input features. We retrieved the daily data from January 1st, 1984, to December 31st, 2005, to match the local observation period. We obtained a total of 8,000 data records. In addition, we extracted CM3 simulations for the future periods between 2030–2040, 2060–2070, and 2090–2100 (selected based on data availability), which are to be used for long-term local-scale daily wind speed projections.

Table 1

Large-scale Climate Variables Extracted from GCM Simulations
Climate Variable	Description	Unit
clt	Total cloud fraction	%
hfls	Surface upward latent heat flux	W/m²
hfss	Surface upward sensible heat flux	W/m²
hus_250	Specific humidity at 250 hPa	-
hus_500	Specific humidity at 500 hPa	-
hus_850	Specific humidity at 850 hPa	-
huss	Specific humidity at near-surface	-
pr	Precipitation	Kg/m²/s
psl	Sea level pressure	Pa
rhs	Relative humidity at near-surface	%
sfcWind	Daily mean wind speed at near-surface	m/s
ta_250	Air temperature at 250 hPa	K
ta_500	Air temperature at 500 hPa	K
ta_850	Air temperature at 850 hPa	K
tas	Air temperature at near-surface	K
ua_250	Eastward wind at 250 hPa	m/s
ua_500	Eastward wind at 500 hPa	m/s
ua_850	Eastward wind at 850 hPa	m/s
uas	Eastward wind at near-surface	m/s
va_250	Northward wind at 250 hPa	m/s
va_500	Northward wind at 500 hPa	m/s
va_850	Northward wind at 850 hPa	m/s
vas	Northward wind at near-surface	m/s
zg_250	Geopotential height at 250 hPa	m
zg_500	Geopotential height at 500 hPa	m
zg_850	Geopotential height at 850 hPa	m

The proposed WCT framework is for performing the statistical downscaling of daily wind speed using coarse-scale simulations from GCM. This framework applies convolutional neural networks (CNNs) with self-attention to each climate variable independently where each climate variable is depicted by a 7 × 7 matrix of corresponding values from the 49 grid points represented by the black circles in Fig. 1. The CNNs are followed by a stacked transformer with two transformer models. Within the framework, we apply dropout steps during testing to leverage Monte-Carlo dropout sampling for uncertainty quantification. The architecture of the WCT framework is shown in Fig. 2.

4.1 Multi-Head Convolutional Neural Networks

CNNs are based on a grid-like topology [49], and they owe their name to the convolution layers included in their architecture. A typical CNN consists of a set of convolution and pooling layers. Convolution layers aim to detect discriminative local features in the input, which are needed for classification, by applying the convolution operator using a set of convolution kernels. On the other hand, pooling layers aim to extract the general patterns from the input and reduce the size of the detected local features. Pooling is done by replacing the value of a pixel with a statistical summary of its neighbors [49]. In general, a CNN architecture consists of a sequence of convolution-pooling layers which are then followed by fully connected layers.

We utilize self-attention in our CNN architecture to give selective importance to the output of the pooling layers [50]. For each climate variable, depicted by the 7 × 7 input matrix, we apply the sequence of convolution-pooling-attention two times followed by a dense layer. The outputs of the dense layer are concatenated and fed to a stacked transformer with two transformer models. Notice that the “multi-head” used in in our architecture is to depict the usage of multiple input heads (similar to “multi-convolutional heads’’ [51–53]) and is not related to the attention mechanism.

4.2 Transformer

A transformer [50] is an attention-based deep learning model that gained popularity within machine translation [54–57], and was incorporated into different domains [58–60], such as computer vision [61–64] and timeseries forecasting [65–68]. A transformer model leverages the attention mechanism to capture dependencies between the data. Within a transformer, data are processed as one sequence, with dependencies learned using self-attention. Multiple studies adopted transformers to tackle timeseries forecasting problems [69–71]. In the WCT framework, we employ a stacked transformer with two transformer models attached to the concatenated outputs of the multi-head CNNs (see Fig. 2). Each transformer model consists of stacked multi-head attention blocks, where each block contains a self-attention layer, a normalization layer, and a feed-forward layer.

4.3 Time Complexity Analysis of WCT

Table 2 shows the time complexity of each layer/model within the WCT framework. Here, n denotes the length of the input sequence of the corresponding layer, d denotes the dimension of such input, and k denotes the length of a filter within a convolution layer. The time complexity of a convolution layer is O(k × n × d ²), derived by taking the sum of the row-wise dot products between a filter of length k and a matrix with dimension d, over the input of size n, for each filter (with at most d filters). The time complexity of a pooling layer is O(n) due to the fact that the pooling mask passes through each input once (for each dimension). The time complexity of a self-attention layer is O(n ² × d) [50], due to the usage of the SoftMax function for calculating the layer’s output. The time complexity of a transformer model is bounded by the complexity of the model’s self-attention layer, which is O(n ² × d). Since n > k × d, the time complexity of the WCT framework is O(n ² × d).

Table 2

Time Complexity of Each Component of the WCT Framework
Layer/Model Type	Time Complexity Per Layer/Model
Convolution Layer	O(k × n × d²)
Pooling Layer	O(n)
Self-Attention Layer	O(n² × d)
Transformer	O(n² × d)

4.4 Uncertainty Quantification

Following the rise of the real-world applications of machine learning, the concept of uncertainty quantification emerged as an important component to better analyze and understand predictive models [72–74]. Quantifying the uncertainties in a machine learning model depends on the underlying forecasting task, as well as the settings of the model itself. Such uncertainties occur in different parts within a prediction pipeline and emerge from different sources. The conditional probability distribution $p(y|x)$ is often used to depict the forecasting problem of outcome from an instance . However, the probability distribution falls short of capturing unique outcomes, triggering an uncertainty known as aleatoric.

Another uncertainty, known as the model uncertainty, emerges due to the fact that the Bayes predictor and the pointwise predictor might not overlap when used in machine learning prediction tasks. Such an uncertainty is related to the model fitting capacity, as well as the hypothesis space [17]. The quality of the estimated hypothesis leads to another type of uncertainty, known as the approximation uncertainty. In machine learning and uncertainty analysis, model and approximation uncertainties are incorporated into one concept, referred to as the epistemic uncertainty. Simply put, the aleatoric uncertainty is concerned with the stochastic relationship between inputs and outputs (uncertainty inherited from the data). The epistemic uncertainty is related to the model used and is caused by the calculation of the model’s parameters (e.g., weights in the neural network).

We adopt the Monte-Carlo dropout sampling approach [75, 76] to produce a quantitative analysis of the aleatoric uncertainty and epistemic uncertainty. To quantify uncertainties, one needs to calculate the probability $P\left(W\right)$ over the network weights $W$, and compute the exact posterior probability $P\left(W|D\right)$ following the Bayes’ theorem, i.e., $P\left(W|D\right)=\frac{P\left(D|W\right)P\left(W\right)}{P\left(D\right)}$, where $D$ depicts the input features. However, due to the difficulty in computing this probability, one can leverage the variational inference [77] by learning the variational distribution ${q}_{\theta }\left(W\right)$ over the weights $W$ parameterized by $\theta$. This can be achieved by optimizing the Kullback-Leibler divergence of ${q}_{\theta }\left(W\right)$, as well as of the estimated posterior probability. The variational approximation can be achieved by randomly dropping out neurons in the model via dropout steps in the neural network. We note that dropout is often employed to overcome the overfitting problem in deep learning.

Following the Monte-Carlo dropout sampling approach, we apply dropout during testing (where the dropout rate is set to 0.4 in this study) to obtain $R$ Monte-Carlo samples for each test input, by running the model on each test input $R$ times where $R$ is set to 100 in this study. The $R$ predicted values for each test input allow us to draw a mean and a variance over them. We compute the aleatoric uncertainty and epistemic uncertainty associated with the predicted daily average wind speed following the work in [19, 78] using the mean and the variance of the $R$ predicted values for the test input.

5.1 Experimental Setup

We adopted three metrics to evaluate the performance of the WCT framework: i) the Root Mean Squared Error (RMSE), ii) Mean Absolute Error (MAE), and iii) the Nash-Sutcliffe Efficiency coefficient (NSE) [79]. NSE is a correlation metric usually used in the environmental field to analyze the fit between observed and simulated time series data. In general, NSE ranges between $-\infty$ and 1, where a negative value (< 0) indicates that a machine learning model performs worse than a mean predictor, and consequently the model would be unacceptable for the task at hand. The performance metrics are defined as follows:

$$\text{R}\text{M}\text{S}\text{E} = \sqrt{\sum _{i=1}^{n}{({O}_{i} - {P}_{i})}^{2} / n},$$

$$\text{M}\text{A}\text{E} = \sum _{i=1}^{n}|{O}_{i} - {P}_{i}| / n,$$

$$\text{N}\text{S}\text{E} =1- (\sum _{i=1}^{n}{({O}_{i} - {P}_{i})}^{2} / \sum _{i=1}^{n}{({O}_{i} - \stackrel{-}{O})}^{2}).$$

Here $n$ is the total number of test samples, ${O}_{i}$ and ${P}_{i}$ are the observed and predicted daily average wind speed respectively, and $\stackrel{-}{O}$ is the mean observed wind speed for all the test samples. Each test sample is comprised of 26 7×7 matrices corresponding to the 26 climate variables considered in this study and each 7×7 matrix contains the values of the corresponding climate variable taken from the 49 GCM grid points shown in Fig. 1.

We employed a split-sampling approach, in which we divide the data samples into the training and testing period. The training period spans from year 1984 to mid-2001, which occupies 80% of the whole dataset, out of which 20% are used for validation. The testing period spans from mid-2001 to the end of year 2005, which occupies 20% of the whole dataset. We performed four experiments independently and separately, one for each of the four wind monitoring stations (ABE, ACY, EWR, PHL). The proposed WCT framework was implemented with the TensorFlow/Keras libraries in Python. We chose the parameter values that achieved the best performance on the validation set. Table 3 summarizes the parameters and their chosen values. All experiments were conducted on a GeForce® GTX 1050 GPU with 2GB GDDR5.

Table 3

Parameters and Their Values Used in the WCT Framework
Parameter	Value	Description
Number of Filters	32	Number of the filters in a convolution layer
Activation (Conv)	ReLU	Activation in a convolution layer
Number of Transformers	2	Number of transformer models used
Optimizer	Adam	Optimization function used
Loss Function	MSE	Loss function to be optimized
Batch Size	64	Size of each batch
Epochs	1500	Number of epochs

5.2 Comparative Study

We first evaluated the performance of WCT and compared WCT with several classical machine learning-based regression methods reported in the literature. These classical machine learning methods included decision trees (DT) [32], multi-layer perceptron (MLP) [80], random forest (RF) [25], support vector regression (SVR) [81]. In addition, we compared WCT with several deep learning methods used in statistical downscaling. These deep learning methods included long short-term memory network (LSTM) [41], convolutional neural network (CNN) [34], Transformer [22, 82], and our previous AIG-Transformer [22].

5.2.1 Time and Space Complexity Analysis of the Machine Learning Methods

Table 4 summarizes the time and space complexities of the compared methods. Here, n denotes the length of the input sequence of the corresponding layer and d denotes the dimension of such input. In RF, testing requires traversing each tree in the forest (with T trees), leading to the O(T × MD) time complexity where MD is the maximum depth of the tree [83]. The space complexity of RF is O(T × MD) as well to save the trees. DT testing is similar to RF but on one tree, hence, the time complexity and space complexity of DT are both O(MD). The time complexity of SVR is O(v × d) where v denotes the number of support vectors for each dimension. The space complexity of SVR is O(v × d) as well to save all the support vectors. In this study, MLP consists of two hidden layers, h₁ and h₂. MLP requires linear multiplication between the layers, and hence the time complexity of MLP is O(n × h₁ + h₁ × h₂ + h₂), and the space complexity of MLP is O(h₁ + h₂) to save weights of both hidden layers.

LSTM is local in space and time irrespective of the input length. The time complexity is constant for each weight in the network. Let w denote the number of weights in LSTM. The time complexity of LSTM is O(w) [84]. As for the space complexity, network-based models require extra space to save weights. Hence, the space complexity of WCT, Transformer, CNN, and LSTM is O(w) respectively. The complexity of the AIG-Transformer is similar to the WCT framework, bounded by the self-attention in the model.

Table 4

Time and Space Complexities of the Machine Learning Methods Compared in This Study
Model	Time Complexity	Space Complexity
DT	O(MD)	O(MD)
MLP	O(n × h₁ + h₁ × h₂ + h₂)	O(h₁ + h₂)
RF	O(T × MD)	O(T × MD)
SVR	O(v × d)	O(v × d)
LSTM	O(w)	O(w)
CNN	O(n² × d) (Table 2)	O(w)
Transformer	O(n² × d) (Table 2)	O(w)
AIG-Transformer	O(n² × d) (Table 2)	O(w)
WCT	O(n² × d) (Table 2)	O(w)

5.2.2 Performance Comparison of the Machine Learning Methods

The classical machine learning-based regression methods used in the comparative study were implemented with the Sklearn library in Python, with default parameters. The deep learning methods used in the comparative study were implemented using TensorFlow/Keras libraries in Python. For the AIG-Transformer, we used the same parameters as in [22]. When running the comparative experiments, we turned off the Bayesian inference and uncertainty quantification component in WCT given that the related ML methods (except our previous AIG-Transformer [22]) cannot quantify uncertainty.

Figure 3 presents the experimental results for each local station. The WCT framework outperformed the other benchmarked methods, achieving an average RMSE (over 4 stations) of 2.15 m/s (30% improvement over the next best method, AIG-Transformer), an average MAE of 1.6 m/s (17% improvement over AIG-Transformer), and an average NSE of 0.88 (10% improvement over AIG-Transformer). These results showcase the suitability of the WCT framework for the downscaling task. The deep learning methods (WCT, AIG-Transformer, Transformer, CNN, and LSTM) generally performed better than the classical machine learning methods (DT, MLP, RF, and SVR) which showed similar performance relative to each other. This behavior depicts the advantage of using deep learning for downscaling. Moreover, it is important to note that all the methods showed a stable performance with respect to the four datasets collected from the four wind monitoring stations (ABE, ACY, EWR, PHL) respectively.

5.3 Ablation Study

To assess the components of the WCT framework, we compared the performance of four variants (with the Bayesian inference turned off): i) WCT representing the original model, ii) WCT-SA denoting the original model minus the self-attention layers in the multi-head CNNs, iii) WCT-Tr referring to the original model without the stacked transformer, and iv) WCT-SA-Tr representing the original model without self-attention and without the stacked transformer. Figure 4 shows the a) RMSE, b) MAE, and c) NSE for the four variants. In general, the original model, WCT, performs better than the variants on all the four datasets, showcasing the importance of both the stacked transformer and the self-attention layers in the multi-head CNNs.

5.4 Uncertainty Quantification Results

The Monte-Carlo dropout sampling approach gives the ability to quantify the aleatoric uncertainty and epistemic uncertainty when performing the predictions. Figure 5 presents the predicted daily average wind speed (shown as monthly average for better visibility) for the evaluation period, as well as the quantified epistemic and aleatoric uncertainties, for each dataset. The average aleatoric (data) and epistemic (model) uncertainties are shown in Table 5. The aleatoric uncertainty is fairly similar across the four stations, indicating that the inherent randomness in the data is stationary. This might relate to the weather reporting methodology executed by the National Oceanic and Atmospheric Administration/National Climatic Data Center (NOAA/NCDC) [47], responsible for the local observations that we are using. The epistemic values are relatively low, and similar across the four stations, revealing the stability as well as high confidence of the model. We note that the aleatoric uncertainty values are larger than the epistemic uncertainty values, indicating that the uncertainty in the output mainly comes from the data, not the model.

Table 5

Average Aleatoric and Epistemic Uncertainity Values Over Each Local Station
Station	Aleatoric	Epistemic
ABE	1.45	0.83
ACY	1.50	0.87
EWR	1.27	0.89
PHL	1.38	0.85

5.5 Long-Term Projections

The WCT performance on the evaluation period proves its suitability for predicting the station-level daily average wind speed using the GCM simulations. The advantage of using GCM simulations as input to the model is that such simulations exist for future periods, enabling us to leverage a trained predictive model on historical data to output long-term projections. We trained the WCT framework on the full period between 1984 and 2005, and applied it to the periods of 2031–2040, 2061–2070, and 2091–2100 where we have the data available for these periods. We downscaled for each period independently, and we used the RCP 8.5 scenario for the future GCM simulations. Figure 6 shows the predicted wind speed (averaged per month for visibility) for the future periods, along with the epistemic and aleatoric uncertainties.

The local-scale long-term prediction of climate fields, such as wind speed, is crucial for better climate change analysis and mitigation, as well as for more optimized renewable energy generation. General Circulation Models (GCMs) provide daily weather simulations for decades into the future. These simulations, however, are at a global large scale and are too coarse to resolve the effects of climate change at the local and regional scale. Downscaling is often applied to the output of GCMs to project the local-scale simulations. In this work, we propose a novel deep learning framework, combining multi-head convolutional neural networks (CNNs) with transformers and uncertainty quantification. We employed our framework, named WCT, for the spatial downscaling of GCM outputs to predict the daily average wind speed at the local station scale, above four locations within New Jersey and Pennsylvania. Experimental results showed the suitability of WCT for the downscaling task, outperforming multiple benchmarked machine learning methods. We followed the Monte-Carlo dropout sampling approach within WCT, to quantify the aleatoric and epistemic uncertainties. Moreover, we trained the WCT framework using data from 1984 to 2005 and applied the trained model to project the long-term daily average wind speed using future GCM simulations for the periods of 2031–2040, 2061–2070, and 2091–2100 respectively.

The long-term prediction of daily wind speed, which we tackled in this paper, is an important task in climate change mitigation and renewable energy generation. However, in the environmental and energy generation fields, practitioners also would need the prediction of wind direction as well as anomalies in wind speed. This is particularly important for disaster management, as large wind gusts, for example, would close down wind turbines, and wind direction would affect the movement of wildfires. For wind direction prediction, we aim in future work to update our WCT framework for multi-target regression to predict wind direction as well as wind speed. As for anomaly detection, one might tackle this task by performing the wind speed at the sub-daily scale, such as hourly, or by predicting the daily maximum wind speed instead of the daily average one. In the future, should sub-daily wind data become publicly available, we aim to collect historical data at the hourly scale to retrain our model for the hourly prediction task. Moreover, some recent historical wind datasets include the daily maximum wind speed values, which we could leverage to detect anomalies. It is usually the case, however, that these datasets are fairly recent, and data are too sparse to train a deep learning model. Nevertheless, we hypothesize that this issue could be resolved by training the WCT framework on the daily average wind speed values and performing transfer learning using the daily maximum wind speed datasets. We aim to prove this hypothesis in future work. We also plan to expand on the local wind monitoring stations used for collecting observations, to test our model on different geographical areas.

Funding

Funding for this study was provided by the Bridge Resource Program (BRP) from the New Jersey Department of Transportation.

Competing interests

The authors have no relevant financial or non-financial interests to disclose.

M. Parry, M. L. Parry, O. Canziani, J. Palutikof, P. Van der Linden, and C. Hanson, Climate change 2007-impacts, adaptation and vulnerability: Working group II contribution to the fourth assessment report of the IPCC. Cambridge University Press, 2007.
T. Stocker, D. Qin, G. Plattner, M. Tignor, S. Allen, J. Boschung, A. Nauels, Y. Xia, V. Bex, and P. Midgley, "IPCC, 2013: summary for policymakers in climate change 2013: the physical science basis, contribution of working group I to the fifth assessment report of the intergovernmental panel on climate change", Camb. Univ. Press Camb. UKNY NY USA, 2013.
USGCRP, "Climate Science Special Report: Fourth National Climate Assessment, Volume I", U.S. Global Change Research Program, 2017, doi: 10.7930/J0J964J6.
A. Murawski, G. Bürger, S. Vorogushyn, and B. Merz, "Can local climate variability be explained by weather patterns? A multi-station evaluation for the Rhine basin", Hydrology and Earth System Sciences, vol. 20, no. 10, pp. 4283–4306, 2016.
R. El-Samra, E. Bou‐Zeid, H. K. Bangalath, G. Stenchikov, and M. El‐Fadel, "Seasonal and regional patterns of future temperature extremes: high‐resolution dynamic downscaling over a complex terrain", Journal of Geophysical Research: Atmospheres, vol. 123, no. 13, pp. 6669–6689, 2018.
L. M. Druyan, M. Fulakeza, and P. Lonergan, "Dynamic downscaling of seasonal climate predictions over Brazil", Journal of Climate, vol. 15, no. 23, pp. 3411–3426, 2002.
Z. Xu and Z.-L. Yang, "An improved dynamical downscaling method with GCM bias corrections and its validation with 30 years of climate simulations", Journal of Climate, vol. 25, no. 18, pp. 6271–6286, 2012.
A. M. Stoner, K. Hayhoe, X. Yang, and D. J. Wuebbles, "An asynchronous regional regression model for statistical downscaling of daily climate variables", International Journal of Climatology, vol. 33, no. 11, pp. 2473–2494, 2013.
M. Li, F. Zhang, S. Barnes, and X. Wang, "Assessing storm surge impacts on coastal inundation due to climate change: case studies of Baltimore and Dorchester County in Maryland", Natural Hazards, vol. 103, pp. 2561–2588, 2020.
S. Ghosh and P. P. Mujumdar, "Statistical downscaling of GCM simulations to streamflow using relevance vector machine", Advances in Water Resources, vol. 31, no. 1, pp. 132–146, 2008.
R. L. Wilby, T. Wigley, D. Conway, P. Jones, B. Hewitson, J. Main, and D. Wilks, "Statistical downscaling of general circulation model output: A comparison of methods", Water resources research, vol. 34, no. 11, pp. 2995–3008, 1998.
H. von Storch, "Inconsistencies at the interface of climate impact studies and global climate research", Meteorologische Zeitschrift, pp. 72–80, 1995.
H. von Storch, "The global and regional climate system", in Anthropogenic climate change: Springer, 1999, pp. 3–36.
X. He, N. W. Chaney, M. Schleiss, and J. Sheffield, "Spatial downscaling of precipitation using adaptable random forests", Water Resources Research, vol. 52, no. 10, pp. 8217–8237, 2016.
D. Sachindra, K. Ahmed, M. M. Rashid, S. Shahid, and B. Perera, "Statistical downscaling of precipitation using machine learning techniques", Atmospheric Research, vol. 212, pp. 240–258, 2018.
X. Li, Z. Li, W. Huang, and P. Zhou, "Performance of statistical and machine learning ensembles for daily temperature downscaling", Theoretical and Applied Climatology, vol. 140, no. 1, pp. 571–588, 2020.
E. Hüllermeier and W. Waegeman, "Aleatoric and epistemic uncertainty in machine learning: An introduction to concepts and methods", Machine Learning, vol. 110, no. 3, pp. 457–506, 2021.
V.-L. Nguyen, S. Destercke, M.-H. Masson, and E. Hüllermeier, "Reliable multi-class classification based on pairwise epistemic and aleatoric uncertainty", in 27th International Joint Conference on Artificial Intelligence (IJCAI 2018), 2018, pp. 5089–5095.
H. Jiang, J. Jing, J. Wang, C. Liu, Q. Li, Y. Xu, J. T. L. Wang, and H. Wang, "Tracing Hα fibrils through Bayesian deep learning", The Astrophysical Journal Supplement Series, vol. 256, no. 1, p. 20, 2021.
T. Myojin, S. Hashimoto, and N. Ishihama, "Detecting uncertain BNN outputs on FPGA using Monte Carlo dropout sampling", in International Conference on Artificial Neural Networks, 2020: Springer, pp. 27–38.
T. Myojin, S. Hashimoto, K. Mori, K. Sugawara, and N. Ishihama, "Improving reliability of object detection for lunar craters using Monte Carlo dropout", in International Conference on Artificial Neural Networks, 2019: Springer, pp. 68–80.
F. Gerges, M. C. Boufadel, E. Bou-Zeid, H. Nassif, and J. T. L. Wang, "A novel Bayesian deep learning approach to the downscaling of wind speed with uncertainty quantification", in Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2022: Springer, pp. 55–66, doi: https://doi.org/10.1007/978-3-031-05981-0_5.
IPCC, Climate Change 2013: The Physical Science Basis (Fifth Assessment Report). New York, NY: United Nations, 2013.
S.-T. Chen, P.-S. Yu, and Y.-H. Tang, "Statistical downscaling of daily precipitation using support vector machines and multivariate analysis", Journal of Hydrology, vol. 385, no. 1–4, pp. 13–22, 2010.
B. Pang, J. Yue, G. Zhao, and Z. Xu, "Statistical downscaling of temperature with the random forest model", Advances in Meteorology, vol. 2017, 2017.
C. Hutengs and M. Vohland, "Downscaling land surface temperatures at regional scales with random forest regression", Remote Sensing of Environment, vol. 178, pp. 127–141, 2016.
H. Wu and W. Li, "Downscaling land surface temperatures using a random forest regression model with multitype predictor variables", IEEE Access, vol. 7, pp. 21904–21916, 2019.
S. H. Pour, S. Shahid, E.-S. Chung, and X.-J. Wang, "Model output statistics downscaling using support vector machine for the projection of spatial and temporal changes in rainfall of Bangladesh", Atmospheric Research, vol. 213, pp. 149–162, 2018.
S. Tripathi, V. Srinivas, and R. S. Nanjundiah, "Downscaling of precipitation for climate change scenarios: a support vector machine approach", Journal of Hydrology, vol. 330, no. 3–4, pp. 621–640, 2006.
S. Ghosh and P. Mujumdar, "Future rainfall scenario over Orissa with GCM projections by statistical downscaling", Current Science, pp. 396–404, 2006.
D. Sachindra, F. Huang, A. Barton, and B. Perera, "Least square support vector and multi-linear regression for statistically downscaling general circulation model outputs to catchment streamflows", International Journal of Climatology, vol. 33, no. 5, pp. 1087–1106, 2013.
R. Xu, N. Chen, Y. Chen, and Z. Chen, "Downscaling and projection of multi-CMIP5 precipitation using machine learning methods in the upper Han River Basin", Advances in Meteorology, vol. 2020, 2020.
S. Kannan and S. Ghosh, "Prediction of daily rainfall state in a river basin using statistical downscaling from GCM output", Stochastic Environmental Research and Risk Assessment, vol. 25, no. 4, pp. 457–474, 2011.
L. Sun and Y. Lan, "Statistical downscaling of daily temperature and precipitation over China using deep learning neural models: Localization and comparison with other methods", International Journal of Climatology, vol. 41, no. 2, pp. 1128–1147, 2021.
Z. Yang, Y. Wang, C. Liu, H. Chen, C. Xu, B. Shi, C. Xu, and C. Xu, "Legonet: Efficient convolutional neural networks with lego filters", in International Conference on Machine Learning, 2019: PMLR, pp. 7005–7014.
X. Pan, J. Shi, P. Luo, X. Wang, and X. Tang, "Spatial as deep: Spatial CNN for traffic scene understanding", in 32nd AAAI Conference on Artificial Intelligence, 2018.
C. Jin, H. Liang, D. Chen, Z. Lin, and M. Wu, "Identifying mobility of drug addicts with multilevel spatial-temporal convolutional neural network", in Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2019: Springer, pp. 477–488.
Z. Liu, M. Wan, S. Guo, K. Achan, and P. S. Yu, "Basconv: Aggregating heterogeneous interactions for basket recommendation with graph convolutional neural network", in Proceedings of the 2020 SIAM International Conference on Data Mining, 2020: SIAM, pp. 64–72.
J. Baño-Medina, R. Manzanas, and J. M. Gutiérrez, "Configuration and intercomparison of deep learning neural models for statistical downscaling", Geoscientific Model Development, vol. 13, no. 4, pp. 2109–2124, 2020.
D. Maraun, M. Widmann, J. M. Gutiérrez, S. Kotlarski, R. E. Chandler, E. Hertig, J. Wibig, R. Huth, and R. A. Wilcke, "VALUE: A framework to validate downscaling approaches for climate change studies", Earth's Future, vol. 3, no. 1, pp. 1–14, 2015.
S. Misra, S. Sarkar, and P. Mitra, "Statistical downscaling of precipitation using long short-term memory recurrent neural networks", Theoretical and Applied Climatology, vol. 134, no. 3, pp. 1179–1196, 2018.
T. Guo, T. Lin, and N. Antulov-Fantulin, "Exploring interpretable LSTM neural networks over multi-variable data", in International Conference on Machine Learning, 2019: PMLR, pp. 2494–2504.
Z. Hu, T. Turki, N. Phan, and J. T. L. Wang, "A 3D atrous convolutional long short-term memory network for background subtraction", IEEE Access, vol. 6, pp. 43450–43459, 2018.
H. Liu, C. Liu, J. T. L. Wang, and H. Wang, "Predicting solar flares using a long short-term memory network", The Astrophysical Journal, vol. 877, no. 2, p. 121, 2019.
I. Segovia-Dominguez, Z. Zhen, R. Wagh, H. Lee, and Y. R. Gel, "TLife-LSTM: forecasting future COVID-19 progression with topological signatures of atmospheric conditions", in Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2021: Springer, pp. 201–212.
M. Shalaby, J. Stutzki, M. Schubert, and S. Günnemann, "An LSTM approach to patent classification based on fixed hierarchy vectors", in Proceedings of the 2018 SIAM International Conference on Data Mining, 2018: SIAM, pp. 495–503.
M. J. Menne, I. Durre, R. S. Vose, B. E. Gleason, and T. G. Houston, "An overview of the global historical climatology network-daily database", Journal of Atmospheric and Oceanic Technology, vol. 29, no. 7, pp. 897–910, 2012.
S. M. Griffies, M. Winton, L. J. Donner, L. W. Horowitz, S. M. Downes, R. Farneti, A. Gnanadesikan, W. J. Hurlin, H.-C. Lee, and Z. Liang, "The GFDL CM3 coupled climate model: characteristics of the ocean and sea ice simulations", Journal of Climate, vol. 24, no. 13, pp. 3520–3544, 2011.
I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT Press Cambridge, 2016.
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, "Attention is all you need", in Advances in Neural Information Processing Systems, 2017, pp. 5998–6008.
S. Ö. Arık, H. Jun, and G. Diamos, "Fast spectrogram inversion using multi-head convolutional neural networks", IEEE Signal Processing Letters, vol. 26, no. 1, pp. 94–98, 2018.
Z. N. Khan and J. Ahmad, "Attention induced multi-head convolutional neural network for human activity recognition", Applied Soft Computing, vol. 110, p. 107671, 2021.
J. Linmans, J. v. d. Laak, and G. Litjens, "Efficient out-of-distribution detection in digital pathology using multi-head convolutional neural networks", in Proceedings of the Third Conference on Medical Imaging with Deep Learning, 2020, vol. 121: PMLR, pp. 465–478.
R. Devika, S. Vairavasundaram, C. S. J. Mahenthar, V. Varadarajan, and K. Kotecha, "A deep learning model based on BERT and sentence transformer for semantic keyphrase extraction on big social data", IEEE Access, vol. 9, pp. 165252–165261, 2021.
D. Martín-Gutiérrez, G. Hernández-Peñaloza, A. B. Hernández, A. Lozano-Diez, and F. Álvarez, "A deep learning approach for robust detection of bots in twitter using transformers", IEEE Access, vol. 9, pp. 54591–54601, 2021.
T. T. Aurpa, R. Sadik, and M. S. Ahmed, "Abusive Bangla comments detection on Facebook using transformer-based deep learning models", Social Network Analysis and Mining, vol. 12, no. 1, pp. 1–14, 2022.
P. Saltz, S. Y. Lin, S. C. Cheng, and D. Si, "Dementia detection using transformer-based deep learning and natural language processing models", in Proceedings of the 2021 IEEE 9th International Conference on Healthcare Informatics (ICHI), 2021: IEEE, pp. 509–510.
N. Parmar, A. Vaswani, J. Uszkoreit, L. Kaiser, N. Shazeer, A. Ku, and D. Tran, "Image transformer", in International Conference on Machine Learning, 2018: PMLR, pp. 4055–4064.
T. Cai, M. Shen, H. Peng, L. Jiang, and Q. Dai, "Improving transformer with sequential context representations for abstractive text summarization", in CCF International Conference on Natural Language Processing and Chinese Computing, 2019: Springer, pp. 512–524.
D. Guo and D. Terzopoulos, "A transformer-based network for anisotropic 3D medical image segmentation", in 2020 25th International Conference on Pattern Recognition (ICPR), 2021: IEEE, pp. 8857–8861.
W. Liu, C. Li, M. M. Rahaman, T. Jiang, H. Sun, X. Wu, W. Hu, H. Chen, C. Sun, and Y. Yao, "Is the aspect ratio of cells important in deep learning? A robust comparison of deep learning methods for multi-scale cytopathology cell image classification: From convolutional neural networks to visual transformers", Computers in Biology and Medicine, vol. 141, p. 105026, 2022.
H. Wang, Y. Ji, K. Song, M. Sun, P. Lv, and T. Zhang, "ViT-P: classification of genitourinary syndrome of menopause from OCT images based on vision transformer models", IEEE Transactions on Instrumentation and Measurement, vol. 70, pp. 1–14, 2021.
S. Park, G. Kim, Y. Oh, J. B. Seo, S. M. Lee, J. H. Kim, S. Moon, J.-K. Lim, and J. C. Ye, "Multi-task vision transformer using low-level chest X-ray feature corpus for COVID-19 diagnosis and severity quantification", Medical Image Analysis, vol. 75, p. 102299, 2022.
K. Ikromjanov, S. Bhattacharjee, Y.-B. Hwang, R. I. Sumon, H.-C. Kim, and H.-K. Choi, "Whole slide image analysis and detection of prostate cancer using vision transformers", in Proceedings of the 2022 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), 2022: IEEE, pp. 399–402.
L. Shen and Y. Wang, "TCCT: tightly-coupled convolutional transformer on time series forecasting", Neurocomputing, 2022.
A. R. Abbasi, M. R. Mahmoudi, and M. M. Arefi, "Transformer winding faults detection based on time series analysis", IEEE Transactions on Instrumentation and Measurement, vol. 70, pp. 1–10, 2021.
A. Narayan, B. S. Mishra, P. S. Hiremath, N. T. Pendari, and S. Gangisetty, "An ensemble of transformer and LSTM approach for multivariate time series data classification", in Proceedings of the 2021 IEEE International Conference on Big Data (Big Data), 2021: IEEE, pp. 5774–5779.
G. Zerveas, S. Jayaraman, D. Patel, A. Bhamidipaty, and C. Eickhoff, "A transformer-based framework for multivariate time series representation learning", in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 2021, pp. 2114–2124.
R. Mohammdi Farsani and E. Pazouki, "A transformer self-attention model for time series forecasting", Journal of Electrical and Computer Engineering Innovations (JECEI), vol. 9, no. 1, pp. 1–10, 2021.
N. Wu, B. Green, X. Ben, and S. O'Banion, "Deep transformer models for time series forecasting: The influenza prevalence case", arXiv preprint arXiv:2001.08317, 2020.
L. Cai, K. Janowicz, G. Mai, B. Yan, and R. Zhu, "Traffic transformer: Capturing the continuity and periodicity of time series for traffic forecasting", Transactions in GIS, vol. 24, no. 3, pp. 736–755, 2020.
K. Zhang, C. Hawkins, and Z. Zhang, "General-purpose Bayesian tensor learning with automatic rank determination and uncertainty quantification", Frontiers in Artificial Intelligence, vol. 4, 2021.
J. Liu, "Variable selection with rigorous uncertainty quantification using deep Bayesian neural networks: Posterior concentration and Bernstein-von Mises phenomenon", in International Conference on Artificial Intelligence and Statistics, 2021: PMLR, pp. 3124–3132.
Y. Wang and V. Rocková, "Uncertainty quantification for sparse deep learning", in International Conference on Artificial Intelligence and Statistics, 2020: PMLR, pp. 298–308.
Y. Gal and Z. Ghahramani, "Dropout as a bayesian approximation: Representing model uncertainty in deep learning", in International Conference on Machine Learning, 2016: PMLR, pp. 1050–1059.
S. Roy, W. Menapace, S. Oei, B. Luijten, E. Fini, C. Saltori, I. Huijben, N. Chennakeshava, F. Mento, and A. Sentelli, "Deep learning for classification and localization of COVID-19 markers in point-of-care lung ultrasound", IEEE Transactions on Medical Imaging, vol. 39, no. 8, pp. 2676–2687, 2020.
D. M. Blei, A. Kucukelbir, and J. D. McAuliffe, "Variational inference: A review for statisticians", Journal of the American statistical Association, vol. 112, no. 518, pp. 859–877, 2017.
Y. Kwon, J.-H. Won, B. J. Kim, and M. C. Paik, "Uncertainty quantification using Bayesian neural networks in classification: Application to biomedical image segmentation", Computational Statistics & Data Analysis, vol. 142, p. 106816, 2020.
J. E. Nash and J. V. Sutcliffe, "River flow forecasting through conceptual models part I—A discussion of principles", Journal of Hydrology, vol. 10, no. 3, pp. 282–290, 1970.
K. Ahmed, S. Shahid, S. B. Haroon, and W. Xiao-Jun, "Multilayer perceptron neural network for downscaling rainfall in arid region: a case study of Baluchistan, Pakistan", Journal of Earth System Science, vol. 124, no. 6, pp. 1325–1341, 2015.
S. Ghosh, "SVM-PGSL coupled approach for statistical downscaling to predict rainfall from GCM output", Journal of Geophysical Research: Atmospheres, vol. 115, no. D22, 2010.
F. Gerges, M. C. Boufadel, E. Bou-Zeid, H. Nassif, and J. T. L. Wang, "A novel deep learning approach to the statistical downscaling of temperatures for monitoring climate change", in The 6th International Conference on Machine Learning and Soft Computing, Haikou, China, 2022: ACM, doi: https://doi.org/10.1145/3523150.3523151.
X. Solé, A. Ramisa, and C. Torras, "Evaluation of random forests on large-scale classification problems using a bag-of-visual-words representation", in Artificial Intelligence Research and Development: IOS Press, 2014, pp. 273–276.
E. Tsironi, P. Barros, C. Weber, and S. Wermter, "An analysis of convolutional long short-term memory recurrent neural networks for gesture recognition", Neurocomputing, vol. 268, pp. 76–86, 2017.

No competing interests reported.

Download PDF

Journal Publication

published 04 Jun, 2023

Read the published version in International Journal of Data Science and Analytics →

Editorial decision: Major revision
25 Mar, 2023
Reviewers agreed at journal
22 Mar, 2023
Reviews received at journal
21 Feb, 2023
Reviewers agreed at journal
21 Feb, 2023
Reviewers invited by journal
12 Sep, 2022
Editor assigned by journal
12 Sep, 2022
Submission checks completed at journal
26 Aug, 2022
First submitted to journal
25 Aug, 2022

You are reading this latest preprint version

Downscaling Daily Wind Speed with Bayesian Deep Learning for Climate Monitoring

Status:

Journal Publication

Version 1

Abstract

Figures

1. Introduction

2 Background

3 Problem Formulation And Data Collection

3.1 Problem Formulation

3.2 Local Observations

3.3 GCM Simulations

4 The Wct Framework

4.1 Multi-Head Convolutional Neural Networks

4.2 Transformer

4.3 Time Complexity Analysis of WCT

4.4 Uncertainty Quantification

5 Experiments And Results

5.1 Experimental Setup

5.2 Comparative Study

5.2.1 Time and Space Complexity Analysis of the Machine Learning Methods

5.2.2 Performance Comparison of the Machine Learning Methods

5.3 Ablation Study

5.4 Uncertainty Quantification Results

5.5 Long-Term Projections

6 Conclusion And Future Work

Declarations

Funding

Competing interests

References

Additional Declarations

Status:

Journal Publication

Version 1