Data
Yield data
To estimate the crop exposure to snow droughts, we used the Global Dataset of Historical Yield (GDHY)47 for winter wheat yields, assessing its sensitivity to snow drought. The GDHY dataset provides gridded winter wheat yields for the period of 1982–2016, with a spatial resolution of 0.5°. It is grounded in the crop calendar48, which provides relatively accurate planting and harvesting dates for each winter wheat cultivation globally, at a 0.5° spatial resolution. Comprising agricultural census statistics and satellite remote sensing, the GDHY dataset has been widely used as a primary source in recent global crop-climate studies49–52. In addition, to enhance the robustness of the sensitivity of winter wheat yield to snow droughts, we also used an alternative winter wheat yield dataset at county-level obtained from the USDA NASS QuickStats35 for the period of 1980–2018 (Supplementary Fig. S14 and Supplementary Fig. S15).
Climate data
Climate data, including hourly 2-m air temperatures (°C), daily precipitation (mm), and monthly vapor pressure deficit (VPD; hPa) computed from dew point temperature (°C), were obtained from the ERA5-Land dataset from 1960 to 2020. ERA5-Land is a global atmospheric reanalysis product produced by the European Centre for Medium-Range Weather Forecasts (ECMWF)53. The center has recently made substantial improvements in data accuracy, enhancing its applicability to various land surface applications. Daily root-zone soil moisture was obtained from the Global Land Evaporation Amsterdam Model (GLEAM) dataset54. The GLEAM datasets are observationally constrained and have been widely used to analyze global and regional soil moisture changes55.
To assess the impacts of extreme heat stress, extreme freezing stress, dry stress, and heavy rainfall on wheat yields during snow drought conditions, we analyzed climate extreme events across growing seasons (fall, winter, and spring) due to their adverse impacts on winter wheat growth56. We calculated the cumulative exposures to growing degree days between the base and optimum growth temperature thresholds (GDD, ℃d, Eq. (1)), extreme growing degree days (EDD, ℃d, Eq. (2)) above the optimum growing temperature thresholds, and freezing growing degree days (FDD, ℃d, Eq. (3)) below the freezing growing temperature thresholds over the winter wheat growing season. GDDs, EDDs and FDDs are calculated using the following formulas57:
$$\:\text{G}\text{D}\text{D}=\sum\:_{h=1}^{N}{DD}_{h};\:{DD}_{h}=\:\left\{\begin{array}{c}\frac{{T}_{opt}-{T}_{base}}{24},\:\:{T}_{h\:}>\:{T}_{opt}\:\:\\\:\frac{{T}_{h}-{T}_{base}}{24},{\:{T}_{base}\le\:T}_{h\:}\le\:\:{T}_{opt}\\\:0,{T}_{h\:}<\:{T}_{base}\end{array}\right.$$
1
$$\:\text{E}\text{D}\text{D}=\sum\:_{h=1}^{N}{DD}_{h};\:{DD}_{h}=\:\left\{\begin{array}{c}0,{\:\:\:\:\:\:\:\:\:T}_{h\:}\le\:\:{T}_{opt}\:\\\:\frac{{T}_{h}-{T}_{opt}}{24},\:{T}_{h\:}>\:{T}_{opt}\end{array}\right.$$
2
$$\:\text{F}\text{D}\text{D}=\sum\:_{h=1}^{N}{DD}_{h};\:{DD}_{h}=\:\left\{\begin{array}{c}0,{\:\:\:\:\:\:\:\:\:\:\:\:\:\:T}_{h\:}>\:{T}_{frez}\:\\\:\frac{{T}_{frez}-{T}_{h}}{24},\:{T}_{h\:}\le\:\:{T}_{frez}\end{array}\right.$$
3
Where hourly temperature (\(\:{T}_{h\:}\)) was obtained from the ERA5-Land and GLDAS datasets, respectively. \(\:{T}_{base}\) and \(\:{T}_{opt}\) are the base and optimum growing temperature thresholds specific to each growing phase of wheat (Supplementary Table S2). \(\:{T}_{frez}\) is the temperature causes freeze injury (Supplementary Table S2).
In addition to snow effect, rainfall act as an important water supply to affect winter wheat growth. In this study, precipitation was partitioned into rainfall and snow to represent water supply in different forms on the basis of daily temperature using the following empirical model:
$$\:\text{R}\text{a}\text{i}\text{n}\text{f}\text{a}\text{l}\text{l}\:=\left\{\begin{array}{c}Pre,\:\:{T}_{d}\:>\:{T}_{rain}\\\:Pre\left(\frac{{T}_{a}-{T}_{snow}}{{T}_{rain}-{T}_{snow}}\right),\:\:{{T}_{rain}\ge\:T}_{d}\ge\:{T}_{snow}\\\:0,\:\:{T}_{d}\le\:{T}_{snow}\:\:\:\:\:\end{array}\right.$$
4
Where Pre is daily precipitation (mm) and \(\:{T}_{d}\) is average daily air temperature (℃). Partitioning thresholds \(\:{T}_{rain}\) and \(\:{T}_{snow}\) are 3.0 and \(\:-\)1.0 ℃, respectively5,58. To estimate the effects on winter wheat yields due to specific rainfall extremes, we used two extreme rainfall indices (DrySpell and Max.5D Rain)59 to represent the effect of rainfall deficit and excess on crop yields. The DrySpell (days) is defined as the maximum number of consecutive days with no rainfall in each season. The Max.5D Rain (mm) is defined as the maximum accumulative rainfall in a five-day period in each season.
Snow data
We identified snow droughts using two datasets: ERA5-Land53 and GLDAS60. The choice to use these reanalysis datasets is due to their inclusion of essential snow variables (snow depth and snow depth water equivalent) with global winter wheat croplands at the daily timescale, enabling a comprehensive and uniform analysis across different regions and across-season periods2. More importantly, the daily snow water equivalent of ERA5-Land agrees better with station observations compared with other datasets, making it an ideal dataset to characterize snow droughts23,61. We used the ERA5-Land dataset for our main analysis, and assessed the sensitivity of our results with the alternative GLDAS dataset.
Identification of snow drought
We calculated daily snow water equivalent (SWE; mm) from 1 October to 31 May for the entire water year 1948–2022, using a 7-day moving window9. We then restricted the analysis to winter wheat croplands in the Northern Hemisphere during the winter season, particularly for January-February-March (JFM) accumulations of daily SWE, to compute the standardized SWE index (SWEI). This is because the date of 1 April is commonly used in water supply management to distinguish winter snow accumulations and spring snowmelt, representing as a proxy for the date of maximum annual SWE8,22,62.
We calculated the SWEI at the yearly scale, choosing the 1961‒1990 as a reference baseline, by fitting a Gamma distribution for each year of snow dynamics, at each grid cell. The SWEI thus provides a comprehensive measure of snow balance over consecutive years, facilitating active monitoring of snow droughts (e.g., SWE on April 1; Supplementary Fig. S4). The SWEI is also useful for finer-scale analyses of crop-snow response corresponding to the occurrence of snow droughts8. Here, snow drought is defined as the SWEI values below − 0.87,63.
Trends of snow drought frequency
Grid cells with SWEI ≤ − 0.8 (or SWEI ≤ − 1 in Supplementary Fig. S2) are defined as snow drought-affected croplands, indicating potential impacts on crop growth due to snow deficits. To address temporal variations of snow drought frequency across winter wheat croplands, we split the data from the entire 1960‒2020 period into 50 10-year blocks (1960‒1970, 1961‒1971 … 2010‒2020). We identified the frequency of snow droughts independently by 3 × 3 data points if more than 50 data points were included. We defined the slope of trends in snow drought frequency estimated from the Theil-sen regression. We then used the Mann-Kendall’s test64 to detect the statistical significance of the trend in snow drought frequency (Fig. 1), which did not require data with a normal distribution. To ensure the robustness of the 10-year blocks, we also assessed trends of snow drought frequency in 15-year blocks and found no significant differences (Supplementary Fig. S3).
Data pre-processing
The SWE, snow depth, soil moisture, VPD, precipitation, and temperature data, obtained from ERA5-Land, GLDAS, and GLEAM datasets, were re-gridded to the same resolution of 0.5° × 0.5° using the bilinear interpolation. In all experiments and for all hydro-climatic variables, we selected growing-season data based on the crop calendar dataset48 for temporal consistency. The winter wheat growing season, defined as the number of days between the planting date and harvesting date, remains consistent within each growth cycle from 1982 to 2016. We aggregated the daily energy-related (i.e., VPD and temperature) values to growing-season averages and the daily water-related (i.e., soil moisture and rainfall) values to growing-season sums. Daily SWE data were aggregated using 3-month (JFM) sum to calculate yearly SWEI to match yearly yield data. Climate extreme indices of GDD, EDD, FDD, DrySpell, and Max.5D Rain were calculated in fall, winter, and spring, respectively (Supplementary Table S3). We restricted the analysis to winter wheat croplands in the Northern Hemisphere.
Seasonality and long-term trends were removed to obtain the anomaly of every single yield and climate extremes indices by subtracting the long-term mean monthly signals and by subtracting a locally weighted scatterplot smoothing regression (LOESS) with a smoothing span of 0.3, respectively. LOESS is one of well-known de-trending methods for studying crop-climate interactions to remove the long-term effect of technological improvements27,49,51. In this way, we excluded long-term trends derived by changes in the equilibrium state, such as long-term successional cycles and human frequent management. We also excluded regions without snow-affected winter wheat croplands and focus specifically on the short-term crop yield responses to snow anomalies.
Detection of causal relationships between winter wheat yields and snow droughts
Convergent cross mapping (CCM) is a powerful method that can help distinguish causality from spurious correlation in time series of non-linear dynamical systems28,55. In CCM, causality is detected by measuring the extent to which the sign of the affected variable Y (e.g., yield anomalies) reliably estimates the states of a causal variable X (e.g., SWEI). That is, if variable SWEI is influencing winter wheat yield, then, based on the generalized Takens’ theorem28,65, the causal variable SWEI can be recovered from the historical record of the affected variable yields. The skill of cross mapping is defined as the coefficient (ρ) of the Pearson correlation between predictions and observations of SWEI. If the ρ increases with the length of time series and convergence is present, then the causal effect of SWEI on yields can be inferred29.
Overall sensitivity of winter wheat yield to SWEI
Wheat yields are affected not only by snow droughts, but also by the occurrence of soil droughts, freezing stress, spring frosts, and severe heats6,31,66. To examine the sensitivity of winter wheat yield to snow anomalies, we first trained random forest models and then employed explainable machine learning (SHapley Additive exPlanations; SHAP) to isolate the contribution of SWEI to yield anomalies from the influence of other climate extremes stress (Supplementary Fig. S6). Random forest67 is one of the data-driven models based on a bootstrap aggregating strategy for improving the stability of model results, and it requires no statistical assumptions on predictors and target variables, given sufficient data. SHAP 30 is a game theoretic approach to explain the outputs of the random forest model by accounting for contributions of individual variables to the overall prediction. We treated the yield anomaly as the target variable and corresponding season climate extreme indices, including season anomalies of GDD, EDD, FDD, DrySpell, and Max.5D Rain, as well as SWEI are the predictors.
We collected all predictors and target data during 1982‒2016 from one grid cell and the surrounding grid cells (3×3 shape) to train a model if more than 50 data points were included. We removed grid cells that model performance worse than the mean of training data itself using OOB R2 < 068. Generally, 22%‒71% of the yield variability can be explained by relevant climate extremes variability (Supplementary Fig. S7).
For each trained model, we employed the SHAP dependence method to isolate the marginal contributions of SWEI on the yield anomaly. We defined the overall sensitivity as the slope estimated from the Theil-sen regression between SHAP dependence for yield and SWEI, by assuming that the grid cell-level interaction between yield anomaly and SWEI is nearly linear. It is important to note that because sensitivity is inferred through linear regression, it may not capture the entirety of the interactions between yield and SWEI for each grid cell. This method combines the benefits of bootstrap aggregating and non-distribution assumption through random forest modeling, as well as the advantages of global interpretations being consistent with the local explanations in the SHAP algorithm, hence strengthening the robustness of the results than using traditional statistical methods68.
Trends of sensitivity of winter wheat yield to SWEI
Grid cells with non-significant (P ≥ 0.1) overall sensitivities are defined as non-snow controlled regions, meaning that energy-related variables such as temperature, radiation, and VPD could dominantly control crop growth, and the detected dependence on water is likely due to other alternative agriculture water. Therefore, we excluded non-significant grid cells (P ≥ 0.1) for studying changes in crop-snow relationships. To address temporal variations in winter wheat yield sensitivity to SWEI, we split the data from the entire 1982‒2016 period into 25 10-year blocks (1982‒1992, 1983‒1993…2006‒2016). We independently trained random forests models again for each 3 × 3 data point if more than 25 data points were included, inferring temporal sensitivity through the SHAP and Theil-sen regression, assuming that the grid cell-level interaction between yield and SWEI within 10-year blocks was nearly linear (Supplementary Fig. S6).
We excluded grid cells with OOB R2 < 0 and increasing thresholds of OOB R2 did not affect our main conclusions (Supplementary Fig. S10). We also used the Mann-Kendall’s test to detect the statistical significance of the trend in yield sensitivity to SWEI (\(\:{T}_{sen}\)). To confirm the 10-year split would not bias results, we examined the \(\:{T}_{sen}\) using 15-year blocks and found no significant differences (Supplementary Fig. S11). To confirm the 3×3 split does not bias results, we tested the \(\:{T}_{sen}\) based on 5×5 data point and found similar results (Supplementary Fig. S12).
Attribution analysis
To understand changes in \(\:{T}_{sen}\), we used random forests and the SHAP attribution method to predict the \(\:{T}_{sen}\). We treated the \(\:{T}_{sen}\) as the target variable, and multiple relevant hydro-climatic trends (i.e., GDD, EDD, FDD, DrySpell, and Max.5D Rain, total precipitation, VPD, and soil moisture) and snow drought frequency as predictors to train a random forest model68,69.
Then we employed SHAP values to quantify marginal contributions of each single factor trends on \(\:{T}_{sen}\) and then rank the variable importance by the sum of absolute contributions across all grid cells. We ensured that the model performed effectively by maintaining a cross-validation out-of-bag R2 higher than 0.567,69. After identifying the dominant factors for \(\:{T}_{sen}\) (Supplementary Fig. S16), we presented the combined impact from the top four important factors which were GDD trends, VPD trends, total precipitation trends, and Max.5D Rain trends to elucidate potential mechanisms across winter wheat croplands (Fig. 4).