Study area
The Amazon basin (https://github.com/gamamo/AmazonBasinLimits) delimited our study area26. We further restricted the analyses to intact forests using the intact forest map of Potapov et al. (2008)25 (http://www.intactforests.org/data.ifl.html) for the year 2020.
TRENDY model simulations
We used simulations of a set of 16 Dynamic Global Vegetation Models (DGVMs), which participated in the past global carbon budgets assessment27, named TRENDY-v11. Because we focus on intact forests, we extracted model outputs from TRENDY-v11 scenario S2 in which CO2 and climate vary with time but which uses a time-invariant “pre-industrial” land use mask. In TRENDY, all models are forced with the same climate forcing, namely CRUJRA, see44–46. We mainly extracted GPP monthly outputs from TRENDY-v11 but we also pulled monthly averages of NPP and heterotrophic respiration to compute NEP from their difference for every model that provided those variables as well. NEP values were only used as is (i.e. did not serve for GBT model training) to investigate their correlation with GPP. TRENDY-v11 model outputs are available for the time period 1901-2021.
Remote sensing estimates
To benchmark TRENDY model outputs, we used four recent GPP products, hereafter referred to as Li and Xiao (2019)8, Bi et al. (2022)47, Wild et al. (2022)48, and Wang et al. (2021)49. Briefly, Li and Xiao (2019) developed a global, high temporal (8-days) and spatial (0.05°) resolutions GPP product based on Orbiting Carbon Observatory-2 SIF data, MODIS and meteorological re-analyses through a data-driven approach over the 2000-2022 period8. Bi et al. (2022) generated a global 0.05°, 8-day dataset for GPP (1992-2020) with a two-leaf light use efficiency model, driven by CRUJRA reanalysis, ESA-CCI land cover and leaf area index from GLOBMAP47. Wild et al. (2022) utilised microwave remote sensing of vegetation optical depth to derive global GPP at moderate spatial (0.25°) and high temporal (8-days) resolutions48 for the period 1988-2020. Finally, Wang et al. (2021) correlated eddy-covariance GPP and AVHRR near-infrared reflectance from the Land Long Term Data Record50 to generate a global long-term (1982-2018) time series at high spatial resolution (0.05°) of monthly GPP. We aggregated all four datasets at the monthly timescale for further analyses.
AI models
We used the extreme gradient boosting (XGBoost51) approach, as implemented in the R-package xgboost52, to reproduce time series of GPP as estimated from DGVMs or from remote sensing. XGBoost is an advanced machine learning algorithm reputed to be highly efficient in terms of model performance and computational speed. Individual GBT models were trained for each DGVM/RS estimate of GPP using the longest possible time series and the largest possible area. We splitted all the data into training (60%), validation (20%) and test (20%) for each grid cell and year. So on average, 2.4 months per year served as validation and test for each grid cell while 7.2 months per year were used for training. The months were randomly attributed to training, validation and test dataset. We optimised/tuned GBT model (hyper)parameters on validation data with the R-package caret53 and 8 resampling iterations, focusing on the maximum decision tree depth and the number of decision trees to grow.
Training variables included monthly averages of climate data (monthly mean, minimum, and maximum temperature, monthly precipitation, VPD, total incoming short- and long-wave radiation), atmospheric CO2 concentration (constant spatially), as well as temporal (year/month) and spatial (latitude/longitude) information. The latter variables were important to include to account for model-specific spatial variability in e.g., soil maps and plant functional type spatial distribution. We computed VPD from sub-daily air temperatures and specific humidity using the R-package Pecan.data.atmosphere54 before taking the monthly average. Similarly, we used sub-daily air temperature values to compute the daily minimum and maximum temperatures and finally calculate the monthly minimum and maximum temperature as the average of all daily minimum and maximum temperatures of a given month.
The CRUJRA meteorological reanalysis, which is used as the driver for TRENDY models, is only updated once a year (typically between April and August), which prevents its use for near-real time forecasting. Therefore we decided to use ERA5 and JRA-55 reanalysis as input data for the GBT models. Doing so we internalised the bias between reanalysis sources into the GBT models, as CRUJRA is constructed by adjusting data from JRA-55 where possible to align with the CRU TS data. We downloaded all ERA5 and JRA-55 climatic drivers available in early June 2024, i.e. until May 2024 and December 2023 respectively.
We performed the same model training and validation described above, independently with ERA5 and JRA-55 reanalysis to check the influence of climatic forcings on the model predictions. We also checked how absolute values and anomalies correlated for both reanalysis sources for the region of interest. For ERA5, we used the 2-metre air temperature variable for the air temperature.
Final GBT model performance was evaluated on test data only and quantified using the root mean square deviation and the square of the Pearson correlation coefficient (R²) between reference (DGVM outputs or RS estimates) and predicted values.
Analyses
Given the different resolutions of the individual TRENDY models and the remote sensing GPP estimates, we reprojected the GBT model features (climate forcings) on each DGVM/RS grid before training with the R-package raster55. To generate spatial ensemble means of GPP, we also reprojected all rasters to a common grid (ERA5) using bilinear interpolations.
When reprojection was needed for the intact forest map, we used bilinear interpolations, introducing decimal numbers. We then limited the analyses to all grid cells with interpolated values larger than 50% intact forest cover. Unless otherwise stated, we limited our analyses to the past 30 years (1994-2023).
For generating model ensemble means, we applied simple means to all RS products and DGVM outputs, giving equal weight to each ensemble member. We used the GBT models for extending RS time series to recent years (e.g., 2023/2024) but also to older times (from 1994 to the actual start of the product time series) to avoid discontinuities when computing ensemble means when not all products were available.
We compared the recent dry and hot episode of 2023-2024 with previous major drought events, namely August 2015-March 2016, and September 1997-April 1998. We delimited those drought events of the past from their precipitation and air temperature anomalies in ERA5.
To assess whether the GBT models were able to accurately predict GPP in case of unprecedented events, we verified how the density distribution of the residuals compared for the test data of the dry and hot episodes of 1997-1998 and 2015-2016 and all other climatic conditions. We also repeated the model training and validation described above using pre-2015 data only and tested the model performance specifically for the 2015-2016 dry and hot episode.
In both RS estimates and DGVM simulations, GPP presents both a seasonality and an increasing long-term trend due to CO2 fertilisation. We detrended the GPP time series by subtracting to the raw data the best linear model estimates from the 1994-2023 time period and its average seasonal variation. We used a linear model for detrending as GPP has been steadily increasing the past decades due to CO2 fertilisation. We detrended the climate variables by subtracting the overall variable average and its seasonal mean to the raw time series. The anomalies were normalised using the monthly standard deviation of the respective variables for the overall time period (1994-2023). For spatial analyses, we applied the detrending method described above to the grid cells individually, while we first averaged the time series for biome-wide analyses. Anomalies were computed on the model ensemble mean and the average of RS products.
To smooth monthly time series, we applied rolling averages with a centred 6-month time window. We tested the effect of this choice by changing the window size between 1 month (no average) and one year and verified how it changed the correlations between variables.
For future trends, we downloaded monthly temperature and precipitation outputs from Earth system models (ESM) available in January 2024 for the models participating in the Coupled Model Intercomparison project 6 (CMIP6)56. We downloaded model outputs for the historical period and the following shared socioeconomic pathways57: SSP1-2.7, SSP2-4.5, SSP3-7.0, and SSP5-8.5. We restricted the analysis to the models which had model outputs available for every shared socioeconomic pathway. We then compared the regional monthly mean temperature and the mean annual precipitation time series predicted by each ESM for the recent historical period (1985-2014) with those of ERA5, and only kept the 10 with the lowest root mean square error for each variable. We then averaged those 10 models using the inverse of the squared bias as weight for both the historical period and the future simulations. Temperature and precipitation anomalies were defined using 1985-2014 or 1994-2023 as reference for the historical period or the future scenarios, respectively, since CMIP6 historical simulations stop in 2014.
Temperature anomaly outliers were identified following the classical outlier 1.5 interquartile (IQR) rule58: any observation that was smaller than 1.5 IQR below the first quartile or larger than 1.5 IQR above the third quartile was considered as a cold or hot outlier, respectively.