The GRIT river network
We used the recently-developed Global RIver Topology (GRIT) river network26,36 as the basis for developing the global bankfull discharge model. GRIT is the first global, high-resolution bifurcating river network, developed from observation-based river data27 and an improved high-resolution global Digital Elevation Model (DEM): the Forest and Building removed DEM (FABDEM25). The centrelines were generated from the GRWL and the OpenStreetMap water layer was used to maximize the topographic accuracy of the river network outside of the GRWL mask. GRIT is pioneering the representation of bifurcating and multi-threaded rivers which are missing from previous global river networks such as HydroSheds37 and MERIT Hydro38. GRIT represents all rivers with a drainage area larger than 50 km2 or rivers with widths wider than 30 m globally, and the average length of a GRIT river reach is 1 km. We estimated the bankfull discharge for all the river reaches wider than 30 m in the GRIT network except Polar Köppen climate regions30, where river channels are likely to be frozen for a large proportion of the year.
Bankfull discharge reference data
We assume that the point-based bankfull discharge values, typically measured at gauging stations, are representative of the nearest river reach. The observed bankfull discharge locations were matched to river reaches on the GRIT network using a snapping approach, pairing the QBFobs with the nearest river reach, defined by the distance from the QBFobs to the river centreline. The QBFobs values were discarded if this distance exceeded 1 km, or the drainage area of the QBFobs did not match with the drainage area of the paired river reach (deviated by more than 20%). We obtained the climate region of the observed bankfull discharge sites using the Köppen climate zone dataset30. Considering the high seasonal variation of the river water status (free-flowing or frozen), sites in the Polar region were excluded. After applying the filtering steps, this left a total of 2,657 QBFobs values to train and test the RF model. These QBFobs spread across four climate regions (minimum number of 156 measurements), with bankfull discharge values ranging from 10 to 42,235 m3s-1 (median value of 165 m3s-1). Temporal changes in bankfull river discharge associated with river channel evolution were not considered in this paper due to the very small number of sites where such observations exist. More details of this dataset can be found in the Supplementary Fig. S2, Text S1.
Bankfull discharge predictors
A selection of hydrological, landscape and climatic indicators known to control river conveyance capacity (bankfull discharge) were identified based on the literature33,39,40 and used as predictors for estimating the bankfull discharge. These indicators include drainage area, annual mean discharge, annual total runoff, river width at multiple inundation frequencies, river sinuosity, reservoir capacity, elevation, slope, water table depth, soil moisture and soil thickness, Leaf Area Index (LAI), Fraction of Absorbed Photosynthetically Active Radiation (FAPAR), impervious percentage area, and a suite of climate indicators (temperature, precipitation, aridity index, snow cover fraction), detailed below.
Depending on the indicator type, different aggregation approaches were used to process these indicators as GRIT river reach predictors. These approaches include Catchment Sum (CS: total value accumulated from the upstream catchment), Catchment Average (CA: mean of all raster values within the catchment), Catchment Majority (CM: the most common value among all raster values within the catchment), Reach Median (RM: median value of all raster values within the river reach), Reach Average (RA: mean value of all raster values within the river reach). It is worth noting that at bifurcated rivers, the calculation of CS follows the partitioning approach in the GRIT network26; no extra processing is done for the RM and the RA as each channel of a bifurcated river is treated as an individual river reach. Some of the predictors were available over multiple years, so time averaged values (mean) were computed. The predictors and processing details are described below (Supplementary Table S1).
Drainage area has a controlling effect on river channel size and discharge6,20,41. The upstream drainage area was estimated at each point along the river network from the river attributes of the GRIT network. The predictors GLOFAS_v4.0_discharge and GLOFAS_v4.0_runoff are re-analysis values of global annual mean discharge and annual total runoff available at 0.05° (° means degree of latitude and longitude throughout the manuscript unless otherwise stated) resolution for the period 1993 to 2022 (~ 5 km at the equator)42. From these data, we computed the time-average (mean) discharge value and mean annual total runoff between 1993 and 2022. Values of the time-averaged discharge and annual runoff were extracted for each GRIT reach and we took the spatial median value (RM) at each reach as the predictor.
The GRWL dataset is an observation-based, global compilation of river planform geometry at mean annual discharge27, derived from thousands of Landsat images acquired in the months during which the mean annual discharge occurs at each location. Here, the river width attribute was calculated from the GRWL river mask by taking the reach median (RM) perpendicular distance from the GRIT river centre line to the river bank26. We only considered rivers which overlap with the GRWL river mask (grwl_overlap > = 0.5). To take account of the uncertainty of the GRIT river width in our model, we included the overlap ratio36 between GRIT river and the GRWL river mask as an additional predictor.
The Wetted River Widths (WRWs) dataset was derived from the Landsat surface water occurrence data43. We use the term “wetted river widths” to describe the dynamic nature of the river width from various discharge scenarios. The Landsat surface water occurrence data is a global surface water mask which indicates the ratio between the number of months in which water was detected and the total number of valid observations over a 37-year period (1984–2021). We calculated the WRWs as the wetted area connected with the reach divided by the overlapping length between that river and the water layer. The calculation is applied to each river reach directly, resulting in the WRWs predictors. We extract these values for different water occurrence values (50%, 40%, 30%, 20%, 10%, 1%), where a smaller occurrence value indicates larger discharge events. We include several frequencies of occurrence because they provide different information on the surface extent (and morphology) of the channel based on the frequency with which the water goes out of bank in each location. The RF model is known to be relatively insensitive to collinear predictors44, supporting this decision. We show the comparison between the WRWs at 1% occurrence and the GRWL in the Supplementary Fig. S6.
River sinuosity controls channel gradient and hence impacts the flow conveyance5. This was calculated as the river planform distance divided by the straight-line distance45 for each segment in the GRIT network, using the GRASS GIS line sinuosity tool (https://grass.osgeo.org/grass83/manuals/v.to.db.html).
The Global Reservoir and Dam (GRanD) database is a global compilation of existing (up until 2016) dam and reservoir data. GRanD includes dams and reservoirs with capacity greater than 0.1 km3, with a cumulative storage capacity of 6864 km3 46. The reservoir capacity predictor was computed by mapping the reservoir location (vector) to the nearest river reach and employing the upstream accumulated (CS) reservoir capacity as the reservoir predictor.
Catchment averaged (mean) (CA) elevation and slope were used for each river reach. They were derived based on the recent Forest And Buildings removed DEM (FABDEM, ~ 30 m spatial resolution)25. This DEM is an artefact error-removed DEM based on the recent released 30 m global DEM, Copernicus DEM. The Copernicus DEM was resampled from the high-solution (12 m) Interferometry Synthetic Aperture Radar (InSAR) acquired global DEM, which has been demonstrated to be more accurate than previous global DEMs.
Climate predictors (Temperature, Annual precipitation, Aridity Index, Snow cover fraction) were averaged (mean) within the direct contributing upstream drainage area of each GRIT reach (CA). Here, we use the MSWX temperature and precipitation data from 1979–2022 (spatial resolution of 0.1°) to compute the annual mean temperature and annual mean precipitation47. The temperature variation predictor (predictor: MSWX_temperature_range) was calculated using the monthly maximum temperature minus the monthly minimum temperature, averaged (mean) over all months.
Water table depth and soil moisture data were obtained from SOIL-WATERGRIDS45. They are monthly data at a spatial resolution of 0.25° over the period 1970–2014. We used the mean depth of the highest monthly water table and of the lowest monthly water table as predictors Water Table Depth-High: WTDH, Water Table Depth-Low: WTDL, respectively. The monthly mean soil moisture was extracted at depth ranges of 0–30 cm, 30–60 cm, and 60–100 cm in the root zone based on all months of the period 1970–2014 (predictors: SI_0_30, SI_30_60, SI_60_100). Additionally, we used the mean soil thickness over the period of 1900 to 201548. The above predictors were computed as river catchment mean values (CA).
The aridity index49 (version 3) has a spatial resolution of 30” (~ 1 km). Time-averaged values of the aridity index from 1970–2000 were computed, then they are averaged over each GRIT river catchment as the predictor (CA). The Moderate Resolution Imaging Spectroradiometer (MODIS) daily snow cover fraction dataset from 2000–2019 with a spatial resolution of ~ 1 km50 was used to compute the percentile values used for the snow cover fraction predictors. We computed the time-averaged value at the 5th, 50th, and 95th percentile of the snow cover fraction as the three predictors. The averaged value (mean) over each GRIT reach catchment was used as the predictor (CA). The Köppen-Geiger climate category30 is also used as a predictor where the climate region is defined by the most common Köppen value within the catchment.
Vegetation in the upstream catchment controls evaporation and infiltration rates, influencing stream discharge and the resulting river channel network51,52. Here we used the LAI as a proxy for vegetation in the catchment. Leaf Area Index (LAI) tracks the one-sided green leaf area per unit of ground surface area, while the Fraction of Absorbed Photosynthetically Active Radiation (FAPAR) quantifies the solar radiation absorbed by plants within the photosynthetically active radiation spectral region. Gridded daily data from 1981–2022 of LAI and FAPAR were accessed from NOAA Climate Data Record of Advanced Very High Resolution Radiometer53. Both datasets have a spatial resolution of 0.05°. The time-averaged value over the full study period was computed for each GRIT river catchment. Averaged value (mean) over each GRIT reach catchment was used as the predictor (CA).
Urbanisation may also affect runoff rates, and therefore we also used the percentage of impervious area as a predictor. We obtained the urban areas from the gridded map of the land cover dataset derived from multiple satellite images acquired for the year 2010, then calculated the ratio between urban areas and the total area per grid. This ratio represents the impervious percentage per pixel at a spatial resolution of 0.25°54. The catchment averaged (CA) value of this data is used as the predictor.
Random Forest model
Benefiting from a bootstrap aggregating technique, RF models can assimilate information from a large number of predictors to estimate the target without overfitting. This is achieved by creating multiple regression models from different subsets of the training samples and combining their outputs to compute the estimation55,56. The importance score is defined as the total decrease in node impurity resulting from splitting the predictor variables, averaged over all trees55, measured by the residual sum of squares. The importance indicates the relative significance of each predictor for predicting the target variable (Supplementary Fig. S3). We used the scikit-learn package (version 1.1.2) in python57 to construct our RF model.
Eighty percent of the observations were randomly as the training data. We enforced a proportional selection of the training samples on top of the randomness, meaning that the magnitude and climate region distribution of training samples are proportional to the total distribution in the observation dataset. We achieved this by using the class_weight in scikit-learn package. A five-fold cross-validation was used in the training process to optimize the model parameters (max_features, min_samples_leaf, n_estimators), ultimately achieving the highest R2 value in predicting the bankfull discharge. To ensure the robustness of the cross-validation, we ran the cross-validation 10 times (samples used in the cross-validation are different in each run) and report the averaged (mean) result as the final performance (Supplementary Fig. S3). We adopted the best-performing model from the multiple cross-validation runs to estimate the QBF for global ungauged rivers.
Flood frequency analysis
In order to estimate the return period of the QBF, long time series of discharge records have to be used. We applied a three-step process to obtain the return period of river bankfull discharge and the value of widely used bankfull discharge assumption: 2-year return period flow. We relied on a global gauging station dataset to identify the discharge records associated with the QBFobs and QBFest. We first selected gauging stations with high-quality discharge records (a minimum of 10 complete years of data, where each complete year has at least 90% of the daily discharge records). We then paired the selected gauging stations with the QBFobs by the gauging site number, and with the QBFest based on the snapping approach mentioned in the bankfull discharge reference data section above. Our initial source of global gauging station dataset includes a total number of 22,538 stations collected from various agencies for the period 1950-202158. Eventually, we paired 1,818 sites of QBFobs, 9,058 sites of QBFest with the qualified discharge records (1,545 sites are overlapping). We estimated the return period of the QBFobs and the QBFest as well as the value of bankfull discharge assumption for these sites (QRP2) using flood frequency analysis.
The return period estimation, and estimation of various return periods of discharge (QRP2), were conducted using Flood Frequency Analysis (FFA). FFA establishes the relationship between the flood magnitude and frequency of occurrence (or return period). It is typically used to estimate the flood magnitude of design floods (such as the 2-year return period of discharge or more extreme ones). To obtain the relationship between the flood magnitude and frequency of occurrence, flood events are first selected from long-time series of discharge records, then the probability of occurrence is estimated based on the statistics of selected flood events. To select the flood events, there are two commonly used approaches: the Annual Maximum Series (AMS) approach, which uses the maximum discharge of each year, and the Peak-Over-Threshold (POT) approach, which employs all daily discharge values above a specified threshold59. Because the AMS only considers the maximum discharge in each year, it is limited to estimating flows with a return period of one year or greater. However, some rivers may reach bankfull more frequently than once per year on average. For this reason, we selected the POT approach. The POT method was shown to perform better than the AMS approach in at-site FFA60.
We selected the flood events based on the POT method offered by the floodnetRfa R package (https://github.com/floodnetProject16/floodnetRfa) and used a Locally Weighted Scatterplot Smoothing (Loess) curve61 to extract the bankfull discharge return period (QBFobs−RP and QBFest−RP) from the relationship between the recurrence probability and discharge. In the POT method, the user must determine the appropriate threshold and the adjacent flood events separation rule to select independent flood events62. Here, to be able to estimate the return period of bankfull discharge, the event identification threshold must be smaller than QBF. Rather than manually identifying this threshold, we systematically assessed four different thresholds, namely 10%, 25%, 50%, and 75% of the bankfull discharge value. We used the adjacent peaks separation rule recommended by the Water Research Council of the United States63. We fitted a Loess curve to the selected flood events and estimated the QBFobs−RP and QBFest−RP for each threshold. Loess is a non-parametric method that fits a smooth curve to the data, particularly useful for capturing the underlying patterns of non-linear relationships. In our case, we used a Loess span of 0.5, which provides a moderate degree of smoothing.
Our analysis of the four bankfull discharge return periods (sensitivity tested using four different event thresholds) at each gauging station revealed that some sites with low frequency of QBF return periods (> 10 years) could also have very large variation in return period (Supplementary Figs. S7-8). We thus kept only those sites where the estimated QBF return period was robust to the choice of the peak threshold, with no greater than 50% variation among the four estimated QBF return periods (QBF−RP, from the four sensitivity-testing thresholds above). The variation was calculated as the maximum QBF−RP minus the minimum QBF−RP, divided by the mean QBF−RP. We chose this value of 50% variation to reduce uncertainty and avoid removing all the sites with relatively large return periods. Details of the relationship between the variation in the estimated return period and the number of events exceeding QBF can be found in Supplementary Figs. S7-8. We averaged the QBF−RP estimated from the four sensitivity thresholds as the final return period. Eventually, we obtained the QBFest−RP for 8,519 sites, among which 1,545 sites has QBFobs−RP (Fig. 2).
Using the above selected flood events, we estimated the discharge at the 2-year return period. Again, a Loess curve is used here. Averaged values from all the thresholds were used as the final QRP2. We calculated the bias of 2-year return period of discharge by comparing to the QBFobs at 1,725 sites (Fig. 3).