Delineating the concept of REDD+
As we saw above, REDD + is typically seen as a prototype type of action (i.e., a means) that generically remains described exclusively by its outcomes of reduced emissions (i.e., an end). This is fundamentally different from other interventions; for instance, “protected areas” or “forest certification” that describe means not ends. Observers can thus conceptually come to confuse a model for action (the alleged market-based offsetting strategy of REDD+) with an expected final goal (of having forests store more carbon).35
In this subsection, we thus explicitly walk through the typical stages and assumptions underlying a REDD + intervention, using a Theory of Change (ToC) approach, designed for causally linking the stages of inputs, treatments, outputs, outcomes and impacts.43 Figure S2 (SI) outlines these stages going from left to right, with key critical assumptions flagged in bubble shapes.
As for inputs, REDD + is directly triggered by, and thus essentially dependent on the presence of external finance flows, be it from global markets for carbon credits (as originally envisaged), or from donors: bilateral development or environmental donors (such as Norway’s NICFI), multilaterals with a climate mandate (e.g. the Global Environment Facility or the Green Climate Fund), and private-sector non-market flows for direct emission offsets, based on some notion of corporate social responsibility.
However, serious claims for REDD + achievements can eventually only be made if knowledge about pre-existing carbon stocks, land-use trends, key drivers and stakeholders triggering forest loss (and protection) jointly can be merged into a credible baseline: what would have happened under the laissez-faire baseline assumption of ‘no REDD + intervention’? Notably, a proper assessment of levels/ changes in threat is quintessential: if threats from deforestation drivers are rising, treatment may have to be intensified. Conversely, if a projected threat was not to materialize at all, then neither the dynamic counterfactual nor the project will exhibit any deforestation.
REDD + treatments are highly heterogeneous in their composition. We thus distinguish between the subcategories of incentives, disincentives, and enabling measures.44 First, invariably some incentives are present in REDD + as a general local benefit-sharing mechanism, or compensation for the opportunity costs of newly introduced/ enforced restriction in forest use or conversion to alternative land uses. Incentives can either be conditioned upon compliance with certain land-use rules (e.g. PES-type of contracts), or unconditional investments into alternative, environmentally more benign livelihoods, social sectors (health, education), etc. Often, REDD + interventions also entail disincentives, through newly introduced restrictions or a more thorough monitoring and sanctioning of incompliance with already existing ones. Typically, REDD + has thus included both carrots and sticks. Third, enabling measures as a residual category include tools such as the free prior informed consent (FPIC) of local people’s participation in REDD+, a clarification of land tenure and access rules, etc.
Many real-world REDD + projects and programmes, such as the Bolsa Floresta Programme in Brazil’s Amazonas State (Cisneros et al. 2019), or the Sustainable Settlements in the Amazon (PAS) project in the Transamazon region of Pará State (Simonet et al. 2019; Carrilho et al. 2022) have been using the full spectrum of conditional and unconditional incentives, disincentives, and enabling measures to reach their goals. They used these pilot interventions to experiment with different components, but also with an underlying belief that holistic, locally customized approaches carried higher probabilities of success, especially in market-remote, cash-strapped frontier regions. Unsurprisingly, many REDD + projects are in their holistic range of actions ‘ICDP-like’, with a predominant focus on non-conditional livelihood enhancements.9,34 For the same reason, REDD + projects have also had much to learn from ICDPs.45
Public PES programmes with a partial focus on forest carbon goals constitute a second type of intervention. Often, carbon financing has helped to boost the funding of these national-level, or at least larger-scale programmes. Costa Rica’s PSA, Peru’s National Forest Conservation Programme, and Ecuador’s Socio Bosque all constitute such examples, although the latter two combined PES with ICDP components (Giudice et al. 2019; Jones and Lewis 2015). Hence, with forest carbon enhancement for climate change mitigation being flagged as an explicit goal, these PES-based interventions need to be included as another pathway of implementing REDD+.
Outputs are to be understood as the immediate, often short-term results of the ‘treatment’: the treated recipients understand the goals and modalities of the intervention, the rules of the game (incl. land and resource tenure) are clear, and (dis)incentives are well-applied. Delivered outputs imply that stakeholder motivations have been successfully aligned with the goals of the intervention. For this to occur, treatments need to have been well-designed and carefully implemented. From the PES literature, we know that spatial targeting in the selection of participants and their to-be-treated land areas has been important, vis-à-vis two complementary dimensions: a) the site-specific environmental service density (here: forest carbon stocks per hectare), and b) the on-site projected threat (here: of deforestation/ degradation) of that stock to be endangered over time. Also, customization of the benefits (e.g. multiple payment levels) can help making the intervention more cost-effective and equitable.32,33
The outcome level is where the REDD + rubber hits the road: do critical stakeholders make the required behavioural on-the-ground changes? That is, do they reduce forest clearing, charcoal making, or timber harvesting in the REDD + required manner (environmental outcomes)? Similarly, do income, consumption, and perceptions of welfare and security increase among those targeted stakeholders (socioeconomic outcomes)? These are measurable indicators that can also be included in impact evaluations.
The final transition towards impacts – the overarching primary carbon-related goal of reduced forest-based emissions, as well as ethically and politically important side-objectives related to biodiversity, human wellbeing and equity, indigenous land rights etc. – still entails some subtleties.
First, a reality check is to what extent intervention-targeted stakeholders and deforestation drivers have been adequately aligned. For instance, many REDD + projects are focused on addressing smallholders to reduce their deforestation, but a local surge in land grabbing from more powerful external agents might render these efforts less fructiferous in terms of mitigating deforestation.
Second, income and consumption outcomes trigger development feedback loops on the final impacts. Rebound effects refer to the fact that treatment-induced changes in household incomes may also change consumption patterns (e.g., higher incomes stimulating meat and dairy consumption) that per se change ecological footprints. Magnet effects refer to the potential of these income changes to attract outside migrants, e.g. through successful employment creation in a REDD + project. Pull migration could have a bearing on land use, as migrants open up new land plots for subsistence agriculture. Both effects are well-established in the PES literature.31
Third, the goal of mitigating climate change is both universal and perpetual. Classical concerns vis-à-vis REDD + projects are thus to what extent these time- and place-bound interventions contribute to the hoped-for universal and perpetual impacts. As for permanence, the impact of a time-limited treatment on carbon stocks may also only be transitory – though as such still important for mitigating climate change in the short run. Conversely, to the extent the treatment triggers desirable structural changes at the output and outcomes level, permanence might be increased.
Likewise, a REDD + treatment may not only reduce on-site deforestation, but also push some of these pressures outside of the intervention area – a phenomenon known as leakage. This spillover effect will typically diminish, though not fully erase the (universal) REDD + mitigation impact.46,47 The larger the scale of the REDD + intervention, the less leakage we should expect – a key argument for favouring national programmes over REDD + projects. The size of leakage in conservation incentive programmes is seldom quantified.31 For high-value products sold on international markets, such as harvesting precious timbers, leakage may be exceptionally high.48 In general, the higher the price elasticity and the geographical mobility on output and input markets (incl. access to land), the larger leakage we should expect.31
Delimitation
As briefly mentioned above, our aim is to take stock of the currently available evidence from rigorous quantitative impact evaluations for REDD + interventions. This means that we need to apply a number of apriori filters of inclusion (cf. Table S1), related both to the underlying REDD + intervention (Factors 1–4), and subsequently to the case study evaluating its impacts (Factors A-F).
As “REDD + interventions” (1), we understand here, firstly, actions that implementers self-denominated using the RED(D) + label, and secondly, other actions that fully or partially featured forest-based climate mitigation/ carbon outcomes in an explicit way. As mentioned above, this would include also national-level PES programmes that pretend to further forest-carbon objectives; in turn, some large national-level watershed-focus PES programmes (e.g. in China, Mexico, and Vietnam) would be excluded. As for actions (2), many forest carbon programmes include both conservation/ regeneration of standing forests and afforestation/reforestation (A/R) activities; those focused entirely on A/R would not seem to functionally fit the REDD + definition, and we thus exclude them. In terms of scale (3), we choose to be inclusive of both subnational REDD+ (incl. projects) and emerging national programmes, keeping in mind they likely have different characteristics – cf. also (1). Finally, as a temporal cut-off point for the start of implementation (4), we use the year 2007, coinciding with the Bali UNFCCC COP13: everything that happened in the forest carbon sphere before 2007 (Joint Implementation, Clean Development Mechanism, etc.) is of comparative interest,28 but is inevitably bound to differ from REDD+.
A second layer of filters refers to the analytical level. First, we choose to be inclusive with respect to the screened literature (A), incorporating not only peer-reviewed but also grey-literature studies – allowing us to include in a quickly moving field some more recent contributions (assessed by us as high-quality) that are still at the working paper stage. As for analysed impacts (B), we looked at both forest carbon (main goal) and welfare effects (primary side-objective). As “bottom-line”, we understand effects to be observed at the right-hand side of the REDD + ToC, i.e. both outcomes and impacts (see above). Impact evaluations are often stated in terms of outcomes (C), such as forest-cover (deforestation areas, rates) and land-use proxies (e.g. fire incidences), which are more precisely observable than forest carbon in the short-to-medium run. The more process-oriented, intermediary outputs (middle part of ToC) are not of our interest here (D): they are often less clearly (sometimes, ambiguously) linked to REDD + bottom-line outcomes, and are often more qualitative than quantitative. However, we included also subjectively stated wellbeing (“do you now feel better/worse off/ unchanged than prior to the REDD + project start?”), as a popular socioeconomic bottom line of evaluation (E). We realize though that it is an indicator with its own potential response biases, which is best triangulated with more objectively measurable socioeconomic outcomes.
The final filtering criterion is arguably the trickiest: the quality of impact evaluation (F). To rigorously attribute impacts to interventions, counterfactuals need to be constructed: what would have happened without the REDD + intervention? We only included impact studies using counterfactuals, i.e. experimental and quasi-experimental methods. This includes the alleged ‘gold standard’ of randomized controlled trials (RCT), or Before-After-Control-Intervention (BACI) design. Various econometric techniques attempt to ex-post model a counterfactual, including using matching to identify adequate control observations, or selecting non-treated units to synthesize a control unit. Yet, different recall techniques can also be used to gather baseline data ex post in the field. To make impact estimates quantitatively comparable, we also needed standard deviation estimates. Many case-study authors did not publish these; we had to contact several for this supplement.
Study identification strategy
We started by screening our pool of REDD + studies from prior reviews.26,29,49,50 An initial set of 15 eligible studies with quantitative estimates of REDD + and carbon-focused PES projects using counterfactual impact evaluation methods was identified. A Boolean search string based on title and abstract of this initial sample was semi-automatically generated following the method described by Grames et al. (2019) (cf. SI, Fig.S1).
We extracted study characteristics such as location, intervention details, sample characteristics along with Hedge’s G effect sizes. Our final sample comprises a total of 30 REDD + interventions, analysed in 32 studies, with 52 effect sizes being included (35 forest-related, 17 socioeconomic outcomes). This includes disaggregated effects being used in the moderation analysis. For the main analyses, we aggregated effects resulting in 23 and 12 estimates for environmental and socioeconomic indicators, respectively. For a meta-study, this remains a fairly small sample, restricting also our analytical options: although the number of rigorous impact studies has expanded rapidly in recent years (around half of the articles included have been published since 2017), but more is still needed to reach a critical mass for detailed statistical analysis. Our studies are just about equally divided between specialized REDD + projects/programmes and PES schemes; yet the latter concentrate on fewer cases. In the former category, some studies are multi-case comparisons, e.g. a pool of Amazon Fund-financed and VCS-certified private REDD + projects41 and cases from CIFOR’s Global Comparative Study on REDD+ (GCS-REDD) (Bos et al. 2017, Duchelle et al. 2017, Larson et al. 2018; Sunderlin et al. 2017).
How well does our final sample represent the REDD + universe? For recall, it is shaped by the filters we have applied (cf. Tables 1), overlaying geographically an initial implementation bias (where have REDD + investors gone?) with a research bias (where have scientists preferred to work, and found access to data?), and publication bias (is it more likely that positive results are published than negative or null results?). Our small sample mirrors an ‘absolute’ implementation bias towards Latin America (Brazil, Andes, Mesoamerica); it covers less well some ‘high-density’ REDD + countries (Kenya, Colombia, Guatemala). We did find evidence for a moderate publication bias based on Egger’s regression test on funnel plot asymmetry (cf. SI): environmentally positive, significant results have a slightly higher likelihood of getting published. On aggregate, the external validity of our sample is deficient, yet still vastly exceeds that of earlier meta-studies of forest conservation incentives, having been based on much smaller and geographically more biased samples.24,25
Meta-analysis
The meta-analysis was carried out using the standardized mean difference as the outcome measure. We use the metafor along with clubSandwich packages in R. A multi-level random-effects model was fitted to the data, including random effects at the study and country level. For the main estimates we assumed a correlation of 0.8 within studies and countries, and report robust variance estimates based on the correlated hierarchical effects procedure.52 We conducted subgroup analyses, testing for differences between self-declared REDD + and PES-cum-carbon programmes. Similarly, we tested for differences between the outcome and impact levels of the socioeconomic variables. For the moderation analysis we also included binary moderators indicating a) deforestation pressure (1 for high pressure; 0 otherwise); b) spatial targeting (1 if study explicitly mentions ecosystem service density and/or deforestation threat as determining factors for the location and/or intensity of the intervention; 0 otherwise); and c) benefit differentiation (1 if study explicitly mentions differently sized benefit levels within the same scheme; 0 otherwise).
Our binary division between high and low deforestation threat was based on the position vis-à-vis the mean annual deforestation rate over the period 2001-21 across all countries (0.28% y-1) from Global Forest Watch (GFW). We compare this threshold with the average case-level deforestation rate considering the last five years previous to the year start of the evaluated REDD + intervention.
Robustness checks
Lacking significant differences between the two forest-size outcomes (forest cover, deforestation rate), we included them both in the same primary-effect analysis. In addition, we found no evidence that impact estimates would vary systematically with programme duration. We tested to what extent our results were driven by a few influential studies by a) consecutively excluding studies with high weights, namely Groom et al. (2022) and Guizar et al. (2022); b) excluding studies using the synthetic control method, and c) excluding studies with a Cook’s Distance larger than two standard deviations. The coefficient sizes slightly changed, but our conclusions remain robust.
Several studies employed matching techniques, and to calculate effect sizes one requires the correlation between pairs of observations (Borenstein 2009:29, Formula 4.27).53 Due to missing data, we assumed a correlation coefficient of 0.5 for our main specification but tested also more extreme values (0.3 and 0.7) as a robustness check. Indeed, we found that REDD + effect estimates are sensitive to the assumed parameter in the calculation method, but not enough to alter our findings based on Figs. 2 and 3.