The patterns of rubber crop diversity are quite flexible and highly depended on farmers’ preference, and intercrop species also largely varied across rubber farms and different regions in Thailand. Taking the availability of data and previous studies on crop diversity into consideration, the definition of rubber crop diversity in present study focuses on the variety of crops in rubber farms, and the rubber crop diversity is observed when additional crop species (such as rice and/or corn and/or durian and so on) in rubber plots are intercropped, bordered, or mixed with rubber trees.
The farmers usually make the adoption decision of rubber crop diversity by maximizing their expected utility according to their characteristics and local factors. Following the previous literature, a random utility framework is used to model the adoption of rubber crop diversity (Asfaw et al. 2012). Given a risk-neutral and utility-maximizing rubber farmer, a latent index function to estimate the rubber crop diversity adoption is represented as \({T}_{i}^{\text{*}}\), which denotes the utility difference between the rubber crop diversity (\({U}_{ai}\)) and non-diversity adoption (\({U}_{ni}\)). Individual farmer \(i\) will adopt rubber crop diversity if the farmer gains utility from diversity adoption is greater than the counterpart \(\left({T}_{i}=1|{T}_{i}^{\text{*}}= {U}_{ai}-{U}_{ni}>0\right)\) and vice versa \(\left({T}_{i}=0|{T}_{i}^{\text{*}}= {U}_{ai}-{U}_{ni}\le 0\right)\) (Leepromrath et al. 2021). The following equation provides an estimate of the difference in utilities.
\({T}_{i}^{\text{*}}=\beta {Z}_{i}+{u}_{i} \text{w}\text{i}\text{t}\text{h} {T}_{i}=\left\{\begin{array}{c}1 if {T}_{i}^{\text{*}}>0\\ 0 otherwise\end{array}\right.\) (Eq. 1)
The \({Z}_{i}\) is a vector of explanatory variables, including characteristics of family, inputs, geographic and climatic conditions of the farm; \(\beta\) is a vector of parameters to be estimated; and \(u\) is the error term. Moreover, net income is the very common welfare objective in utility optimization process (Kassie et al. 2011; Mendola 2007). As mentioned earlier, the inputs and costs of agricultural production would largely change in the transformation of monoculture into crop diversity rubber farms, so does the net income of rubber farmers. Generally, for household \(i\), the treatment effect of rubber crop diversity in a counterfactual framework is equal to:
\({TE}_{i}={Y}_{1i}-{Y}_{0i}\) (Eq. 2)
Where \({Y}_{1i}\) and \({Y}_{0i}\) denote net income of household \(i\) adopts rubber crop diversity and the outcome that does not adopt crop diversity, respectively. In fact, either \({Y}_{1i}\) or \({Y}_{0i}\) is normally observed, but not both outcomes can be recorded for each rubber farmer at the same time. The net income could be normally observed as follows:
\({Y}_{i}={Y}_{1i}\text{*}{T}_{i}-{Y}_{0i}\text{*}\left(1-{T}_{i}\right), {T}_{i}=\text{0,1}\) (Eq. 3)
Commonly, the basic relationships between the net income \({Y}_{i}\) and rubber crop diversity adoption are assumed to be a linear function of a vector of influential variables \({X}_{i}\) and the dummy variable for rubber crop diversity adoption \({T}_{i}\):
\({Y}_{i}=\alpha {X}_{i}+\eta {T}_{i}+{\epsilon }_{i}\) (Eq. 4)
Where \(\alpha\) is a vector of parameters to be estimated, and \({\epsilon }_{i}\) is an error term. The impact of diversity adoption on the net income \({TE}_{i}\) is measured by the parameter \(\eta\). However, \(\eta\) can precisely measure the impact of rubber crop diversity only when the adoption is exogenously determined and rubber farmers are randomly assigned to diversity and monocultural groups (Becerril and Abdulai 2009; Kassie et al. 2011). In fact, the adoption of rubber crop diversity is dependent on expected benefits, characteristics of the household and farm, and the factors that affect rubber crop diversity choices may also influence the net income. That implies the rubber crop diversity adopters and nonadopters are not ex-ante randomly assigned and the equivalent distributions of two groups do not hold in this case. Given the relationships between the rubber crop diversity adoption and net income, the correlation between the two error terms \({\epsilon }_{i}\)and \({u}_{i}\) is greater than zero and the endogeneity issue in the crop diversity adoption process would create a spurious link between the diversity adoption and net income and lead to a biased estimation of crop diversity impact (Rubin 1974).
The two-step Heckman and instrumental variable (IV) approaches can deal with the bias caused by self-selection. However, the Heckman method depends on the restrictive assumption of normally distributed errors, and it’s not easy to identify reasonable instruments in practice for IV. Moreover, these two methods tend to address the endogeneity problem by imposing distributional and linear functional form assumption which might lead the welfare outcomes extrapolate over regions where no similar adopter and non-adopter observations exist. In addition, those methods have the strong assumption that the coefficients on the control variables are similar for diversity adopters and nonadopters, but the coefficients could vary from each other (Becerril and Abdulai 2009; Kassie et al. 2011).
To reduce the potential selection bias and endogeneity problem, many studies suggested that it is necessary to avoid functional form assumptions and impose a common support condition (Cerulli 2015). The propensity score matching studies the probability of participants who receive a treatment based on observed characteristics to calculate the causal effect with use of the propensity scores and matching algorithm (Li 2013). PSM does not require to specify a specific functional form for the relationship between confounding variables and potential outcomes (Kassie et al. 2011), and with propensity matching each individual adopter and non-adopter based on similar observable characteristics, PSM overrides the exogeneity of covariates to identify the causal effect of interest. Moreover, PSM approach allows a wide set of different matching procedures, thus it enables researchers to compare various estimators and provide robustness to results (Cerulli 2015; Rosenbaum and Rubin 1983). Thus, the propensity score-matching method is adopted to reliably recover the treatment effect. With the conditional independence assumption that, diversity adoption is independent of potential outcomes given set of covariates, the average treatment effect on the crop diversity farmers (ATT) is defined within the region of common support which ensures that farmers with the similar covariates \(X\) have a positive probability of being both adopter and nonadopter in propensity score matching (Guo et al. 2020; Rosenbaum and Rubin 1983). The ATT effect can then be estimated as follows:
$${ATT}_{i}=E\left[{Y}_{1i}|{T}_{i}\text{=1}\right]-E\left[{Y}_{0i}|{T}_{i}\text{=1}\right]$$
$$={E}_{p|T=1}\left\{E\left[{Y}_{1i}|{T}_{i}\text{=1,}p\left(X\right)\right]-E\left[{Y}_{0i}|{T}_{i}\text{=1,}p\left(X\right)\right]\right\}$$
\(={E}_{p|T=1}\left\{E\left[{Y}_{1i}|{T}_{i}\text{=1,}p\left(X\right)\right]-E\left[{Y}_{0i}|{T}_{i}\text{=0,}p\left(X\right)\right]\right\}\) (Eq. 5)
Where \(E\left(\bullet \right)\) is the expectation in the population. \({T}_{i}\) represents the treatment of crop diversity. In practice, both the probit and logit model are commonly used to predict propensity score for each observation. And several methods which were proposed in the previous studies to impose the common support region and match similar adopters and nonadopters, such as Nearest neighbor matching and Kernel-based matching, are also adopted in present study. In addition, as PSM assumes diversity adoption is based on observable variables, there might be systematic differences between the net income and non-diversity rubber farmers even after conditioning on observables due to the existence of unobservable variables that may affect both crop diversity choice and net income. Therefore, Rosenbaum bounds test is commonly employed as the sensitivity analysis that examines how great an effect of unobservable may have on changing inference about the crop diversity.