3a. Potential outcomes framework
Following Ghanem, Sant’Anna, and Wüthrich (2024), we define G as membership in the treatment group. The probability that G = 1 is a function of unobserved time-invariant expected gains (U). For simplicity, let G = 1 if U > 0 and G = 0 if U ≤ 0. Let ε be some positive value of U. Our DID estimate can be written as:
{E[Y(post) | G = 1] – E[Y(pre) | G = 1]} - {E[Y(post) | G = 0] – E[Y(pre) | G = 0]}
the ATT of G. Let the variable D represent receipt of the treatment. D = 1 for members of the treatment group in the post-treatment period, and D = 0 otherwise. The observed outcomes in the treated group are a function of both receipt of treatment and expected gains. That is,
E[Y(post) | G = 1] = E[Y(post) | D = 1] + E[Y(post) | U = ε]
and E[Y(pre) | G = 1] = E[Y(pre) | D = 1] + E[Y(pre) | U = ε]
When there are no expected gains, E[Y(post) | G = 1] = E[Y(post) | D = 1] and the ATT of G equals the ATT of D. If membership in the treatment group is effectively randomly determined, the ATT of D can be generalized to the ATE of D. Similarly, Gupta, Martinez, and Navathe (2023) conceptualize the ATT as being composed of the ATE and the effect of expected gains.
If the effect of expected gains on the outcome is time-invariant (E[Y(post) | U = ε]
= E[Y(pre) | U = ε], the ATT of G will also equal the ATT of D, and no further adjustment is needed:
{E[Y(post) | G = 1] – E[Y(pre) | G = 1]} - {E[Y(post) | G = 0] – E[Y(pre) | G = 0]} =
{( E[Y(post) | D = 1] + E[Y(post) | U = ε]) – (E[Y(pre) | D = 1] + E[Y(pre) | U = ε]) } - {E[Y(post) | D = 0] – E[Y(pre) | D = 0]} =
{E[Y(post) | D = 1] – E[Y(pre) | D = 1]} - {E[Y(post) | D = 0] – E[Y(pre) | D = 0]} =
the ATT of D.
However, if the effect of expected gains on the outcome varies with time (E[Y(post) | U = ε] > E[Y(pre) | U = ε]), Fig. 1), the ATT of G will no longer equal the ATT of D:
{E[Y(post) | G = 1] – E[Y(pre) | G = 1]} - {E[Y(post) | G = 0] – E[Y(pre) | G = 0]} = {( E[Y(post) | D = 1] + E[Y(post) | U = ε]) – (E[Y(pre) | D = 1] )} - {E[Y(post) | D = 0] – E[Y(pre) | D = 0]}.
What sort of unobserved expected gains might affect membership in the treatment group and have time-varying effects on outcomes? In a job training or smoking cessation program, or a difficult treatment regimen that entails active participation by the patient, a subject’s resolve to faithfully attend all the sessions, absorb the training material, and follow the instructor’s advice, might not manifest itself at all in the pre-treatment period. Only when the subject’s resolve is combined with the information and instructions that are part of the intervention does higher resolve affect the outcome. Similarly, some subjects may know that they have an inherent capacity such as personal or human capital that allow them to respond better to the prescribed treatment. Those resources also may be unobserved to the analyst and manifest themselves only in the post-treatment time period and only for subjects in the treatment group.
Essential heterogeneity is possible for units of analysis besides individuals. Alternative payment model interventions conducted by the Center for Medicare and Medicaid Innovation, often rely on volunteer enrollment and where unobserved organizational-level expected gains may be associated with both participation and changes in outcomes.
If unobserved expected gains affect outcomes only in the post-treatment period, the effect of expected gains will not be detected by any comparison of pre-treatment trends in the treatment and comparison groups (Fig. 2). The analyst will proceed, assuming the post-treatment experience of the comparison group is a valid counterfactual for the treatment group in the absence of the treatment.
3b. Two stage estimation methods
Any model of a subject-specific endogenous explanatory variable, e.g., an unobserved confounder, consists of at least two equations. The first equation, often referred to as the sample selection equation, describes how the subject is assigned to the treatment or comparison group. The second equation is the outcome equation or “equation of interest.” Either equation can be estimated using any appropriate functional form (Lee 1983).
Equations 2a and 2b show a different way of writing the DID model. Again, we assume that the treatment is administered at the same point in time for all subjects in the treatment group. The variable \(\:{w}_{i}\) represents the subject’s expected gain and appears in both equations and \(\:{v}_{i}\) is a “clean” error term, uncorrelated with \(\:{w}_{i}\).
\(\:{Treat}_{i}=f(\gamma\:{w}_{i}+{v}_{i}\:)\) (treatment selection equation) (2a)
$$\:{Y}_{it}={Subject}_{i}{\beta\:}_{1}+{Year}_{t}{\beta\:}_{2}+{TreatEffect}_{it}{\beta\:}_{3}{+(\delta\:}_{TreatEffect}{w}_{it}+\:{u}_{it})$$
(outcome “equation of interest”) (2b)
The coefficient on w in Eq. 2a, (γ), represents the causal effect of expected gains on selection into the treatment group. The coefficient on w in Eq. 2b, (\(\:\:{\delta\:}_{TreatEffect})\) represents the causal effect of expected gains on the outcome variable. Notice that in Eq. 2b, w has acquired a t subscript, indicating that expected gains affects Y only in the post-treatment period.
If w retained only an i subscript in the outcome equation, it’s effect would be cancelled out by the pre-post comparison of the same individual. But because w has both an i and a t subscript, its effect on outcomes cannot be addressed by subject or time fixed effects, because the effect of expected gains exists only for subjects in the treatment group, and only in the post-treatment time period. Thus, the effect of unobserved expected gains must be addressed through econometric techniques including instrumental variables, sample selection models, two-stage residual inclusion (Terza 2018), and regression discontinuity (Cook 2008).
3c. Illustration with Simulated Data
A simulated dataset can illustrate the impact of time-varying effects of expected gains on estimates. Using Stata, we generate a dataset with 10,000 individual observations, each followed for 20 time periods (see appendix for details of the data generating process and results). Years 10–20 are considered the post-treatment period. Receipt of treatment is modeled as a function of expected gains, an instrumental variable Z that affects assignment to the treatment group but has no direct effect on the outcome variable, and a random error term. We then generate untreated and treated potential outcomes, where the treated potential outcomes among recipients of the treatment are a function of a time-varying effect of expected gains. We set the true treatment effect to zero so that the only effect on the outcome is due to expected gains.
3d. When Do Expected Gains Threaten the Validity of Inferences?
Does the time-varying effect of unobserved expected gains on outcomes in in DID models necessarily represent bias? That depends on the research question. The research question will determine the population to which one wants to generalize the results from the evaluation intervention (the target population), and the target population, in turn, depends on what policymakers intend to do if the initial evaluation suggests that the intervention is successful (Imai, King, and Stuart 2008; Lesko, et al., 2017).
The potential effect of expected gains on program effects can arise through non-random assignment to treatment, thus threatening internal validity (Abadie, et al., 2020), as well as through non-random sampling, thus threatening external validity). However, the threats to internal validity only have a practical impact if one wishes to generalize the treatment effect estimate to a population with a different treatment assignment mechanism. If the intent is to continue offering the program on a voluntary basis to a new, perhaps larger, but similar population, then the research question essentially concerns the effect of offering the treatment (intention to treat). In that case, the ATT for the voluntary participants in the evaluation sample need exhibit only internal validity. In the expanded application, subjects with relatively high expected gains will continue to enroll in the treatment group, thus replicating the results from the original evaluation. The estimated ATT will capture the effect of the program on a population of volunteers with higher than average expected gains relative to the comparison group with lower expected gains. No correction for the effect of expected gains is needed in this case.
However, if policymakers intend to expand the program to a population with different characteristics, perhaps by mandating participation in the treatment, the most policy relevant parameter would be the ATE for the entire population. The ATE will capture the effect of the treatment for the entire target population with average expected gains, not just volunteers. That ATE could be obtained from a randomized, controlled trial (RCT) conducted on the expanded target population, but an RCT may be politically difficult, expensive, or unethical (Black 1996). Another option would be to collect additional data, presumably survey data, on subjects’ expected gains, and control for that variable in the DID analyses. However, that approach also may be infeasible. In some cases, the analyst must turn to econometric corrections for unobserved confounders, remembering that methods such as instrument variables estimate only a local average treatment effect for the subpopulation whose membership in the treatment versus control group is influenced by the value of the instrument (Imbens and Angrist 1994; Imbens and Rubin 1007).
Although our focus is on the simplest DID model, expected gains also influence analyses that use more flexible DID estimators. Some estimators allow for parallel trends after conditioning on covariates – both pre and post, in the case of the two-way Mundlak estimator, and only in the post period, in the case of the Callaway & Sant’Anna estimator. However, these methods do not adjust for unobserved confounding. Although coefficients on treatment indicators in the two-way Mundlak estimator allow an investigator to “study the nature of selection bias into exposure”, they do not allow one to isolate the effect of a treatment from the influence of selection (Wooldridge 2021). Similarly, robustness tests meant to assess the degree to which results of DID models that assume parallel trends differ from results of models that allow time trends to differ for treatment and comparison groups will not allow the analyst to isolate the impact of unobserved, time varying, treatment group-specific confounders like expected gains (Bilinski and Hatfield 2020).