Summary and Interpretation of Findings
This analysis is the first systematic assessment of potential reasons for measurement non-invariance of the five DHS controlling-behavior items administered during 2005–2019 in 19 LMICs and 42 DHS in which 373,167 ever-partnered women of reproductive age responded to at least one item (8). This analysis also is the first to use a carefully sequenced analytical strategy that tested, across seven to nine survey-design groups, for the following types of measurement invariance: first, configural and scalar invariance using MGCFA without covariates (Step 1); then, partial invariance using MGCFA and allowing the model parameters for some controlling-behavior items to vary across groups (Step 2); and then, approximate invariance of the full item set using alignment optimization (Step 3). In the absence of measurement invariance for the DHS controlling-behavior items across survey-design groups, we introduced in Step 4 a new technique to IPV research—MGCFA with covariate adjustment—to explore why measurement non-invariance of the item set was observed. This sequenced approach applied state-of-the-art guidance on the cross-cultural assessment of measurement invariance (33) to controlling-behavior items—a major dimension of psychological IPV against women that heightens the risk of other forms of IPV (34). This sequenced approach allowed us to understand more clearly the nature and sources of measurement non-invariance for this item set, providing guidance on how to improve the measurement of this dimension of psychological IPV against women, and more generally, on the cross-cultural measurement of psychological IPV against women.
In MGCFA in this large and diverse sample of women, this item set of five controlling behaviors achieved configural invariance across nine survey-design groups that varied in interviewer skill (weeks of training) and respondent burden (number of prior questionnaire modules). Importantly, this finding suggests that this DHS items appear to be related in the same direction to a single “controlling behavior” construct that has conceptual meaning across diverse survey-design environments (33).
Despite evidence for configural invariance, this controlling-behavior item set did not exhibit scalar invariance, or full score equivalence (33), across the nine survey-design groups. The achievement of scalar invariance would indicate that the DHS measure for controlling behaviors has the same loadings and thresholds across the nine survey-design groups and time period of 2005–2019. One interpretation of this finding is that the observed non-invariance of these items may be attributable to variability in respondent burden (questionnaire length) and interviewer skill (weeks of training).
Given the observational nature of this study, however, other explanations are possible. One alternative explanation is that the constraints for scalar invariance may be unrealistic in comparisons involving multiple groups, settings, cultures, and time periods (35). To address this issue, Byrne, Shavelson (36) introduced the concept of partial measurement invariance, in which a subset of parameters in MGCFA is constrained to be invariant, and another subset of parameters is allowed to vary across groups. When partial invariance is observed, the invariant subset of items can be compared across countries, cultures, groups, and/or time (36). However, limited guidance exists on the cutoff proportion of noninvariant parameters that can be released using this approach. In our assessment of the DHS controlling-behavior items, we were unable to establish partial invariance after releasing 20% of the parameter estimates.
In the absence of observing either scalar invariance or partial invariance of the controlling behavior items in MGCFA, we turned to alignment optimization (AO)—a novel approach to invariance testing in cross-cultural research on IPV. AO incorporates a simplicity function to discover the simplest model with the fewest noninvariant parameters and to estimate the factor mean and variance parameters in each group. In the application of AO here, two controlling-behavior items (“does not permit her to meet her female friends” and “tries to limit her contact with her family”) exhibited extreme non-invariance, as evidenced by R2 values of 0. This finding, alone, could suggest that item-specific non-equivalence contributed to non-invariance of the item set; however, the R2 values were relatively low (≤ 0.31) for all controlling behavior items in the set. In fact, 39% of all parameter estimates were found to be non-invariant. This percentage substantially exceeded the suggested threshold of 25% or fewer non-invariant parameters for trustworthy comparisons of factor means of controlling behavior across survey-design groups (29, 30). Thus, our findings from AO suggested that the DHS controlling-behavior item set was not approximately invariant across nine survey-design groups that captured survey conditions considered to be important for collecting high-quality data in IPV research (9).
A second explanation for the observed non-invariance in this controlling-behavior item set across survey-design groups is confounding; in other words, other covariates could explain the non-invariance across these groups (37). To explore this possibility, we estimated MGCFA with adjustment for theoretically relevant covariates at the woman level (completed years of schooling and attitudes about IPV against women) and the population or national level (e.g., national means for these variables and the gender-related legal environment). However, adjusting separately and jointly for these covariates did not reduce the observed non-invariance of the items across survey-design groups. Thus, despite adjustment for relevant individual and contextual factors, this item set remained measurement non-invariant across survey-design groups characterized by the number of preceding questionnaire modules (9–11, 12, or 13–15) to proxy respondent burden and the number of weeks of interviewer training (1–3, 4, or 5–6) to proxy interviewer skills.
Implications for Comparative Research and Global Monitoring of Controlling Behaviors
Our findings have several implications for cross-national research and monitoring of controlling behavior as a dimension of psychological IPV against women. First, further secondary analyses may be conducted to assess whether the observed measurement non-invariance across survey-design groups is reduced with adjustment for other theoretically relevant variables, such as the survey year to adjust for potential temporal differences in item meaning; language of the questionnaire (which is available) and the primary language of the participant, interviewer, and interview (which are not available) to adjust for linguistic sources of variation in item meaning (38). Guidelines for translating scales into other languages exist (39), but suggested steps are resource intensive and may not be fully implemented in many surveys. Finally, covariates for other relevant survey conditions may include: the nature and extent of interviewer-participant matching, survey-team sizes or the extent of interviewer supervision, the expected daily quota of completed interviews per interviewer, duration of the fieldwork, and the season or conditions of the fieldwork. Many of these potential sources of non-invariance are modifiable and should be tested systematically for their contribution to measurement non-equivalence of controlling behavior items, and psychological IPV against women more generally.
Second, we recommend that global researchers with the appropriate expertise advance theory and basic research on controlling behavior, and psychological IPV against women, more broadly. The promotion of standard definitions, such as the working definition we propose, may inform refinement of existing DHS controlling behavior items in ways that improve measurement invariance of the two most non-invariant items as well as the other three items that also exhibited low invariance. Cross-cultural qualitative research may inform revisions to the wording of items in the current set that align better with how lay women conceptualize controlling behavior. Such research also may identify other cross-culturally salient controlling behaviors that reflect more fully its definition, as one partner’s credible demands for compliance with behaviors that systematically constrain the other partner’s actions, relationships, and activities. For example, the current set of DHS controlling-behavior items emphasize constraints on a woman’s relationships, and new items may operationalize constraints on a woman’s actions and activities. Survey experiments to understand how variations in survey design may causally affect the measurement invariance of the current DHS items and newly identified items also is warranted. Finally, we recommend that rigorous psychometric assessment, applying the sequenced steps we have performed here, alongside randomized experiments, become the standard for assessing the measurement invariance of all item sets that are designed to capture women’s experiences of IPV in cross-cultural research.
Despite the limitations of existing measures, current evidence suggests that controlling behaviors are among the most common forms of IPV against women globally, with substantial health implications for women (1). Thus, efforts to improve existing measures of controlling behavior, and psychological IPV against women more generally, remain critical.