Study settings
We use data from previous studies in Ethiopia, Malawi, Mozambique, and Zambia. The Ethiopia sample comprised 1,586 people from 24 communities (81% rural) across six districts (woredas) in three regions of the country. The Malawi sample comprised 1,400 people from 70 rural villages in Chiradzulu district. The Mozambique sample comprised 601 people, half from 24 urban blocks (quarteirões) in Maputo City and half from 18 blocks in the large town of Dondo in Sofala province. The Zambia sample comprised 365 people from nine rural villages in the Chongwe district. Our final sample includes 3,952 households across the four sites, representing heterogeneous geographies, cultures, and sanitation infrastructure availability. Random sampling of households was used in different ways in all sites. Further details of the design of underlying studies are in Supplemental Material A.
Table 1
SanQoL-5 questions (descriptive system)
Attribute | Question (item)* | Responses |
Disgust | How often do you feel disgusted while using the toilet? | Always Sometimes Never |
Disease | How often do you worry that the toilet spreads diseases? |
Safety | How often do you feel unsafe while using the toilet? |
Shame | How often do you feel ashamed about using the toilet? |
Privacy | How often do you worry about being seen while using the toilet? |
* A preamble is as follows: “The following questions are about your sanitation experiences in the past 30 days, meaning defecation, urination, and anything else you do in a toilet. Please respond with always, sometimes or never.” If less literate respondents struggle with a question, it can be reformulated as “Do you feel disgusted while using the toilet? How often?”. Before the SanQoL-5 questions, the respondent is asked about the last place they defecated. If the respondent practiced open defecation (OD), e.g. in fields or wasteland, they are directed to OD-specific questions, e.g. “How often do you worry about being seen while practising open defecation?”.
SanQoL-5 data and weighting
The SanQoL-5 questions are in Table 1. There are 243 (= 35) possible combinations of SanQoL-5 attribute levels. Weighting of the attributes can apply many approaches to preference elicitation. The four samples included in this study used discrete choice experiment (Mozambique), attribute scoring (Malawi), and attribute ranking (Ethiopia, Zambia). Further details are provided in Supplementary Material B. The index represents a given population’s relative valuation of the attributes so weights are typically slightly different in different countries, as with health-related QoL indices such as EQ-5D (Brazier et al., 2016). In a given country, avoiding disgust might be more important to people than avoiding shame, so disgust would have a higher weight in the index. One person’s SanQoL-5 index value might be 0.46 and another’s 0.78, based on their responses to the five questions combined with the population weights. Higher scores are better, with 1 denoting “full sanitation capability” (maximum QoL) and 0 “no sanitation capability” (minimum QoL).
Overall study design
We apply a combination of classical test theory and item response theory (IRT) to assess different aspects of validity and reliability, including: (i) construct validity – whether an instrument measures the construct it intends to measure; (ii) convergent validity – whether two instruments aiming to measure similar constructs are correlated (an aspect of construct validity); (iii) known groups validity – whether an instrument can discriminate between two groups expected to differ in terms of the outcome (another aspect of construct validity); and, (iv) internal reliability – how consistently different questions in a measure capture the same construct (Fayers & Machin, 2015).
We took a predictive approach to construct validity, by testing hypotheses about how SanQoL-5 would covary with hypothesised variables, explained below. We assessed convergent validity by correlation between SanQoL-5 and a sanitation visual analogue scale (VAS) with scores ranging 0-100 (Cheung et al., 2024)(Supplementary Material C). We used Spearman’s rank correlation (ρ) because, like EQ-5D index values (Parkin et al., 2016), SanQoL-5 index values are not usually normally distributed in a given population. We hypothesised that there would be moderate correlation (0.4 > ρ > 0.6), following norms for health VAS (Whynes et al., 2008). We explored known groups validity by assessing whether people with higher sanitation levels of service tended to have higher SanQoL-5 index values. We assessed internal reliability using Cronbach's alpha (> 0.7 following Nunnally (1978)), and item-total correlation (> 0.4, Ware et al. (1980)).
Hypotheses for construct validation
We pre-specified hypotheses about the presence of associations between SanQoL-5 index values and a set of toilet characteristics, hereafter “hypothesised variables” (Al-Janabi et al., 2013; Savoia et al., 2006; Smith, 2005). These were predominantly fieldworker observations of toilet characteristics including: walls being “solid”; faeces not being observed on the pan/slab; the pan/slab being concrete or similar; a water seal being present; the toilet having an inside lock; and, the toilet not being shared with other households. Variables were binary coded (1 = better outcome, 0 = worse) so that positive regression coefficients are hypothesised. For example, if the fieldworker observes that the toilet has solid walls, we hypothesise a positive association with SanQoL-5. This is because solid walls are more likely to provide privacy and safety than makeshift or absent walls. In making hypotheses, we drew on the literature on sanitation and mental wellbeing, as well as motives for sanitation behaviours (Novotný et al., 2018; Sclar et al., 2018). Further details and rationales for hypothesised variables are provided in Supplementary Material D. We also included negative controls hypothesised not to be associated with SanQoL-5 (Arnold & Ercumen, 2016), namely household size and whether the respondent had a partner.
Construct validity analyses were completed for each country separately. We only assessed a binary variable for a given country if > 15% of the sample with non-missing data was in each category, to ensure a minimum of statistical power. We tested hypotheses using generalised linear mixed models (GLMM) in Stata 18, with standard errors clustered at the sampling level above the household (e.g. urban block in Mozambique, village in Malawi). We regressed on SanQoL-5 index values per country, including as a covariate each hypothesised variable in turn. We also explored the consequences of accounting for covariance between toilet characteristics, by including all hypothesised variables as covariates concurrently.
Item response theory (IRT)
We used the graded response model (GRM) to assess the psychometric properties of each attribute and its contribution to the information function for unweighted SanQoL-5. GRM is widely used in the evaluation of health-related QoL measures because it allows polytomous variables, i.e., with multiple attribute levels (Liu et al., 2021; Reeve et al., 2007). GRM is not part of the “Rasch family” because it allows discrimination to vary across items (Fayers & Machin, 2015). For IRT analyses we pooled data across countries because of minimum recommended sample size guidance (Fayers & Machin, 2015). Based on the GRM, we present item information and test information functions, as well as category characteristic curves.
Comparing question framings
In the first two studies in which the SanQoL-5 was used (Ross, Greco, et al., 2021; Tidwell et al., 2022), the questions had been framed such that “always” was the best outcome. For example, “Can you use the toilet without feeling disgusted?”. Mixed-methods cognitive and piloting work in support of the Zambia study identified this framing as challenging to understand in local languages without detailed explanation (as well as other languages spoken by the team, e.g., Hindi). To facilitate a comparison, we included the old (“always = best”) questions alongside the new/current question framing (Table 1) in Zambia. A third of the Ethiopia sample (n = 506), which undertook fieldwork at a similar time, were also asked both sets of questions. A further analysis in our present study was therefore comparing the performance of the “always = best” and “always = worst” framings, using the same validity and reliability methods as above. For example, we tested the construct validity hypotheses under the two question framings for the five SanQoL-5 attributes and compared results. For a fair comparison in Ethiopia, we only compared results for the n = 506 who completed both question formulations (rather than the full n = 1,586 sample)
Ethics
The Malawi study received prior approval from the National Committee on Research in the Social Sciences and Humanities (ref: NCST/RTT/2/6). The Mozambique study received prior approval from the Comité Institucional de Ética do Instituto Nacional de Saúde (ref: 028/CIE-INS/2023) in Mozambique. The Zambia study received prior approval from the University of Zambia Biomedical Research Ethics Committee (ref: UNZA-1389/2020). The Ethiopia data were collected as part of an internal evaluation by World Vision, who secured a prior approval letter from each district sampled for data collection. Use of the Ethiopia data was approved by LSHTM since anonymised data had been made openly available online by World Vision at https://osf.io/x5myz/ before this study commenced. The protocol covering Ethiopia and Zambia was approved by the LSHTM MSc Research Ethics Committee (Ref: 29049), while the LSHTM Observations/Interventions Research Committees approved the studies in Malawi (Ref: 28249) and Mozambique (Ref: 28190).
Table 2
Respondent and toilet characteristics
| Ethiopia (n = 1,586) | Malawi (n = 1,400) | Mozambique (n = 601) | Zambia (n = 365) |
Milieu | 81% rural | 100% rural | 100% urban | 100% rural |
Demographic characteristics | | | | |
Respondent is female | 829 (52%) | 1167 (84%) | 330 (55%) | 182 (50%) |
Respondent age (mean/SD) | 41.8 (13.6) | 39.9 (16.1) | 40.4 (16.4) | 43.3 (15.6) |
Aged 18–29 | 286 (18%) | 401 (29%) | 205 (34%) | 80 (22%) |
Aged 30–44 | 663 (42%) | 529 (38%) | 182 (30%) | 129 (35%) |
Aged 45–59 | 419 (26%) | 275 (20%) | 111 (18%) | 91 (25%) |
Aged 60+ | 217 (14%) | 187 (13%) | 103 (17%) | 65 (18%) |
Household size (mean/SD) | 5.3 (2.1) | 4.5 (1.7) | 5.3 (2.5) | 5.8 (3.0) |
Completed primary school or above | 722 (91%)* | 1256 (90%) | 399 (66%) | 258 (71%) |
Piped water on-premises | 469 (30%) | 4 (0.3%) | 358 (60%) | 17 (5%) |
Sanitation characteristics | | | | |
Toilet type | | | | |
Flush or pour-flush toilet | 5 (0.3%) | 1 (0.1%) | 195 (32%) | 5 (1%) |
Pit latrine with concrete slab | 219 (14%) | 194 (14%) | 301 (50%) | 61 (17%) |
Pit latrine with wood/soil slab | 961 (61%) | 1159 (83%) | 47 (8%) | 254 (70%) |
Open defecation | 401 (25%) | 46 (3%) | 58 (10%) | 45 (12%) |
Toilet shared with other households | 967 (84%) | 515 (38%) | 389 (72%) | 219 (60%) |
Toilet has solid walls | 358 (30%) | 961 (72%) | 430 (72%) | 237 (79%) |
Faeces not observed on pan/slab | 381 (32%) | 215 (16%) | 579 (96%) | 135 (45%) |
Pan/slab is concrete, porcelain or similar | 220 (19%) | 195 (15%) | 496 (91%) | 70 (23%) |
Water seal is present (flush or pour-flush) | 5 (0.4%) | 1 (0.1%) | 195 (36%) | 5 (2%) |
Toilet has inside lock | n/a (not collected) | 65 (5%) | 152 (28%) | 6 (2%) |
Data are n (%) for categorical variables and mean (SD) for numerical variables. Percentages for categorical variables are % of those with non-missing data for that variable. *in Ethiopia, data are for highest level of education “reached” rather than “completed”, and the question was randomised to be asked of only half the sample.