Setting
A SSA country called Zambia situated in Southern part of Africa, with an estimated population of 16.9 million people in 2018. It has a relatively young population, with a median age of 16.9 years (20). Twenty two percent of the population are aged between 15 and 24 years, with an estimated HIV prevalence of 6.6% (21).
Study Design
We reanalysed data from the 2013-14 Zambia Demographic and Health Survey (ZDHS). This is a cross-sectional study based on a two-stage stratified cluster sample design. A total sample of 18,052 households were selected for the 2013-14 ZDHS, which provided reliable estimates of population and health indicators at the national and provincial levels, and in rural and urban areas within provinces. Details of the sample design and methodology are reported in the 2013–14 ZDHS report and in an article we published, which is based on the same data (21, 22).
Study Population
In the 2013–14 ZDHS, the eligible study population included women aged 15–49 years and men aged 15–59 years who were either permanent residents of the sampled households or visitors present in the households on the night before the survey (21). A total of 16,411 women and 14,773 men were successfully interviewed, yielding response rates of 96.2% and 91.1%, respectively. The principal reason for non-response among eligible adults was failure to find individuals at home despite repeated visits, followed by refusal to be interviewed. Of the respondents who completed the structured interview, 29,007 respondents aged 15–59 years were tested for HIV, corresponding to a response rate of 87.2%. Young people aged 15–24 years accounted for 39.9% of the people tested or 11,571 in total. Further details about participation in the survey and HIV testing response rates can be found the the 2013-14 ZDHS (21).
HIV Data Collection Procedure
The 2013–14 ZDHS HIV testing protocol allowed for linking of the HIV results to the socio-demographic data. Eligible women and men who consented to HIV testing were asked to voluntarily provide about five drops of blood from a finger prick for anonymous testing. Each Dried Blood Spots (DBS) sample was given a bar code label, and a duplicate label was attached to the individual’s questionnaire, while the third bar code was attached to the blood sample transmittal form to track the blood sample from the field to the central laboratory (21). The HIV Laboratory testing algorithm used for the DHS surveys follow the UNAIDS/WHO guidelines for HIV testing in population based surveys. This testing algorithm included Vironostika HIV antigen/antibody combination assay (Biomerieux) as the first assay and Enzygnost HIV Integral II assay (Dade Behring) as the second confirmatory testing assay. Both tests are Fourth-generation enzyme-linked immunoassay assays (ELISAs) (23). Western Blot was used as a third confirmatory test. Further details about the testing methodology can be found in the 2013–14 ZDHS report (21).
Variables
The selection of variables to include in the analysis of this article was based on the proximate determinants framework for factors affecting the risk of transmission of HIV developed by Boerma and Weir (24). According to this conceptual framework underlying variables, such as socioeconomic status influence proximate determinants, which in turn have an effect on biological mechanisms to influence health outcomes (i.e. HIV infection). The proximate determinants include among others concurrent sexual partnerships, number of sexual partners and condom use. The conceptual framework helps to understand the causal pathways from distal socioeconomic factors to HIV infection The focus was on underlying determinants, i.e., the socioeconomic context of the neighbourhood (neighbourhood educational attainment, wealth, and employment status) and individual demographic and socioeconomic factors. The operational definitions are as follows:
Dependent variable:
The dependent variable for this study was HIV status, which is defined as serostatus determined by testing blood samples collected from each consenting individual (0 indicating “HIV negative” and 1 “HIV positive”).
Independent variables:
Individual level: The following variables were included in the analysis: sex, age, marital status, residence, educational attainment, wealth, and employment status. The wealth index score from the ZDHS was used, and separate wealth tertiles for urban and rural populations were created to reduce residence bias. The wealth score was calculated from the first component of principal component analysis (PCA) of household assets, housing characteristics, and access-to-amenities data (e.g., roof and floor material, electricity, water supply, possession of goods such as a bicycle and television, and so forth). The methodology used to calculate the wealth index is based on the Filmer and Pritchett approach and a detailed explanation is available in the Measure DHS report (25). Educational attainment was defined as the reported number of years spent in school, and this was used as a continuous variable in the regression model. Employment status was measured by asking respondents if they had been working in the 12 months preceding the survey and this was then categorised as unemployed and employed. Occupation was defined as the main type of work done in the 12 months preceding the survey, and this was classified using the ILO International Standard Classification of Occupation Codes (ISCO). However, this variable was not included in the regression model, as it partly represents the same information as employment status.
Neighbourhood level: In this study, neighbourhoods were defined as geographic areas located in close proximity to each other, and an enumeration area was used as a proxy for a neighbourhood. Variables describing the characteristics of the neighbourhoods were derived by aggregating individual responses within each cluster for all respondents and then categorising the means or proportions into three levels (low, medium, and high). Neighbourhood educational attainment was derived by calculating the mean number of years of school attained by the individuals who were interviewed in each enumeration area. Neighbourhood employment was the proportion of individuals interviewed in each cluster who were categorised as employed in the last 12 months. Neighbourhood relative wealth was derived by calculating the mean of the wealth scores of all respondents in the clusters.
Analysis
We restricted the analyses to young people (15–24 years). We conducted the analyses of both sexes and then stratified by women and men. Analyses were carried out in two phases. In the first phase we explored bivariate associations between HIV serostatus and individual demographic and socioeconomic factors, and neighbourhood-level socioeconomic factors. In the second phase, we used multilevel mixed-effects logistic regression analyses on HIV status with associated factors, which we conducted using Stata 15. Multilevel modelling takes into account both individual and neighbourhood-level variability. The bivariate and multivariate analyses also controlled for the potential confounder age, which was adjusted for as a linear effect. We tested five separate models to examine the association between HIV prevalence and socioeconomic factors. Model 1 was the null or random intercept-only model, which did not include any socioeconomic or demographic factors. Subsequent models gradually included socioeconomic and demographic variables, starting with individual-level variables only, then neighbourhood-level variables only, and finally included both individual and neighbourhood-level variables (corresponding to model 2, model 3 and model 4, respectively). The final model included only significant variables at both individual and neighbourhood level. The log likelihood test was used for assessing the goodness-of-fit of the models and to determine whether adding independent variables to the intercept-only model significantly improved the fit.
Multilevel statistics estimated were, explained variance, the interclass correlation (ICC)), and log likelihood tests. To estimate explained variance (R2), we use the approach explained in Hox, J (2010), and proposed by Snijders and Bosker (1999) for multilevel logistics regression models, which does not rely on the likelihood. The estimated variance is decomposed into the lowest-level residual variance ( ), which is fixed to π2/3 = 3.29 in logistic models, the second-level variance ( ) and the variance of the linear predictor from the fixed part of the model ( ) (26). The explained variance is estimated using the formula:
See formula 1 in the supplemental files.
The ICC or rho indicates the proportion of the variance explained by the grouping structure in the population, which in our case are the neighbourhoods (26) . This is same as the unexplained neighbourhood-level variance. We tested for interactions in the final model, for individual and neighbourhood variables, but none were found to be statistically significant.
Ethics
Ethical approval was obtained from the Tropical Disease and Research Centre (TDRC) ethical committee, the institutional board of ICF International, and the Centers for Disease Control and Prevention (CDC) Atlanta research ethics review board. Participation in the surveys was obtained by soliciting verbal informed and voluntary consent. Participants were informed that the survey’s HIV testing results were anonymous. Home-based counselling and testing were offered in parallel to participants following the national HIV testing algorithm to ascertain HIV infection status for respondents who consented. Concurrent HIV testing with DetermineTM HIV ½ (Alere Healthcare) and Uni-GoldTM (Trinity Biotechnology) was the home-based testing procedure, and nurses and lay counsellors provided pre- and post-test counselling. If either of the rapid tests was HIV reactive (positive), the respondent was referred to the nearest health facility for further assessment, treatment, and care.