Study setting and data source
The United Nation (UN) Statistics Division has subdivided the African continent into five regions. Among these countries, East Africa is the one largest region that includes 19 countries (Burundi, Comoros, Djibouti, Ethiopia, Eritrea, Kenya, Madagascar, Malawi, Mauritius, Mozambique, Reunion, Rwanda, Seychelles, Somalia, Somaliland, Tanzania, Uganda, Zambia, and Zimbabwe). This study was a secondary data analysis based on Demographic and Health Surveys (DHS). From these 19 East Africa countries, 13 countries have DHS data whereas 6 (Djibouti, Somalia, Somaliland, Seychelles and Mauritius, Reunion). Among these 13 countries that have DHS data, 2 countries have DHS data that was conducted before 2010 (Eritrea-2002 and Madagascar-2008). In this study, we included 11 counties DHS data that was conducted after 2010.
The data of these 11 East Africa countries were accessed from the demography heath survey (DHS) program official database www.measuredhs.com, after authorization was granted through online request by explaining the goal of our study. We used the individual Record (IR file) data set and extracted the dependent and independent variables. To collect knowledge that is comparable across countries in the world, the DHS program adopts standardized methods involving uniform questionnaires, manuals, and field procedures. DHS is a nationally representative household survey that offers data from a wide variety of population, health, and nutrition tracking and effect assessment measures with face-to-face interviews of women aged 15 to 49. Stratified, multi-stage, random sampling is used in the surveys. In each country, information was obtained from qualified women aged 15 to 49 years. Detailed survey techniques and methods of sampling used to collect data have been recorded elsewhere (13).
Variables
Outcome variable
The response (outcome) variable of this study was ANC utilization. The response variable is binary and it is coded as 1 if women received ANC from skilled health care provider (Doctors, Midwives, Nurses, and Health officers) at least four times and 0 otherwise.
Independent variables
Based on different literatures, two types of independent variables were considered. Individual level and community-level variables. Community-level variables include literacy rate, country and residence. The individual level variables are age, level of education, distance from health facility, birth order, mass media, and wealth index.
Data processing and management
Data processing and analysis were performed using STATA 15 software. The data were weighted using sampling weight, primary sampling unit and strata before any statistical analysis to restore the representativeness of the survey and to tell the STATA to take into account the sampling design when calculating standard errors to get reliable statistical estimates. Cross tabulations and summary statistics were conducted to describe the study population.
Statistical analysis
Since the DHS data has a hierarchical nature, women within a cluster may be more similar to each other than women in the other cluster. Due to this, the assumption of independence of observations and equal variance across clusters might be violated. Therefore, an advanced statistical model is required to take into account the between cluster variability to get a reliable standard error and unbiased estimate.
Furthermore, by taking into account the dichotomous nature of the outcome variable, multilevel mixed effect logistic regression was fitted. Model comparison was done based on Akaike and Bayesian Information Criteria (AIC and BIC). Mixed effect model with the lowest Information Criteria (AIC and BIC) was selected.
The individual and community level variables associated with ANC utilization were checked independently in the bi-variable multilevel mixed-effect logistic regression model and variables which were statistically significant at p-value 0.20 in the bi-variable multilevel mixed-effects logistic regression analysis were considered for the final individual and community level model adjustments. In the multivariable multilevel mixed-effect analysis, variables with a p-value≤0.05 were declared as significant determinants of ANC utilization. Intra class correlation coefficient (ICC) were used to check whether or not multilevel model is appropriate and how much of the overall variation in the response is explained by clustering.
Four models were fitted. The first was the null model that did not include exposure variables which was used to verify community variance and provide evidence to assess random effects at the community level. Then model I was the multivariable model adjustment for individual-level variables and model II was adjusted for community-level factors. In model III, the outcome variable was equipped with potential candidate variables from both individual and community level variables.
The fixed effects (a measure of association) were used to estimate the association between the ANC utilization and explanatory variables and expressed as odds ratio with 95% confidence interval. Regarding the measures of variation (random-effects), Community-level variance with standard deviation and intra-cluster correlation coefficient (ICC) was used.