Machine learning algorithm to predict determinants of home delivery after ANC visit among reproductive age women in East Africa: Using SHAP

doi:10.21203/rs.3.rs-4223378/v1

Download PDF

Research Article

Machine learning algorithm to predict determinants of home delivery after ANC visit among reproductive age women in East Africa: Using SHAP

https://doi.org/10.21203/rs.3.rs-4223378/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Background: Home birth is described as a delivery that takes place at home without the presence of a skilled birth attendant. Home delivery after ANC visit is a major public health concern, and reducing the proportion of home births in East Africa is a key strategy for lowering the maternal death rate. However, no studies on this public health issue. Therefore, this study aimed to assess the machine learning approach to predict determinants of home delivery after ANC visits among reproductive-age women in East Africa.

Methods: A community-based, cross-sectional study was conducted using a recent Demographic and Health Survey (DHS) from 2011 to 2021 data set. Nine supervised machine learning algorithms were employed on a total weighted sample of 44,123 women and evaluated using performance metrics using Python version 3.11 statistical software. This study also employed the most popular outlines of Yufeng Guo’s steps of supervised machine learning and SHAP analysis to predict and identify important predictors of home delivery after ANC visits in East Africa.

Results: Home delivery after ANC visit was highest in Malawi, Uganda, and Kenya. Among the nine machine learning algorithms random forest was fitted for this study. The Beeswarm plot of SHAP analysis showed that being a rural resident of women and having a second trimester of ANC visit increases the likelihood of home delivery after an ANC visit. Whereas rich household income, secondary educational level of husband, contraceptive use, short birth interval, primary educational level of husband, having no problems of distance to health facility, and having above four ANC visits decreased women’s home delivery after ANC visits in East Africa.

Conclusion: The random forest machine learning classification algorithms effectively predict home delivery after the ANC visit. As a result, this study recommends, considering the top ten determinants of home delivery and guaranteeing high-quality health institution services from a qualified practitioner, women should begin antenatal care services early and often throughout their pregnancies. Developing health facilities, promoting media health education, and encouraging women to get adequate information on health care services. Moreover, healthcare policy should give great consideration to women from low-income households.

Machine learning

Home delivery

ANC visit

East Africa

SHAP

Home birth is described as a delivery that takes place at home without the presence of a skilled birth attendant (midwife, nurse, or doctor)(1). Maternal health is a top concern in the global health agenda (2). Though the majority of maternal deaths are preventable, women lose their lives due to complications related to pregnancy and childbirth every minute of every day in the world(3). In 2017, nearly 295,000 mothers died from various pregnancy and childbirth-related problems, accounting for approximately 810 maternal deaths per day (3).

Complications associated with pregnancy and childbirth account for considerable pregnancy and childbirth-related fatalities and disabilities worldwide, particularly in underdeveloped nations(4). Prolonged/obstructed labor, complications from a botched abortion, bleeding, malaria during pregnancy, anemia, and sepsis are the leading causes of death(4). In Ethiopia, the percentage of women who gave birth at home was highest in the Afar and Somali regions (89.6 percent and 81.7 percent, respectively), while just 3.3 percent of Addis Ababa residents gave birth at home (5). Except for a few countries (Benin, Namibia, Zimbabwe, and Vietnam), the use of professional care during delivery is significantly lower in Sub-Saharan Africa and South/Southeast Asia, according to findings from levels and trends in the use of maternal health services in developing nations(5).

Several studies revealed that educational status of women(6-12), cultural factors(7, 8, 12, 13), region(6), parity (6), limited access to health facilities(7, 8), poor quality of care(7, 8), lack of transportation (7, 8, 14), age(5, 10, 11, 13, 15), marital status(14), environment(12, 16), distance to the health facility(3, 6, 8, 9, 14, 17), source of information (3, 6, 9, 14), antenatal care visit(3, 6-8, 18), birth order(6, 7), wealth index (6, 7, 10, 19), place of residence(3, 5, 6, 8, 11, 18), religion(6, 7, 13), employment of women and husband (5), knowledge on place of delivery (3, 10, 11, 18), Unplanned pregnancy (9) and decision on place of delivery(3, 10, 12), were statistically significant predictors of home delivery.

Obstetric complications such as maternal morbidity and mortality are a result of home birth(20). The burden of home delivery, particularly unattended delivery, is not simply a mother health issue; it also results in perinatal and neonatal illness and mortality (9). In-home deliveries were found to have a 21% higher perinatal mortality rate than institutional deliveries. On the other hand, home deliveries are linked to infection and other negative neonatal and maternal outcomes(9). Research has been conducted to investigate the factors that influence the place of delivery in specific regions using traditional analysis methods, However, no studies on home delivery after ANC visits and predictors on the level of East African Region using machine learning algorithm (6).

Home delivery after ANC visit is a major public health concern and reducing the proportion of home births in East Africa is a key strategy for lowering the maternal death rate (3). Accordingly, investigating this study is important for prioritization to design public health interventions. Besides, this study was based on the weighted pooled DHS data of 12 East African countries that have adequate power to detect the true effect of the predictors on home delivery after ANC visit and it can assist health planners and policymakers in developing particular measures to reduce home deliveries. Therefore, this study aims to investigate determinants of home delivery after ANC visits among reproductive-age women in East Africa using supervised machine learning algorithms and SHAP analysis.

Study design and study setting

A community-based cross-sectional study design was conducted to predict home delivery after ANC visits among 15-49 age women using recent DHS from 2011 to 2021 in the East Africa region. East Africa is the largest region, including 19 countries (Burundi, Comoros, Djibouti, Ethiopia, Eritrea, Kenya, Madagascar, Malawi, Mauritius, Mozambique, Rwanda, Seychelles, Somalia, Tanzania, Uganda, Zambia, South Sudan, Zimbabwe, and Sudan)(21). Of these 19 East African countries, 14 have DHS data, whereas Five do not (Djibouti, Somalia, South Sudan, Seychelles, and Mauritius). Among these 14 countries, one has restricted DHS data (Eritrea), and one (Sudan) has old data set from 1989-1990. Thus, this study included 12 countries (Burundi, Ethiopia, Comoros, Uganda, Rwanda, Tanzania, Mozambique, Madagascar, Zimbabwe, Kenya, Zambia, and Malawi) of recent standard DHS data collected between 2011 and 2021 to make more representative for East Africa.

Data source, study population and sampling technique

The study was conducted by using secondary data analysis based on Demographic and Health Surveys (DHS). Different datasets, such as those for men, women, children, births, individuals, and households, are included in each country's survey; for this study, we used the Individual Record (IR) file. DHS used the Population and Housing Census (PHC) as a sampling frame for a two-stage stratified cluster sampling technique. Using independent selection in each sampling stratum and probability sampling proportionate to the EAs' size, Enumeration Areas (EAs) were selected in the first stage. Households were systematically chosen for the second phase. The entire DHS report included a detailed sampling procedure (22, 23). We extracted 75,047 reproductive-age women in the study. However, after managing the data (excluding women who hadn’t ANC visit, missing value, and unknown response) a total weighted sample of 44,123 respondents was included in the study for further analysis.

Study variables

The outcome variable for this study was home delivery after an ANC visit, which is described as a delivery that takes place at home without the presence of a skilled birth attendant, even if they provided antenatal care services in a health facility (24). Maternal age, maternal education, marital status, wealth index, media exposure, sex of household head, previous contraceptive use, the timing of ANC visits, number of ANC visits, residence, birth interval, husband education, and health facility problem were used as the independent variable for this study.

Data management and analysis

Before conducting the statistical analysis, the data were weighted using the primary sampling unit, sampling weight, and strata to restore the survey's representativeness and consider the sampling design for accurate statistical estimates. The number of samples required in each stratum to obtain accurate estimates is determined by sampling statisticians; in the DHS, certain areas were oversampled while others were under-sampled. Therefore, using sampling weight (v005), primary sampling unit (v021), and strata (v022), the distribution of reproductive-age women in the sample needs to be weighted (mathematically adjusted) so that it resembles the true distribution in East Africa to obtain statistics that are representative of the nation.

A total of 75,047 actual samples with selected variables were extracted from the measures of DHS using STATA software version 17 and exported to a CSV file. Then the data was imported into a Jupyter Notebook version 3.11 for further analysis. To make data suitable for machine learning tasks, explanatory data analysis, missing value management, data discretization, outlier detection, balancing target features, and feature selection was applied as a preprocessing task. After all, data split as training (80%) and test data (20%) was performed to fit on a model selected, only variable pass feature selection was fitted to the model.

In this study supervised machine learning algorithms such as Random Forest, Ada Boost, Gaussian NB, MLP, Decision Tree, Logistic Regression (LR), random forest (RF), K-Nearest Neighbors (KNN), Extreme Gradient Boosting (XG Boost), and support vector machines (SVM) (25-28), was performed to predict determinants of home delivery after ANC visit among reproductive age women in East Africa. A tenfold cross-validation method was used for training the models on training data. Performance of the model was measured using different metrics like confusion matrix, and receiver operating Area Under Curve (AUC). Finally, the prediction of home delivery was made after hyperparameter tuning of the best-performed model. All analyses were performed using Python version 3.11 programming language in Jupyter Notebook using imblearn (29), sklearn (30), and SHAP(31) packages.

The relationship between the predictors and the outcome variable was evaluated using the SHAP feature importance method, which also helped identify the independent variables that are most crucial for predicting home delivery after an ANC visit. The Shapley Additive exPlanations (SHAP) analysis employs a game theory framework to provide a global or local interpretation and explanation of any machine learning model's prediction (27). Since tree-based models are typically "black-box" systems, it is uncommon to find interpretations and explanations of high-performing models in machine learning research (27). SHAP has been used as a feature selection mechanism by several researchers, and their results show that machine learning using the SHAP value feature selection method performs better in terms of classification with model explainability (27, 32). Plotting the total Shapley value of each sample's feature will also help you understand how each predictor affects the prediction of home delivery. Here, we can clarify whether a given characteristic makes a woman more likely to give birth at home after ANC visit or less likely.

Additionally, the contributions of each feature to the prediction of a positive class (home delivery) were explained using a waterfall plot (27). The waterfall plot's y-axis shows the independent variables and the feature values that correspond to them for each sample, while the x-axis shows the likelihood that a sample will be classified as belonging to the "home delivery" class. In the waterfall plot, a horizontal bar represents the contribution of each feature. The feature increases the probability that the sample will belong to the positive class, as indicated by positive contributions (red bars). Blue bars representing negative contributions indicate a decline in the probability of the sample falling into the positive class. Finally, the overall methodology workflow is shown below (Figure 1).

Ethical consideration

This study was a secondary data analysis. As a result, a permission letter for data access was obtained from a major demographic and health survey through an online request from http://www.dhsprogram.com. The study's data were publicly available and devoid of any personally identifiable information. The Demographic and Health Surveys (DHS) Program sent us a permission letter. Respondents, families, or sample communities cannot be identified in any way according to the IRB-approved processes for DHS public-use datasets. The data files do not contain names of people or addresses of households. The geographic IDs only descend to the regional level (where regions are frequently very large geographical areas covering several states or provinces).

Socio-demographic and economic characteristics of the study participants

A total of 44,123 reproductive-age women who gave birth during the study period were included. The mean age of the participant was 28.41+ (1 .02 SD). Of the total, about three-fourths, 33,048 (74.9) of the study participants were from rural residents and 7889(17.88%) of them were Malawi. 16,693(37.83%) of the participants were attained primary education. 19,147(43.40%) of them were poor wealth status. 28,779(65.22%) of the woman were married and about two third, 29,472(66.80) of them had media exposure and 31,713(71.87%) of the participants had 2-4 ANC visit. 26,268(59.53%) of the participants had no distance problem to the health facility (Table 1).

Table 1: Sociodemographic characteristics of reproductive age women in East Africa, DHS 2011-2021

Variable	Category	Frequency	Percentage
Residence	Urban	11,075	25.1
Residence	Rural	33,048	74.9
Country	Burundi	2,764	6.26
	Ethiopia	3,267	7.40
	Kenya	4,995	11.3
	Comoros	1,052	2.38
	Madagascar	2,753	6.24
	Malawi	7,889	17.88
	Mozambique	1,548	3.51
	Rwanda	4,014	9.10
	Tanzania	3,002	6.80
	Uganda	6,788	15.38
	Zambia	4,394	9.96
	Zimbabwe	1,657	3.76
Maternal age in the year	15-24	7,764	17.60
	25-34	23,365	52.95
	35-49	12,994	29.45
Maternal education	Unable to read and write	9,492	21.51
	Primary	16,693	37.83
	Secondary	14587	33.06
	Higher	3,351	7.60
Marital status	Single	11,778	26.70
	Married	28,779	65.22
	Widowed	1378	3.12
	Divorced	2,188	4.96
Wealth index	Poor	19,147	43.40
	Middle	8,410	19.06
	Rich	16,566	37.54
Media exposure	Yes	29,472	66.80
Media exposure	No	14,651	33.20
Sex of household head	Male	36,042	81.69
	Female	8,081	18.31
Previous contraceptive use	No	22,523	51.05
Previous contraceptive use	Yes	21,600	48.95
Timing of ANC visit	First trimester	14,174	32.12
	Second trimester	26,340	59.70
	Third trimester	3,609	8.18
Number of ANC visit	One	1,383	3.13
	2-4	31,713	71.87
	Above4	11,027	25.0
Birth interval	Short	18,286	41.44
Birth interval	Long	25,837	58.56
Husband education	Unable to read and write	7,939	18.00
	Primarily education	22,284	50.50
	Secondary education	11,005	24.94
	Higher education	2,895	6.56
Distance to a health facility problem	Big problem	17,855	40.47
Distance to a health facility problem	Not a big problem	26,268	59.53

Machine learning analysis of home delivery after ANC visit

Balancing

To balance the skewed distribution of the outcome variable (home delivery after ANC visit), SMOTE oversampling created 13,590 more synthetic observations for the minority class (Yes category).Therefore, to create symmetric distributions for both groups and to create reliable prediction models, the total distribution of home delivery following ANC visits was modified from 12,060 home delivery and 25650 no home delivery to 25,650 in each class (Figure 2).

Model performance comparison

When classifiers were compared using stratified tenfold cross-validation and imbalanced training data, the support vector machine emerged as the most successful model, exhibiting 74.82% accuracy and 66.17% area under the receiver operating characteristic (ROC) curve. However, this result was deceptive since the outcome variable was unbalanced. As a result, after the training data were balanced using the SMOTE oversampling technique random forest was the best predictive model with an accuracy of 70.60% and a 77.88% area under the ROC curve (Table 2).

Table 2: Model comparison through cross-validation of training data

Machine learning Models	Performance	Unbalanced (%)	Balanced (%)
Random Forest(RF)	Accuracy	72.11	70.60
Random Forest(RF)	AUC	68.06	77.88
Support Vector Machine(SVM)	Accuracy	74.82	68.57
Support Vector Machine(SVM)	AUC	66.17	74.41
Logistic Regression(LR)	Accuracy	74.81	67.36
Logistic Regression(LR)	AUC	73.10	74.18
K-Nearest Neighbors (KNN)	Accuracy	71.00	67.67
K-Nearest Neighbors (KNN)	AUC	65.38	72.87
Ada Boost(AdB)	Accuracy	74.72	67.41
Ada Boost(AdB)	AUC	73.08	74.17
Gaussian Naïve Bayes(GNB)	Accuracy	66.13	64.70
Gaussian Naïve Bayes(GNB)	AUC	69.99	70.95
Multi-Layer Perceptron( MLP)	Accuracy	74.34	68.82
Multi-Layer Perceptron( MLP)	AUC	72.08	75.60
Decision Tree (DT)	Accuracy	74.32	65.69
Decision Tree (DT)	AUC	70.62	71.98
Extreme Gradient Boosting (XG Boost),	Accuracy	74.34	69.35
Extreme Gradient Boosting (XG Boost),	AUC	71.92	76.20

After selecting the best model (RF), the prediction of home delivery after ANC visit was done on previously unseen test data. The prediction was made after training random forest on unbalanced training data, balanced data with default model parameters, and compared with an optimized model trained with balanced data. After training the random forest model on unbalanced and balanced data, the prediction on unseen test data provided an area under curve score of 0.69 and 0.68 respectively. Similarly, a hyperparameter-tuned random forest predicted an AUC of 0.68(Figure 3).

Hyperparameter tuning of Random Forest

Although scikit-learn provides a set of sensible default hyperparameters for all models and is not guaranteed to be optimal for a problem. As a result, to maximize the performance of random forest, hyperparameters included the number of decision trees in the forest (n_estimators), the number of features considered by each tree when splitting a node(max_features), minimum number of samples required to split an internal node(min_samples_split), minimum number of samples required to be at a leaf node(min_samples_leaf), and number of samples to draw from independent variables to train each tree(max_samples) were optimized with one hundred trials on a given search space using stratified 10- fold cross-validation(Table 3). Finally, a random forest model was created with these tuned hyperparameters on balanced training data through 10-fold cross-validation and yielded 79% accuracy and 0.78 area under the curve.

Table 3: Default and optimally tuned hyperparameters of the Random Forest model

Hyperparameter	Default	Optimal Value
Number of trees	100	250
Number of features considered for the best split	The square root of the number of features	0.21
The minimum number of samples required to split an internal node	2	2
The minimum number of samples required to be at a leaf node	1	1
Number of samples to draw from X to train each base estimator	None	0.96

Important feature selection using Random Forest

This study has used model agnostic SHAP global feature importance for selecting top predictors of home delivery after ANC visit. This technique examines the mean absolute SHAP value for each predictor across all of the data which quantifies the feature’s contribution towards the predicted home delivery. The results revealed that The SHAP global importance scores the most important ten factors to predict home delivery after ANC visit using the optimized random forest model with test data and the predictors are sorted in descending order based on their impact on the outcome variable prediction and features with higher mean absolute SHAP values are more influential. Such as; rich household income (wealth_status_2), contraceptive use (contraceptive_use_1), short birth interval (birth_interval_1), secondary educational level of husband (edu_status_husband_2), and being from rural (residence_2) were the most top five important predictors of women’s home delivery after ANC visit. Furthermore, second trimester of ANC visit (timing_ANC_1), problems of distance to health facility (distance_HF_2), and primary educational level of the husband (edu_status_husband_1), media exposure (media_exposure_1) and had middle household income (wealth_status_1) were also important predictors of home delivery after ANC visit. As presented in the figure, the red and blue colors occupy half of the horizontal rectangles for each class. This means that each feature has an equal impact on the classification of both home delivery after ANC visit (label = yes) and no-home delivery after ANC visit (label = no) cases (Figure 4).

Model interpretation/explanation

The beeswarm plot provides shapely values of the features related to home delivery status after ANC visit, providing insight into the importance and association of each of the top ten features on the outcome variable. Points that are right to the vertical line (0 SHAP value) increase the likelihood of home delivery after the ANC visit while the left side decreases home delivery status after the ANC visit). So the red line represents the category coded as 1 (high value) and the blue represents the category coded as 0 (low value). Accordingly, being rural residents of women (residence_2) and having the second trimester of ANC visit (timing_ANC_1) increases the likelihood of home delivery after ANC visit. Whereas rich household income (wealth_status_2), secondary educational level of husband (edu_status_husband_2), contraceptive use (contraceptive_use_1), short birth interval (birth_interval_1), primary educational level of husband (edu_status_husband_1), having no problems of distance to health facility (distance_HF_2), and having above four ANC visit (ANC_visit_3) were decrease women’s home delivery after ANC visit in East Africa (Figure 5).

Waterfall plot

The waterfall plots begin with the expected value of the model output on the x-axis (E[f(X)] = 0.5), which represents the initial prediction for the given sample before considering any feature contributions and it is typically the average or most common prediction for the dataset. For a given observation, if the model output above this value (E[f(X)]) corresponds to a positive class (i.e. home delivery) whereas scores below this value correspond to a negative class (“no home delivery”). Hence, Figure 6 revealed that, for the first observation, the combination of the positive contributions (in red) and the negative contributions (in blue) moves the expected value output to the final model output (f(x) = 0.86) classified as positive class (home delivery after ANC visit).

Accordingly, not being from rural (0=residence_2), secondary educational level of husband (1=edu_status_husband_2), rich household income (1=wealth_status_2), having above four ANC visit (1=ANC_visit_3), contraceptive use (1=contraceptive_use_1), women’s age between 35-49 (1=age_2), having no problems of distance to health facility (1=distance_HF_2) drives up the probability of having home delivery after ANC visit, whereas ,no short birth interval (0=birth_interval_1), no women’s age between 25-34(0=age_1) drives down the probability of having home delivery after ANC visit for this particular woman (Figure 6).

The purpose of this study was to evaluate the use of a machine learning algorithm to identify important determinants of home delivery after ANC visit in Ethiopia. Support Vector Machine (SVM) model classifier exceeded other classifiers in the initial phase of predictive modeling on unbalanced training data. Random Forest (RF) outperformed other model classifiers in the second phase of model prediction on balanced training data. The Random Forest prediction model proved to be the most effective when fitted to test data, and additional analysis was carried out once its hyperparameters were optimized.

The SHAP analysis based on the Random Forest (RF) model showed that rich household income, contraceptive use, short birth interval, secondary educational level of husband, being a woman from rural, second trimester of ANC visit, problems of distance to health facility, primary educational level of husband, media exposure, and had middle household income were important predictors of home delivery after ANC visit in East Africa.

Women who lived in rural resident of women had an increased likelihood of home delivery after ANC visits. This finding was in line with the previous studies (6, 33-35). The possible explanation could be women who reside in rural areas are often in remote locations, or far from healthcare facilities. East Africa's target aims to establish more primary healthcare facilities by utilizing health extension workers at the health post level. East Africa's topography and infrastructure make it difficult to get an ambulance there, even though services are easily accessible in the community. Because of this, women living in rural areas had delays in healthcare delivery due to the mother's choice to seek delivery care and the inability to get there at the health institution on time (6).

Rich household income decreased women’s home delivery after ANC visits. This implies home delivery after ANC visits was higher among women who had a poor wealth status as compared to those who had a rich wealth status. This finding was supported by the previous studies (4-6, 10). This could be because of the demand for various transit and food-related expenses, which the mother's low income may not be able to afford (36). Moreover, perhaps it's because the majority of East African rich household-income level of women reside in urban areas. Consequently, women can either access transportation, or health facilities, which could result in a decrease in home deliveries (37).

Women who had a second trimester of ANC visits had an increased likelihood of home delivery after the ANC visit. This could be attributed to the level of service provided, during the ANC visit. Despite improvements over the past ten years, there were still severe shortages of medical professionals as well as irregular drug and equipment supplies (6). This could discourage women from delivering children in health facilities. This finding contradicts by previous study in Ethiopia, women who had their first ANC visit lately during the third trimester were less likely to deliver at home compared to those who had first ANC visit during the first trimester of pregnancy(38).

The primary and secondary educational levels of the husband were decreased in women’s home delivery after ANC visit. This study is supported by previous studies (39-41). The possible reason could be, one important tactic to improve the use of health care services is education. However, without widespread access to education, there would be a significant knowledge gap between those with and without education about the challenges of childbirth and labor, as well as the dangers associated with unattended deliveries (42). Moreover, education increases one's understanding of the benefits of preventative healthcare allowing them to decide where to give birth and to discuss it with others. Accordingly, women’s husbands who did receive a formal education were less likely than their counterparts to give birth at home(43).

Contraceptive use decreased women’s home delivery after ANC visit. This finding is in line with previous studies. This implies women who have used contraceptives will have behavioral changes towards health facility delivery. In addition, contraceptive use offers mothers the chance to receive more counseling from medical professionals, which can significantly reduce the number of home births (6, 42, 44). Short birth intervals decreased women’s home delivery after ANC visits. This implies that the uptake of health facility delivery increased by women who had short birth intervals. The explanation for this could be that throughout pregnancy, women become more aware of the potential risks associated with short birth intervals during ANC visits, and they also assist mothers in creating an efficient birth plan, which may reduce the likelihood that they will give birth at home(45, 46).

Women having no problems with distance to health facility were decrease women’s home delivery after ANC visits. The possible reason could be a woman would find it easier to get to the health facility. Even if she had to walk, the closer it is. When a pregnant mother lives close to a health institution, women may be more likely to walk there even in the absence of other transportation options. Walking longer distances during labor is challenging, and it gets worse if the labor begins at night. This is why home births were chosen by the majority of respondents(42).

Women who had more than four ANC visits decreased women’s home delivery after ANC visits. Accordingly, the number of ANC visits was also significantly associated with home delivery after ANC visits. This finding was in line with previous studies (42, 44, 47). The reason could be attributed to the increasing number of ANC visits with medical experts, which increases the likelihood of receiving advice regarding place of delivery and birth preparedness (48). Furthermore, mothers who had more ANC visits had the opportunity to acquire the value of professional birth attendants and perform institutional deliveries; however, women who had fewer than four ANC visits did not have better contact with skilled healthcare personnel (36).

Limitations and strengths of the study

This study's primary strength was the use of large sample sizes and nationally representative data. The use of a sophisticated and suitable statistical method (machine learning technique), which revealed previously undiscovered relationships and patterns in the field, was another key point. Moreover, the researchers conducted additional studies to ascertain how predictors increased or decreased home delivery after ANC visits to reduce the interpretation limits of machine learning results, which are a result of their black-box nature. To determine the relative significance of each predictor and obtain an understanding of how each component contributed to the model's predictions, the researchers employed a variety of methodologies, including SHAP. This helped them to comprehend how various factors affected the model's predictions.

The limitation of this study was that the DHS survey depends on respondents' self-reports, which could be prone to recall bias because respondents were asked to recall prior events. Additionally, because of the cross-sectional study design, it was challenging to determine causality, however, it only revealed relationships between factors and home birth after the ANC visit.

Home delivery after ANC visit was highest in Malawi, Uganda, and Kenya. The random forest model provides better predictive power than other models used in this study to predict determinants of home delivery after ANC visits in East Africa. The Beeswarm plot of the SHAP analysis based on the Random Forest (RF) model showed that being a rural resident woman and having a second trimester of ANC visit increases the likelihood of home delivery after an ANC visit. Whereas rich household income, secondary educational level of husband, contraceptive use, short birth interval, primary educational level of husband, having no problems of distance to health facility, and having above four ANC visits were decreased women’s home delivery after ANC visit in East Africa.

As a result, this study recommends that to guarantee they receive high-quality ANC from a qualified practitioner, women begin antenatal care services early and often throughout their pregnancies. Another implication of this study is that education and girl empowerment are very important components of programs aimed at reducing mother and infant mortality by improving the quality of health institution service utilization, especially in rural residents. In addition, developing health facilities, promoting media health education, and encouraging women to get adequate information on health care services, especially in Malawi, Uganda and Kenya. Moreover, healthcare policy should give great consideration to women from low-income households.

ANC: Antenatal Care Visit,

AUC: Area Under Curve

DHS: Demographic and Health Survey

LR: Logistic regression

ML: Machine learning

RF: Random Forest

ROC: Receiver operating characteristic

SHAP: SHapley Additive exPlanations

SMOTE: Synthetic Minority Oversampling Technique

WHO: World Health Organization

Ethics approval and consent to participate

All methods were carried out using the relevant guidelines of the Demographic and Health Surveys (DHS) program. Informed consent was waived from the International Review Board of Demographic and Health Surveys (DHS) program data archivists after the consent paper was submitted to the DHS program, and a letter of permission to download the dataset for this study was issued. The dataset was not shared or passed on to other bodies, and it has maintained its confidentiality.

Consent for publication

Not applicable

Availability of data and materials

All relevant data are in the manuscript. However, the minimal data underlying all the findings in the manuscript will be available upon request. DHS (2011-2021) data was used which is available in the public domain through the Measure DHS website (www.measuredhs.com).

Competing Interests

There is no competing interests.

Funding

No funding was obtained for this study.

Author Contributions

ADW was responsible for a significant contribution to the conceptualization, study selection, data curation, formal analysis, funding acquisition, investigation, methodology, and original draft preparation. Project administration, resources, software, supervision, validation, visualization, and reviewing are all handled by SDK, JBA, and ZAG. ADW, SDK, and DNM wrote the final draft of the manuscript, and the final draft of the work was read, edited, and approved by all authors.

Acknowledgments

We are grateful to the MEASURE DHS program that provides permission with data access authorization to enable us to conduct the study

Ganle JK, Mahama MS, Maya E, Manu A, Torpey K, Adanu R. Understanding factors influencing home delivery in the context of user‐fee abolition in Northern Ghana: Evidence from 2014 DHS. The International journal of health planning and management. 2019;34(2):727-43.
Smith SL, Shiffman J. Setting the global health agenda: the influence of advocates and ideas on political priority for maternal and newborn survival. Social science & medicine. 2016;166:86-93.
Nigatu AM, Gelaye KA, Degefie DT, Birhanu AY. Spatial variations of women’s home delivery after antenatal care visits at lay Gayint District, Northwest Ethiopia. BMC public health. 2019;19(1):1-14.
Devkota B, Maskey J, Pandey AR, Karki D, Godwin P, Gartoulla P, et al. Determinants of home delivery in Nepal–A disaggregated analysis of marginalised and non-marginalised women from the 2016 Nepal Demographic and Health Survey. Plos one. 2020;15(1):e0228440.
Chernet AG, Dumga KT, Cherie KT. Home delivery practices and associated factors in Ethiopia. Journal of reproduction & infertility. 2019;20(2):102.
Tessema ZT, Tiruneh SA. Spatio-temporal distribution and associated factors of home delivery in Ethiopia. Further multilevel and spatial analysis of Ethiopian demographic and health surveys 2005–2016. BMC pregnancy and childbirth. 2020;20:1-16.
Tiruneh SA, Lakew AM, Yigizaw ST, Sisay MM, Tessema ZT. Trends and determinants of home delivery in Ethiopia: further multivariate decomposition analysis of 2005–2016 Ethiopian Demographic Health Surveys. BMJ open. 2020;10(9):e034786.
Yebyo H, Alemayehu M, Kahsay A. Why do women deliver at home? Multilevel modeling of Ethiopian National Demographic and Health Survey data. PloS one. 2015;10(4):e0124718.
Kasaye HK, Endale ZM, Gudayu TW, Desta MS. Home delivery among antenatal care booked women in their last pregnancy and associated factors: community-based cross sectional study in Debremarkos town, North West Ethiopia, January 2016. BMC pregnancy and childbirth. 2017;17(1):1-12.
Mrisho M, Schellenberg JA, Mushi AK, Obrist B, Mshinda H, Tanner M, et al. Factors affecting home delivery in rural Tanzania. Tropical medicine & international health. 2007;12(7):862-72.
Abdella M, Abraha A, Gebre A, Reddy PS. Magnitude and associated factors for home delivery among women who gave birth in last 12 months in Ayssaita, Afar, Ethiopia-2016. A community based cross sectional study. Glob J Fertil Res. 2017;2(1):030-9.
Ramezani Siakhulake F, Tabatabaei SM, Mohammadi M, Behmanesh Pour F. Home Delivery Practices among Pregnant Women in Southeast of Iran and Associated Factors after the Implementation of the Health Transformation Plan: a Case-control Study. Women’s Health Bulletin. 2021;8(3):142-51.
Abubakar S, Adamu D, Hamza R, Galadima JB. Determinants of home delivery among women attending antenatal care in Bagwai town, Kano Nigeria. African Journal of Reproductive Health. 2017;21(4):73-9.
Scott NA, Henry EG, Kaiser JL, Mataka K, Rockers PC, Fong RM, et al. Factors affecting home delivery among women living in remote areas of rural Zambia: a cross-sectional, mixed-methods analysis. International journal of women's health. 2018;10:589.
Idris S, Gwarzo U, Shehu A. Determinants of place of delivery among women in a semi-urban settlement in Zaria, northern Nigeria. Annals of African medicine. 2006;5(2):68-72.
Agency CS. Ethiopia mini demographic and health survey 2014. 2014.
Simfukwe ME. Factors contributing to home delivery in Kongwa District, Dodoma-September 2008. Dar Es Salaam Medical Students' Journal. 2011;18(1):13-22.
Abebe F, Berhane Y, Girma B. Factors associated with home delivery in Bahirdar, Ethiopia: a case control study. BMC research notes. 2012;5(1):1-6.
Ogolla JO. Factors associated with home delivery in West Pokot County of Kenya. Advances in public health. 2015;2015.
Bang RA, Bang AT, Reddy MH, Deshmukh MD, Baitule SB, Filippi V. Maternal morbidity during labour and the puerperium in rural homes and the need for medical attention: a prospective observational study in Gadchiroli, India. BJOG: An international journal of Obstetrics & Gynaecology. 2004;111(3):231-8.
Raru TB, Ayana GM, Zakaria HF, Merga BT. Association of higher educational attainment on antenatal care utilization among pregnant women in east africa using Demographic and Health Surveys (DHS) from 2010 to 2018: a multilevel analysis. International journal of women's health. 2022:67-77.
Tesema GA, Teshale AB, Tessema ZT. Incidence and predictors of under-five mortality in East Africa using multilevel Weibull regression modeling. Archives of Public Health. 2021;79:1-13.
Rutstein SO, Rojas G. Guide to DHS statistics. Calverton, MD: ORC Macro. 2006;38:78.
Muluneh AG, Animut Y, Ayele TA. Spatial clustering and determinants of home birth after at least one antenatal care visit in Ethiopia: Ethiopian demographic and health survey 2016 perspective. BMC Pregnancy and Childbirth. 2020;20(1):97.
Kebede SD, Sebastian Y, Yeneneh A, Chanie AF, Melaku MS, Walle AD. Prediction of contraceptive discontinuation among reproductive-age women in Ethiopia using Ethiopian Demographic and Health Survey 2016 Dataset: A Machine Learning Approach. BMC Medical Informatics and Decision Making. 2023;23(1):1-17.
Mamo DN, Yilma TM, Fekadie M, Sebastian Y, Bizuayehu T, Melaku MS, et al. Machine learning to predict virological failure among HIV patients on antiretroviral therapy in the University of Gondar Comprehensive and Specialized Hospital, in Amhara Region, Ethiopia, 2022. BMC Medical Informatics and Decision Making. 2023;23(1):75.
Kebede SD, Mamo DN, Adem JB, Semagn BE, Walle AD. Machine learning modeling for identifying predictors of unmet need for family planning among married/in-union women in Ethiopia: Evidence from performance monitoring and accountability (PMA) survey 2019 dataset. PLOS Digital Health. 2023;2(10):e0000345.
Demsash AW, Chereka AA, Walle AD, Kassie SY, Bekele F, Bekana T. Machine learning algorithms’ application to predict childhood vaccination among children aged 12–23 months in Ethiopia: Evidence 2016 Ethiopian Demographic and Health Survey dataset. Plos one. 2023;18(10):e0288867.
Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System2016. 785-94 p.
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine learning in Python. the Journal of machine Learning research. 2011;12:2825-30.
Lundberg SM, Lee S-I. A unified approach to interpreting model predictions. Advances in neural information processing systems. 2017;30.
Gebreyesus Y, Dalton D, Nixon S, De Chiara D, Chinnici M. Machine Learning for Data Center Optimizations: Feature Selection Using Shapley Additive exPlanation (SHAP). Future Internet. 2023;15(3):88.
Habte F, Demissie M. Magnitude and factors associated with institutional delivery service utilization among childbearing mothers in Cheha district, Gurage zone, SNNPR, Ethiopia: a community based cross sectional study. BMC pregnancy and childbirth. 2015;15(1):1-12.
Doctor HV, Nkhana-Salimu S, Abdulsalam-Anibilowo M. Health facility delivery in sub-Saharan Africa: successes, challenges, and implications for the 2030 development agenda. BMC public health. 2018;18:1-12.
Huda TM, Chowdhury M, El Arifeen S, Dibley MJ. Individual and community level factors associated with health facility delivery: A cross sectional multilevel analysis in Bangladesh. PloS one. 2019;14(2):e0211113.
Ayalew HG, Liyew AM, Tessema ZT, Worku MG, Tesema GA, Alamneh TS, et al. Spatial variation and factors associated with home delivery after ANC visit in Ethiopia; spatial and multilevel analysis. Plos one. 2022;17(8):e0272849.
Wagle RR, Sabroe S, Nielsen BB. Socioeconomic and physical distance to the maternity hospital as predictors for place of delivery: an observation study from Nepal. BMC pregnancy and childbirth. 2004;4:1-10.
Tariku M, Enyew DB, Tusa BS, Weldesenbet AB, Bahiru N. Home delivery among pregnant women with ANC follow-up in Ethiopia; Evidence from the 2019 Ethiopia mini demographic and health survey. Frontiers in Public Health. 2022;10.
Nduka I, Nduka E. Determinants of noninstitutional deliveries in an urban community in Nigeria. Journal of Medical Investigations and Practice. 2014;9(3):102.
Mengesha ZB, Biks GA, Ayele TA, Tessema GA, Koye DN. Determinants of skilled attendance for delivery in Northwest Ethiopia: a community based nested case control study. BMC public health. 2013;13(1):1-6.
Feyissa TR, Genemo GA. Determinants of institutional delivery among childbearing age women in Western Ethiopia, 2013: unmatched case control study. PLoS One. 2014;9(5):e97194.
Kasaye HK, Endale ZM, Gudayu TW, Desta MS. Home delivery among antenatal care booked women in their last pregnancy and associated factors: community-based cross sectional study in Debremarkos town, North West Ethiopia, January 2016. BMC pregnancy and childbirth. 2017;17:1-12.
Dickson KS, Adde KS, Amu H. What Influences Where They Give Birth? Determinants of Place of Delivery among Women in Rural Ghana. International Journal of Reproductive Medicine. 2016;2016:7203980.
Muluneh AG, Animut Y, Ayele TA. Spatial clustering and determinants of home birth after at least one antenatal care visit in Ethiopia: Ethiopian demographic and health survey 2016 perspective. BMC pregnancy and childbirth. 2020;20(1):1-13.
Ejigu AG, Yismaw AE, Limenih MA. The effect of sex of last child on short birth interval practice: the case of northern Ethiopian pregnant women. BMC research notes. 2019;12:1-6.
Berhan Y, Berhan A. Antenatal care as a means of increasing birth in the health facility and reducing maternal mortality: a systematic review. Ethiopian journal of health sciences. 2014;24:93-104.
Wodaynew T, Fekecha B, Abdisa B. Magnitude of home delivery and associated factors among antenatal care booked mothers in Delanta District, South Wollo Zone, North East Ethiopia: a cross-sectional study, March 2018. Int J Womens Health Wellness. 2018;4(2):1-11.
Organization WH. Making pregnancy safer: the critical role of the skilled attendant: a joint statement by WHO, ICM and FIGO: World health organization; 2004.

No competing interests reported.

Download PDF

Version 1

posted

You are reading this latest preprint version

Machine learning algorithm to predict determinants of home delivery after ANC visit among reproductive age women in East Africa: Using SHAP

Status:

Version 1

Abstract

Figures

Introduction

Methods and materials

Data source, study population and sampling technique

Results

Discussion

Conclusion

Abbreviations

Declarations

References

Additional Declarations

Status:

Version 1