Propensity Score Analysis with Missing Data Using a Multi- Task Neural Networks

doi:10.21203/rs.3.rs-2075081/v1

Download PDF

Research Article

Propensity Score Analysis with Missing Data Using a Multi- Task Neural Networks

https://doi.org/10.21203/rs.3.rs-2075081/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 15 Feb, 2023

Read the published version in BMC Medical Research Methodology →

You are reading this latest preprint version

Background:Propensity score analysis is increasingly used to control for confounding factors in observational studies. Unfortunately, unavoidable missing values make estimating propensity scores extremely challenging. We propose a new method for estimating propensity scores in data with missing values.

Materials and Methods: Both simulated and real-world datasets are used in our experiments. The simulated datasets were constructed under two scenarios, the presence (T=1) and the absence (T=0) of the true effect. The real-world dataset comes from the LaLonde's employment training program. We construct missing data with varying degrees of missing rates under three missing mechanisms: MAR, MCAR, and MNAR. Then we compare MTNN with two other traditional methods in different scenarios. The experiments in each scenario were repeated 1000 times. Our code is publicly available at https://github.com/ljwa2323/MTNN.

Results:Under the three missing mechanisms of MAR, MCAR and MNAR, the RMSE between the effect and the true effect estimated by our proposed method is the smallest in simulations and in real-world data. Furthermore, the standard deviation of the effect estimated by our method is the smallest. In situations where the missing rate is low, the estimation of our method is more accurate.

Conclusions:MTNN can perform propensity score estimation and missing value filling at the same time through shared hidden layers and joint learning, which solves the dilemma of traditional methods and is very suitable for estimating true effect in samples with missing values. Therefore, it is expected to be extensively generalized and used in real-world observational studies.

Observational study

Propensity score analysis

Neural network

Multitasking learning

Causal effect estimation

Inverse probability weighting

In observational studies, propensity scores are increasingly used to control for confounding^{[1, 2]}. When the observed baseline characteristics are sufficient to correct for confounding bias and the propensity model is correctly constructed, a conditional exchange can be conducted between subjects with the same propensity score^{[3, 4]}. Observational studies usually inevitably have covariate missing values. Currently, estimating the propensity score in the presence of missing values is a challenge for studying causality^[5–8]. Common approaches to dealing with missing values in propensity analysis include full-case analysis, adding missing indicator variables to the propensity model, and multiple imputation^[9–11]. Unfortunately, these methods are inherently flawed. For example, the missing indicator method introduces new biases^[12]. There are studies using machine learning methods to replace traditional logistic regression^[13–17]. However, they do not address the propensity score misestimation problem caused by overfitting. In contrast to hand-crafted models^[18], neural networks can automatically learn interactions between variables. A multi-task neural network is a network structure with multiple outputs. It has been widely used in the medical field. With a multi-task neural network, propensity score computation and missing value filling can be performed jointly. By optimizing the global objective function, overfitting to the propensity score calculation task can be prevented, while the estimation problem of missing value^[19] is effectively solved. This study develops a new pipeline for calculating propensity scores in samples with missing values based on a multi-task neural network. To evaluate the accuracy of our model in estimating the true effect, we conduct experiments on simulated and real-world data separately, and compare our method with traditional methods.

2.1 Propensity Score

In a study, individual subjects may have multiple covariates. Propensity scoring is a way of simplification multiple covariates^[20]. It condenses multiple covariates into a single variable (propensity score), whose meaning is the conditional probability of being assigned to the experimental group depending on the covariates^[21]. A propensity score can be viewed as a function of the original multiple covariates, so the propensity score includes the information about these covariates. Rosenbaum and Rubin demonstrated that the propensity score $e\left(X\right)$ can be used to balance the distribution of a covariate between experimental and control groups when the covariate X meets the strong negligibility assumption^[3].

$$e\left({X}_{i}\right)=\text{Pr}\left({T}_{i}=1\mid {X}_{i}\right)$$

2.6 Propensity score estimation

In complete data, logistic regression is the most commonly used method for estimating propensity scores under the conditions of binary treatment or exposure^[22]. The propensity score is calculated by performing binary regression on covariates (i.e. potential confounders) by treatment or exposure indicator variables, which can be written as

$$\text{logit}\left({p}_{i}\left(T=1\right)\right)={X}^{{\prime }}\beta ,i=\text{1,2},\dots ,n$$

where, $X{\prime }=\left(1,{X}_{1},{X}_{2},\dots ,{X}_{K}\right),\beta =\left({\beta }_{0},{\beta }_{1},{\beta }_{2},\dots ,{\beta }_{K}\right){\prime }$,K is the number of covariates a, d n is the number of observations. An individual’s propensity score can be estimated as

$${p}_{i}=\frac{{\text{e}}^{{X}_{i}^{{\prime }}\beta }}{1+{\text{e}}^{{X}_{i}^{{\prime }}\beta }}$$

In many situations, logistic regression may not be the best choice when estimating propensity scores. We assumed that the log probability of exposure was linearly related to covariates when using logistic regression to estimate exposure probabilities. However, this assumption is not always true. A logistic regression cannot estimate propensity scores accurately when covariates interact with each other or when covariates and treatments are not linear. To solve the inherent problem of logistic regression estimation of propensity scores, some studies substitute machine learning algorithms for logistic regression, such as decision trees, random forests, I bayes, support vector machines, etc.^{[13–15, 23, 24]}. It is claimed that these methods can provide a more accurate estimate of propensity scores. Nevertheless, these conclusions have not been validated by systematic simulation studies.

2.3 Missing data

In realistic observational studies, individual covariates may have large amounts of missing data, which may lead to both loss of efficiency and biased estimates. Based on the degree to which confounding factors are related to outcome and exposure, the magnitude of bias varies.

2.6.2 Type of missing data

There are three types of missing data depending on the mechanism of missing: missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR)^{[25, 26]}. MCAR refers to missing data when a random subset of the study population has the same probability of being missing. In contrast to MCAR, the term MAR is counterintuitive. MAR occurs when the probability of missing is dependent only on the observed information. Missing data are denoted by MNAR when their probability depends on the unobserved data, such as the observation value itself.

2.3.2 Methods for handling missing values

Complete case analysis is the easiest way to deal with missing confounding data, which restricts the analysis to cases where all variables are complete. If the absence of covariates is independent of treatment and outcome, then this approach provides unbiased estimates of group effects. Another simple method is the missing indicator method^[27]. Before incorporating confounding into a propensity score model, add a “missing” category to partially observed categories. Continuous confounders are set to a specific value, such as 0, and both the confounding factor and missingness indicator (a variable that indicates whether the variable is observed) are included in the propensity score model. In many cases, this approach leads to biased results. Missing pattern analysis is a generalization of the missing index method, where all individuals are grouped together according to different missing patterns, and then propensity scores are estimated in each group separately. As a practical matter, this method fails when the number of participants with missing patterns is lower than the number of observed covariates. It usually occurs when there are a lot of missing patterns in the data. Multiple imputation is a method of using chain equations to impute missing data, in which the missing covariates are imputed with plausible values based on the predicted distribution of the missing covariates in a set of observed data many times to create complete datasets^{[28, 29]}. We used MICE (version 3.3.0) in R (version 3.6.3) to perform multiple imputation. A Bayesian linear regression was used for the mice model. It is commonly used when covariates and outcomes are continuous. Other parameters are set as defaults.

2.4 Inverse probability weighting

Inverse probability weighting (IPW) uses the inverse of the propensity score as weights to create a synthetic sample in which the baseline covariate distribution is independent of treatment assignment ^[30]. In this study, we use IPW to estimate the true effect. Unlike propensity score matching, IPW uses all individuals in both groups, thus avoiding sample waste. A high level of statistical power was maintained in all cases to detect effects. IPW was more sensitive to erroneous propensity score estimation. This limitation emphasizes the importance of carefully defining the model selection before applying propensity score weighting. Multi-task neural networks can overcome this limitation.

2.5 Multi-task neural network

Neuronal networks are excellent function approximators, which can estimate linear and nonlinear functions. It uses data samples with known outcomes as examples for supervised training. In this process, a nonlinear function model is built to predict the output data based on the input data. Figure 1 (a) shows three independent neural networks. All networks have the same inputs and outputs. Backpropagation is used to train each net separately. There is no connection between the three nets, so the information that one learns cannot help the others. This is known as single-task learning (STL). Figure 1 (b) shows a single net with the same inputs as those on the left, but three outputs corresponding to the learning task. Each of the 3 outputs is connected to the same hidden layer. Three of the MTL outputs undergo parallel backpropagation. These results share a hidden layer, meaning the internal representation of one task is available to other tasks. The core idea of multitask learning is to share knowledge learned from different tasks and to train them simultaneously.

In this study, we propose a new pipeline using a multi-task neural network (MTNN) to estimate propensity scores. There are three parts to our task set: reconstructing input covariates, estimating propensity scores, and predicting missing patterns. There is a close relationship between these tasks. The structure of MTNN is shown in Fig. 1 (c). In order to achieve joint optimality across all tasks, the MTNN must correctly learn the relationship between covariates, covariates and absence, and covariates and exposure levels. Through joint learning and sharing hidden layers, MTNN reduces overfitting when estimating propensity scores. The detailed calculation procedure and more information about MTNN training can be found in Supplementary S.1. Our tutorial and source code for MTNN are also available on github so readers can apply our method to real problems and gain a deeper understanding of it. Models for missing value imputation and estimation of propensity scores are determined from the convergence of the objective function, and in all experiments in this study, we chose the model for the last epoch after convergence.

2.6 Data

2.6.2 Simulation data

We adopted a data simulation generation process similar to that of Choi^[7]. Two scenarios were considered, one in which the outcome was treatment-related (effect$\ne$0), and one in which it was treatment-independent (effect = 0). In each scenario, we considered three different deletion mechanisms. First, we generated two continuous covariates, ${X}_{1}$ and ${X}_{2}$, for each subject. ${X}_{1}$ follows a normal distribution with mean 0 and standard deviation 1. ${X}_{2}$ depends on ${X}_{1}$.

$${X}_{2i}=0.5{X}_{1i}+{\epsilon }_{i}\text{ with }{\epsilon }_{i}\sim N\left(\text{0,0.75}\right)$$

In this way, the standard deviation of ${X}_{2}$ is also 1, and the correlation between ${X}_{1}$ and ${X}_{2}$ is equal to 0.5. The treatment T was generated from the binomial distribution, with the probability for subject I to receive the treatment being equal to:

$$\text{logit}\left(P\left({T}_{i}=1|{X}_{1i},{X}_{2i}\right)\right)=-0.8+0.5{X}_{1i}+0.5{X}_{2i}$$

By this equation, about 30% of subjects were treated.

We constructed two scenarios:

Scenario 1: the outcome are affected by treatment: we assume, without losing generality, that treatment has an effect of 1 on the subject’s outcome.

$${Y}_{i}={X}_{1i}+{X}_{2i}+Trea{t}_{i}+{\epsilon }_{i},\text{ with }{\epsilon }_{i}\sim N\left(\text{0,1}\right)$$

Scenario 2

the outcome is unrelated to the treatment.

$${Y}_{i}={X}_{1i}+{X}_{2i}+{\epsilon }_{i},\text{ with }{\epsilon }_{i}\sim N\left(\text{0,1}\right)$$

To test the effect of different missing rates on effect estimation in simulated datasets, we preset 7 missing rates, including 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, and 0.8. Missing values in ${X}_{2}$ are generated using three mechanisms:

(1) MCAR: In ${X}_{2}$, randomly selected given proportion of observations are set to be missing.

(2) MAR: The higher the value of ${X}_{1}$, the more likely the value of ${X}_{2}$ is missing. Taking $M$ as the missing indicator of ${X}_{2}$, the probability of missing ${X}_{2}$ value is:

$$\text{logit}\left(P\left({M}_{i}=1\right)\right)={X}_{1i}+C$$

(3) MNAR: The higher the value of ${X}_{2}$, the more likely the value is missing. The probability of missing an ${X}_{2}$ value is:

$$\text{logit}\left(P\left({M}_{i}=1\right)\right)={X}_{2i}+C$$

C is a constant used to control the missing rate. As an example, if a missing rate of around 50% is to be controlled, C can be set to 0.

2.6.2 Real-world data

The real-world data come from a subset of the data from the treated group in the National Supported Work Demonstration (NSWD) and the comparison sample from the Population Survey of Income Dynamics (PSID). The dataset has been used by many researchers to test the effects of different propensity score analysis methods ^{[31, 32]}. There are 614 samples in this dataset (185 treatments and 429 controls). Each person has nine variables. Table S1 provides more details. Treat is the intervention variable, re78 is the outcome, and the other 7 variables are covariates. Table S2 summarizes the distribution of covariates between different treatment groups.

Our experiments used the inverse probability-weighted effect size of the propensity score calculated from the complete data as the true value. Simulations were then performed to estimate the true effect under the three missing mechanisms. We made missing values occur in both variables re74 and re75. The construction of missing values for these two variables was performed randomly separately. Similarly to the setting we used for simulated datasets, we used 7 missing rate settings for real-world datasets: 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, and 0.8.

MCAR: In both variables re74 and re75, randomly selected given proportion of observations are set to be missing.

MAR: The missing rate is assumed to be proportional to a linear combination of age and education. To facilitate setting the probability of missing, we standardize the age and years of education so that the mean is 0. Let ${M}_{1}$ and ${M}_{2}$ represent the missing indicators of re74 and re75, respectively, then their missing probability is:

$$\text{logit}\left(P\left({M}_{i1}=1\right)\right)=\text{a}\text{g}{\text{e}}_{i}+\text{e}\text{d}\text{u}{\text{c}}_{i}+C$$

$$\text{logit}\left(P\left({M}_{i2}=1\right)\right)=\text{a}\text{g}{\text{e}}_{i}+\text{e}\text{d}\text{u}{\text{c}}_{i}+C$$

MNAR: The higher the value of a variable, the more likely that value is missing. Similar to age and years of education, we also normalize re74 and re75. Then the probability of re74/re75 missing is:

$$\text{logit}\left(P\left({M}_{i1}=1\right)\right)={\text{r}\text{e}74}_{i}+C$$

$$\text{logit}\left(P\left({M}_{i2}=1\right)\right)={\text{r}\text{e}75}_{i}+C$$

2.7 Estimation of the true effect

The first step is to deal with the missing values in the samples. As MTNN computes propensity scores and imputation values simultaneously, it does not require separate missing value processing. When propensity scores were estimated by logistic regression, multiple imputation and missing indicator methods were used to handle missing values. We estimate propensity scores using age, education, race, marital status, education, and re74 and re75 as covariates. These 7 covariates are also included in the regression analysis used to estimate effect. Lastly, we estimated the effect using an inverse probability-weighted regression analysis of the propensity score, in which subjects receiving treatment were weighed 1/propensity score and subjects not receiving treatment were weighed 1/(1 - propensity score). Figure 2 shows the workflow for estimating the effects of the three methods.

2.8 Evaluation

There are two kinds of effects in the experiments with simulated data, and three mechanisms for handling missing values, i.e., 6 scenarios for generating simulated data, and 3 methods for handling missing values. In the experiments with real-world data, there are three missing mechanisms, namely three scenarios. For each scenario, the same process of missing value imputation, propensity score calculation, and effect estimation was repeated 1000 times before evaluating the results of the different methods. Approaches are compared in terms of standard deviation (SD) of the estimates from the 1000 replications and root mean square error (RMSE), calculated as the mean square of the differences between each estimate $\widehat{\beta }$ and the true value $\beta$:

$$RMSE=\sqrt{\frac{1}{n}\sum _{i=1}^{n} {\left({\widehat{\beta }}_{i}-\beta \right)}^{2}}$$

3.1 Analysis results on simulation datasets

Fig.3 shows the RMSE of the true effect estimates under two true effect scenarios and three missing mechanisms. The smallest RMSE for all six data scenarios is achieved with MTNN. Thus, MTNN seems to be the best method over the other two. In addition, regardless of the choice of method used, the higher the missing rate, the higher the RMSE. When the missing rate was increased from 0.2 to 0.8, the RMSE for any of the three estimation methods nearly doubled. Table 1, table S3 and table S4 present more detailed information on the estimation results for the three methods. In all scenarios of data, we find that MTNN is not only optimal in estimation of true effect deviation, but also that the standard deviation of its estimation results is the smallest. This shows that MTNN provides the most accurate estimation, as well as being more stable than other methods.

3.2 Analysis results on real-world datasets

We first calculated the propensity score by logistic regression from the complete data, and then used the inverse probability-weighted regression equation to calculate the effect to be 712.743, which we used as the true value of the effect for subsequent comparisons.

Fig.4 compares RMSE between different methods under three distinct missing mechanisms. According to the analysis results of simulated data, MTNN exhibited the smallest RMSE under different missing mechanisms and missing rates. The difference is that in the real-world dataset, the missing rate is less influential on the RMSE of the estimated result. Table 2, tables S5 and table S6 provide further details of the estimation results for the various methods. It is clear that the standard deviation of the MTNN estimation results is lower than that of the two other methods.

In this study, we develop a novel method for calculating propensity scores with multi-task neural networks that can calculate propensity scores directly for samples with missing values. On simulated and real-world datasets, we compare the proposed method with two commonly used ones. Under the three missing mechanisms, the RMSE of our proposed method for estimating the true effect is the smallest. In addition, the standard deviation of the true effect estimated by MTNN is the smallest, indicating that it is more robust than the other two methods. While previous studies have demonstrated smaller RMSEs for machine learning algorithms, our study confirms these findings in scenarios with missing values^[33–36]. We also found that under lower missing rate conditions, the RMSE of the missing indicator method is better than multiple imputation for all 3 missing mechanisms. This result is consistent with the previous study^[7].

Recent studies have used autoencoders to reduce the dimension of high-dimensional features and then calculate propensity scores using the reduced features^[17]. It leverages the ability of neural networks to deal with high-dimensional data. However, they did not consider reconstruction and computation of the propensity score as joint tasks. Instead, we train the model together with reconstruction of the input, prediction of missing patterns, and estimation of propensity scores as joint tasks to prevent overfitting. It causes propensity scores to be close to zero or one, resulting in biased estimates of the effects.

With the increase in the dimension of variables in observational studies, and the increase in the modalities of the data. The relationship between variables will be more complex, and missing will be more difficult to avoid. It also becomes increasingly difficult to manually determine propensity models for high-dimensional variables. The neural network has the ability to model complex models, so there is no need to manually specify the so-called correct model, and the neural network can learn adaptively by observing the data. Performing multiple imputation is expensive in large datasets. In contrast, for the MTNN model, the computational cost of this process is smaller. Furthermore, Compared to multiple imputation ^[37], MTNN does not require any prior assumptions about the distribution of the data. It automatically learns the correlations between variables, thus impute their missing values.

Limitations

Our study also has some limitations. First, the performance of the MTNN model on simulated data and real-world data is not completely consistent. The reason for this phenomenon is that in real-world data, the linkages between variables and between variables and outcomes and treatments are more complex. These unknowable complex connections are difficult to simulate manually. Our experiments only simulate the simplest case, so there is a slight difference between the results of the two types of data. Second, in the real-world data experiment, we will use the effect value calculated by inverse probability weighting from the complete data as the value of the true effect. Because in real-world data, the real effect cannot be obtained, it can only be estimated.

In this study, we proposed a novel method for estimating propensity scores in data with missing values. It is based on a multi-task neural network, where missing value imputation and propensity score estimation are jointly trained as related tasks. Through the experimental results of simulated data and real-world data, we prove that our model has the smallest error in estimating the true effect under different missing mechanisms and different missing rates, and the standard deviation of the effect estimate is also the smallest. This shows that our method has good applicability in real-world observational studies with missing values.

Ethics approval and consent to participate Not applicable.

Consent for publication Not applicable.

Availability of data and materials The data in this study is available from the corresponding author on reasonable request. Readers interested in the code of the simulation analysis may contact the corresponding author.

Competing Interests The authors have no conflicts of interest to declare.

Funding This work was partially supported by the National Natural Science Foundation of China [grant number 11901352]; the Research Grants Council of the Hong Kong Special Administrative Region, China [HKU C7123-20G]; "Coronavirus Disease Special Project" of Xinglin Scholars of Chengdu University of Traditional Chinese Medicine [grant number XGZX2013].

Author contributions Study conception and design: S Yang, J Luo and X Yan; Collection and creation of data: J Luo, P Du and S Yang; Data analysis and interpretation: S Yang, J Luo, X Yan, X Feng, P Du; Drafting the manuscript and figures: all authors; Final approval of manuscript: all authors.

Acknowledgment The authors would like to thank Professor He Daihai for theoretical guidance.

WEBSTER-CLARK M, STüRMER T, WANG T, et al. Using propensity scores to estimate effects of treatment initiation decisions: state of the science [J]. Stat Med. 2021;40(7):1718–35.
AUSTIN P C, JEMBERE N, CHIU M. Propensity score matching and complex surveys [J]. Statistical methods in medical research, 2018, 27(4): 1240-57.
ROSENBAUM P R, RUBIN DB. The central role of the propensity score in observational studies for causal effects [J]. Biometrika. 1983;70(1):41–55.
LIN J, GAMALO-SIEBERS M, TIWARI R. Propensity‐score‐based priors for Bayesian augmented control design [J]. Pharm Stat. 2019;18(2):223–38.
CHAM H, WEST SG. Propensity score analysis with missing data [J]. Psychol Methods. 2016;21(3):427.
D'AGOSTINO JR R B, RUBIN DB. Estimating and using propensity scores with partially missing data [J]. J Am Stat Assoc. 2000;95(451):749–59.
CHOI J, DEKKERS O M, LE CESSIE S.. A comparison of different methods to handle missing data in the context of propensity score analysis [J]. Eur J Epidemiol. 2019;34(1):23–36.
PERERA-SALAZAR R MALLAL, MCFADDEN E, et al. Handling missing data in propensity score estimation in comparative effectiveness evaluations: a systematic review [J]. J Comp Eff Res. 2018;7(3):271–9.
SHAO J, WANG L. Semiparametric inverse propensity weighting for nonignorable missing data [J]. Biometrika. 2016;103(1):175–87.
QU Y, LIPKOVICH I. Propensity score estimation with missing values using a multiple imputation missingness pattern (MIMP) approach [J]. Stat Med. 2009;28(9):1402–14.
CROWE B J, LIPKOVICH I A, WANG O. Comparison of several imputation methods for missing baseline data in propensity scores analysis of binary outcome [J]. Pharm Stat. 2010;9(4):269–79.
MATTEI A. Estimating and using propensity score in presence of missing background data: an application to assess the impact of childbearing on wellbeing [J]. Statistical Methods and Applications, 2009, 18(2): 257–73.
YARNOLD LINDENA. P R. Combining machine learning and propensity score weighting to estimate causal effects in multivalued treatments [J]. J Eval Clin Pract. 2016;22(6):875–85.
CANNAS M. ARPINO B. A comparison of machine learning algorithms and covariate balance measures for propensity score matching and weighting [J]. Biom J. 2019;61(4):1049–72.
TU C. Comparison of various machine learning algorithms for estimating generalized propensity score [J]. J Stat Comput Simul. 2019;89(4):708–19.
SETOGUCHI S, SCHNEEWEISS S, BROOKHART M A, et al. Evaluating uses of data mining techniques in propensity score estimation: a simulation study [J]. Pharmacoepidemiol Drug Saf. 2008;17(6):546–55.
WEBERPALS J, BECKER T, DAVIES J, et al. Deep Learning-based Propensity Scores for Confounding Control in Comparative Effectiveness Research: A Large-scale, Real-world Data Study [J]. Epidemiology. 2021;32(3):378–88.
KUBAT M. Neural networks: a comprehensive foundation by Simon Haykin, Macmillan. 1994, ISBN 0-02-352781-7 [J]. The Knowledge Engineering Review, 1999, 13(4): 409 – 12.
CARUANA R. Multitask learning [J]. Mach Learn. 1997;28(1):41–75.
GUO S. FRASER M W. Propensity score analysis: Statistical methods and applications [M]. SAGE publications; 2014.
STUART E A. Matching methods for causal inference: A review and a look forward [J]. Stat science: Rev J Inst Math Stat. 2010;25(1):1.
CEPEDA MS, BOSTON R, FARRAR J T, et al. Comparison of logistic regression versus propensity score when the number of events is low and there are multiple confounders [J]. Am J Epidemiol. 2003;158(3):280–7.
LEE B K, LESSLER J, STUART E A. Improving propensity score weighting using machine learning [J]. Stat Med. 2010;29(3):337–46.
WESTREICH D, LESSLER J, FUNK MJ. Propensity score estimation: machine learning and classification methods as alternatives to logistic regression [J]. J Clin Epidemiol. 2010;63(8):826.
SANTOS MS, PEREIRA R C, COSTA A F, et al. Generating synthetic missing data: A review by missing mechanism [J]. IEEE Access. 2019;7:11651–67.
GARCIARENA U. SANTANA R. An extensive analysis of the interaction between missing data types, imputation methods, and supervised classifiers [J]. Expert Syst Appl. 2017;89:52–65.
WEST S G, CHAM H, THOEMMES F, et al. Propensity scores as a basis for equating groups: basic principles and application in clinical treatment outcome research [J]. J Consult Clin Psychol. 2014;82(5):906.
ZHANG P. Multiple imputation: theory and method [J]. International Statistical Review/Revue Internationale de Statistique. 2003: 581 – 92.
LI P, STUART E A, ALLISON DB. Multiple imputation: a flexible tool for handling missing data [J]. JAMA. 2015;314(18):1966–7.
AUSTIN P C. An introduction to propensity score methods for reducing the effects of confounding in observational studies [J]. Multivar Behav Res. 2011;46(3):399–424.
LALONDE R J. Evaluating the econometric evaluations of training programs with experimental data [J]. The American economic review; 1986. 604 – 20.
DEHEJIA R H, WAHBA S. Causal effects in nonexperimental studies: Reevaluating the evaluation of training programs [J]. J Am Stat Assoc. 1999;94(448):1053–62.
KARIM M E PANGM. PLATT R W. Can we train machine learning methods to outperform the high-dimensional propensity score algorithm? [J]. Epidemiology. 2018;29(2):191–8.
WYSS R, SCHNEEWEISS S, VAN DER LAAN M, et al. Using super learner prediction modeling to improve high-dimensional propensity score estimation [J]. Epidemiology. 2018;29(1):96–106.
JU C, COMBS M, LENDLE SD, et al. Propensity score prediction for electronic healthcare databases using super learner and high-dimensional propensity score methods [J]. J Applied Statistics. 2019;46(12):2216–36.
CHOI B Y, WANG C-P MICHALEKJ, et al. Power comparison for propensity score methods [J]. Comput Stat. 2019;34(2):743–61.
LIU X. Methods and applications of longitudinal data analysis [M]. Elsevier; 2015.

Table 1 Estimation of the true effect in the simulated datasets using three different methods under the MCAR mechanism.

Missing rate	Method	True effect =0			True effect =1
Missing rate	Method	Mean	SD	RMSE	Mean	SD	RMSE
0.2	Missing indicator	0.123	0.057	0.135	1.123	0.057	0.125
	Multiple imputation	0.116	0.060	0.129	1.126	0.057	0.127
	Multi-task neural network	0.096	0.052	0.108	1.084	0.055	0.085
0.3	Missing indicator	0.147	0.060	0.158	1.147	0.060	0.149
	Multiple imputation	0.152	0.060	0.163	1.163	0.074	0.165
	Multi-task neural network	0.112	0.056	0.124	1.127	0.054	0.128
0.4	Missing indicator	0.174	0.058	0.182	1.174	0.058	0.175
	Multiple imputation	0.185	0.084	0.202	1.200	0.068	0.201
	Multi-task neural network	0.141	0.048	0.148	1.142	0.055	0.143
0.5	Missing indicator	0.205	0.065	0.214	1.205	0.065	0.207
	Multiple imputation	0.218	0.065	0.226	1.212	0.072	0.214
	Multi-task neural network	0.173	0.054	0.181	1.174	0.058	0.175
0.6	Missing indicator	0.237	0.068	0.246	1.237	0.068	0.239
	Multiple imputation	0.230	0.076	0.241	1.237	0.068	0.239
	Multi-task neural network	0.203	0.061	0.211	1.202	0.059	0.204
0.7	Missing indicator	0.249	0.080	0.260	1.249	0.080	0.251
	Multiple imputation	0.251	0.082	0.262	1.250	0.075	0.252
	Multi-task neural network	0.214	0.077	0.226	1.213	0.072	0.215
0.8	Missing indicator	0.266	0.064	0.273	1.266	0.064	0.268
	Multiple imputation	0.269	0.070	0.277	1.257	0.077	0.259
	Multi-task neural network	0.236	0.059	0.243	1.233	0.062	0.235

SD, standard deviation; RMSE, root mean square error; MCAR, missing completely at random; MAR, missing at random; MNAR, missing not at random.

Table.2 Estimation of the true effect in the real-world datasets using three different methods under the MCAR mechanism.

Missing rate	Method	Mean	SD	RMSE
0.2	Missing indicator	347.199	236.990	429.151
	Multiple imputation	285.931	282.518	503.987
	Multi-task neural network	639.059	108.559	126.632
0.3	Missing indicator	316.332	366.169	527.080
	Multiple imputation	278.732	328.906	534.534
	Multi-task neural network	647.959	137.392	145.554
0.4	Missing indicator	283.208	238.148	485.328
	Multiple imputation	289.757	295.469	507.433
	Multi-task neural network	664.981	126.017	128.738
0.5	Missing indicator	203.584	276.721	572.852
	Multiple imputation	200.004	137.295	529.024
	Multi-task neural network	616.633	106.922	139.736
0.6	Missing indicator	312.163	236.664	459.209
	Multiple imputation	211.421	151.232	521.448
	Multi-task neural network	566.496	125.629	188.66
0.7	Missing indicator	337.854	195.997	418.467
	Multiple imputation	308.159	199.459	446.647
	Multi-task neural network	556.685	126.365	196.788
0.8	Missing indicator	333.714	122.130	396.342
	Multiple imputation	352.637	97.894	371.888
	Multi-task neural network	561.435	154.324	210.543

SD, standard deviation; RMSE, root mean square error; MCAR, missing completely at random; MAR, missing at random; MNAR, missing not at random.

No competing interests reported.

Supplementary.docx

Download PDF

Journal Publication

published 15 Feb, 2023

Read the published version in BMC Medical Research Methodology →

Editorial decision: Major revision
14 Nov, 2022
Reviews received at journal
24 Oct, 2022
Reviewers agreed at journal
17 Oct, 2022
Reviewers invited by journal
14 Oct, 2022
Editor assigned by journal
07 Oct, 2022
Editor invited by journal
29 Sep, 2022
Submission checks completed at journal
29 Sep, 2022
First submitted to journal
17 Sep, 2022

You are reading this latest preprint version

Propensity Score Analysis with Missing Data Using a Multi- Task Neural Networks

Status:

Journal Publication

Version 1

Abstract

Figures

1. Introduction

2. Data And Methods

2.1 Propensity Score

2.6 Propensity score estimation

2.3 Missing data

2.6.2 Type of missing data

2.3.2 Methods for handling missing values

2.4 Inverse probability weighting

2.5 Multi-task neural network

2.6 Data

2.6.2 Simulation data

2.6.2 Real-world data

2.7 Estimation of the true effect

2.8 Evaluation

3. Result

3.1 Analysis results on simulation datasets

3.2 Analysis results on real-world datasets

4. Discussion

Conclusion

Declarations

References

Tables

Additional Declarations

Supplementary Files

Status:

Journal Publication

Version 1