Critical Analysis of Privacy Risks in Machine Learning and Implications for Use of Health Data: A systematic review and meta-analysis on membership inference attacks

doi:10.21203/rs.3.rs-3393386/v1

Download PDF

Research Article

Critical Analysis of Privacy Risks in Machine Learning and Implications for Use of Health Data: A systematic review and meta-analysis on membership inference attacks

https://doi.org/10.21203/rs.3.rs-3393386/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Purpose.

Machine learning(ML) has revolutionized data processing and analysis, with applications in health showing great promise. However, ML poses privacy risks, as models may reveal information about their training data. Developing frameworks to assess/mitigate privacy risks is essential, particularly for health data custodians responsible for adhering to ethical and legal standards in data use. In September 2022, we conducted a systematic review/meta-analysis to estimate the relative effects of factors hypothesized to contribute to ML privacy risk, focusing on membership inference attacks (MIA).

Methods.

Papers were screened for relevance to MIA, and selected for the meta-analysis if they contained attack performance(AP) metrics for attacks on models trained on numeric data. Random effects regression was used to estimate the adjusted average change in AP by model type, generalization gap and the density of training data in each region of input space (partitioned density). Residual sum of squares was used to determine the importance of variables on AP.

Results.

The systematic review and meta-analysis included 115 and 42 papers, respectively, comprising 1,910 experiments. The average AP ranged from 61.0% (95%CI:60.0%-63.0%; AUC)-74.0% (95%CI:72.0%-76.0%; recall). Higher partitioned density was inversely associated with AP for all model architectures, with the largest effect on decision trees. Higher generalization gap was linked to increased AP, predominantly affecting neural networks. Partitioned density was a better predictor of AP than generalization gap for most architectures.

Conclusions.

This is the first quantitative synthesis of MIA experiments, that highlights the effect of dataset composition on AP, particularly on decision trees, which are commonly used in health.

Machine Learning

Privacy

Membership Inference Attacks

Privacy Risk Assessment

Machine learning (ML) has emerged as a powerful tool that enables and streamlines high-volume data processing, and has led to improvements in data analysis and task automation. The rapid advancements in this technology include applications in health, ranging from medical imaging analysis to personalized medicine. The success of these applications and inherent potential for future advancements have made ML attractive to those aiming to develop products for the betterment of health, and also as an avenue for commercialization. However, evidence has indicated that ML can pose a significant threat to data privacy, which is of particular importance in health, where the data are highly sensitive. The privacy risks associated with ML are unique from those of traditional use and disclosure of data, for which de-identification and secure data sharing mechanisms have been developed. This is because ML models can leak information about their training data, posing a privacy risk even when the data itself is not directly accessible. Frameworks for assessing privacy risks are vital to strike a balance between technological advancements and data protection. Such frameworks have not yet been developed, but are essential for all users of this technology, and in particular health data custodians, who oversee the use and disclosure of data under the governance of ethical and legal standards, which often include requirements to assess and mitigate risk.

Two types of attacks on ML models where adversaries attempt to obtain sensitive information about a model’s training data have been demonstrated in the literature. These attacks involve adversaries repeatedly querying a model and analyzing various components of its output to draw inferences about the training data^1–6. The complexity of these attacks varies from using confidence thresholds to training ML attack models, and their success doesn’t rely on prior knowledge about the model or data^1–6. A model inversion attack, first introduced in 2015, is when an adversary attempts to reconstruct sensitive data from a model’s outputs^1,4. Membership inference attacks (MIA), first introduced in 2017, are when an adversary attempts to determine whether specific records were included in the model’s training data by detecting differences in model behaviour with and without certain inputs^2,3. There are several theoretical explanations for the vulnerability of ML models to privacy attacks^{9,23,32,33,35,36,38–44}. However, these explanations are still preliminary, as the growing complexity of ML models and the data on which they are trained makes it challenging to fully characterize the various attributes that can be exploited in privacy attacks³⁸.

To date, these attacks have predominantly been evaluated through computer science experiments, where attack performance is measured under different scenarios defined by varying training datasets, target model attributes or approaches to the attack⁶. However, there has been no comprehensive evaluation and summary of the entire body evidence, or application of effect estimation to measure the adjusted influence of different factors on attack performance. To address this gap, we conducted a systematic review and meta-analysis, with the goals of consolidating the available evidence and estimating the effects of factors hypothesized to contribute to ML privacy risk. We selected MIA for this work, as while both attacks pose significant risk, MIA is of particular concern for health data, where successful membership inference can lead to the health status of individuals being disclosed.

We conducted a systematic review and meta-analysis in September 2022. We queried Web of Science for relevant articles using the following search terms: ("Machine learning") (Topic) AND ("re-identification risk" OR "Membership Inference" OR "tracing attack" OR "re-identification attack") (Topic). We reviewed reference lists and grey literature to identify additional relevant manuscripts.

2.1 Inclusion/Exclusion Criteria

There were two stages of manuscript selection: title/abstract review and full text review. During title/abstract review, records were included if the title and/or abstract contained the term “membership inference” and/or the content was related to privacy in ML. Title/Abstract review exclusion criteria were: specific applications of ML (unrelated to privacy); the title/abstract are not in English; the work was not peer-reviewed. Indicators of peer review were acceptance by a scientific journal or conference. Eligible full text was reviewed with two sets of inclusion/exclusion criteria for the systematic review and meta-analysis. Systematic review exclusion criteria where: full text was unavailable; full text was not in English; paper summarized work on specific applications of ML (unrelated to privacy); the work was unrelated to MIAs; the work was focused on adversarial learning.

We opted to exclude evidence from models trained exclusively on imaging data from the meta-analysis, to tailor the review to numeric and text data. Additionally, while methods aiming to reduce privacy risk of deployed models exist, attacks that circumvent these defenses are being developed simultaneously. Therefore, we assessed the risk of undefended data/models, taking a precautionary approach to risk assessment rather than assuming current defense mechanisms will remain robust over time. Papers were excluded from the meta-analysis if they didn’t contain baseline undefended estimates of MIA performance. Additional meta-analysis exclusion criteria were: no quantitative analysis; no numeric presentation of results (figures only); and no validated quantitative measures of attack performance.

2.2 Selection Process and Data Extraction

Records were divided among 5 reviewers with expertise in biostatistics, epidemiology, and ML, who met weekly to discuss progress and decisions on records with unclear eligibility. Records were randomly selected from each subset for second independent eligibility review, and discordant decisions were discussed. Information on theoretical explanations for MIA vulnerability was collected from papers included in the systematic review. A preliminary review of the literature on MIAs informed development of the data extraction forms. A description of the data elements extracted from each paper is included in supplement 1.

2.3 Synthesis Methods

We summarized the training dataset attributes, target model (architecture and task), and attack set-up and performance, including the metrics used to measure performance and whether the attack performance was at least 50%, ≥ 75%, or ≥ 90%. These thresholds were selected to indicate performance greater than chance, with increasing probability of disclosure. Attack performance was recorded based on the highest value for any experiment presented in the paper. Attack performance metrics selected for inclusion were undefended, unless otherwise specified.

2.4 Meta-Analysis Methods

There are several metrics used to quantify MIA performance. Commonly used metrics that encompass ability to correctly identify members and non-members include accuracy and area under the receiver operator curve (AUC). Accuracy measures the proportion of correct predictions out of all predictions made (True Positive + True Negative / All Predictions). The AUC is a metric that quantifies the overall performance of a classification model by measuring the probability that the model will rank a randomly chosen positive sample higher than a randomly chosen negative sample as the classification threshold varies. Three additional metrics focusing on positive predictions are precision, recall (sensitivity) and F1-score. Precision indicates the proportion of correctly identified positive instances (True positive/True positive + false positive), while recall (also known as sensitivity) denotes the proportion of true positives that were correctly identified (True positive/True positives + false negatives), and F1-Score is the harmonic mean of precision and recall. Since these metrics provide different types of information on the success of an attack, we included assessments of each in the meta-analysis, creating separate models for each.

The literature contained experiment details, but did not provide estimates of the effects of different factors on attack performance. Therefore, we constructed a dataset with each experiment as a row, and analyzed it as clustered data. Attack performance metrics were averaged across scenarios and studies, with 95% confidence intervals (CI) overall and stratified by training dataset and target model architecture. We defined covariates for multivariable analysis based on review of the literature on hypothesized contributors to attack success. We divided the number of records by the product of number of classes and features [records/(classes*features)], to calculate the density of the data in each region of input space, which we refer to as partitioned density. We calculated the target model generalization gap by subtracting the validation accuracy from the training accuracy. When there was insufficient data on specific subtypes of model architectures, we grouped the models into larger categories of the same model type. For example, multi-layer perceptrons were grouped with other neural networks.

We used recursive partitioning to identify meaningful cut-points to define categories of partitioned density and generalization gap within which there were uniform relationships with the outcome. We used random effects regression models to estimate the adjusted average change in attack performance for levels of partitioned density and generalization gap by target model architecture, using dataset and study as random intercepts. Decision trees were used to estimate the absolute and relative importance of each variable on attack performance by model type. Decision trees were constructed using cost-complexity pruning, ten-fold cross-validation, and a specified maximum depth of 20. Variable importance measured based on residual sum of squares (RSS). Absolute variable importance is the reduction in RSS when a split is found at a node⁷. Relative variable importance reflects the relative influence of each variable on RSS, taking values from 0 to 1, with 1 being assigned to the variable with the highest absolute variable importance measure as the reference ⁷.

There were 115 papers selected for the systematic review. Table 1 shows the study characteristics and attack performance from 111 papers that included quantitative assessments of MIAs, the remaining papers described concepts and theories without experiments. Each paper presented at least one experiment with a performance metric above 50%, with 81.1% of them achieving attack performance ≥ 75%, and 50.5% reaching attack performance ≥ 90%. Of the 115 papers, 42 met inclusion for the meta-analysis, providing evidence on attack performance from 1,910 experiment scenarios (Fig. 1). The average attack performance from these experiments was: accuracy, 65.0% (95%CI: 64.0, 66.0%); recall (sensitivity), 74.0% (95%CI: 72.0%, 76.0%); precision, 65.0% (95%CI: 64%, 67%); F1 score, 63.0% (95%CI: 60.0%, 65.0%); and AUC, 61.0% (95%CI: 60.0%, 63.0%), with variation across training dataset and target model type (Table 2). Health data was used in 32.4% of the papers in the systematic review^2–35, and 52.4% of papers in the meta-analysis^{3–5,9,11–14,16–28,31}. The most analyzed health dataset was Texas Inpatient Hospital data (18 studies), with performance ranging from accuracy of 70.0% (95%CI: 68%, 73%), to recall (sensitivity) of 88.0% (95%CI: 82.0%, 94%)^{3,5,9,11,12,14,16,17,19–25,31,36,37}. Review of the literature highlighted 3 overarching concepts related to the vulnerability of ML models to MIA. The first is retention of training data in the ML model^{35,36,38,40–42,45}. The second is related to the extent to which single data points can influence model decisions, which is most often a consequence of the model architecture^36,38. And third, the distributions in the training data^{38,39,41–43}. These explanations are not mutually exclusive, as each may contribute to the success of a privacy attack to varying degrees.

3.1 Retention of Training Data in ML Models

A model consists of functions that map the vector of feature values to the output, which is typically a class label in classification models. Model parameters are the variables within these equations that are adjusted during the training process to minimize the difference between the model's output and the true values in the data. This is typically done by iteratively adjusting the parameters using an optimization algorithm, until the model achieves the maximal performance. Through this data-driven process, the model can become a comprehensive representation of the training data, retaining details that make them vulnerable to privacy breaches^{2,4,8,31,33,35,36,38,40–42,45–47}. Retention of data in ML models is related to three concepts: over-parameterization, overfitting, and memorization^{33,35,36,38,40–42,45}.

A model is considered over-parameterized if it contains more parameters than necessary to accurately fit the training data for a given task^40,42. The more parameters a model has, the greater the capacity to learn the underlying patterns and values in the data, including noise^{33,35,40–42}. In supervised learning, memorization refers to explicit mapping of inputs to their respective outputs, rather than learning the underlying patterns and dependencies in the data^{8,31,41,42,45}. An input value is memorized if its output is predicted with higher confidence than expected given its frequency in the dataset and if its correct output cannot be predicted based on the distributions in the dataset when the value is excluded from the training data^41,42,45. This phenomenon has predominantly been identified in deep neural networks, which are often highly complex with an unnecessarily high capacity to retain training data^8,38. Memorization of input values may be essential for high model performance when the underlying data are heavily skewed, or when there is insufficient data to derive patterns^41,42. However, the stark contrast in confidence values for outputs when memorized values are present versus absent from the training data also leaves these models vulnerable to MIAs^31,41,42. If the model becomes too large and complex relative to the amount of training data, it can lead to overfitting^{2,4,33,35,38,39,41,42,45}. Overfitting is a phenomenon where the model performs better on its training data than on unseen data, also known as generalization error^{2,4,33,35,38,39,41,42,45}. This is often measured with the generalization gap, calculated as the difference between the training and test performance.

The number of model parameters was not reported in the reviewed papers. We estimated the effect of generalization gap for studies (n = 27) and scenarios (n = 1,138) with data on training and validation accuracy. Through recursive partitioning, the following categories were defined within which there was minimal variance in attack performance: accuracy < 17.4%, 17.4–34.1%, ≥ 34.1%; recall and precision < 12.4%, 12.4–24.2%, ≥ 24.2%; and AUC < 17.1%, 17.1–34.4%, ≥ 34.4%. Table 3 shows the adjusted attack performance for different magnitudes of generalization gap by target model type, and the adjusted difference in attack performance by generalization gap. The most robust estimates of the effect of generalization gap on attack performance were for neural networks, and accuracy of attacks on decision trees (Table 3). While higher generalization gap was associated with higher attack performance, the magnitude of the effect varied, with a larger effect on neural networks than other model architectures (Table 3).

Table 3

Adjusted attack performance for different magnitudes of generalization gap by target model type, and the adjusted difference in attack performance by generalization gap
Target Model Type	Study References	Generalization Gap (%)	n (%)	Adjusted Attack Performance % (95%CI)	Adjusted Difference in Attack Performance % (95%CI)
Attack Accuracy
Decision Tree	^2,21,75	< 17.4	52 (74.9)	70.5 (64.0, 77.1)		ref
	^2,21	17.4–34.1	11 (15.1)	75.3 (67.7, 82.8)		4.7 (-1.3, 10.7)
	^2,21	≥ 34.1	8 (11.0)	77.2 (69.3, 85.1)		6.6 (0.4, 12.9)
Neural Network	^{2,5,11,40,75,78,87,91,93,108,109,112,113}	< 17.4	190 (67.9)	56.3 (53.0, 59.5)		ref
	^{5,11,40,43,44,80,85,91,108,113}	17.4–34.1	51 (18.2)	64.7 (61.0, 68.4)		8.5 (4.7, 12.2)
	^{2,43,89,91,108,109}	≥ 34.1	39 (13.9)	78.8 (73.6, 84.0)		22.5 (17.1, 28.0)
Regression	²	< 17.4	1 (25.0)	74.7 (58.5, 91.0)		ref
	²	17.4–34.1	2 (50.0)	93.0 (80.2, 100)		18.2 (-0.5, 37.0)
	²	≥ 34.1	1 (25.0)	91.1 (74.3, 100)		16.4 (-6.3, 39.0)
k-Nearest Neighbour	²	< 17.4	1 (25.0)	55.9 (37.2, 74.7)		ref
	²	17.4–34.1	1 (25.0)	59.6 (40.8, 78.3)		3.7 (-13.4, 20.7)
	²	≥ 34.1	2 (50.0)	58.0 (44.7, 71.3)		2.1 (-20.9, 25.0)
Attack Recall (Sensitivity)
Neural Network	^{5,37,81,82,87,113}	< 12.4	57 (32.4)	64.4 (55.0, 73.9)		ref
	^5,81–83	12.4–24.2	82 (46.6)	70.1 (60.7, 79.4)		5.6 (-2.3, 13.6)
	^{77,81–83,113}	≥ 24.2	37 (21.0)	76.2 (65.9, 86.5)		11.8 (2.0, 21.5)
Random Forest	⁸⁷	< 12.4	1 (50.0)	69.3 (33.4, 100)		ref
	⁸⁷	12.4–24.2	1 (50.0)	80.9 (45.0, 100)		11.6 (-38.5, 61.6)
		≥ 24.2	0	-		-
Attack Precision
Decision Trees	²	< 12.4	1 (33.3)	81.9 (54.4, 109.5)		ref
	²	12.4–24.2	1 (33.3)	73.9 (46.3, 101.4)		-8.1 (-47.0, 30.9)
	²	≥ 24.2	1 (33.3)	92.2 (64.7, 119.7)		10.3 (-28.7, 49.2)
Neural Network	^{2,22,37,37,38,82,82,83,87,113}	< 12.4	71 (35.9)	63.1 (58.5, 67.7)		ref
	^5,22,81–83	12.4–24.2	66 (33.3)	62.8 (58.1, 67.4)		-0.4 (-5.4, 4.7)
	^{2,22,38,77,81–83,113}	≥ 24.2	61 (30.8)	67.7 (62.8, 72.5)		4.6 (-1.1, 10.2)
Regression	²	< 12.4	1 (33.3)	83.0 (56.2, 109.7)		ref
	²	12.4–24.2	1 (33.3)	65.0 (38.2, 91.8)		-18.0 (-55.2, 19.3)
	²	≥ 24.2	1 (33.3)	76.2 (49.5, 103.0)		-6.7 (-44.0, 30.5)
Target Model Type	Study References	Generalization Gap (%)	n (%)	Adjusted Attack Performance % (95%CI)	Adjusted Difference in Attack Performance % (95%CI)
Attack Precision Continued
Random Forest	⁸⁷	< 12.4	1 (50.0)	65.0 (37.5, 92.5)		ref
	⁸⁷	12.4–24.2	1 (50.0)	56.0 (28.5, 83.5)		-9.0 (-48.0, 30.0)
		≥ 24.2	0	-		-
Attack AUC
Neural Network	^{38,40,44,83,84,93,109}	< 17.1	162 (87.6)	63.9 (58.0, 69.9)		ref
	^38,40,44,84	17.1–34.4	18 (9.73)	63.9 (56.7, 71.0)		0.0 (-5.6, 5.5)
	¹⁰⁹	≥ 34.4	5 (2.7)	91.4 (68.4, 100)		27.5 (3.8, 51.2)

Table 4: Adjusted attack performance for different magnitudes of partitioned density by target model type, and the adjusted difference in attack performance by partitioned density

Target Model Type	Study References			Partitioned Density		n (%)		Adjusted Attack Performance % (95%CI)		Adjusted Difference in Attack Performance % (95%CI)
*Attack Accuracy*
Decision Tree	^2,21,75,86			<6.7		16 (22.2)		86.4 (76.9, 95.8)		*ref*
	^2,20,86			6.7-16.8		23 (31.9)		77.2 (67.9, 86.5)		-9.2 (-22.4, 4.1)
	^2,20,21,86			16.8-33.1		24 (33.3)		79.5 (69.9, 89.1)		-6.9 (-20.3, 6.6)
	^{2,20,21,86,87}			≥33.1		9 (12.5)		63.9 (55.1, 72.6)		-22.5 (-35.3, -9.6)
Neural Network	^{2,3,5,11,24,43,44,76,79,80,85,88,90,91,93,108,109,114,116}			<6.7		237 (59.9)		70.9 (67.2, 74.6)		*ref*
	^2,5,20,108			6.7-16.8		14 (3.5)		67.4 (57.8, 77.1)		-3.5 (-13.8, 6.8)
	^{5,20,40,93,113}			16.8-33.1		50 (12.6)		60.7 (51.7, 69.7)		-10.2 (-19.9, -0.5)
	^{5,20,87,88,90,93,113,115}			≥33.1		95 (24.0)		53.9 (47.5, 60.3)		-17.0 (-24.4, -9.6)
Regression	^2,79			<6.7		3 (27.3)		76.1 (65.9, 86.3)		*ref*
	^2,20			6.7-16.8		3 (27.3)		76.7 (65.8, 87.6)		0.6 (-14.3, 15.6)
	^2,20			16.8-33.1		2 (18.2)		71.3 (59.2, 83.3)		-4.8 (-20.6, 11.0)
	^2,20			≥33.1		3 (27.2)		54.4 (43.7, 65.1)		-21.7 (-36.6, -6.9)
Naïve Bayes	²			<6.7		2 (28.6)		60.7 (47.0, 74.4)		*ref*
	²			6.7-16.8		3 (42.9)		45.1 (33.3, 56.9)		-15.6 (-33.7, 2.5)
	²			16.8-33.1		1 (14.3)		50.6 (34.0, 67.2)		-10.1 (-31.6, 11.5)
	²			≥33.1		1 (14.3)		53.8 (37.5, 70.1)		-6.9 (-28.2, 14.4)
k-Nearest Neighbour	²			<6.7		1 (20.0)		56.1 (40.0, 72.3)		*ref*
	²			6.7-16.8		2 (40.0)		50.5 (37.6, 63.5)		-5.6 (-26.3, 15.1)
	²			16.8-33.1		1 (20.0)		53.4 (36.8, 70.0)		-2.7 (-25.9, 20.4)
	²			≥33.1		1 (20.0)		54.8 (38.5, 71.1)		-1.3 (-24.3, 21.6)
*Attack Recall (Sensitivity)*
Decision Tree				<4.2		0
	²⁰			4.2-13.7		1 (20.0)		71.7 (36.1, 100)		*ref*
	²⁰			13.7-25.6		1 (20.0)		58.3 (22.6, 94.0)		-13.5 (-58.1, 31.1)
	^20,87			≥25.6		3 (60.0)		62.6 (39.9, 85.3)		-9.1 (-51.5, 33.2)
Neural Network	^{3,5,41,76,82,83,112,119}			<4.2		180 (65.0)		79.5 (72.3, 86.7)		*ref*
	^{5,20,37,83,112,119}			4.2-13.7		46 (16.6)		63.5 (51.8, 75.1)		-16.0 (-29.8, -2.3)
	^5,20,83			13.7-25.6		24 (8.7)		69.0 (51.2, 86.8)		-10.6 (-29.8, 8.6)
	^{5,20,37,77,82,87,112,113,119}			≥25.6		55 (18.0)		66.0 (56.8, 75.2)		-13.5 (-25.2, -1.9)
Target Model Type	Study References			Partitioned Density		n (%)		Adjusted Attack Performance % (95%CI)		Adjusted Difference in Attack Performance % (95%CI)
*Attack Recall (Sensitivity) Continued*
Regression	^112,119			<4.2		5 (29.4)		86.3 (68.0, 100)		*ref*
	^20,112			4.2-13.7		5 (38.5)		60.0 (40.5, 79.5)		-26.3 (-53.0, 0.4)
	²⁰			13.7-25.6		1 (7.7)		67.0 (31.3, 100)		-19.3 (-59.4, 20.8)
	^20,112			≥25.6		2 (15.4)		74.8 (57.6, 92.0)		-11.5 (-36.6, 13.6)
Random Forest	^112,119			<4.2		5 (45.5)		66.5 (48.2, 84.8)		*ref*
	¹¹²			4.2-13.7		4 (36.4)		66.9 (45.1, 88.8)		0.4 (-28.0, 28.9)
				13.7-25.6		0
	^87,112			≥25.6		2 (18.2)		57.2 (40.3, 74.1)		-9.3 (-34.2, 15.6)
Support Vector Machine	¹¹²			<4.2		4 (33.3)		88.7 (68.1, 100)		*ref*
	¹¹²			4.2-13.7		4 (33.3)		47.0 (25.1, 68.8)		-41.7 (-71.8, -11.7)
				13.7-25.6		0
	¹¹²			≥25.6		4 (33.3)		65.8 (45.3, 86.2)		-22.9 (-52.0, 6.1)
*Attack Precision*
Decision Tree	²			<4.2		1 (11.1)		92.2 (65.2, 100)		*ref*
	²⁰			4.2-13.7		1 (11.1)		85.5 (59.3, 100)		-6.7 (-44.3, 31.0)
	^2,20			13.7-25.6		2 (22.2)		78.6 (59.5, 97.6)		-13.7 (-46.7, 19.4)
	^2,20,87			≥25.6		5 (55.6)		59.1 (46.4, 71.8)		-33.1 (-62.9, -3.2)
Neural Network	^{2,5,22,41,76,82,83,112,112,119}			<4.2		193 (60.3)		71.9 (67.7, 76.0)		*ref*
	^{5,5,20,22,83,112,119}			4.2-13.7		45 (14.1)		58.6 (51.7, 65.6)		-13.2 (-21.2, -5.3)
	^2,5,20,83			13.7-25.6		25 (7.8)		58.5 (48.1, 68.8)		-13.4 (-24.5, -2.3)
	^{2,5,20,37,77,82,83,87,112,113,119}			≥25.6		57 (17.8)		56.7 (51.1, 62.3)		-15.2 (-22.2, -8.2)
Regression	^2,112,119			<4.2		6 (28.6)		70.1 (56.6, 83.6)		*ref*
	^20,112			4.2-13.7		5 (23.8)		60.0 (44.6, 75.5)		-10.1 (-30.6, 10.4)
	^2,20			13.7-25.6		2 (9.5)		76.5 (57.7, 95.2)		6.4 (-16.7, 29.4)
	^20,20,112			≥25.6		8 (38.1)		57.1 (45.6, 68.6)		-13.0 (-30.7, 4.7)
Random Forest	^112,119			<4.2		5 (33.)		65.0 (37.5, 92.5)		*ref*
	¹¹²			4.2-13.7		4 (26.7)		56.0 (28.5, 83.5)		-9.0 (-48.0, 30.0)
				13.7-25.6		0
	^87,112			≥25.6		6 (40.0)		56.4 (42.9, 69.9)		-9.3 (-29.9, 35.2)
Target Model Type	Study References			Partitioned Density		n (%)		Adjusted Attack Performance % (95%CI)		Adjusted Difference in Attack Performance % (95%CI)
*Attack Precision Continued*
Naïve Bayes	²			<4.2		1 (25.0)		50.4 (24.5, 76.3)		*ref*
	²			4.2-13.7		0
	²			13.7-25.6		1 (25.0)		51.0 (24.8, 77.2)		0.6 (-36.2, 37.4)
	²			≥25.6		2 (50.0)		50.3 (31.9, 68.8)		-0.1 (-31.8, 31.7)
k-Nearest Neighbour	²			<4.2		1 (25.0)		60.1 (33.1, 87.1)		*ref*
	²			4.2-13.7		0
	²			13.7-25.6		1 (25.0)		55.4 (28.4, 82.4)		-4.8 (-42.9, 33.4)
	²			≥25.6		2 (50.0)		52.5 (33.4, 71.6)		-7.7 (-40.7, 25.4)
*Attack Precision Continued*
Support Vector Machine	¹¹²			<4.2		4 (33.3)		50.6 (31.5, 69.7)		*ref*
	¹¹²			4.2-13.7		4 (33.3)		53.4 (34.3, 72.5)		2.8 (-24.2, 29.8)
				13.7-25.6		0
	¹¹²			≥25.6		4 (33.3)		58.9 (39.9, 78.0)		8.3 (-18.7, 35.3)
*Attack AUC*
Neural Network		^44,83,93,109	<4.3		77 (42.3)		72.2 (65.4, 79.0)		*ref*
		^83,84	4.3-12.6		8 (4.4)		63.6 (49.3, 78.0)		-8.6 (-24.4, 7.3)
		^40,83,93	12.6-25.1		52 (28.6)		60.8 (49.8, 71.8)		-11.4 (-24.4, 1.5)
		^83,84,93	≥25.1		45 (24.7)		56.2 (46.1, 66.3)		-16.0 (-28.1, -3.8)

Table 5: Relative and absolute importance of partitioned density and generalization gap by model architecture.

Target Model Architecture	Study References	Scenarios [n]	Variable Importance
			Relative Importance		Absolute Importance
			Partitioned Density	Generalization Gap	Partitioned Density	Generalization Gap
*Attack Accuracy*
Decision Tree	^2,21,75	33	1.0	0.28	0.62	0.18
Neural Network	^{2,5,11,40,43,44,80,85,87,91,93,108,109,112,113}	260	0.30	1.0	0.39	1.29
Regression	²	4	1.0	0.74	0.05	0.04
k-Nearest Neighbour	²	4	1.0	0.52	0.08	0.04
Naïve Bayes	²	6	1.0	0.93	0.27	0.26
*Attack Recall (Sensitivity)*
Neural Network	^{5,37,77,82,83,87,113}	159	1.0	0.52	1.48	0.77
*Attack Precision*
Neural Network	^{2,5,22,37,77,82,83,87,113}	173	1.0	0.35	1.46	0.51
*Attack AUC*
Neural Network	^{40,44,83,84,93,109}	182	0.67	1.0	0.49	0.73

3.2 Training Data Attributes

There are several training data attributes that are hypothesized to affect vulnerability to attacks^{4,23,28,36,38,39,41–43,47,58,59}. Overall, the dimensionality of the data matters^{28,36,38,39,47,58}. This is often discussed in reference to the number of classes for classification tasks, with higher numbers being associated with higher privacy risk, but the concept can be extended to independent variables^23,36,59. There are two main explanations for this issue. First, in terms of the dependent variable, this translates into decision boundaries that are closely positioned around the training instances, which increases the probability of a single instance having an observable impact on the model’s decision boundary³⁶. More broadly, higher dimensional data means a larger matrix of potential values over which the training data are distributed³⁶. If the independent variables are too high dimensional relative to the number of training records, there is a greater likelihood of unique combinations of variable values among members of the training dataset³⁶. Second, a higher number of classes requires the training algorithm to use more information to distinguish between classes, increasing the importance of more variables in determining the class of a given instance and subsequently the amount of information about each individual that can be extracted from the model²³.

Partitioned density was used to combine the effects of the number of records, classes and features from the literature, accounting for the relationship between these factors. The following categories were defined with recursive partitioning: Accuracy: <6.7, 6.7–16.8, 16.8–33.1, ≥ 33.1; Recall and Precision: <4.2, 4.2–13.7, 13.7–25.6, ≥ 25.6; and AUC: <4.3, 4.3–12.6, 12.6–25.1, ≥ 25.1. Data were insufficient for analysis of F1 score. Table 4 shows the estimated effect of partitioned density on attack performance, adjusted for target model type and generalization gap, and training dataset and study as random effects. Data were insufficient for precise estimation of the effect of partitioned density on attack performance for most model types, however, evidence shows reduced attack performance with increasing partitioned density for all model architectures, to varying magnitudes (Table 4). The largest effects were seen for decision trees, followed by regression models and neural networks (Table 4). Considering model architecture, the estimated importance of generalization gap and partitioned density on attack performance is shown in Table 5. Partitioned density was associated with greater reduction in RSS than generalization gap, indicating more importance in predicting attack performance for all model architectures and performance metrics, except for neural networks when performance was measured with accuracy or AUC (Table 5).

Another attribute of training data hypothesized to influence privacy risk is variable distributions^{4,23,39,41–43,59}. Skewed data, characterized by an imbalance of records across classes/categories, or long-tailed distributions of continuous variables, are associated with increased privacy risk, particularly for those in the minority classes or with rare values ^{4,23,39,41,42,59}. The theoretical explanation for this is that the small number of records with those values leads to overfitting and/or memorization^23,39,41,42. The extent to which imbalanced groups or skewed distributions lead to overfitting or memorization depends on the model architecture and can be modified by training parameters^39,41,42. And finally, a concept related to both previous hypotheses for classification tasks is the degree of variation within and between classes³⁶. Less variation within classes is thought to be associated with reduced privacy risk, as single instances are less likely to impact decision boundaries, compared to situations where classes are comprised of individuals with varied values³⁶. Conversely, a low level of variation between classes may result in higher privacy risk, since tighter decision boundaries may be more influenced by single data points^23,36,39. The data required to estimate the effect of these factors in the meta-analysis was not reported in the reviewed literature.

3.3 Model Architecture

Model architecture is the underlying structure of the model, which determines how input data is processed to make predictions or classifications, and has a strong influence on the privacy risk^{12,23,35,36,41,42,45,60}. There are several theories on why model architecture contributes to privacy risk. First, architecture determines the complexity and capacity of the model^{12,23,41,42,45,60}. This theory is particularly applicable to neural networks, and specifically deep or convolutional neural networks, which are designed to capture highly complex relationships in the training data^{12,23,41,42,45,60}. Second, model architecture influences how the model determines optimal decision boundaries³⁶. Architectures that are more sensitive to specific data points or distributions are more vulnerable to privacy attacks^36,38. Conversely, if a model’s decision is unlikely to be impacted by the presence/absence of particular inputs, they will be more resilient to attacks^36,38. For example, a naïve bayes algorithm estimates the probability of a given class for each variable independently, and subsequently, assuming sufficient sample size, single training data members have only marginal influence on those probabilities³⁶. Conversely, models like decision trees or support vector machines (SVM) are highly influenced by specific datapoints³⁶. A SVM functions by identifying specific datapoints that define the boundary between classes, known as support vectors, which are therefore intrinsic to the model and can be exploited in privacy attacks³⁶. Similarly, decision trees use the values of input variables and the relationships between them to determine the decision boundaries³⁶. With this architecture, a unique combination of features may lead to a new branch, or otherwise modify the decision of the model³⁶. Third, some model architectures have known assumptions and error distributions, which can be leveraged by an adversary when evaluating model predictions during a MIA³⁵. This applies to models in the regression family, where the error distribution is expected to be Gaussian or Bernoulli.

Evidence from this analysis was consistent with these explanations, with the highest effects of decision trees, regression models and neural networks, which were modified by partitioned density and generalization gap (Tables 3 and 4). The most robust estimates of model-specific effects were for attack accuracy. Relative to neural networks, decision trees and regression models were associated with adjusted increases in attack accuracy of 15.5% (95%CI:5.5, 25.4%) and 18.5% (95%CI:2.0, 35.0%), respectively for sparse data (partitioned density < 6.7). Attack accuracy was also higher on regression models relative to neural networks at generalization gaps ≥ 34.1% (12.3%; 95%CI:-5.2, 29.9%). Conversely, at generalization gap ≥ 34.1%, decision trees had lower attack accuracy than neural networks (-1.6%; 95%CI:-11.1, 7.7). However, the confidence intervals leave uncertainty about the direction and magnitude of these effects.

All papers included at least one scenario with attack performance > 50%, with 81.1% of papers including scenarios with attack performance ≥ 75%, underscoring the privacy risk associated with ML. Among 42 papers included in the meta-analysis, average attack performance ranged from: AUC 61.0% (95%CI: 60.0, 63.0%), to recall 74.0% (95%CI: 0.72, 0.76%), with variation by target model and training data. Of the hypothesized factors associated with MIA vulnerability, we could estimate the effects of target model type, generalization gap, and partitioned density. Findings from the meta-analysis were consistent with the theoretical explanations for the feasibility of MIAs^{9,23,32,33,35,36,38–44}. However, in the reviewed literature, the emphasis is most often placed on model-specific factors like architecture and overfitting. Conversely, the present analysis indicates that the data is of higher importance for privacy. This discrepancy may be due in part to lack of construct validity of metrics used to estimate the effect of dataset composition in the literature. Previously investigated data attributes included the numbers of features, records, and classes^{28,38,39,47,58}. However, our interpretation of the theoretical explanations for MIA vulnerability is that the complexity of the data is what matters, and the relationship between dimensionality and number of records, which is not captured by those metrics. Partitioned density is a better approximation of that construct.

Decision trees and regression models had the highest attack performance, when partitioned density and generalization gaps were low, followed by neural networks. This finding is expected based on theoretical explanations for how different model architectures contribute to MIA success, and is important because decision trees and regression models are commonly used on health data. Data was insufficient for precise estimation of the effects of other model types. Overfitting was associated with increased attack performance. However, the magnitude of the effect varied substantially across models, and to a greater degree than that of partitioned density. This is consistent with findings from Yeom et al. (2018), who also concluded that while overfitting can lead to successful MIAs, it is not a necessary condition³⁵. Of note, while attack performance was reduced with less overfitting and higher partitioned density, it remained above 50% for most target model architectures, indicating the presence of additional factors that explain privacy risk.

The stronger influence of data composition points to a potential approach for reducing privacy risk for data custodians that does not depend on alteration of model-building methods. Ensuring sufficient records per feature and class may reduce MIA vulnerability, regardless of the model architecture. This analysis showed that an average of ≥ 33 records per feature in each class is associated with up to 22.5% reduction in attack accuracy (95%CI: -35.3, -9.6%; decision trees). Partitioned density can be increased via increasing sample size or reducing the dimensionality of the data, and doesn’t rely on advanced understanding of ML, making it feasible to adopt. However, a strong statistical understanding is required to ensure the informative value of the data is retained through dimensionality reduction.

The most prominent limitation of this work stems from inconsistent and incomplete reporting in the privacy and ML literature, which underscores the need for standardized reporting in this field. Many studies lacked specific details on the target model architecture, classifying models according to broader categories like neural networks. Therefore, there could be heterogeneity in the effects of target model on MIAs within categories used in this analysis, which is masked. Lack of reporting on hyperparameters like number of training passes, learning rate and model parameters also limited comprehensive investigation of target model attributes that could influence privacy risk. While it may be generally assumed that hyperparameters were optimized for a given task, while minimizing generalization gap, reporting standards should incorporate explicit descriptions of these indicators, to allow for thorough assessment of their effects. Further, data on training and validation accuracy were missing for 772 scenarios. While an improvement on previous measures, partitioned density does not fully characterize the dimensionality of the data, as the number of potential values of features in the dataset was not available. Additionally, other detailed information about the training data, including balance of classes, skewness of features, and variance between and within classes were not available. And finally, measures of variance around attack performance were not provided for 987 scenarios identified in the literature, requiring unweighted combinations of attack performance metrics.

However, within these constraints, we have been able to measure the impact of different factors on MIA success, and identify an avenue for further exploration with respect to approaches data custodians can take to assess and mitigate privacy risks of proposed ML projects. The analysis is strengthened by the multi-disciplinary approach, integrating concepts of epidemiology, biostatistics and ML, for scientifically rigorous assessment of MIA vulnerability.

To our knowledge, this is the first quantitative synthesis of MIA experiments, and is a significant contribution to the literature on the potential risks of ML in health. This work is a significant contribution to the literature, as thorough understanding of these issues is required to ensure the sustainability of ML in health, where privacy is paramount. Notably, we identified that dataset composition has a greater influence on attack performance than generalization gap, with the magnitude of the effect varying across target model architectures and attack performance metrics. The high MIA performance on decision trees is an important finding since this architecture is commonly used on health data. Reporting gaps underscore the need for standardized reporting on ML experiments and models, to allow for inclusion in meta-analyses and reproducibility of findings. Future analysis will incorporate comprehensive analysis of the publicly available datasets used in the experiments presented in the reviewed literature, so that full dimensionality of the data can be measured, along with other key attributes hypothesized to influence privacy risk, to inform a privacy risk assessment framework and additional strategies for reducing risk of applying ML to confidential data.

The authors have no competing interests to declare and did not receive funding for this work.

Veale M, Binns R, Edwards L. Algorithms that remember: model inversion attacks and data protection law. Philos Trans R Soc Math Phys Eng Sci. 2018;376(2133):20180083. doi:10.1098/rsta.2018.0083
Truex S, Liu L, Gursoy ME, Yu L, Wei W. Demystifying Membership Inference Attacks in Machine Learning as a Service. IEEE Trans Serv Comput. 2021;14(6):2073-2089. doi:10.1109/TSC.2019.2897554
Shokri R, Stronati M, Song C, Shmatikov V. Membership Inference Attacks against Machine Learning Models. ArXiv161005820 Cs Stat. Published online March 31, 2017. Accessed December 21, 2021. http://arxiv.org/abs/1610.05820
Fredrikson M, Jha S, Ristenpart T. Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. ACM; 2015:1322-1333. doi:10.1145/2810103.2813677
Leino K, Fredrikson M. Stolen Memories: Leveraging Model Memorization for Calibrated White-Box Membership Inference. :19.
Hu H, Salcic Z, Sun L, Dobbie G, Yu PS, Zhang X. Membership Inference Attacks on Machine Learning: A Survey. Published online February 2, 2022. doi:10.48550/arXiv.2103.07853
The HPSPLIT Procedure.
Pyrgelis A, Troncoso C, Cristofaro ED. Knock Knock, Who’s There? Membership Inference on Aggregate Location Data. In: Proceedings 2018 Network and Distributed System Security Symposium. Internet Society; 2018. doi:10.14722/ndss.2018.23183
Rahman MA, Rahman T, Laganiere R, Mohammed N, Wang Y. Membership Inference Attack against Differentially Private Deep Learning Model. Trans Data Priv. 2018;11(1):61-79.
Yeom S, Giacomelli I, Fredrikson M, Jha S. Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting. In: 2018 IEEE 31st Computer Security Foundations Symposium (CSF). ; 2018:268-282. doi:10.1109/CSF.2018.00027
Nasr M, Shokri R, Houmansadr A. Machine Learning with Membership Privacy using Adversarial Regularization. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. CCS ’18. Association for Computing Machinery; 2018:634-646. doi:10.1145/3243734.3243855
Liu KS, Xiao C, Li B, Gao J. Performing Co-membership Attacks Against Deep Generative Models. In: 2019 IEEE International Conference on Data Mining (ICDM). ; 2019:459-467. doi:10.1109/ICDM.2019.00056
Song L, Shokri R, Mittal P. Privacy Risks of Securing Machine Learning Models against Adversarial Examples. In: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. CCS ’19. Association for Computing Machinery; 2019:241-257. doi:10.1145/3319535.3354211
Irolla P, Chatel G. Demystifying the Membership Inference Attack. In: 2019 12th Cmi Conference on Cybersecurity and Privacy (Cmi). Ieee; 2019:1-7. Accessed August 4, 2022. http://www.webofscience.com/wos/woscc/full-record/WOS:000540917000001
Hayes J, Melis L, Danezis G, De Cristofaro E. LOGAN: Membership Inference Attacks Against Generative Models. Published online August 21, 2018. Accessed September 26, 2022. http://arxiv.org/abs/1705.07663
Hilprecht B, Härterich M, Bernau D. Monte Carlo and Reconstruction Membership Inference Attacks against Generative Models. Proc Priv Enhancing Technol. 2019;2019(4):232-249. doi:10.2478/popets-2019-0067
Song L, Shokri R, Mittal P. Membership Inference Attacks Against Adversarially Robust Deep Learning Models. In: 2019 IEEE Security and Privacy Workshops (SPW). ; 2019:50-56. doi:10.1109/SPW.2019.00021
Sablayrolles A, Douze M, Ollivier Y, Schmid C, Jegou N. White-box vs Black-box: Bayes Optimal Strategies for Membership Inference. In: Chaudhuri K, Salakhutdinov R, eds. International Conference on Machine Learning, Vol 97. Vol 97. Jmlr-Journal Machine Learning Research; 2019. Accessed August 4, 2022. http://www.webofscience.com/wos/woscc/full-record/WOS:000684034305072
Mo F, Shahin Shamsabadi A, Katevas K, Cavallaro A, Haddadi H. Poster: Towards Characterizing and Limiting Information Exposure in DNN Layers. In: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. CCS ’19. Association for Computing Machinery; 2019:2653-2655. doi:10.1145/3319535.3363279
Liu G, Wang C, Peng K, Huang H, Li Y, Cheng W. SocInf: Membership Inference Attacks on Social Media Health Data With Machine Learning. IEEE Trans Comput Soc Syst. 2019;6(5):907-921. doi:10.1109/TCSS.2019.2916086
Truex S, Liu L, Gursoy ME, Wei W, Yu L. Effects of Differential Privacy and Data Skewness on Membership Inference Vulnerability. In: 2019 First Ieee International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (Tps-Isa 2019). Ieee; 2019:82-91. doi:10.1109/TPS-ISA48467.2019.00019
Melis L, Song C, De Cristofaro E, Shmatikov V. Exploiting Unintended Feature Leakage in Collaborative Learning. In: 2019 IEEE Symposium on Security and Privacy (SP). ; 2019:691-706. doi:10.1109/SP.2019.00029
Nasr M, Shokri R, Houmansadr A. Comprehensive Privacy Analysis of Deep Learning Passive and Active White-box Inference Attacks against Centralized and Federated Learning. In: 2019 Ieee Symposium on Security and Privacy (Sp 2019). Ieee Computer Soc; 2019:739-753. doi:10.1109/SP.2019.00065
Jia J, Salem A, Backes M, Zhang Y, Gong NZ. MemGuard: Defending against Black-Box Membership Inference Attacks via Adversarial Examples. In: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. CCS ’19. Association for Computing Machinery; 2019:259-274. doi:10.1145/3319535.3363201
Song C, Shmatikov V. Auditing Data Provenance in Text-Generation Models. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. KDD ’19. Association for Computing Machinery; 2019:196-206. doi:10.1145/3292500.3330885
Salem A, Zhang Y, Humbert M, Berrang P, Fritz M, Backes M. ML-Leaks: Model and Data Independent Membership Inference Attacks and Defenses on Machine Learning Models. In: Proceedings 2019 Network and Distributed System Security Symposium. Internet Society; 2019. doi:10.14722/ndss.2019.23119
Chen J, Zhang J, Zhao Y, Han H, Zhu K, Chen B. Beyond Model-Level Membership Privacy Leakage: an Adversarial Approach in Federated Learning. In: 2020 29th International Conference on Computer Communications and Networks (Icccn 2020). Ieee; 2020. Accessed August 4, 2022. http://www.webofscience.com/wos/woscc/full-record/WOS:000627816700125
Chen D, Yu N, Zhang Y, Fritz M. GAN-Leaks: A Taxonomy of Membership Inference Attacks against Generative Models. In: Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security. CCS ’20. Association for Computing Machinery; 2020:343-362. doi:10.1145/3372297.3417238
He Y, Rahimian S, Schiele B, Fritz M. Segmentations-Leak: Membership Inference Attacks and Defenses in Semantic Image Segmentation. In: Vedaldi A, Bischof H, Brox T, Frahm JM, eds. Computer Vision – ECCV 2020. Lecture Notes in Computer Science. Springer International Publishing; 2020:519-535. doi:10.1007/978-3-030-58592-1_31
Zhang J, Zhang J, Chen J, Yu S. GAN Enhanced Membership Inference: A Passive Local Attack in Federated Learning. In: ICC 2020 - 2020 IEEE International Conference on Communications (ICC). ; 2020:1-6. doi:10.1109/ICC40277.2020.9148790
Song C, Raghunathan A. Information Leakage in Embedding Models. In: Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security. Association for Computing Machinery; 2020:377-390. Accessed August 4, 2022. https://doi.org/10.1145/3372297.3417270
Zhang G, Zhang A, Zhao P. LocMIA: Membership Inference Attacks Against Aggregated Location Data. IEEE Internet Things J. 2020;7(12):11778-11788. doi:10.1109/JIOT.2020.3001172
Shuvo MSR, Alhadidi D. Membership Inference Attacks: Analysis and Mitigation. In: 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). ; 2020:1410-1419. doi:10.1109/TrustCom50675.2020.00190
Wang C, Liu G, Huang H, Feng W, Peng K, Wang L. MIASec: Enabling Data Indistinguishability Against Membership Inference Attacks in MLaaS. IEEE Trans Sustain Comput. 2020;5(3):365-376. doi:10.1109/TSUSC.2019.2930526
Yeom S, Giacomelli I, Menaged A, Fredrikson M, Jha S. Overfitting, robustness, and malicious algorithms: A study of potential causes of privacy risk in machine learning. J Comput Secur. 2020;28(1):35-70. doi:10.3233/JCS-191362
Bogdanova A, Attoh-Okine N, Sakurai T. Risk and Advantages of Federated Learning for Health Care Data Collaboration. ASCE-ASME J Risk Uncertain Eng Syst Part Civ Eng. 2020;6(3):04020031. doi:10.1061/AJRUA6.0001078
Long Y, Wang L, Bu D, et al. A Pragmatic Approach to Membership Inferences on Machine Learning Models. In: 2020 IEEE European Symposium on Security and Privacy (EuroS&P). ; 2020:521-534. doi:10.1109/EuroSP48549.2020.00040
Chen C, Wu B, Qiu M, Wang L, Zhou J. A Comprehensive Analysis of Information Leakage in Deep Transfer Learning. Published online September 3, 2020. doi:10.48550/arXiv.2009.01989
Tople S, Sharma A, Nori A. Alleviating Privacy Attacks via Causal Learning. In: Proceedings of the 37th International Conference on Machine Learning. PMLR; 2020:9537-9547. Accessed September 26, 2022. https://proceedings.mlr.press/v119/tople20a.html
Wu B, Chen C, Zhao S, et al. Characterizing Membership Privacy in Stochastic Gradient Langevin Dynamics. Proc AAAI Conf Artif Intell. 2020;34(04):6372-6379. doi:10.1609/aaai.v34i04.6107
Ying Z, Zhang Y, Liu X. Privacy-Preserving in Defending against Membership Inference Attacks. In: Proceedings of the 2020 Workshop on Privacy-Preserving Machine Learning in Practice. PPMLP’20. Association for Computing Machinery; 2020:61-63. doi:10.1145/3411501.3419428
Hisamoto S, Post M, Duh K. Membership Inference Attacks on Sequence-to-Sequence Models: Is My Data In Your Machine Translation System? Trans Assoc Comput Linguist. 2020;8:49-63. doi:10.1162/tacl_a_00299
Song L, Mittal P. Systematic Evaluation of Privacy Risks of Machine Learning Models. ArXiv200310595 Cs Stat. Published online December 9, 2020. Accessed March 2, 2022. http://arxiv.org/abs/2003.10595
Rahimian S, Orekondy T, Fritz M. Differential Privacy Defenses and Sampling Attacks for Membership Inference. In: Proceedings of the 14th ACM Workshop on Artificial Intelligence and Security. AISec ’21. Association for Computing Machinery; 2021:193-202. doi:10.1145/3474369.3486876
Shokri R, Strobel M, Zick Y. On the Privacy Risks of Model Explanations. In: Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society. AIES ’21. Association for Computing Machinery; 2021:231-241. doi:10.1145/3461702.3462533
Graves L, Nagisetty V, Ganesh V. Amnesiac Machine Learning. In: Thirty-Fifth Aaai Conference on Artificial Intelligence, Thirty-Third Conference on Innovative Applications of Artificial Intelligence and the Eleventh Symposium on Educational Advances in Artificial Intelligence. Vol 35. Assoc Advancement Artificial Intelligence; 2021:11516-11524. Accessed August 4, 2022. http://www.webofscience.com/wos/woscc/full-record/WOS:000681269803023
Bernau D, Robl J, Grassal PW, Schneider S, Kerschbaum F. Comparing Local and Central Differential Privacy Using Membership Inference Attacks. In: Data and Applications Security and Privacy XXXV: 35th Annual IFIP WG 11.3 Conference, DBSec 2021, Calgary, Canada, July 19–20, 2021, Proceedings. Springer-Verlag; 2021:22-42. doi:10.1007/978-3-030-81242-3_2
Qiang W, Liu R, Jin H. Defending CNN against privacy leakage in edge computing via binary neural networks. Future Gener Comput Syst. 2021;125:460-470. doi:10.1016/j.future.2021.06.037
Lee H, Kim J, Ahn S, Hussain R, Cho S, Son J. Digestive neural networks: A novel defense strategy against inference attacks in federated learning. Comput Secur. 2021;109:102378. doi:10.1016/j.cose.2021.102378
Park C, Kim Y, Park JG, Hong D, Seo C. Evaluating Differentially Private Generative Adversarial Networks Over Membership Inference Attack. IEEE Access. 2021;9:167412-167425. doi:10.1109/ACCESS.2021.3137278
Su T, Wang M, Wang Z. Federated Regularization Learning: an Accurate and Safe Method for Federated Learning. In: 2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems (AICAS). ; 2021:1-4. doi:10.1109/AICAS51828.2021.9458510
Weng J, Weng J, Huang H, Cai C, Wang C. FedServing: A Federated Prediction Serving Framework Based on Incentive Mechanism. In: IEEE INFOCOM 2021 - IEEE Conference on Computer Communications. ; 2021:1-10. doi:10.1109/INFOCOM42981.2021.9488807
Miao Y, Minhui X, Chen C, et al. The audio auditor: user-level membership inference in Internet of Things voice services. Proc Priv Enhancing Technol. 2021;2021:209-228. doi:10.2478/popets-2021-0012
Gupta U, Stripelis D, Lam PK, Thompson PM, Ambite JL, Steeg GV. Membership Inference Attacks on Deep Regression Models for Neuroimaging. Published online June 3, 2021. doi:10.48550/arXiv.2105.02866
Bagmar AM, Maiya S, Bidwalkar S, Deshpande A. Membership Inference Attacks on Lottery Ticket Networks. In: ; 2021. Accessed August 4, 2022. https://openreview.net/forum?id=4lyXal2ZWB3
Liu H, Jia J, Qu W, Gong NZ. EncoderMI: Membership Inference against Pre-trained Encoders in Contrastive Learning. In: Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. CCS ’21. Association for Computing Machinery; 2021:2081-2095. doi:10.1145/3460120.3484749
Hidano S, Murakami T, Kawamoto Y. TransMIA: Membership Inference Attacks Using Transfer Shadow Training. In: 2021 International Joint Conference on Neural Networks (IJCNN). ; 2021:1-10. doi:10.1109/IJCNN52387.2021.9534207
Zhao Y, Chen J, Zhang J, et al. User-Level Membership Inference for Federated Learning in Wireless Network Environment. Wirel Commun Mob Comput. 2021;2021:5534270. doi:10.1155/2021/5534270
Kaya Y, Dumitras T. When Does Data Augmentation Help With Membership Inference Attacks? In: Proceedings of the 38th International Conference on Machine Learning. PMLR; 2021:5345-5355. Accessed September 26, 2022. https://proceedings.mlr.press/v139/kaya21a.html
Wang Y, Wang C, Wang Z, et al. Against Membership Inference Attack: Pruning is All You Need. In: Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization; 2021:3141-3147. doi:10.24963/ijcai.2021/432
Mukherjee S, Xu Y, Trivedi A, Patowary N, Ferres JL. privGAN: Protecting GANs from membership inference attacks at low cost to utility. Proc Priv Enhancing Technol. 2021;2021(3):142-163. doi:10.2478/popets-2021-0041
Webster R, Rabin J, Simon L, Jurie F. Generating Private Data Surrogates for Vision Related Tasks. In: 2020 25th International Conference on Pattern Recognition (ICPR). ; 2021:263-269. doi:10.1109/ICPR48806.2021.9413067
Paul W, Cao Y, Zhang M, Burlina P. Defending Medical Image Diagnostics against Privacy Attacks using Generative Methods. Published online August 20, 2021. Accessed September 26, 2022. http://arxiv.org/abs/2103.03078
Bai Y, Chen D, Chen T, Fan M. GANMIA: GAN-based Black-box Membership Inference Attack. In: ICC 2021 - IEEE International Conference on Communications. ; 2021:1-6. doi:10.1109/ICC42927.2021.9500657
Yu D, Zhang H, Chen W, Yin J, Liu TY. How Does Data Augmentation Affect Privacy in Machine Learning? In: AAAI. ; 2021.
Shin J, Choi SH, Choi YH. Is Homomorphic Encryption-Based Deep Learning Secure Enough? Sensors. 2021;21(23):7806. doi:10.3390/s21237806
Liu G, Wang C, Ma X, Yang Y. Keep Your Data Locally: Federated-Learning-Based Data Privacy Preservation in Edge Computing. IEEE Netw. 2021;35(2):60-66. doi:10.1109/MNET.011.2000215
Grosse K, Smith MT, Backes M. Killing Four Birds with one Gaussian Process: The Relation between different Test-Time Attacks. In: 2020 25th International Conference on Pattern Recognition (ICPR). ; 2021:4696-4703. doi:10.1109/ICPR48806.2021.9413290
Zhang M, Ren Z, Wang Z, et al. Membership Inference Attacks Against Recommender Systems. In: Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. CCS ’21. Association for Computing Machinery; 2021:864-879. doi:10.1145/3460120.3484770
Li Z, Zhang Y. Membership Leakage in Label-Only Exposures. In: Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. CCS ’21. Association for Computing Machinery; 2021:880-895. doi:10.1145/3460120.3484575
Zhao BZH, Agrawal A, Coburn C, et al. On the (In)Feasibility of Attribute Inference Attacks on Machine Learning Models. In: 2021 IEEE European Symposium on Security and Privacy (EuroS&P). ; 2021:232-251. doi:10.1109/EuroSP51992.2021.00025
Rezaei S, Liu X. On the Difficulty of Membership Inference Attacks. Int Conf Comput Vis Pattern Recognit CVPR. Published online June 2021. doi:10.1109/CVPR46437.2021.00780
Chen J, Guo Y, Zheng Q, Chen H. Protect privacy of deep classification networks by exploiting their generative power. Mach Learn. 2021;110(4):651-674. doi:10.1007/s10994-021-05951-6
He X, Zhang Y. Quantifying and Mitigating Privacy Risks of Contrastive Learning. In: Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. CCS ’21. Association for Computing Machinery; 2021:845-863. doi:10.1145/3460120.3484571
Chang H, Shokri R. On the Privacy Risks of Algorithmic Fairness. In: 2021 IEEE European Symposium on Security and Privacy (EuroS&P). ; 2021:292-303. doi:10.1109/EuroSP51992.2021.00028
Zheng J, Cao Y, Wang H. Resisting membership inference attacks through knowledge distillation. Neurocomputing. 2021;452:114-126. doi:10.1016/j.neucom.2021.04.082
Fontana M, Naretto F, Monreale A. A new approach for cross-silo federated learning and its privacy risks. In: 2021 18th International Conference on Privacy, Security and Trust (PST). ; 2021:1-10. doi:10.1109/PST52912.2021.9647753
Kuppa A, Le-Khac NA. Adversarial XAI Methods in Cybersecurity. IEEE Trans Inf Forensics Secur. 2021;16:4924-4938. doi:10.1109/TIFS.2021.3117075
Chen J, Wang WH, Shi X. Differential Privacy Protection Against Membership Inference Attack on Machine Learning for Genomic Data. Pac Symp Biocomput Pac Symp Biocomput. 2021;26:26-37.
Hu H, Salcic Z, Dobbie G, Chen Y, Zhang X. EAR: An Enhanced Adversarial Regularization Approach against Membership Inference Attacks. In: 2021 International Joint Conference on Neural Networks (IJCNN). ; 2021:1-8. doi:10.1109/IJCNN52387.2021.9534381
Wu B, Yang X, Pan S, Yuan X. Adapting Membership Inference Attacks to GNN for Graph Classification: Approaches and Implications. Published online October 17, 2021. Accessed September 26, 2022. http://arxiv.org/abs/2110.08760
Hui B, Yang Y, Yuan H, Burlina P, Gong NZ, Cao Y. Practical Blind Membership Inference Attack via Differential Comparisons. In: Proceedings 2021 Network and Distributed System Security Symposium. Internet Society; 2021. doi:10.14722/ndss.2021.24293
Olatunji IE, Nejdl W, Khosla M. Membership Inference Attack on Graph Neural Networks. Published online December 18, 2021. doi:10.48550/arXiv.2101.06570
Chen M, Zhang Z, Wang T, Backes M, Humbert M, Zhang Y. When Machine Unlearning Jeopardizes Privacy. In: Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. CCS ’21. Association for Computing Machinery; 2021:896-911. doi:10.1145/3460120.3484756
Li J, Li N, Ribeiro B. Membership Inference Attacks and Defenses in Classification Models. In: Proceedings of the Eleventh ACM Conference on Data and Application Security and Privacy. CODASPY ’21. Association for Computing Machinery; 2021:5-16. doi:10.1145/3422337.3447836
Wang K, Hu Z, Ai Q, et al. Membership Inference Attack with Multi-Grade Service Models in Edge Intelligence. IEEE Netw. 2021;35(1):184-189. doi:10.1109/MNET.011.2000246
Goldsteen A, Ezov G, Shmelkin R, Moffie M, Farkash A. Anonymizing Machine Learning Models. In: Data Privacy Management, Cryptocurrencies and Blockchain Technology: ESORICS 2021 International Workshops, DPM 2021 and CBT 2021, Darmstadt, Germany, October 8, 2021, Revised Selected Papers. Springer-Verlag; 2021:121-136. doi:10.1007/978-3-030-93944-1_8
Chen J, Wang WH, Gao H, Shi X. PAR-GAN: Improving the Generalization of Generative Adversarial Networks Against Membership Inference Attacks. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. KDD ’21. Association for Computing Machinery; 2021:127-137. doi:10.1145/3447548.3467445
Yin Y, Chen K, Shou L, Chen G. Defending Privacy Against More Knowledgeable Membership Inference Attackers. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. KDD ’21. Association for Computing Machinery; 2021:2026-2036. doi:10.1145/3447548.3467444
Choquette-Choo CA, Tramer F, Carlini N, Papernot N. Label-Only Membership Inference Attacks. In: Proceedings of the 38th International Conference on Machine Learning. PMLR; 2021:1964-1974. Accessed August 4, 2022. https://proceedings.mlr.press/v139/choquette-choo21a.html
Shejwalkar V, Houmansadr A. Membership Privacy for Machine Learning Models Through Knowledge Transfer. In: Thirty-Fifth Aaai Conference on Artificial Intelligence, Thirty-Third Conference on Innovative Applications of Artificial Intelligence and the Eleventh Symposium on Educational Advances in Artificial Intelligence. Vol 35. Assoc Advancement Artificial Intelligence; 2021:9549-9557. Accessed August 4, 2022. http://www.webofscience.com/wos/woscc/full-record/WOS:000681269801025
Duddu V, Boutet A, Shejwalkar V. GECKO: Reconciling Privacy, Accuracy and Efficiency in Embedded Deep Learning. In: Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing. ; 2022:520-529. doi:10.1145/3477314.3507128
Watson L, Guo C, Cormode G, Sablayrolles A. On the Importance of Difficulty Calibration in Membership Inference Attacks. Published online April 11, 2022. Accessed September 26, 2022. http://arxiv.org/abs/2111.08440
Chen H, Li H, Dong G, et al. Practical Membership Inference Attack Against Collaborative Inference in Industrial IoT. IEEE Trans Ind Inform. 2022;18(1):477-487. doi:10.1109/TII.2020.3046648
Lu Z, Asghar HJ, Kaafar MA, Webb D, Dickinson P. A Differentially Private Framework for Deep Learning With Convexified Loss Functions. IEEE Trans Inf Forensics Secur. 2022;17:2151-2165. doi:10.1109/TIFS.2022.3169911
Zhang Y, Zhou H, Wang P, Yang G. Black-Box Based Limited Query Membership Inference Attack. IEEE Access. 2022;10:55459-55468. doi:10.1109/ACCESS.2022.3175824
Ruiz de Arcaute GM, Hernández JA, Reviriego P. Assessing the Impact of Membership Inference Attacks on Classical Machine Learning Algorithms. In: 2022 18th International Conference on the Design of Reliable Communication Networks (DRCN). ; 2022:1-4. doi:10.1109/DRCN53993.2022.9758025
Zhang G, Liu B, Zhu T, Ding M, Zhou W. Label-Only Membership Inference Attacks and Defenses In Semantic Segmentation Models. IEEE Trans Dependable Secure Comput. Published online 2022:1-1. doi:10.1109/TDSC.2022.3154029
Yuan X, Zhang L. Membership Inference Attacks and Defenses in Neural Network Pruning. Published online February 7, 2022. doi:10.48550/arXiv.2202.03335
Pedersen J, Muñoz Gómez R, Huang J, Sun H, Tu WW, Guyon I. LTU Attacker for Membership Inference. In: Third AAAI Workshop on Privacy-Preserving Artificial Intelligence (PPAI-22). ; 2022. Accessed August 4, 2022. https://hal.archives-ouvertes.fr/hal-03522633
Xie G, Pei Q. Towards Attack to MemGuard with Nonlocal-Means Method. Secur Commun Netw. 2022;2022:e6272737. doi:10.1155/2022/6272737
Ben Hamida S, Mrabet H, Belguith S, Alhomoud A, Jemai A. Towards securing machine learning models against membership inference attacks. Comput Mater Contin. 2022;70(3):4897-4919. doi:10.32604/cmc.2022.019709
He X, Liu H, Gong NZ, Zhang Y. Semi-Leak: Membership Inference Attacks Against Semi-supervised Learning. Published online July 25, 2022. doi:10.48550/arXiv.2207.12535
Ha H, Jang J, Jeong Y, Yoon S. Membership Feature Disentanglement Network. In: Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security. ASIA CCS ’22. Association for Computing Machinery; 2022:364-376. doi:10.1145/3488932.3497772
Zhang Z, Zhang LY, Zheng X, Abbasi BH, Hu S. Evaluating Membership Inference Through Adversarial Robustness. Published online May 14, 2022. doi:10.48550/arXiv.2205.06986
Ye D, Shen S, Zhu T, Liu B, Zhou W. One Parameter Defense—Defending Against Data Inference Attacks via Differential Privacy. IEEE Trans Inf Forensics Secur. 2022;17:1466-1480. doi:10.1109/TIFS.2022.3163591
Mao Y, Hong W, Zhu B, Zhu Z, Zhang Y, Zhong S. Secure Deep Neural Network Models Publishing Against Membership Inference Attacks Via Training Task Parallelism. IEEE Trans Parallel Distrib Syst. 2022;33(11):3079-3091. doi:10.1109/TPDS.2021.3129612
Hu L, Li J, Lin G, et al. Defending against Membership Inference Attacks with High Utility by GAN. IEEE Trans Dependable Secure Comput. Published online 2022:1-1. doi:10.1109/TDSC.2022.3174569
Chen D, Yu N, Fritz M. RelaxLoss: Defending Membership Inference Attacks without Losing Utility. In: ; 2022. Accessed September 26, 2022. https://openreview.net/forum?id=FEDfGWVZYIn
Wang Z, Huang N, Sun F, et al. Debiasing Learning for Membership Inference Attacks Against Recommender Systems. Published online June 28, 2022. doi:10.1145/3534678.3539392
Hu H, Salcic Z, Dobbie G, Chen J, Sun L, Zhang X. Membership Inference via Backdooring. Published online June 9, 2022. doi:10.48550/arXiv.2206.04823
Liu L, Wang Y, Liu G, Peng K, Wang C. Membership Inference Attacks Against Machine Learning Models via Prediction Sensitivity. IEEE Trans Dependable Secure Comput. Published online 2022:1-8. doi:10.1109/TDSC.2022.3180828
Zhong D, Sun H, Xu J, Gong N, Wang WH. Understanding Disparate Effects of Membership Inference Attacks and their Countermeasures. In: Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security. ASIA CCS ’22. Association for Computing Machinery; 2022:959-974. doi:10.1145/3488932.3501279
Gu Y, Bai Y, Xu S. CS-MIA: Membership inference attack based on prediction confidence series in federated learning. J Inf Secur Appl. 2022;67:103201. doi:10.1016/j.jisa.2022.103201
Yoon HJ, Stanley C, Christian JB, et al. Optimal vocabulary selection approaches for privacy-preserving deep NLP model training for information extraction and cancer epidemiology. Cancer Biomark. 2022;33(2):185-198. doi:10.3233/CBM-210306
Nasr M, Shokri R, Houmansadr A. Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning. In: 2019 IEEE Symposium on Security and Privacy (SP). ; 2019:739-753. doi:10.1109/SP.2019.00065
UCI Machine Learning Repository: Breast Cancer Wisconsin (Diagnostic) Data Set. Accessed February 27, 2023. https://archive.ics.uci.edu/ml/datasets/breast+cancer+wisconsin+(diagnostic)
UCI Machine Learning Repository: Hepatitis Data Set. Accessed February 27, 2023. https://archive.ics.uci.edu/ml/datasets/hepatitis
Salem A, Zhang Y, Humbert M, Berrang P, Fritz M, Backes M. ML-Leaks: Model and Data Independent Membership Inference Attacks and Defenses on Machine Learning Models. Published online December 14, 2018. Accessed July 20, 2022. http://arxiv.org/abs/1806.01246
UCI Machine Learning Repository: Bank Marketing Data Set. Accessed March 6, 2023. https://archive.ics.uci.edu/ml/datasets/Bank+Marketing

Table 1 and 2 are available in the Supplementary Files section.

No competing interests reported.

Download PDF

Version 1

posted

You are reading this latest preprint version

Critical Analysis of Privacy Risks in Machine Learning and Implications for Use of Health Data: A systematic review and meta-analysis on membership inference attacks

Status:

Version 1

Abstract

Purpose.

Methods.

Results.

Conclusions.

Figures

Introduction

Methods

2.1 Inclusion/Exclusion Criteria

2.2 Selection Process and Data Extraction

2.3 Synthesis Methods

2.4 Meta-Analysis Methods

Results

3.1 Retention of Training Data in ML Models

3.2 Training Data Attributes

3.3 Model Architecture

Discussion

Conclusions

Declarations

References

Table 1 and 2

Additional Declarations

Supplementary Files

Status:

Version 1