The Robustness of Deep Learning Models to Adversarial Attacks in Lung X-ray Classification

doi:10.21203/rs.3.rs-4923634/v1

With the rapid advancement of artificial intelligence (AI) and deep learning, AI-driven models are increasingly being used in the medical field for disease classification and diagnosis. However, the robustness of these models against adversarial attacks is a critical concern, as such attacks can significantly distort diagnostic outcomes, leading to potential clinical errors. This study investigates the robustness of various convolutional neural network (CNN) models, including MobileNet, Resnet-152, and Vision Transformers (ViT), in lung radiograph classification tasks under adversarial conditions. We utilized the "ChestX-ray8" dataset to train and evaluate these models, applying a range of adversarial attack methods, such as FGSM and AutoAttack, to assess the models' resilience. Our findings indicate that while all models experienced a decrease in accuracy after adversarial attacks, MobileNet consistently demonstrated superior robustness compared to other CNN-based models. We also explored the impact of inverse robustness training to enhance model stability. Results seem to prove that the sparser nature of the MobileNet parameters, being the reason for its robustness, will give insight into enhancement of security and dependability within AI models in medical applications. This research underscores the need for continued refinement of AI models to ensure their safe deployment in clinical settings.

Health sciences/Medical research

Physical sciences/Mathematics and computing/Computer science

Deep Learning

Adversarial Attacks

Robustness

Lung X-ray Classification

Medical Imaging

Convolutional Neural Networks

Over time, particularly with the advancement of information technology and the onset of the post-pandemic era, artificial intelligence (AI) based on deep learning networks has begun to make significant strides in the field of medicine^{[1, 2, 23]}. Increasingly, AI is being utilized as a foundation for disease research and diagnosis. In many disease studies, radiographs are commonly used for classification and diagnosis, and numerous studies have demonstrated that AI achieves high accuracy in these tasks, including disease prediction^[3–5]. Many studies have shown that AI can be very accurate in these tasks, such as in predicting diseases. Further, deep learning has greatly improved the performance in image recognition from earlier methods that highly depended on manual labeling of image features^{[6, 7]}. In fact, not only does AI help doctors examine and diagnose with much expeditiousness and efficiency, but also through its predictive abilities, it provides insight into the future progression in a patient's condition, leading to wider and more optimized treatment plans^{[8, 9]}. Overall, AI proves to be a valuable tool in this field.

Because of the large-scale use of AI based on deep learning networks, it becomes an ineViTable trend to study the differences in safety and performance among different models. On the one hand, the accuracy of different AI models in classification and prediction varies greatly, which can easily lead to very different results in judgment and prediction for some problems, affecting the subsequent operations^[10]; on the other hand, the performance of the model also affects the model's judgment of the data interfered with by the outside world, especially some man-made malicious data interference and tampering, and the AI model is very sensitive to these attacks, which can also easily lead to wrong results^[11]. This kind of adversarial attacks can easily occur when we use models for large-scale problem solving, such as disease research based on thousands of patients. Adversarial attacks are mainly categorized into black-box and white-box attacks. Black-box attacks often modify the original data of the model, and this very small attack can have a large impact on the model results^{[12, 13]}. White-box attacks, on the other hand, have direct access to the entire model and can easily result in large-scale attacks^[13].

As discussed above, the performance and security of deep learning AI models are closely tied to the current healthcare industry. For example, in critical research areas like lung nodule detection in CT images, spine radiological score prediction, and lymph node and interstitial lung disease detection and classification, adversarial attacks on model perturbations can cause significant deviations in results^[15–17]. Such errors can have irreversible consequences for patient diagnosis and subsequent treatment decisions^[17]. Therefore, AI models based on deep learning networks should have excellent resistance to attack perturbations, especially in the domain of medicine, and it is crucial to investigate the attack resistance of AI models in the medical field and to develop and implement relevant methods, as these are essential for ensuring the safety of subsequent clinical applications.

CNN models pre-trained on ImageNet are commonly employed in today’s computer vision field, and these models perform well in image classification across a wide range of fields^[19]. It is also worth noting that in recent years vision transformer (VIT)^{[31, 32]} has also begun to be applied to image classification tasks and achieved impressive results. Among the CNN models, MobileNet^{[28, 29]}, as a lightweight convolutional neural network model, has gained popularity due to its high efficiency and low computational cost. Compared with Resnet-152, AlexNet, and traditional convolutional neural networks, MobileNet is able to significantly reduce the use of computational resources while maintaining a high level of accuracy. MobileNet makes its application on mobile and embedded devices more feasible through the structure of deeply separable convolutions. In addition, although ViT outperforms CNN in some tasks, its application in medical imaging needs to be further explored due to its computational complexity and larger data size required^[20–22]. Based on this, we concentrated on dataset acquisition and discovered that annotating lung x-rays is particularly challenging for non-medical professionals^[24]. Further, the model seems to focus more on smaller areas in the X-ray, which might not be consistent with the dimensions or scale with respect to the full-sized image. To overcome the same, we adopted the "ChestX-ray8"^[24] repository and used natural language processing techniques to annotate images by reports on the same. Further, weakly-supervised multi-label image classification and disease localization frameworks were implemented to handle the challenge of detecting pathological findings in the images.

In this paper, we first trained models using various defense mechanisms based on the "ChestX-ray8" dataset, then evaluated and compared the robustness of each model against attacks. Additionally, we conducted further training on models with lower robustness to enhance their performance and analyzed the underlying reasons for the differences in model robustness.

Relate work

Recently, the utilization of deep learning models in the medical field is still a hot topic, like Liu et al. proposed a deep learning method for classification of breast cancer tissue slices^[25], Zhang et al. used MobileNet model for lung cancer detection^[26], which demonstrated the effectiveness and prospect of the application of deep learning models in medical image classification tasks. Meanwhile, the research on modeling against attacks is also deepening, Goodfellow et al. systematically proposed the concept of Adversarial Examples^[27]. However, the research on adversarial attacks on deep learning models used in the medical field has yet to be deepened, especially for some newer models. We selected a series of models such as MobileNet,Resnet-152,vison transformer and so on to carry out the adversarial attack and compare their robustness, at the same time, we proposed the use of retrograde robustness training to improve the model stability, and analyzed the reason why MobileNet model performs better in the convolutional neural network class of models.

Models such as MobileNet, Resnet-152, and ViT perform excellently for lung radiograph classification

MobileNet^[28^,^29], Resnet-152^[30], ViT^[31^,^32], ordinary convolutional neural network^[33] and Alexnet^[34] have wide applications in the field of image classification and also have high accuracy in classification judgement of lung x-ray photographs for diseases. The goal of our classification task is to classify lung radiographs into two categories, ‘having disease’ and ‘no disease’(Figure 1b). Among them, the ‘having disease’ category includes common lung diseases such as Atelectasis, Cardiomegaly, Effusion, Infiltration, Mass, Nodule, Pneumonia, Pneumothorax, etc. or a combination of these diseases. We trained the lightweight neural network MobileNet (Figure 1a), the Resnet-152 model, the ViT model, the ordinary convolutional neural network and the Alexnet model to generate the corresponding classifiers, and the classifiers achieved high accuracy based on the dataset provided by the ChestX-ray8 database. The mean area under the receiver operating curve (AUROC) of MobileNet reaches 0.964(Table 1); the AUROC of the Resnet model reaches 0.956; that of the ViT model reaches 0.952; CNNs (common convolutional neural network) reaches 0.761; and that of the Alexnet model reaches 0.785.There are some models with higher accuracy, with MobileNet having the highest accuracy. In addition, these models have good learning capabilities and can be applied to external validation sets, as demonstrated by the ability of the classifier to categorise typical photos based on the scores obtained from the x-ray photos. Overall, these models are very good at classifying diseases and maintain a high level of accuracy.

Table 1:AUROC of each model in lung radiograph classification

Model	MobileNet	Resnet-152	Vision Transformer	ordinary CNN	Alexnet
AUROC	0.964	0.956	0.952	0.761	0.785

Adversarial attacks have a large impact on the model

We use a variety of adversarial attacks^[27] on the model to ensure the rigor of the experiments. These methods include FGSM^[35] (Figure 2c), FAB^[36], PGD^[37], and AutoAttack^[38] in the white-box attack^[39](Figure 2b); AdvDrop^[40] in the black-box attack^[41], and Square^[42], a query-based gradient-free attack method. The results of these attacks can be clearly seen from the images (Figure 2a), after the attack, the number of noises in the images rises significantly, and here too the noise is related to the strength of each method's attack strength ε. The larger ε is, the more noise there is (Figure 2d), and this interference will ineViTably have a greater impact on the accuracy of the classifier. We attack the models separately with the different strategies described above and use different attack strengths (ε = 0.5e-3, ε = 1.0e-3, ε = 1.5e-3) to see the specific impact on the different models. We analyze for the FGSM attack, the accuracy of MobileNet model decreases to become 0.895,0.764,0.455, Resnet-152 decreases to become 0.847,0.618,0.375, and ViT model decreases to become 0.931,0.873,0.798 for different attack strengths, and ordinary convolutional neural network decreases to 0.614,0.446,0.215, and Alexnet model decreases to 0.642,0.496,0.287(Table 2). It can be clearly seen that different models have a more significant decrease in accuracy after countering the attack, and as far as the experimental results are concerned, the ViT 's accuracy decreases less in countering the attack because the ViT model is different from the other several types of architectures, the ViT model is a model based on transformer architecture, and the other models belong to convolutional neural networks. And for the convolutional neural network class of models, MobileNet has the best robustness. Collectively, these models are more significantly affected by the adversarial attack.

Table 2:AUROC of each model under default adversarial attack (FGSM)

Model ε	MobileNet	Resnet-152	Vision Transformer	ordinary CNN	Alexnet
5.00E-04	0.895	0.847	0.931	0.614	0.642
1.00E-03	0.764	0.618	0.873	0.446	0.496
1.50E-03	0.455	0.375	0.798	0.215	0.287

Robustness of convolutional neural network-like models can be improved by inverse robustness training

Since it was found that the accuracy of convolutional neural networks may decrease a lot when the perturbation is strong, we tried to solve this problem by utilizing inverse robustness training. The so-called inverse robustness training is to add attacks such as FGSM directly to the x-ray photos in the dataset, so as to learn the noise as a part of the model training, and to achieve the ability of the model to ignore the noise in the subsequent image recognition, thus improving the accuracy(Figure 2b). For FGSM with different attack strengths (ε= 0.5e-3, ε= 1.0e-3, ε= 1.5e-3), we experimentally conclude that after the inverse robustness training, the AUROC of MobileNet rises to 0.905,0.783,0.469, and that of Resnet-152 rises to 0.851,0.636,0.385 compared to before. Ordinary convolutional neural network increased to 0.821,0.568,0.239 and alexnet increased to 0.845,0.710,0.399(Table 3), according to these data, it can be seen that the inverse robustness training can improve the robustness of the convolutional neural network class of models in a small way.

Table 3:AUROC of each adversarially trained model under default adversarial attack(FGSM)

Model ε	MobileNet	Resnet-152	ordinary CNN	Alexnet
5.00E-04	0.905	0.851	0.821	0.845
1.00E-03	0.783	0.636	0.568	0.710
1.50E-03	0.469	0.385	0.239	0.399

The MobileNet is the convolutional neural network model with the best robustness against adversaria attacks

Convolutional neural network class of models: MobileNet, Resnet-152, ordinary convolutional neural network, Alexnet After a variety of counter-attacks, MobileNet's accuracy decreases significantly less than the other models, in order to validate this result, we used the six counter-attacks used previously and selected 1000 images from the dataset for the attack success rate (ASR) test. We find that in the absence of an adversarial attack and in the presence of an adversarial attack with different attack strengths (ε = 0.5e-3, ε = 1.0e-3, ε = 1.5e-3), both based on the baseline model and the adversarial training model, the MobileNet model has a lower ASR than the other models in most of the experiments (Table 4), which suggests that the MobileNet model is less susceptible to attacks and more stable. Meanwhile, for the inverse robustness training that we proposed before, we also tested it, and we found that the ASR of the models trained by inverse training were all reduced (Table 5), and the MobileNet model has some advantages over the other models but is not particularly obvious. In addition, we speculate that the reason for this may be due to the difference in the number of parameters between the models. Thus, we used ghostnet^[43], which is the same lightweight convolutional neural network as the MobileNet model but with more parameters, for comparison. We also performed the same experiments on the ghostnet model, and the results proved that the ghostnet with more parameters has a higher ASR than the MobileNet model, and the accuracy obtained is lower than MobileNet in counter attacks with different attack strengths. Considering the small attack strengths we used, we also tested a higher strength attack with ε = 0.1(Table 4,Table 5). This attack caused a serious decrease in the accuracy of all models, but even so, the low correlation of attacks of this strength is not very meaningful due to the fact that the attack strength is too strong to be observed by the naked eye. Based on this, our previous attack strengths ε = 0.5e-3, ε = 1.0e-3, ε = 1.5e-3 are difficult to observe with the naked eye and are much more interesting to study.

Reasons for better robustness of MobileNet against adversarial attacks

We obtained the amount of noise in white-box attacks such as FGSM for both the MobileNet model and the Resnet-152 model as a way to continue to investigate why MobileNet is better stabilized. Firstly, in terms of quantity, the gradient magnitude of MobileNet is significantly lower than that of Resnet-152, and secondly, in terms of nature, the adversarial noise pattern of MobileNet is more ordered, whereas the noise of Resnet-152 appears to be more disorganized. This suggests that, unlike Resnet-152, the MobileNet model focuses more on overall feature learning than on local details (e.g., edges and lines), which reduces sensitiViTy to high-frequency perturbations. I also utilized Principal Component Analysis (PCA) dimensionality reduction to further investigate the underlying spatial structure of Resnet-152 and MobileNet deep activations, and in the raw images for the lung disease classification task, MobileNet had tighter clustering of samples within classes and greater distances between classes. After the adversarial attack, this difference is even more obvious: the latent space of Resnet-152 becomes more decentralized, while MobileNet maintains a better clustering effect. That is to say compared to Resnet-152, MobileNet can better distinguish different features in lung radiographs. After that, we used the Grad-CAM method to visualize the regions of high importance of Resnet-152 and MobileNet in the input images. In the baseline case, Resnet-152 focuses more on a single portion of the input image, while MobileNet assigns higher importance to multiple regions, i.e., focuses more comprehensively. In the confrontation attack, as the intensity of the attack continues to rise, the region of attention of Resnet-152 becomes scattered and contains more irrelevant parts of the image, while the region of attention of MobileNet remains almost unchanged. Based on these phenomena, we can judge that among the convolutional neural network models, the MobileNet model shows better robustness in white-box attacks because it is more effective in separating the features of lung radiographs and the important regions in the image are more stable.

Table 4:AUROC of each model under different adversarial attacks

	FGSM					PGD					Square
Mobile ε	mobilenet	resnet-152	ordinary CNN	Alexnet	ghostnet	mobilenet	resnet-152	ordinary CNN	Alexnet	ghostnet	mobilenet		resnet-152	ordinary CNN	Alexnet	ghostnet
5.00E-04	11.53%	14.61%	19.85%	17.94%	13.69%	15.29%	13.45%	16.24%	17.35%	14.28%	5.84%		4.56%	9.46%	8.32%	7.22%
1.00E-03	29.64%	34.75%	40.11%	38.62%	35.78%	36.78%	32.94%	39.48%	38.73%	33.76%	13.96%		14.39%	17.83%	15.47%	15.85%
1.50E-03	44.59%	46.13%	51.68%	50.21%	48.98%	53.23%	56.32%	59.11%	55.12%	54.29%	25.92%		24.85%	28.37%	23.81%	27.95%
1.00E-01	67.01%	66.71%	73.53%	71.75%	69.34%	67.98%	68.82%	70.18%	67.94%	68.26%	56.74%		55.21%	60.16%	58.24%	54.24%
	FAB					AutoAttack					AdvDrop
Mobile ε	mobilenet	resnet-152	ordinary CNN	Alexnet	ghostnet	mobilenet	resnet-152	ordinary CNN	Alexnet	ghostnet	ε	mobilenet	resnet-152	ordinary CNN	Alexnet	ghostnet
5.00E-04	12.18%	14.27%	16.25%	18.26%	17.31%	15.87%	14.25%	14.74%	18.39%	20.06%	20	59.14%	57.24%	74.25%	62.84%	58.12%
1.00E-03	26.36%	25.88%	29.64%	25.14%	30.54%	35.19%	38.24%	43.48%	39.85%	40.17%	40	67.87%	65.82%	89.13%	81.68%	70.64%
0.0015	40.83%	47.54%	48.17%	47.57%	41.73%	44.05%	48.33%	55.94%	49.17%	50.29%	60	0.7814	0.6352	0.8854	0.8925	0.7926
0.1	58.91%	57.73%	55.78%	57.06%	57.65%	54.38%	56.73%	58.23%	57.96%	57.89%	-	-	-	-	-	-

Table 5:AUROC of each adversarially trained model under different adversarial attacks

	FGSM					PGD					Square
Mobile ε	mobilenet	resnet-152	ordinary CNN	Alexnet	ghostnet	mobilenet	resnet-152	ordinary CNN	Alexnet	ghostnet	mobilenet		resnet-152	ordinary CNN	Alexnet	ghostnet
5.00E-04	11.53%	14.61%	19.85%	17.94%	13.69%	15.29%	13.45%	16.24%	17.35%	14.28%	5.84%		4.56%	9.46%	8.32%	7.22%
1.00E-03	29.64%	34.75%	40.11%	38.62%	35.78%	36.78%	32.94%	39.48%	38.73%	33.76%	13.96%		14.39%	17.83%	15.47%	15.85%
1.50E-03	44.59%	46.13%	51.68%	50.21%	48.98%	53.23%	56.32%	59.11%	55.12%	54.29%	25.92%		24.85%	28.37%	23.81%	27.95%
1.00E-01	67.01%	66.71%	73.53%	71.75%	69.34%	67.98%	68.82%	70.18%	67.94%	68.26%	56.74%		55.21%	60.16%	58.24%	54.24%
	FAB					AutoAttack					AdvDrop
Mobile ε	mobilenet	resnet-152	ordinary CNN	Alexnet	ghostnet	mobilenet	resnet-152	ordinary CNN	Alexnet	ghostnet	ε	mobilenet	resnet-152	ordinary CNN	Alexnet	ghostnet
5.00E-04	12.18%	14.27%	16.25%	18.26%	17.31%	15.87%	14.25%	14.74%	18.39%	20.06%	20	59.14%	57.24%	74.25%	62.84%	58.12%
1.00E-03	26.36%	25.88%	29.64%	25.14%	30.54%	35.19%	38.24%	43.48%	39.85%	40.17%	40	67.87%	65.82%	89.13%	81.68%	70.64%
0.0015	40.83%	47.54%	48.17%	47.57%	41.73%	44.05%	48.33%	55.94%	49.17%	50.29%	60	0.7814	0.6352	0.8854	0.8925	0.7926
0.1	58.91%	57.73%	55.78%	57.06%	57.65%	54.38%	56.73%	58.23%	57.96%	57.89%	-	-	-	-	-	-

Ethical Statement

This thesis is practiced in accordance with the Declaration of Helsinki, the lung radiographs we used were all taken from the data provided by the ChestX-ray8 database, a total of 108,948 frontal radiographs from 32,717 anonymized patients, which are publicly available between 1992-2015 and can be downloaded at https://nihcc.app.box.com/v/ChestXray-NIHCC to download, this experiment did not involve physical contact with the patients, and we only took a retrospective analysis of the radiographs of these anonymized patients, so the relevant ethical committees waived the requirement of informed consent.

Patient cohorts

We conducted experiments from these radiographs in order to categorize them into two main categories, 'having disease' and 'no disease' (Figure 1b) The 'having disease' category is dominated by those with Atelectasis, Cardiomegaly, Effusion, Infiltration, Mass, Nodule, Pneumonia, Pneumothorax, and other common lung diseases or combinations of diseases.' No Disease' category this is marked with 'No Finding'.

Image Preprocessing

80,000 photos are randomly selected in the dataset and divided into training set and validation set according to the ratio of 3:1. We set the mean value of the image samples to 0 to reduce the brightness difference between the samples, is that the images have the same mean and variance, and then, for the training samples, flip operation is performed, randomly flipping the input image horizontally without vertical flipping. The panning operation is performed with a range of 5% of the total height; random panning in the width direction with a range of 10% of the total width. On this basis then perform image rotation operation, randomly rotate the image, the range is 5 degrees. Then perform a shear transformation to randomly shear transform the image with a range of 0.1. These operations enhance the robustness and generalization ability of the trained model. The reflective fill pattern is used to fill the image boundaries during preprocessing, helping to retain more image information. Finally, random scaling within a range of 0.15 is applied, further improving the model's robustness to variations in image scale.

Experimental Design

We train deep learning models with the previously partitioned training sets, which include MobileNet, Resnet-152,VIT(Vision transformers), ordinary convolutional neural network(CNN), Alexnet, and ghostnet, and we conduct a series of anti-attack experiments on the trained models to validate the robustness difference between different models. In addition, for some models, we also adopt the operation of reverse robustness training to improve the robustness of the models under adversarial attacks, so as to improve the stability of the models.

Specific methods for implementing the adversarial attack

We conduct adversarial attacks on the images in the validation set, where noise is added to the photos to cause the original model classifier to misclassify them. or these attacks, we primarily use the FGSM (Fast Gradient Sign Method), which is a white-box attack. In addition, we use a variety of ways of attacking in order to achieve experimental rigor: a multi-step gradient-based white-box attack method called PGD, Projected Gradient Descent; FAB Fast Adaptive Boundary Method, a more generalized gradient-based white-box attack; Square, a query-based gradient-free attack method that places square updates at random locations in the input image; AutoAttack, a diverse collection of parameter-free attacks; and AdvDrop, which creates adversarial examples by discarding high-frequency features in the image. In order to verify the reason for the higher robustness of the corresponding models, we divide the experiment into three steps:(1) We compare the generated adversarial noise between the models to make a comparison, using the gradient and orderliness of the noise as points of comparison. This requires us to visualize the noise. We subtract the perturbed image from the original image, crop and scale the 15th to 95th color channels in the image. This allows visualization. (2) We investigate the underlying spatial structure of different deep activations between models by principal component analysis (PCA). (3) We visualize the regions of the input image to which the models assign high importance using the Gradient Weighted Class Activation Mapping (Grad-CAM) method. Combined with different intensities of the adversarial attack, the factors affecting the robustness of the model can be determined based on the location and number of region distributions.

Data Statistics

We mainly collect and analyze the microscopic average area under the receiver operating curve (AUROC) and attack success rate (ASR) for the training models. For each model, we took the same validation set for experimental manipulation and used a random seed for repeated experiments. For AUROC we mainly collected the AUROC mean (with standard deviation) and AUROC median (with interquartile spacing), and also examined the set of AUROCs for different models, specifically by taking a two-sided unpaired t for testing. For ASR, we randomly select 1000 photos from the validation set for the experiment, and call the attack successful if the counterattack leads to a difference in the model classification.The lower the ASR, the less likely the model is to be attacked, and the better the model robustness is.

Introduction to Deep Learning in Medical Applications

In recent years, deep learning models have demonstrated good clinical problem solving capabilities in the medical field, and their applications are becoming more and more widespread, especially in the field of classification and prediction of disease judgments from pathology slides and X-rays, such as classification of breast cancer tissue slides^[25] and lung cancer detection^[26]. These models have demonstrated excellent classification and prediction capabilities, which have greatly improved the efficiency of physicians in patient diagnosis and treatment planning. However, these models are also susceptible to interference from adversarial attacks, leading to erroneous prediction results and thus clinical incidents. Therefore, deep learning models must have high stability under adversarial attacks to ensure prediction accuracy.

Research Objectives and Model Selection

Based on this, we selected several widely used current models, including MobileNet, Resnet-152^[30], ViT^[31^,^32], ordinary convolutional neural networks^[33], and AlexNet^[34], and systematically investigated the differences in the robustness of these models in adversarial attacks and their reasons. Our study not only comprehensively analyzes the performance of various types of models under different attacks, but also reveals the potential connection between model parameters and robustness. The experimental results show that all studied models are susceptible to adversarial attacks, even though these attacks are well below the human perception threshold. Nevertheless, MobileNet still showed the highest accuracy among the convolutional neural network class of models (MobileNet, Resnet-152, ordinary convolutional neural network, AlexNet) before and after the adversarial attacks, a finding that is important for advancing the application of MobileNet in the medical field. We also attempted to improve the stability of the model using inverse robustness training, but with less success.

Hypothesis on Parameter Efficiency and Robustness

We believe that the robustness advantage exhibited by MobileNet may be related to its fewer parameters. To test this hypothesis, we introduced GhostNet, a novel convolutional neural network with a larger number of parameters, for comparison. The results show that MobileNet exhibits greater robustness and stability under adversarial attacks with fewer parameters. In addition, our study also finds that MobileNet is potentially more spatially stable than other models, a property that further explains why MobileNet performs more superiorly in both black- and white-box attacks. These findings are not only applicable to the medical domain, but also have wide reference value in non-medical applications, validating the generalizability and robustness of MobileNet models in handling various image classification tasks^[28].

Theoretical and Practical Implications

Our study provides a series of theoretical advantages for MobileNet to be more stable compared to other convolutional neural network models in lung radiograph classification and prediction, which also provides strong support for the advantages of MobileNet's application in medical fields related to classification and prediction. By analyzing and comparing multiple models in depth, our study not only has a broader reference meaning, but also can provide a basis for future research and practical applications.

Limitations of the Study

However, this study also has some limitations. First, although we used currently popular datasets, these datasets still have some limitations in terms of sample size and diversity, which may affect the adaptability of the models in a wider context. Second, our experiments mainly focused on specific types of adversarial attacks, and other potential attack types, such as single-pixel attacks^[44] and physical-world attacks^[45], were not able to be explored in this paper, which may limit the comprehensiveness of the findings. In addition, although MobileNet excels in robustness and computational efficiency, its less complex structure may encounter challenges when dealing with more complex medical images.

Future Research Directions

Future work includes, but is not limited to, the following: fusing the strengths of different architectural models, e.g., combining the respective strengths of MobileNet and Vision Transformer, and exploring new stability issues that may arise; extending adversarial attacks from cyber-attacks to real-life attacks, such as single-pixel and physical-world attacks; and enhancing the adversarial training methodology through improved robustness of different models in adversarial attacks^[46^,^47].

Author Contributions Statement

Conceptualization, X.L.; methodology, X.L. and Y.L.; software, X.L.; validation,X.L.; writing—original draft preparation, X.L..; writing—review and editing, X.L. and Y.L.; visualization; supervision, X.L. and Y.P..; funding acquisition, Y.P..All authors have read and agreed to the published version of the manuscript.

Data Availability Statement

The dataset(ChestX-ray8) used for the experiment is available from https://nihcc.app.box.com/v/ChestXray-NIHCC

Competing Interests Statement

The authors declare no conflict of interest.

Approval Statement

Ethical approval waived due to the nature of the study or the retrospective study.

Pallua, J. D., Brunner, A., Zelger, B., Schirmer, M. & Haybaeck, J. The future of pathology is digital. Pathol. Res. Pract. 216, 153040 (2020).
Niazi, M. K. K., Parwani, A. V. & Gurcan, M. N. Digital pathology and artificial intelligence. Lancet Oncol. 20, e253–e261 (2019).
Litjens, G. et al. A survey on deep learning in medical image analysis. Med. Image Anal. 42, 60–88 (2017).
Esteva, A. et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 542, 115–118 (2017).
Kather, J. N. et al. Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer. Nat. Med. 25, 1054–1056 (2019).
Cifci, D., Foersch, S. & Kather, J. N. Artificial intelligence to identify genetic alterations in conventional histopathology. J. Pathol. https://doi.org/10.1002/path.5898 (2022).
Gulshan, V. et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 316, 2402–2410 (2016).
Heinz, C. N., Echle, A., Foersch, S., Bychkov, A. & Kather, J. N. The future of artificial intelligence in digital pathology - results of a survey across stakeholder groups. Histopathology. 80, 1121–1127 (2022).
Herrington, C. S., Poulsom, R. & Coates, P. J. Recent advances in pathology: the 2020 annual review Issue of the Journal of Pathology. J. Pathol. 250, 475–479 (2020).
Szegedy, C. et al. Intriguing properties of neural networks. arXiv preprint arXiv :13126199 (2013).
Kurakin, A., Goodfellow, I. & Bengio, S. Adversarial machine learning at scale. arXiv preprint arXiv :161101236 (2016).
Papernot, N. et al. IEEE,. The limitations of deep learning in adversarial settings. In 2016 IEEE European Symposium on Security and Privacy (EuroS&P) 372–387 (2016).
Papernot, N., McDaniel, P. & Goodfellow, I. Transferability in machine learning: From phenomena to black-box attacks using adversarial samples. arXiv preprint. https://doi.org/10.48550/arXiv.1605.07277 (2016). arXiv:1605.07277.
Carlini, N. & Wagner, D. Towards evaluating the robustness of neural networks. In IEEE Symposium on Security and Privacy (SP) 39–57 (IEEE, 2017). (2017).
Zhang, Y. et al. High-performance automatic segmentation of proximal femur in anteroposterior pelvic radiographs using deep convolutional neural networks. Quant. Imaging Med. Surg. 10, 1454–1464 (2020).
Esteva, A. et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 542, 115–118 (2017).
Rajpurkar, P. et al. CheXNet: Radiologist-level pneumonia detection on chest X-rays with deep learning. arXiv preprint. https://doi.org/10.48550/arXiv.1711.05225 (2017). arXiv:1711.05225
Finlayson, S. G. et al. Adversarial attacks on medical machine learning. Science. 363, 1287–1289 (2019).
Deng, J. et al. IEEE,. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition 248–255 (2009).
Touvron, H. et al. PMLR,. Training data-efficient image transformers & distillation through attention. In International Conference on Machine Learning 10347–10357 (2021).
Aldahdooh, A., Hamidouche, W. & Deforges, O. Reveal of vision transformers robustness against adversarial attacks. arXiv preprint arXiv :210603734 (2021).
Shao, R., Shi, Z., Yi, J., Chen, P. Y. & Hsieh, C. -J. On the adversarial robustness of visual transformers. arXiv preprint arXiv :210315670 (2021).
Tan, M., Le, Q. V. & EfficientNet Rethinking model scaling for convolutional neural networks. In International Conference on Machine Learning 6105–6114PMLR, (2019).
Wang, X. et al. ChestX-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2097–2106 (IEEE, 2017).
Liu, Y. et al. Deep learning for breast cancer histopathological image classification: A comparative study. IEEE Trans. Biomed. Eng. 66, 3397–3405 (2019).
Yang, J. et al. MobileNet-based deep learning approach for automated diagnosis of skin lesions. Comput. Med. Imaging Graph. 89, 101940 (2021).
Goodfellow, I. J. et al. Explaining and improving the robustness of classifiers against adversarial examples. In International Conference on Learning Representations (ICLR) (2015).
Howard, A. G. et al. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv :170404861 (2017).
Sandler, M. et al. MobileNetV2: Inverted residuals and linear bottlenecks. arXiv preprint arXiv :180104381 (2018).
He, K. et al. IEEE,. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 770–778 (2016).
Chen, R. J. et al. Scaling vision transformers to gigapixel images via hierarchical self-supervised learning. arXiv preprint arXiv :220602647 (2022).
DosoViTskiy, A. et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).
LeCun, Y. et al. Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998).
Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 1097–1105Curran Associates, Inc., (2012).
Croce, F. & Hein, M. Minimally distorted adversarial examples with a fast adaptive boundary attack. In International Conference on Machine LearningPMLR, (2020).
Madry, A., Makelov, A., Schmidt, L., Tsipras, D. & Vladu, A. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv :170606083 (2018).
Tramer, F. et al. Ensemble adversarial training: Attacks and defenses. arXiv preprint arXiv:1705.07204, https://doi.org/10.48550/arXiv.1705.07204 (2017).
Croce, F. & Hein, M. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In International Conference on Machine LearningPMLR, (2020).
Li, C. et al. Unveiling the unseen: Exploring whitebox membership inference through the lens of explainability. arXiv preprint arXiv :240701306 (2024).
Wu, Z., Zhu, L., Chen, S. & AdvDrop Adversarial attack to DNN by dropping. arXiv preprint arXiv:2001.03255 (2020).
Zhou, M. et al. IEEE,. DaST: Data-free substitute training for adversarial attacks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 234–243 (2020).
Andriushchenko, M. & Flammarion, N. Square attack: A query-efficient black-box adversarial attack via random search. In Proceedings of the European Conference on Computer Vision (ECCV) 484–500Springer, (2020).
Han, K. et al. IEEE,. GhostNet: More features from cheap operations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 1580–1589 (2020).
Su, J., Vargas, D. V. & Sakurai, K. One pixel attack for fooling deep neural networks. IEEE Trans. Evol. Comput. 23, 828–841 (2019).
Eykholt, K. et al. IEEE,. Robust physical-world attacks on deep learning visual classification. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 1625–1634 (2018).
Wong, E., Rice, L. & Kolter, J. Z. Fast is better than free: Revisiting adversarial training. arXiv preprint arXiv:2001.03994 (2020).
Zhang, H. et al. Theoretically principled trade-off between robustness and accuracy. arXiv preprint arXiv:1901.08573 (2019).

No competing interests reported.

The Robustness of Deep Learning Models to Adversarial Attacks in Lung X-ray Classification

Status:

Version 1

Abstract

Figures

Introduction