Deep Learning for Predicting Breast Cancer: A Systematic Review of Progress and Future Directions

doi:10.21203/rs.3.rs-3320207/v1

Download PDF

Research Article

Deep Learning for Predicting Breast Cancer: A Systematic Review of Progress and Future Directions

https://doi.org/10.21203/rs.3.rs-3320207/v1

This work is licensed under a CC BY 4.0 License

You are reading this latest preprint version

Breast cancer prediction is a critical area of research aimed at improving early detection and enhancing treatment strategies. Considering the fast development of Machine Learning techniques, the level of curiosity has increased dramatically in leveraging these algorithms for accurate and efficient breast cancer prediction. This survey paper comprehensively overviews the present condition of the art Machine Learning approaches employed in breast cancer prediction. This study analyzed a wide range of research studies, methodologies, and datasets to present a complete image of the state of the field, the problems it faces, and where it's going. Diverse techniques for Machine Learning, including deep learning models, SVMs, random forests, ANNs, and ensemble methods, are explored in terms of their strengths, weaknesses, and specific breast cancer prediction tasks they have been applied. Furthermore, the study also discussed the diverse input data modalities used, ranging from traditional mammograms and histopathological images to genomics and proteomics data. Challenges such as dataset imbalance, feature selection, interpretability, and generalizability are examined, along with proposed solutions and prospective directions for research. This survey paper aims to give a wealth of information for scientists, doctors, and others in the healthcare field to understand the advancements and potential of predicting breast cancer with Machine Learning, contributing to the development of improved precision and dependable predictive models for improved patient outcomes in the battle against breast cancer.

Breast cancer

Machine Learning

Deep Learning

ensemble methods

mammograms

The breast cancer mortality rate is high. The World Health Organisation (WHO) reports approximately 1.5 million females worldwide are yearly discovered to have breast cancer[1]. Women's breast cancer rates continue to rise and are astronomically greater than that of men. Breast cancer was to blame for the demise of an estimated 6,85,000 women in 2020, that's 1 in 6 female cancer fatalities, or 16%[2], [3]. Recognizing these tumors requires a method of active decision on the part of doctors. However, even for doctors, identifying tumors can be challenging. Breast cancer was first identified by the Ancient Egyptians, almost three thousand years ago. Breast cancer increased cases due to unchecked and uncontrolled growth in the breast tissues[4]. There are two primary categories of tumors: benign, which does not cause cancer, and malignant, which does cause cancer[5]. Benign tumors expand slowly and their borders are clearly defined; they never spread to neighbouring tissues or organs. Malignant tumors, on the other hand, grow rapidly, don't have distinct borders, and metastasize (spread) to other sections of the body[6]. Several diagnostic procedures, including breast ultrasound, breast exams, biopsy (the removal of analysing cells from a breast sample), mammography, and MRI of the breast, are advised in an effort to identify breast cancer. Radiologists then study the result of all these diagnostic procedures, consult with oncologists, and determine whether or not the cells are malignant[7].

This article examines the current condition of technology in breast cancer detection screening approaches, including a number of Deep Learning (DL) strategies such as Convolution Neural Networks with their most influential architectures, Recurrent Neural Networks, Autoencoders, Restricted Boltzmann Machines, and Self Organising Maps.

1.1 Motivation

The motivation behind this survey paper is twofold. Firstly, a necessity exists to consolidate and analyze the existing research studies on machine learning (ML) approaches for breast cancer prediction. With the rapid growth of literature in this field, it becomes essential to provide an exhaustive summary that synthesizes the present condition of the art methodologies, techniques, and advancements. By examining and comparing the different ML algorithms and data modalities used, this survey aims to present a holistic understanding of the strengths, weaknesses, and specific applications of these methods in breast cancer prediction.

Secondly, this survey paper aims to address the challenges and limitations that researchers and healthcare professionals face when applying ML techniques in breast cancer prediction. These challenges include dealing with imbalanced datasets, selecting informative features, ensuring interpretation capability, and achieving broader applicability across different patient populations. By discussing these problems and presenting proposed solutions, this survey aims to guide future research efforts and inspire the development of more robust and effective predictive models. Table 1 shows the criterion used for the selection and rejection of various studies.

Table 1 Selection and Rejection Criteria of state-of-the art studies.

S.NO.	Basis	Selection	Rejection
1	Research	mathematical foundations and experimental findings constitute the subject of study	Case study and article research conducted in languages other than English
2	Time Span	published academic works from 2018 to 2022	scholarly works published before 2018
3	Relation	Research articles emphasize detecting breast cancer using DL techniques.	Research articles concentrate on other breast cancer detection techniques
4	Examination	articles on research involving breast cancer detection	articles describing various disease detection techniques except for cancer

1.2 Contribution

This survey paper makes several contributions to breast cancer prediction using ML:

Comprehensive Overview: The paper presents a brief analysis of contemporary ML techniques employed in breast cancer prediction. It provides a holistic comprehension of the current landscape, developments, and trends in this field by analysing a wide range of research studies and methodologies.
Comparison and Evaluation: The survey evaluates and contrasts a variety of ML techniques for breast cancer prediction, including DL models, SVMs, ANNs, random forests, and ensemble methods. Each method's strengths, weaknesses, and specific duties for which it has been applied are discussed in detail, providing researchers and practitioners with invaluable insight into their applicability and performance.
Diverse Data Formats: The paper investigates the various input data modalities used in breast cancer prediction, including traditional mammograms, histopathological images, genomics, and proteomics data. It discusses the benefits and difficulties of each modality, allowing researchers to comprehend the implications of various data types for ML-based breast cancer prediction.
Challenges and Remedies: The survey examines the difficulties encountered when implementing machine learning algorithms for breast cancer prediction. Issues including dataset imbalance, feature selection, interpretability, and generalizability are discussed in depth, along with proposed solutions and future directions for research. This information aids researchers and practitioners in overcoming these obstacles and enhancing the accuracy and efficacy of predictive models.
Resource for Researchers and Practitioners: This survey document is a useful resource for scholars, clinicians, and healthcare professionals interested in understanding the advancements and potential of ML in breast cancer prediction. By consolidating and critically analyzing the existing literature, it provides a comprehensive reference point for future research, facilitating the development of more precise and trustworthy predictive models.

Overall, the contributions of this survey paper aim to enhance the understanding of ML approaches in breast cancer prediction, foster further research, and ultimately contribute to improved early detection and personalized strategies for patients with breast cancer.

The overall structure as illustrated in Figure 1 of this review paper, includes a basic introduction in section 1, an introduction to DL and procedures for Breast cancer diagnosis in section 2 and section 3, then a methodological discussion in section 4, section 5 mainly deals with a literature review followed by discussion in section6. The study ends with a conclusion in section 7.

DL has quickly become a popular and effective strategy for a wide range of problems, revolutionizing the way researchers analyze and extract valuable insights from complex data. With its inherent capacity for self-directed learning of the hierarchical representations of data, DL has received a lot of interest and achieved remarkable accomplishments in various areas of contexts, from computer vision and treatment of natural language to speech recognition and recommendation systems[8]. At its core, DL is influenced by the growing understanding of the brain's inner workings, particularly how neurons form an intricate web. Deep neural networks, the foundation of DL models, include many layers of synthetic neurons that transform input data through a series of nonlinear transformations. These networks are programmed to pick up new information and extract increasingly abstract representations of the data that are relevant as it travels through the levels. One of the primary advantages of DL is its capability to manage high-dimensional and complex data. This makes it especially suitable for tasks involving images, audio, text, and sequential data[9]. There are various DL architectures as shown in Fig. 2 and some significant architectures are disused in this section.

2.1 Convolutional Neural Network

The rise of CNNs as a useful technique for breast cancer prediction, revolutionizing medical imaging technology analysis. By automatically learning complex and discriminative features from mammograms and other imaging modalities, CNNs have shown promising results in assisting clinicians with early detection, diagnosis, and personalized treatment strategies[10]. CNNs are specifically designed for visual data processing and analysis, making them suitable for activities requiring medical imaging analysis, including mammography. These networks leverage the principles of local receptive fields and shared weights to automatically learn and extract hierarchical depictions of visual characteristics. Using a combination of convolutional and pooling operations, CNNs capture local patterns and spatial dependencies within mammograms, enabling them to discern subtle and complex characteristics indicative of breast cancer[11].

CNN’s basic structure is depicted in Fig. 3, which consists of multiple layers, each with a specific purpose. The raw image data is received by the initial layer, it will be further analyzed by convolutional layers in a sequence. The input is filtered by a layer of convolution, capturing local features and detecting patterns within the data. The activation function creates a network with non-linear behavior, allowing it to figure out intricate interconnections between characteristics. Pooling layers help to minimize the feature map resolution, minimizing the spatial dimensions and extracting the most important details. Each neuron in one layer is intertwined with each neuron in the following layer in a fully connected layer, providing superior thinking and classification based on the learned features. The network is trained to employ a Function of Loss, which calculates the gap between predicted results and the labels of reality. The parameters of the network are optimized by means of Reverse Propagation and gradient descent, reducing the function of loss during training.

The CNN's input and output capabilities are described by Eq. 1, while its global non-linearity and cost function are shown by Equations 2 and 3.

$${\text{a}}_{\text{i}\text{j}}^{\text{e}}={\sum }_{\text{x}=0}^{\text{n}-1}{\sum }_{\text{y}=0}^{\text{n}-1}{\text{W}}_{\text{x}\text{y}}{\text{b}}_{\left(1+\text{x}\right)\left(\text{j}+\text{y}\right)}^{\text{e}-1}$$

$${\text{b}}_{\text{i}\text{j}}^{\text{e}}={\sigma }\left({\text{a}}_{\text{i}\text{j}}^{\text{e}}\right)$$

$$\frac{\partial \text{E}}{{\text{W}}_{\text{a}\text{b}}}={\sum }_{\text{i}=0}^{\text{N}-\text{m}}{\sum }_{\text{j}=0}^{\text{N}-\text{m}}\frac{\partial \text{E}}{\partial {\text{x}}_{\text{i}\text{j}}^{\text{e}}}\frac{\partial {\text{x}}_{\text{i}\text{j}}^{\text{e}}}{\partial {\text{W}}_{\text{a}\text{b}}}={\sum }_{\text{i}=0}^{\text{N}-\text{m}}{\sum }_{\text{j}=0}^{\text{N}-\text{m}}\frac{\partial \text{E}}{{\text{x}}_{\text{i}\text{j}}^{\text{e}}}{\text{y}}_{\left(\text{i}+\text{a}\right)\left(\text{j}+\text{b}\right)}^{\text{e}-1}$$

2.1.1 Activation Functions

With regards to neural networks, the activation functions are the mathematical formulas for calculating results. Each node in the network performs the function because it is associated with it. According to how important every neuron’s input is to the model's prediction, it decides whether or not to fire (active). It's possible to normalize the result of each neuron to fluctuate between − 1 and + 1, or + 1 and 0, with the aid of activation functions. Since activation functions are calculated for dozens or millions of neurons for each data sample, they need to be computationally efficient. There is a lot of computational stress on the activation function and its derivative in current-day neural networks because of the backpropagation training method.

2.1.1.1 Sigmoid Activation Function-

With regards to neural networks, activation function of the sigmoid type is a prevalent activation function. It maps input values to the interval [0, 1] and is appropriate for binary classification problems. Sigmoid activation function is shown by Eq. 4.

$$\text{S}\text{i}\text{g}\text{m}\text{o}\text{i}\text{d}\left(\text{y}\right)=\frac{1}{1+{\text{e}}^{-\text{y}}}$$

2.1.1.2 Tanh Activation function-

Activation functions like the hyperbolic tangent (tanh) one map input values to output values to the interval [-1, 1]. It is commonly used in neural networks due to its zero-centered output, which alleviates the problem of vanishing gradients and enables improved training in deeper networks. Tanh activation function is represented by Eq. 5.

$$\text{t}\text{a}\text{n}\text{h}\left(\text{m}\right)=\frac{{\text{e}}^{\text{m}}-{\text{e}}^{-\text{m}}}{{\text{e}}^{\text{m}}+{\text{e}}^{-\text{m}}}$$

2.1.1.3 ReLU Activation Function-

Due to its simplicity and efficacy, ReLU activation function is extensively employed in DL. It transforms negative inputs to zero while leaving positive inputs as it is, resulting in quicker and more efficient training. The ReLU activation function is shown by Eq. 6.

$\text{f}\left(\text{p}\right)=\text{m}\left(0,\text{q}\right)$ , m() is maximum function (6)

Following LeNet-5, AlexNet, VGGNet, GoogLeNet, and ResNet are some influential architectures of CNN, each with its unique design choices and performance characteristics. These architectures have achieved Contemporary outcomes on standard data collections, showcasing the efficiency of CNNs in many image analysis tasks. Figure 4 shows some most prominent CNN architectures available in the literature.

LeNet, also known as LeNet-5, is a pioneering architecture for CNN created by Yann LeCun et al. in the 1990s[12]. It was specifically designed for handwritten digit recognition, marking one of the earliest uses of CNNs inside the confines of computer vision. The LeNet architecture in Fig. 5, includes seven layers: two convolutional layers, two subsampling layers, and three fully connected layers. The convolutional layers apply filters to get local characteristics from the input image, while the subsampling layers scale down the feature maps, minimizing spatial dimensions. The fully connected layers add the extracted features and do classification. LeNet was learned and tested on the MNIST dataset, a widely used reference data set comprising handwritten digit images. It achieved remarkable performance, demonstrating the efficacy of CNNs for image classification tasks. The LeNet architecture successfully created the path for future advancements in DL and inspired the development of more complex CNN architectures[13]. Modern CNN architectures, such as ResNet, VGGNet, and AlexNet, have built upon the foundations laid by LeNet, incorporating deeper and more complex structures to improve a cutting-edge quality of performance on diverse image datasets.

AlexNet is based on the structure of a deep CNN that had a major impact on revolutionizing visual technology for computers. It was initiated by Ilya Sutskever, Geoffrey Hinton, and Alex Krizhevsky in their groundbreaking article named "ImageNet Classification with Deep Convolutional Neural Networks” in 2012. AlexNet's architecture in Fig. 6, includes multiple layers like convolutional layers, max-pooling layers, and fully connected layers. It was created to process images parameterized by a huge number of variables, enabling it to learn complex and abstract features directly from raw pixel inputs. One of the key contributions of AlexNet was its introduction of several innovative techniques that significantly improved the performance of deep CNNs. These techniques include the usage of Rectified Linear Units (ReLU) as activation functions, local response normalization, overlapping pooling, and the implementation of dropout regularization. These activation functions help alleviate the vanishing gradient problem by introducing non-linearity to the network. Local response normalization aids in the normalization and contrast enhancement of the neural responses within the same local neighbourhood. Overlapping pooling reduces the spatial dimensions more effectively than traditional pooling techniques, enabling the network to learn more spatial hierarchies. Dropout regularization loses neurons at random when training, which reduces overfitting and improves generalization. AlexNet's effectiveness was measured in the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) in 2012. It obtained 15.3% mistake rate among the top five, which was significantly lower than the second-best competitor with an error rate of 26.2%[14].

ZFNet, (Zeiler and Fergus Network), is based on the principles of a CNN architecture shown in Fig. 7. It was introduced by Matthew D. Zeiler and Rob Fergus in 2013[15]. It is a variant of AlexNet, aimed at improving the performance of image classification tasks while maintaining a similar overall structure. One significant change in ZFNet is the adjustment of the filter sizes in the convolutional layers. By widening the range of detection of the filters in the initial layers and reducing it in the deeper layers, ZFNet aims to capture both local and global image features more effectively. This change helps to preserve more spatial information and aids in the accurate localization of objects within images. ZFNet achieved notable success in the ILSVRC in 2013, achieving 11.7% mistake rate among the top five. This performance improvement over the original AlexNet demonstrated the efficacy of the architectural modifications made in ZFNet.

VGGNet, developed in 2014 by Simonyan and Zisserman[16], is a specific kind of neural network that uses deep convolutions renowned for its depth and simplicity. VGGNet in Fig. 8 consists of multiple layers with small-sized filters, resulting in a highly expressive model capable of capturing intricate image features. VGGNet achieved impressive performance on the ImageNet dataset, becoming a benchmark for image classification tasks. Its modular and scalable design has influenced subsequent DL architectures, making it a prominent contribution to the field of computer vision.

ResNet, (Residual Network) shown in Fig. 9, is a groundbreaking neural network with deep convolutions architecture developed by He et al. in 2015[17]. ResNet introduced the concept of residual learning, which addresses the degradation problem that occurs when deeper networks suffer from diminishing performance. By utilizing residual connections, ResNet permits the training of extremely deep networks with better accuracy and ease of optimization. The skip connections in ResNet allow data to pass unimpeded from one layer to the next, facilitates the network to learn residual mappings. This breakthrough architecture achieved remarkable performance on various image recognition tasks and had a major effect on the development of DL models.

Two intuitions form the basis of ResNet.

1. Adding complexity and depth shouldn't reduce the error rate.

2. Keep training the residuals to bring the predicted and observed values closer together.

Residual Networks serve these functions.

$\text{p}=\text{F}\left(\text{q},{\text{W}}_{\text{j}}\right)$ +q (7)

$$\text{p}=\text{F}\left(\text{q},{\text{W}}_{\text{j}}\right)+ {\text{W}}_{\text{s}\text{q}}$$

Input and output vectors (x and y) are employed in Eq. 7 and Eq. 8 above.

GoogleNet, or Inception v1, is an architecture for a deep convolutional neural network shown in Fig. 10 introduced by Szegedy et al. in 2014[18]. It is characterized by its innovative inception modules, which utilize multiple parallel convolutional size-variable filters within the same layer. This design permits the network to capture local as well as global features efficiently. GoogleNet lowers the amount of inputs by a large margin compared to previous architectures while maintaining high accuracy, making it computationally efficient. The architecture achieved outstanding performance on the ImageNet dataset and has served as the foundation for subsequent versions of the Inception architecture.

DenseNet is a neural network structure based on deep convolutions, developed by Huang et al. in 2016[19]. It stands out by introducing dense connections, where all layers flow into one another in the same direction. This dense connectivity pattern enables a smooth exchange of data and encourages feature reuse across the entire network. DenseNet addresses the vanishing gradient problem and enables better gradient flow, resulting in improved training efficiency and accuracy. The architecture has shown cutting-edge performance on various image classification tasks while calling for fewer parameters compared to other models. DenseNet has made substantial contributions to the field of DL and has influenced subsequent network architectures. DenseNet architecture and DS Conv Block is shown in Fig. 11 and Fig. 12 respectively.

In the recent past, owing to the explosive development and refinement of the big data discipline, DL has become extremely widespread ML algorithm type. It is still evolving in light of novel performance for a number of ML tasks and has facilitated the advancement of numerous learning fields, including image super-resolution, object detection, and image recognition. Recently, DL performance on tasks such as image classification has surpassed human performance which is shown in Fig. 13.

2.2 Recurrent Neural Network

RNN in Fig. 14 is a particular kind of neural network architecture that excels in processing time series or natural language data sequentially. Deep RNNs are created with loops that allow them to persist information over time, enabling them to capture sequences of data, and their relationships and patterns. The key component of an RNN is the hidden state, which functions as a reminder and carries details from preceding actions to the current step. This hidden state allows RNNs to handle inputs of variable length and make predictions based on the context of the entire sequence. RNNs have found the need in various fields, including speech recognition, translation by machine, and sentiment analysis, and more[21]–[25]. However, they experience the vanishing/exploding gradient issue, which hinders their performance to identify long-term reliance. To deal with this, more advanced Variants of RNNs, including Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), exist[26], have been designed to mitigate the gradient issue and enhance the performance of RNNs.

2.3 Autoencoders

Autoencoders (or Auto associators) are a type of unsupervised learning technique and a class of neuronal systems in which the initial layer is identical to the layer of output[27], [28]. The input is first compressed into a visual form in latent space, then the final result is reconstructed from it. It is a lossy compression technique in which both the compression and decompression functions are learned automatically using instances of previously compressed data. They have been used to reduce the data's dimensionality[29] and as building elements of a deep multi-layer neural network[30]. An autoencoder uses a predetermined set of typically unlabelled inputs and then attempts to reconstruct them as accurately as possible from their encoded form. As an output, the network needs to operate as a feature extraction engine, prioritizing whatever aspects of the data are most crucial. Autoencoders are often quite simple, consisting of only a few layers at most (input, output, and hidden). Unlike the RBM, which has three layers, some autoencoder networks only have two.

Autoencoders have found applications in diverse areas, including dimensionality reduction, anomaly detection, image denoising, and generating new data samples through models include GANs and VAEs (variational autoencoders and adversarial networks)[29]. These networks can, for instance, compress a picture with a grid size of 28x28 pixels down to a representation of 30 integers. After applying the right weights and bias, the image can be reconstructed. To make the found patterns more stable, some networks also inject random noise at this point. There would be flaws in the final rebuilt image. However, depending on the robustness of the network, the result would be a good approximation. The data is compressed in an effort to reduce the amount of the input to a DL classifier. The time spent on this preprocessing is justified by the massive computing speedups that result from the reduced size of the inputs. Architecture of Autoencoders is shown in Fig. 15.

The Eq. 9 below illustrates the workings of Variational Autoencoders (VAEs).

$$\partial \left(\text{n}\right)={\sum }_{\text{c}\text{n}\left(\frac{\text{c}}{\text{a}}\right)}^{0}\left(\text{l}\text{g}\text{m}\left(\frac{\text{a}}{\text{c}}\right)\right)-\text{K}\text{L}\left(\text{n}\left(\frac{\text{c}}{\text{a}}\right)\left|\right|\text{p}\left(\text{c}\right)\right)$$

2.4 Restricted Boltzmann Machines

One variety of generative neural networks is the RBM which falls under the category of unsupervised learning algorithms[31]. RBM in Fig. 16, includes the top-level visible layer and the lower-level hidden layer. The nodes within each layer, but between layers, there are no links between nodes within the similar layer. This restricted connectivity allows for more efficient training of RBMs[32]. The top layer displays the input data, while the lower layer stores latent features or representations. RBMs can be seen as energy-based models, where learning the parameters is the main focus that minimizes the energy of the system. The energy function is defined based on the RBM's inherent biases and weights. The training phase involves tracking down the model parameters that best capture the true shape of the training data's distribution. This is typically achieved using contrastive divergence or other gradient-based optimization techniques. An important benefit of RBMs is their capacity to learn powerful feature representations without requiring labelled data. RBMs can automatically obtain actionable features from complex datasets, making them suitable for activities like dimension reduction and pretraining deep networks of neurons. RBMs have been extensively studied and applied in various domains. They have shown promising results in recommendation systems, several fields, including image recognition, NLP, and others[29], [32]–[38].

2.5 Self-Organizing Maps

SOMs, sometimes recognized as Kohonen maps, are self-organizing maps., are techniques for unsupervised neural networks deployed for visualizing and clustering high-dimensional data. They were developed in the 1980s by the Finnish professor Teuvo Kohonen[39], [40] and have since gained popularity in various domains. SOMs organize input data onto a 2-dimensional grid that maintains topological connections. One key advantage of SOMs is their capacity to handle intricate and non-linear relationships in the data, making them useful for exploratory analysis and data visualization. They also do not require labeled data for training, allowing for unsupervised learning. However, determining the optimal size and shape of the SOM can be challenging, and the algorithm is sensitive to initialization. In terms of efficiency, SOMs are computationally efficient during both training and inference phases. They have been used in many contexts, including image analysis, market research, bioinformatics, and recommendation systems, enabling tasks such as visualization, clustering, anomaly detection, and feature extraction Fig. 17 represents the architecture of self-organising map.

Breast cancer is the second largest form of cancer overall, and the leading malignancy among women. Breast cancers are those that begin in the breast itself, typically in the lobules that produce the milk that flows via the ducts. As the world's second-largest frequent non-skin carcinoma among women (after lung cancer) and the fifth leading reason of cancer death worldwide, accounting for 10.4% of all cancer diagnoses among women[11]. 13.5% (178361) of all cancer cases and 10.6% (90408) of cancer fatalities were attributable to breast cancer in India in 2020[41]. DNA and RNA in cancer cells are identical to, but not identical to, those of the organism from which they came. As a result, the immune system, especially if compromised, does not typically recognize them.

Cancer forms when either the immune system isn't functioning properly or when there are too many cells for the immune system to destroy. An unfavorable surrounding (due to radiation, toxins, etc.), bad food (unhealthy cell environment), individuals with genetic mutation predispositions, and people of old age (over 80) can all contribute to an abnormally high mutation rate in DNA and RNA[41], [42]. Several distinct malignancies can arise in the various breast tissues. Benign (not malignant) breast alterations cause the vast majority of tumours. Fibrocystic change is a noncancerous condition that can affect women and is characterised by cysts (collections of fluid), fibrosis (the production of scar-like connective tissue), lumpiness, areas of thickening, discomfort, or breast pain[43].

Some images of Benign classification of breast cancer shown in Fig. 18 which are taken from kaggle dataset of breakhis histology images.

Some images of Malignant classification shown in Fig. 19 of breast cancer which are taken from kaggle dataset of breakhis histology images

3.1 Detection of Breast Cancer

There are an estimated 1,152,161 breast cancer incidences that emerge every year, making it the most frequent disease in women as well as second-most common type of cancer[42]. It was predicted that 606,520 Americans (including men and women) would die due to cancer in 2020[45]. Current treatment strategies focus on finding and treating the condition early on. In the United States, where this method was developed, the 10-year survival rate is 85%. The survival rate is closely proportional to the stage upon diagnosis, with 98% for patients in stages 0 and I and 65% for those in stage III after ten years[41]. More people need to be discovered at an early stage to enhance survival in this disease. Therefore, in order to identify potential pitfalls, we analyzed both established and developing technologies for detecting and screening breast cancer. Specificity, sensitivity, usability, and population acceptability are all important factors in selecting a diagnostic tool (with regards to stress and delay), and low cost are the major characteristics of a reliable diagnostic tool. Cancer screening methods, and then continues on to explore cutting-edge methodologies[46].

3.2 Techniques of screening for Breast Cancer

Methodological, clinical, and ethical difficulties arise when attempting to evaluate screening techniques, especially in a community setting.

The best way to evaluate a novel screening test is through randomised clinical trials. Women who were randomly assigned to receive a novel screening test for breast cancer were compared with women who were given the standard of care.

However, it is challenging to carry out such trials. They need to track down tens of thousands of women for a minimum of 15 years. It is likely to be even more challenging to establish the added efficacy of new tests, given that the use of mammography in health found to be successful in some instances. Last but not least, it may be harder to determine screening's impact on Breast Cancer death rates given breast cancer treatment has improved through time[47]–[49].

Because of these obstacles, researchers generally focus on characterizing new screening assays before determining their impact on patient outcomes like breast cancer mortality. Sensitivity, specificity, safety, affordability, simplicity, patient and clinician acceptability, and so on are all crucial test properties. We summarise the research done on the various new testing modalities and the results. We also detail the research methodology and outcomes evaluated for each screening tool. Although cost-effectiveness analyses should be considered when contemplating a population-based diagnostic exam, they are not.

If a screening test is going to be utilized in a community environment, it needs to be evaluated there first.

Women having a greater than average chance of breast cancer, or those who present to a diagnostic setting with breast complaints or called anomalies in the breasts, are typically used to evaluate the test characteristics of new modalities. It's possible that a test's demonstrate significant levels of sensitivity and specificity in these high-risk female patients will differ from those reported when the same test is utilised in a general screening sample[50]. In this way, you can see whether or not a test has been studied for diagnosis, and if it has been studied for screening, whether or not the study was conducted on women who were deemed to be at an elevated risk, or on women who were not[51].Various Breast Cancer Screening Techniques are shown in Fig. 20.

3.2.1 Mammography

Common methods of breast cancer screening include self-examination, a medical professional's examination, and mammography detection[46], [52]. Digital mammograms have become the standard for diagnosing breast cancer. Instead of using X-rays, digital mammograms employ detectors made of solid state that transform X-rays into electrical impulses. Full-field digital mammography (FFDM) is other name for this technique. All digital cameras use the same kinds of detectors. Images of the breasts, created using electrical impulses, can then be shown[53].

In order to interpret mammograms, experts develop computer-aided detection (CAD) systems. Mammograms are typically interpreted by CAD systems, which then flag suspicious areas for further examination by a radiologist[54].

Faster R-CNN was the target of a proposed CAD program by Ribli et al[55]. This software can automatically determine if a tumor on a mammogram is malignant or benign. An end-to-end strategy for mammographic diagnostics was proposed by Wang et al[56]. The need for manual preparation was avoided by using this method. In one scenario, a new method centered on both a Multi-Instance (MI) and Multi-Scale (MS) system was introduced for the treatment of mammograms.

The MS section chooses the most essential characteristics of mammograms, while the MI module considers the big picture. The improved outcomes come from combining the output of various components.

By analyzing global mammographic picture characteristics, Heidari et al[57]. established a novel CADx technique. This study suggests that an advanced CADx mammography is feasible to develop an efficient worldwide method of processing images. This method is a significant improvement over its predecessors. Thermographic breast cancer screening was accomplished by Ekici et al[56]. using a CNN(Convolutional Neural Network).

Data collection, image processing, feature extraction, feature segmentation, and feature classification were the five methods used. In terms of accurate predictions, CNN is superior to other methods[58].

Table 2

Review of the researches based on mammography datasets
Dataset	Author	DL Technique	Performance Measure	Year
INbreast CBIS-DDSM	Shu et al.[59]	CNN	INbreast: Accuracy = 92.2% CBIS: Accuracy = 76.7%	2020
DDSM	Li et al.[60]	CNN-RNN (Recurrent Neural Network)	AUC = 0.968 Accuracy = 94.7%, Recall = 94.1%	2021
MIAS	Agnes et al.[61]	Multiscale All CNN	Accuracy = 96.47%	2020
DDSM	Boumaraf et al.[62]	DBN (Deep Belief Network)	Accuracy = 84.5%	2020
MIAS	Zhang et al.[63]	GNN (Graph Neural Network) + CNN	Accuracy = 96.1%	2021
INbreast DDSM-BCRP	Zhu et al.[64]	FCN + CRF	DDSM-BCRP: Dice = 91.3% INbreast: Dice = 90.97%	2018
MIAS CBIS-DDSM	Ahmed et al.[65]	DeepLab/mask RCNN	Mask RCNN: C: Accuracy = 98% S: MAP = 80% DeepLab: C: Accuracy = 95% S: MAP = 72%	2020
MIAS	Saber et al.[66]	Transfer learning/CNN	F-score = 99.3% Accuracy = 98.87%	2021
MIAS CBIS-DDSM Inbreast	Soleimani et al.[67]	CNN	INbreast: Dice = 96.39% CBIS: Dice = 97.69% MIAS: Dice = 97.59%	2020
INbreast CBIS-DDSM	Chen et al.[68]	Modified U-Net	CBIS: Dice = 82.16% INbreast: Dice = 81.64%	2020
INbreast DDSM	Al-antari et al.[69]	YOLO	S: INbreast: F1-score = 98.02% DDSM: F1-score = 99.28% C: DDSM: Accuracy = 97.5% INbreast: Accuracy = 95.32%	2020

Figure 21 presents some mammography images of breast cancer which are taken from kaggle subset dataset of Curated Breast Imaging DDSM Dataset.

3.2.2 Ultrasound

When it comes to finding malignancies, ultrasound is far superior to other methods, and it helps cut down on unnecessary biopsies[71]. Thus, it comes as no surprise that scientists employ such images in DL models for the detection of cancer

A GoogleNet[23]-based CNN, for instance, has been educated on the potentially malicious ROIs of Photographs of the United States. The AUC for the suggested method in[78] is 96%, which is 6% better than the CAD-based method using manually generated features[72]–[74].

Datasets of US pictures are scarcer and often comprise fewer photos than mammography datasets. As a result, most suggested DL models employ some type of data augmentation approach, such as rotation, to expand the number of training data and enhance the working of the model. However, care must be taken when enhancing US images, as doing so incorrectly can reduce the model's accuracy. It has been demonstrated, for instance, that conducting the shifting or turning of the image along the horizontal axis might have detrimental effects on the performance of the model[74]. Synthetic ultrasound (US) images, with or without tumors, can be generated with the use of generative adversarial networks (GANs). The accuracy of the model can be enhanced by including these pictures in the training set.

Some techniques have integrated the processes of categorization and lesion detection in ultrasound images[75]. U.S. Image Detection and Classification Using a Variety of DL Architectures are compared in depth[76]. Accuracy levels of 85% for entire picture classification and 87.5% for pre-defined ROIs demonstrate that the DenseNet is a promising option for US image classification study. The authors in[77] trained VGG16, ResNet34, and GoogleNet on a dataset of 1000 unlabelled US pictures to create a weakly supervised DL system. They found an AUC of 88% on average.

Some studies verify DL algorithm[78]–[80] performance with expert inference, demonstrating DL algorithms' usefulness to radiologists. Most commonly, this occurs after an expert has recognized a lesion and uses a DL model to categorise it. In contrast to mammography research, however, the majority of these studies do not demonstrate the generalizability of their approach on numerous datasets or undergo independent physician validation.

Table 3

Review of the researches based on ultrasound datasets
Dataset	Year	Author	DL Technique	Performance measure
OASBUD	2019	Byra et al.[81]	Transfer learning on InceptionV3 and VGG-19	VGG19: AUC = 0.822 InceptionV3: AUC = 0.857
SNUH BUSI	2020	Moon et al.[82]	ResNet+ VGGNet + DenseNet (Ensemble loss)	SNUH: Accuracy = 91.1% AUC = 0.9697 BUSI: Accuracy = 94.62% AUC = 0.9711
BUSI	2020	Vakanski et al.[83]	CNN	Accuracy = 98% Dice score = 90.5%
Mendeley UDIAT	2020	Singh et al.[84]	CNN	UDIAT: Dice = 86.82% Mendeley: Dice = 0.9376
1-Ultrasoundcases.info and BUSI 2- UDIAT 3- Radiopaedia	2021	Wang et al.[85]	Residual Feedback Network	1-Dice = 86.91% 2-Dice = 81.79% 3-Dice = 87%
Private	2019	Byra et al.[74]	VGG 19 by Transfer learning	AUC = 0.936
Ultrasoundcases.info BUSI STUHospital	2021	Wang et al.[86]	CNN	BUSI: Dice = 83.76% Ultrasoundcases: Dice = 84.71% STUHospital: Dice = 86.52%
Private	2020	Fujioka et al.[87]	CNN	AUC = 0.87
Private	2020	Wu et al.[88]	Random Forest	Accuracy = 86.97%
Private	2020	Gong et al.[89]	Multi-view Deep Neural Network Support Vector Machine (MDNNSVM)	AUC = 0.908 Accuracy = 86.36%
Private	2022	Byra et al.[90]	Y-Net	S: Dice = 64.0% C: AUC = 0.87

Some ultrasound images shown in Fig. 22 of breast cancer classification which are taken from kaggle dataset.

Benign image Malignant image normal image

Figure 22 Images of ultrasound[91].

3.2.3 Magnetic Resonance Imaging

In MRI, DL is most often utilized to perform or assist with the categorization, detection, and segmentation of breast lesions, much like DM, DBT, and US. However, the dimensionality of an MRI scan sets it apart from these other modalities. MRI generates 3D scans, while DM, DBT, and US only generate 2D images. Dynamic contrast-enhanced (DCE) MRI sequences, which track temporal changes such as the introduction and removal of contrast agents, take MRI to a fourth dimension. Most DL models built outside of the medical field are geared for working with 2D images, which can cause problems when applied to these 3D or 4D MRI scans.

Several potential answers to this problem have been offered. The most frequent way is to transform the 3D images to 2D so that regular 2D DL models may be used. Both the maximum intensity projection (MIP) and slicing the 3D MR image into 2D slices can be used for this purpose. Many of the industry-standard DL models, however, were originally built for color images, such as those with three channels for red, green, and blue (RGB). These models require a three-dimensional image as input, with the extra dimension provided by the three color channels. Due to the monochrome nature of MR scans, 3 slices or MIPs can be joined to form a single input image.

This opens the door to the use of several postcontrast slices or MIPs in a single input image, or the use of 3 consecutive slices to form an input image of semi-3D MRI.

Some more approaches include using real-world 3D MRI scans, extending the use of 2D DL models to 3D data, or turning to dedicated 3D DL models like DenseNet. Artificial intelligence (AI) algorithms that use the full 3D or 4D breast MRI data set are hypothesized to outperform baseline methods that resort to dimension reduction.

Each of the aforementioned methods has been used at some point in lesion categorization research using MRI. A number of research teams fed in 2D slices of the ROIs[92]–[94]. They were able to get AUCs between 0.908 and 0.991. Using MIPs, other research[95] has found AUCs of 0.88 and 0.895. Researchers found AUCs between 0.84 and 0.92 when using the three RGB channels in conjunction with multiple post-contrast slices[96]–[98]. Last but not least, research using genuine 3D MRI images has yielded AUC values of 0.852 and 0.859[98], [99]. All of the aforementioned studies used different datasets, so while the areas under the curves are comparable and may even appear to decrease as one moves from conventional 2D methods to improved 3D animation techniques, it is mandatory to notice that these quantities are unable to be compared with each other. However, studies comparing their results with radiologists' interpretations did use a variety of methods[97]–[99].

The particularity of AI models was greater than the rate for radiologists, on average, although their sensitivity was around the same or even lower.

Table 4

Review of the researches based on MRI datasets
Dataset	Year	Author	DL Technique	Performance measure
TCIA	2020	Zheng et al.[100]	CNN	Acc = 97.2%
Private	2022	Liu et al.[101]	Weakly ResNet-101	Accuracy = 94% AUC = 0.92
Private	2022	Wu et al.[102]	CNN	Acc = 87.7% AUC = 91.2%
QIN Breast DCE-MRI	2021	Carvalho et al.[103]	SegNet and UNet	IOU = 95.3% Dice = 97.6%
Private	2021	Wang et al.[104]	CNN	Dice = 76.4%
TCGA-BRCA	2022	Khaled et al.[105]	3D U-Net	Dice = 68%
Private	2022	Zhu et al.[106]	V-Net	C: Avg. AUC = 0.84 S: Dice = 86%
Private	2022	Rahimpour et al.[107]	3D U-Net	Dice = 78%
Private	2022	Yue et al.[108]	Res_U-Net	Dice = 89%
Private	2021	Dutta et al.[109]	Multi-contrast D-R2UNet	F1 score = 95%
Private	2022	Verburg et al.[110]	CNN	AUC = 0.83

3.2.4 Digital Breast Tomosynthesis:

DBT has become the standard breast imaging technique because of its excellent rates of cancer detection. The rate of cancer detection (CDR) is higher with DBT compared to FFDM, although the recall rate (RR) is lower[111]–[113]. Several DL algorithms for cancer detection on DBT images have been proposed, following the same logic[61], [114]–[117]. For instance, to determine if an image is normal, benign, high-risk, or cancerous, scientists in[118] developed a ResNet - based DL model. The model was initially learned using the FFDM dataset and subsequently fine-tuned using 2D reconstructions of DBT images gathered using the Highest Density Emission in two dimensions technique. On the DBT dataset, their method had an AUC of 84.7%. In order to categorize large datasets, a deep CNN was designed[114] that makes use of DBT volumes. The AUC for their proposed procedure was 84.7%, that is around 2% better than the standard CAD technique using manually created features.

Medical image analysis is one area where DL models excel, but they have a significant limitation: a lack of suitable training datasets. Data collection and labeling is a costly endeavor in the medical industry. Some research has attempted to address this issue by employing transfer learning. The authors[119] of the study created a two-step transfer learning strategy to determine whether or not DBT images should be classified as mass or normal. Before moving on to train a model with DBT photos, the authors modified a pre-trained AlexNet model with FFDM data. The DBT pictures' features were retrieved using a model of CNN in its second phase, and then the random forest classifier decided if the features were abnormally large or normal. Their AUC on the test dataset was 90%. To classify FFDM and DBT images as cancerous or benign, the authors of another study[120] employ a VGG19 network learned on the ImageNet dataset to get the features.

An SVM was then used to evaluate the probability of malignancy depend on the retrieved features. On the DBT images, their approach achieved an AUC of 98% in the CC view and 97% in the MLO view. These techniques demonstrate that DL models can achieve satisfactory results even with a modest training dataset and the incorporation of transfer learning strategies. The majority of these researches contrast their DL algorithms with more conventional CAD techniques. However, direct comparison with a radiologist is the gold standard for assessing a DL method's efficacy. The effectiveness of DL models on DBT And FFDM, for instance, has been studied. This research demonstrates that a DL system can reduce recall rates for FFDM images while maintaining or improving sensitivity to that of radiologists. Even if the recall rate has improved, an AI system can achieve the same behavior as radiologists on DBT images.

Table 5

Review of the researches based on DBT datasets
Dataset	Year	Author	DL Technique	Performance measure
DBTex challenge	2022	Hossain et al.[121]	CNN	Avg. sensitivity = 0.84
DBTex challenge	2022	Hossain et al.[122]	CNN	Avg. sensitivity = 0.815
Private	2022	Buda et al.	CNN	Sensitivity = 65%
BCS-DBT Private	2022	Bai et al.[123]	GCN (Graph Convolutional Network)	Acc = 84% AUC = 0.87
VICTRE	2022	Mota et al.[124]	CNN	AUC = 0.941
Private	2021	Matthews et al.[125]	Transfer learning based on ResNet	AUC = 0.9
Private	2020	Singh et al.[118]	CNN	AUC = 0.85
DBTex challenge	2021	Shoshan et al.[126]	CNN	Avg. sensitivity = 0.91

4.1 Questions in mind of researchers:

There are various questions in the mind of the researchers when they are about to write any research paper which basically includes 5 w’s- Who, What, When, Where, and Why.

Some basic questions are shown below.

Q1 - Breast Cancer: What is it?

Q2 – What classification of DL Approaches are there for detecting breast cancer?

Q3 - How important is DL in making an early breast cancer diagnosis?

Q4 - What types of screening techniques are used to find breast cancer?

Q5 - What are the benefits and drawbacks of screening approaches?

Q6 - What are the survival rates of breast cancer?

Q7 - How is DL for diagnosing breast cancer different from other methods?

Q8 - From which site, the dataset is used?

4.2 Usage of Datasets:

The computational algorithms for detecting breast cancer need to be trained on many different datasets. The datasets cover a wide range of topics. There are some datasets with few characteristics and tuples. While some datasets may have a smaller count of characteristics, they may have many more tuples.

There are various datasets available for breast cancer classification and its detection like MIAS, DDSM, INbreast, US1, US2, BUSI and many more.

Some datasets are publicly available and some are available only at demand.

4.3 PRISMA flow diagram:

The Prisma flow chart shown in Fig. 23 which illustrates the work and research done for preparing this review paper which further helps in future works.

William Lotter et al.[127], Over 600,000 lives of people will be lost to breast cancer in 2018 year alone. Currently, DL is being employed in mammography to raise standards and expand availability. This method surpasses experts in the area of mammography research five times out of five, reaches a high level of performance in mammography classification, generalizes effectively to a population with few screening rates, and finds tumours in clinically negative past mammograms.

Bryar Shareef et al.[128] offered a new structure for DL called “dubbed Small Tumour-Aware Network (STAN)” to better detect small breast tumors, which makes use of both contextual information and high-resolution picture attributes. This study uses two publicly available breast ultrasonography datasets to validate the suggested approach utilizing seven quantitative criteria. The suggested method segmented tiny breast tumors better than cutting-edge methods.

Ruchi Chauhan et al.[129] In this study, DL is applied to analyze Histopathology images of Breast Cancer and identify genetic indicators based on the underlying morphology. It outperforms prior works by a range of 0.02 to 0.13 points on the AUROC scale. The presence of lymphocytes and karyorrhexis are two more findings that can be used as hypotheses for future research. Additionally, this entirely automated workflow can be expanded to include additional activities for various cancer subtypes.

Daniel G.P. Petrini et al.[130] Studies have proven that deep CNN can detect breast cancer in mammograms as well as or better than human experts. This study suggests a third form of knowledge transfer approach to create a "two-view classifier" from a combination of craniocaudal and mediolateral oblique images. By making use of 5-fold cross-validation, the method had an AUC of 0.9344, and while using the actual training/test split, it had an AUC of 0.8483.

Karthik et al.[131] Classification accuracy on the BreakHis dataset was 99.55 percent using the proposed hybrid of two CNN structures combined with paying Attention in Both Time and Space for breast cancer classification.

Shu et al.[59] Mammography is the most popular screening tool for breast cancer, which is one of the most often detected solid cancers. For the purpose of model training and assessment, conventional ML techniques typically call for manually annotated segmentation data. A whole image from beginning to end mammography classification system using advanced neural network technology was suggested with the goal of decreasing both the radiologists' workload and the associated costs. The proposed pooling structures have shown promising outcomes on mammographic imaging data in experimental evaluations on the INbreast and CBIS datasets.

Malebary et al.[132] The use of DL techniques in mammography processing is helping to save money for radiologists. Although existing CNN-based breast mass classification systems have improved in performance they still have limitations in areas such as semantic feature ignorance, analysis, missing patches, and ambiguity in segmentation. To address these concerns, this study offers BMC, a unique Breast Mass Classification system with an enhanced architecture built on RNN with Long Short-Term Memory and k-means clustering, CNN, random forest, and boosting approaches. A 0.97% sensitivity, 0.98% specificity, 0.96% F-measure, and 0.95% accuracy were achieved, for the proposed BMC system. Classification precision was also evaluated using the Confidence interval statistical test.

Mota et al.[124] This research aims to apply CNNs to the task of automatically classifying a full DBT (Digital Breast Tomosynthesis) image to indicate microcalcification clusters' presence or absence and to assess their efficacy in doing so. This study includes training four widely used deep CNNs and then comparing them to a new design which is presented by this study. The GoogLeNet had the highest AUC (94.19%) and the CNN-a had the lowest (91.17%) when comparing their respective performances using the AUC. This study has the potential to significantly advance clinical DBT image analysis.

Atrey et al.[133] Using a novel dynamic bimodal data warehouse and a trained convolutional neural network, the authors of this article create a CAS (Computer Aided Segmentation) system for breast lesion diagnosis in mammography and ultrasound. The findings demonstrate that ultrasound can be utilized to complement mammography in segmenting breast lesions. Jaccard Index (JI) values of 0.53 (for MG) and 0.64 (for US) and Dice Similarity Coefficients (DSC) of 0.64 (for MG) and 0.77 (for US) suggest that the US can be utilized as a supplemental technique to MG in the segmentation of breast lesions.

Li et al.[134] To enhance breast cancer detection on mammograms, this research creates a CNN-based approach for analyzing images from both sides. Both a registration network and a Siamese-Faster-RCNN network make up this system. After being tested on three different data sets, the suggested approach was shown to have a 0.88 TPR and 1.12 FPs/I. Compared to other unbiased detection of masses approaches, other schemes for Bilateral Connections, and bilateral image level fusion systems, the suggested method outperforms its competitors in experimental settings.

Bai et al.[123] In this study, Graphs and a self-attention graph CNN model were used to built a novel model for improved detection of malignant 3D mammography images. In comparison to other reference models, it exhibited superior accuracies, precisions, sensitivities, F1, and area under the curve.

Li Shen et al.[135] Accurate screening mammography for the identification of breast cancer is now possible because of the development of DL algorithms. Datasets for training with either full clinical annotation or just the cancer status (label) of the whole image are used in this "end-to-end" method. The best single model on an external test set produced an AUC of 0.88 for each image while averaging among four models raised the AUC (area under the receiver operating characteristic curve) to 0.98. This strategy has the potential to significantly enhance clinical tools, hence decreasing the occurrence of both erroneous positive and negative conclusions screening mammography outcomes.

Ahmed et al.[65] The purpose of this research is to design a computer-assisted system that can identify, categorize, and segment the malignant zone in mammograms using DL. The suggested pre-processing approach will eliminate sources of false positives such as noise, artifacts, and muscle areas. This study applies advanced DL techniques instance segmentation frameworks on top of two publicly available breast cancer datasets. With a sample size of 150 cases, AUC, or Area Under the Curve, of a Receiving Device for mask RCNN was 0.98, and for deep lab, it was 0.95. Accuracy from radiologists varied between 0.80 and 0.88.

Farzin Negahbani et al.[136] Progression of tumors and response to treatment in Breast Cancer have been predicted using Ki-67 and TILs. However, in order for DL techniques to work, annotated data is required. This study recommends the SHIDC-BC-Ki-67 dataset and present PathoNet as the supporting infrastructure. In terms of harmonic mean measure, PathoNet has higher performance compared to cutting-edge techniques.

Zakaria Senousy[137] In this study, an impressive 98.11% accuracy was achieved by the innovative CNN MCUa (Multi-level Context and Uncertainty aware) using a database of histological images of breast cancer by capitalizing on the strong sensitivity to multi-level contextual information. Experimental findings demonstrate the suggested solution's greater efficacy in comparison to cutting-edge histopathology classification models.

Table 6

Dataset links
Dataset	Link	Type	Description
WSI-level annotations	https://portal.gdc.cancer.gov	whole slide images (WSI)	2535 + 1014 images
BreaKHis	http://web.inf.ufpr.br/vri/breast-cancer-database	biopsy images	7909 breast tumor tissue microscopic images
MIAS	http://marathon.csee.usf.edu/Mammography/Database.html	Mammography images	322 mammographic images
DDSM	http://www.eng.usf.edu/cvprg/Mammography/Database.html	Mammography images	set of 500 mammogram images
TCGA dataset	https://github.com/javadnoorb/HistCNN	whole slide images (WSI)	1054 images
PCam (Patch CAMELYON)	https://github.com/basveeling/pcam	Images	327.680 color images (96 x 96px)
CBIS-DDSM	http://www.ebi.ac.uk/biostudies/studies/S-EPMC5735920?xr=true	Images	3513 images

The tumors are found and characterized using a number of screening methods. Screening methods don't stop cancer from developing, but they can help catch it early so patients can be prepared for treatment. The benefits and drawbacks of each screening method vary. Table 7 represents the associated advantages as well as disadvantages of various screening techniques for breast cancer detection.

The following table compares Breast cancer detection methods in light of their respective benefits and drawbacks. The precision, responsiveness, and specificity of a system are quantitative measures of its effectiveness. For optimal effectiveness, these metrics should be relatively high in value. Figure 24 represents the sensitivity comparison of various screening techniques.

Since getting a breast cancer diagnosis can be unpleasant, many women avoid getting one. Invasive procedures are typically used in Breast cancer detection tests. Breast cancer detection methods that do not need to puncture the skin are needed.

When one's body is subjected to ionizing radiation or radioactive rays, safety becomes a paramount concern. Some patients may be sensitive to or should avoid screenings that employ the use of damaging waves. A system's design should prioritize the utilization of non-ionizing radiations and the provision of reliable results[138].

6.1 Performance Measures-

6.1.1 Sensitivity-

The Sensitivity evaluates the classifiers' performance at identifying positive ROIs by

$$Sensitivity=\frac{TP}{TP+FN}$$

where TP implies positive result

FN implies negative result

6.1.2 Accuracy-

The percentage of the variance between predicted and observed synergy scores within the error margin.

A =$\frac{tp+tn}{tp+tn+fp+fn}$*100 (11)

where tp implies "true positive."

tn implies “true-negative”

fp implies “false-positive”

fn implies “false-negative”

6.1.3 F1-Score-

F1 scores combine recall and precision to evaluate binary classification models. For imbalanced datasets, larger values indicate better model performance in balancing precision and recall.

Table 7

Advantages and Disadvantages of Screening Techniques
Screening Technique	Advantages	Disadvantages	Sensitivity
Mammography	efficient in respect of both time and average radiation dose	high cost, Poorer spatial resolution; X-rays required;	87%
Ultrasound	cheaper and employs the utilization of sound waves	Low sensitivity; cannot distinguish between different tumour kinds.	80.1%
MRI	low background radiation, High sensitivity and excellent tissue distinction.	Low specificity, expensive, and intrusive process	92%
Digital Breast Synthesis	better accuracy, fewer false positives	requires more time to read and space to store	90%

Early cancer diagnosis increases the likelihood of survival and decreases the number of deaths caused by the disease. Latest studies on detection of breast cancer with DL in imaging modalities were analysed for this review. Several popular DL methods have been employed to the problem of diagnosing breast cancer in the published literature. It include CNNs, RNNs, Autoencoders, Self-Organizing Maps, and Restricted Boltzmann Machines. People's lives can be saved and treatments can be made more effective through breast cancer screening.

Conflict of interest

The author declares that there is no conflict of interest.

Mohammed MA, Al-Khateeb B, Rashid AN, Ibrahim DA, Abd MK, Ghani, Mostafa SA (2018) “Neural network and multi-fractal dimension features for breast cancer classification from ultrasound images,” Computers & Electrical Engineering, vol. 70, pp. 871–882, Aug. 10.1016/J.COMPELECENG.2018.01.033
Bahramiabarghouei H, Porter E, Santorelli A, Gosselin B, Popovíc M, Rusch LA (2015) Flexible 16 antenna array for microwave breast cancer detection. IEEE Trans Biomed Eng 62(10). 10.1109/TBME.2015.2434956
Xu J et al (2016) Stacked sparse autoencoder (SSAE) for nuclei detection on breast cancer histopathology images. IEEE Trans Med Imaging 35(1). 10.1109/TMI.2015.2458702
Torre LA, Bray F, Siegel RL, Ferlay J, Lortet-Tieulent J, Jemal A (2015) Global cancer statistics, 2012. CA Cancer J Clin 65(2). 10.3322/caac.21262
Chaurasia V, Pal S (2020) Applications of Machine Learning Techniques to Predict Diagnostic Breast Cancer. SN Comput Sci 1(5). 10.1007/s42979-020-00296-8
Ray R, Abdullah AA, Mallick DK, Ranjan Dash S (2019) “Classification of Benign and Malignant Breast Cancer using Supervised Machine Learning Algorithms Based on Image and Numeric Datasets,” in Journal of Physics: Conference Series, 10.1088/1742-6596/1372/1/012062
Biswas N, Uddin KMM, Rikta ST, Dey SK (2022) A comparative analysis of machine learning classifiers for stroke prediction: A predictive analytics approach. Healthc Analytics 2. 10.1016/j.health.2022.100116
Zhang L, Wang S, Liu B (2018) Deep learning for sentiment analysis: A survey. Wiley Interdiscip Rev Data Min Knowl Discov 8(4). 10.1002/widm.1253
Shinde PP, Shah S (2018) “A Review of Machine Learning and Deep Learning Applications,” in Proceedings – 2018 4th International Conference on Computing, Communication Control and Automation, ICCUBEA 2018. 10.1109/ICCUBEA.2018.8697857
Madani M, Behzadi MM, Nabavi S (2022) The Role of Deep Learning in Advancing Breast Cancer Detection Using Different Imaging Modalities: A Systematic Review. Cancers 14 no. 21. MDPI, Nov. 01. 10.3390/cancers14215334
Balkenende L, Teuwen J, Mann RM (2022) “Application of Deep Learning in Breast Cancer Imaging,” Seminars in Nuclear Medicine, vol. 52, no. 5. W.B. Saunders, pp. 584–596, Sep. 01, 10.1053/j.semnuclmed.2022.02.003
Zhang Z-K, Liu C, Zhang Y-C, Zhou T (1990) Handwritten Digit Recognition with a Back-Propagation Network, Adv Neural Inf Process Syst,
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, 10.1109/5.726791
Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60(6). 10.1145/3065386
Zeiler MD, Fergus R (2014) “Visualizing and understanding convolutional networks,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 10.1007/978-3-319-10590-1_53
Simonyan K, Zisserman A (2015) “Very deep convolutional networks for large-scale image recognition,” in 3rd International Conference on Learning Representations, ICLR - Conference Track Proceedings, 2015
He K, Zhang X, Ren S, Sun J (2016) “Deep residual learning for image recognition,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 10.1109/CVPR.2016.90
Szegedy C et al (2015) “Going deeper with convolutions,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 10.1109/CVPR.2015.7298594
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) “Densely connected convolutional networks,” in Proceedings – 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, 10.1109/CVPR.2017.243
Alzubaidi L et al (2021) Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. J Big Data 8(1). 10.1186/s40537-021-00444-8
Graves A, Mohamed AR, Hinton G (2013) “Speech recognition with deep recurrent neural networks,” in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 10.1109/ICASSP.2013.6638947
Graves A, Liwicki M, Fernández S, Bertolami R, Bunke H, Schmidhuber J (2009) A novel connectionist system for unconstrained handwriting recognition. IEEE Trans Pattern Anal Mach Intell 31(5). 10.1109/TPAMI.2008.137
Mikolov T, Karafiát M, Burget L, Jan C, Khudanpur S (2010) “Recurrent neural network based language model,” in Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010, 10.21437/interspeech.2010-343
Zhang B, Xiong D, Su J (2016) “Recurrent Neural Machine Translation,” Emnlp,
Johnson J, Karpathy A, Fei-Fei L (2016) “DenseCap: Fully convolutional localization networks for dense captioning,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 10.1109/CVPR.2016.494
Hochreiter S, Schmidhuber J, Memory “LongShort-Term (1997) ” Neural Comput 9(8). 10.1162/neco.1997.9.8.1735
Bengio Y (2009) “Learning deep architectures for AI,” Foundations and Trends in Machine Learning, vol. 2, no. 1, 10.1561/2200000006
Vincent P, Larochelle H, Bengio Y, Manzagol PA (2008) “Extracting and composing robust features with denoising autoencoders,” in Proceedings of the 25th International Conference on Machine Learning, 10.1145/1390156.1390294
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Sci (1979) 313(5786). 10.1126/science.1127647
Bengio Y, Lamblin P, Popovici D, Larochelle H (2007) Greedy layer-wise training of deep networks. Adv Neural Inform Process Syst. 10.7551/mitpress/7503.003.0024
Ackley DH, Hinton GE, Sejnowski TJ (1985) A learning algorithm for boltzmann machines. Cogn Sci 9(1). 10.1016/S0364-0213(85)80012-4
Hinton GE (2002) Training products of experts by minimizing contrastive divergence. Neural Comput 14(8). 10.1162/089976602760128018
Kivinen JJ, Williams CKI (2012) Multiple texture boltzmann machines, in J Mach Learn Res,
Larochelle H, Bengio Y (2008) “Classification using discriminative restricted boltzmann machines,” in Proceedings of the 25th International Conference on Machine Learning, 10.1145/1390156.1390224
Mohamed AR, Hinton G (2010) “Phone recognition using restricted boltzmann machines,” in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 10.1109/ICASSP.2010.5495651
Schmah T, Hinton GE, Zemel RS, Small SL, Strother S (2008) “Generative versus discriminative training of RBMs for classification of fMRI images,” in Advances in Neural Information Processing Systems 21 - Proceedings of the 2009
Tang Y, Salakhutdinov R, Hinton G (2012) “Robust Boltzmann Machines for recognition and denoising,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 10.1109/CVPR.2012.6247936
Fischer A, Igel C (2014) Training restricted Boltzmann machines: An introduction. Pattern Recognit 47(1). 10.1016/j.patcog.2013.05.025
Kohonen T (1982) Self-organized formation of topologically correct feature maps. Biol Cybern 43(1). 10.1007/BF00337288
Kohonen T (1998) The self-organizing map. Neurocomputing 21:1–3. 10.1016/S0925-2312(98)00030-7
Screening C, “Cancer Screening in the United States (2009), : Cancer Screening,” Cancer Control, vol. 59, no. 1, 2009, 10.3322/caac.20008.Available
Kamangar F, Dores GM, Anderson WF (2006) Patterns of cancer incidence, mortality, and prevalence across five continents: Defining priorities to reduce cancer disparities in different geographic regions of the world. J Clin Oncol 24:14. 10.1200/JCO.2005.05.2308
Sharma GN, Dave R, Sanadya J, Sharma P, Sharma KK, VARIOUS TYPES AND MANAGEMENT OF BREAST CANCER: AN OVERVIEW “,” J Adv Pharm Tech Res, vol. 1, no. 2, [Online]. Available: www.japtr.org
“kaggle breakhis dataset (2023) ” https://www.kaggle.com/datasets/ambarish/breakhis (accessed Jul 27,
Jemal A, Siegel R, Ward E, Hao Y, Xu J, Thun MJ (2009) “Cancer Stat 2009 ” CA Cancer J Clin 59(4). 10.3322/caac.20006
Brooks AD et al (2009) “Modern breast cancer detection: A technological review,” International Journal of Biomedical Imaging, vol. 2009. 10.1155/2009/902326
Peto R, Boreham J, Clarke M, Davies C, Berai C (2000) “UK and USA breast cancer deaths down 25% in year 2000 at ages 20–69 years,” Lancet, vol. 355, no. 9217. 10.1016/S0140-6736(00)02277-7
Abe O et al (1998) Tamoxifen for early breast cancer: An overview of the randomised trials. The Lancet 351(9114). 10.1016/S0140-6736(97)11423-4
Abe O et al (1998) Polychemotherapy for early breast cancer: An overview of the randomised trials. The Lancet 352(9132). 10.1016/S0140-6736(98)03301-7
Barlow WE et al (2002) Performance of diagnostic mammography for women with signs or symptoms of breast cancer. J Natl Cancer Inst 94(15). 10.1093/jnci/94.15.1151
Elmore JG, Armstrong K, Lehman CD, Fletcher SW “Screening for Breast Cancer.” [Online]. Available: https://jamanetwork.com/
Gupta A, Shridhar K, Dhillon PK (2015) A review of breast cancer awareness among women in India: Cancer literate or awareness deficit? Eur J Cancer 51(14). 10.1016/j.ejca.2015.07.008
Nam KJ et al (2015) Comparison of full-field digital mammography and digital breast tomosynthesis in ultrasonography-detected breast cancers. Breast 24(5). 10.1016/j.breast.2015.07.039
Abdelhafiz D, Yang C, Ammar R, Nabavi S (2019) Deep convolutional neural networks for mammography: Advances, challenges and applications. BMC Bioinformatics 20. 10.1186/s12859-019-2823-4
Ribli D, Horváth A, Unger Z, Pollner P, Csabai I (2018) Detecting and classifying lesions in mammograms with Deep Learning. Sci Rep 8(1). 10.1038/s41598-018-22437-z
Ekici S, Jawzal H (2020) Breast cancer diagnosis using thermography and convolutional neural networks. Med Hypotheses 137. 10.1016/j.mehy.2019.109542
Heidari M, Mirniaharikandehei S, Liu W, Hollingsworth AB, Liu H, Zheng B (2020) Development and Assessment of a New Global Mammographic Image Feature Analysis Scheme to Predict Likelihood of Malignant Cases. IEEE Trans Med Imaging 39(4). 10.1109/TMI.2019.2946490
Rautela K, Kumar D, Kumar V, “A Systematic Review on Breast Cancer Detection Using Deep Learning Techniques (2022),” Archives of Computational Methods in Engineering, vol. 29, no. 7. Springer Science and Business Media B.V., pp. 4599–4629, Nov. 01, 10.1007/s11831-022-09744-5
Shu X, Zhang L, Wang Z, Lv Q, Yi Z (2020) Deep Neural Networks with Region-Based Pooling Structures for Mammographic Image Classification. IEEE Trans Med Imaging 39(6). 10.1109/TMI.2020.2968397
Li H, Niu J, Li D, Zhang C (2021) Classification of breast mass in two-view mammograms via deep learning. IET Image Process 15(2). 10.1049/ipr2.12035
Kooi T, Karssemeijer N (2017) Classifying symmetrical differences and temporal change for the detection of malignant masses in mammography using deep neural networks. J Med Imaging 4(04). 10.1117/1.jmi.4.4.044501
Boumaraf S, Liu X, Ferkous C, Ma X (2020) “A New Computer-Aided Diagnosis System with Modified Genetic Feature Selection for BI-RADS Classification of Breast Masses in Mammograms,” Biomed Res Int, vol. 2020, 10.1155/2020/7695207
Zhang YD, Satapathy SC, Guttery DS, Górriz JM, Wang SH (2021) Improved Breast Cancer Classification Through Combining Graph Convolutional Network and Convolutional Neural Network. Inf Process Manag 58(2). 10.1016/j.ipm.2020.102439
Zhu W, Xiang X, Tran TD, Hager GD, Xie X (2018) “Adversarial deep structured nets for mass segmentation from mammograms,” in Proceedings - International Symposium on Biomedical Imaging, 10.1109/ISBI.2018.8363704
Ahmed L, Iqbal MM, Aldabbas H, Khalid S, Saleem Y, Saeed S (2020) Images data practices for Semantic Segmentation of Breast Cancer using Deep Neural Network. J Ambient Intell Humaniz Comput. 10.1007/s12652-020-01680-1
Saber A, Sakr M, Abo-Seida OM, Keshk A, Chen H (2021) A Novel Deep-Learning Model for Automatic Detection and Classification of Breast Cancer Using the Transfer-Learning Technique. IEEE Access 9. 10.1109/ACCESS.2021.3079204
Soleimani H, Michailovich OV (2020) On segmentation of pectoral muscle in digital mammograms by means of deep learning. IEEE Access 8. 10.1109/ACCESS.2020.3036662
Chen J, Chen L, Wang S, Chen P (2020) A Novel Multi-Scale Adversarial Networks for Precise Segmentation of X-Ray Breast Mass. IEEE Access 8. 10.1109/ACCESS.2020.2999198
Al-antari MA, Han SM, Kim TS (2020) Evaluation of deep learning detection and classification towards computer-aided diagnosis of breast lesions in digital X-ray mammograms. Comput Methods Programs Biomed 196. 10.1016/j.cmpb.2020.105584
“kaggle ddsm”, Accessed: Aug. 20, 2023. [Online]. Available: https://www.kaggle.com/datasets/awsaf49/cbis-ddsm-breast-cancer-image-dataset
Shen S et al (2015) A multi-centre randomised trial comparing ultrasound vs mammography for screening breast cancer in high-risk Chinese women. Br J Cancer 112(6). 10.1038/bjc.2015.33
Han S et al (2017) A deep learning framework for supporting the classification of breast lesions in ultrasound images. Phys Med Biol 62(19). 10.1088/1361-6560/aa82ec
Shi J, Zhou S, Liu X, Zhang Q, Lu M, Wang T (2016) Stacked deep polynomial network based representation learning for tumor classification with small ultrasound image dataset. Neurocomputing 194. 10.1016/j.neucom.2016.01.074
Byra M et al (2019) Breast mass classification in sonography with transfer learning using a deep convolutional neural network and color conversion. Med Phys 46(2). 10.1002/mp.13361
Huang Y et al (2019) Two-stage CNNs for computerized BI-RADS categorization in breast ultrasound images. Biomed Eng Online 18(1). 10.1186/s12938-019-0626-5
Cao Z, Duan L, Yang G, Yue T, Chen Q (2019) An experimental study on breast lesion detection and classification from ultrasound images using deep learning architectures. BMC Med Imaging 19(1). 10.1186/s12880-019-0349-x
Kim J et al (2021) Weakly-supervised deep learning for ultrasound diagnosis of breast cancer. Sci Rep 11(1). 10.1038/s41598-021-03806-7
Choi JS et al (2019) Effect of a deep learning framework-based computer-aided diagnosis system on the diagnostic performance of radiologists in differentiating between malignant and benign masses on breast ultrasonography. Korean J Radiol 20(5). 10.3348/kjr.2018.0530
Park HJ et al (2019) A computer-aided diagnosis system using artificial intelligence for the diagnosis and characterization of breast masses on ultrasound. Med (United States) 98(3). 10.1097/MD.0000000000014146
Xiao M et al (2019) An investigation of the classification accuracy of a deep learning framework-based computer-aided diagnosis system in different pathological types of breast lesions. J Thorac Dis 11(12). 10.21037/jtd.2019.12.10
Byra M et al (2019) “Impact of Ultrasound Image Reconstruction Method on Breast Lesion Classification with Deep Learning,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 10.1007/978-3-030-31332-6_4
Moon WK, Lee YW, Ke HH, Lee SH, Huang CS, Chang RF (2020) Computer-aided diagnosis of breast ultrasound images using ensemble learning from convolutional neural networks. Comput Methods Programs Biomed 190. 10.1016/j.cmpb.2020.105361
Vakanski A, Xian M, Freer PE (2020) Attention-Enriched Deep Learning Model for Breast Tumor Segmentation in Ultrasound Images. Ultrasound Med Biol 46(10). 10.1016/j.ultrasmedbio.2020.06.015
Singh VK et al (2020) Breast tumor segmentation in ultrasound images using contextual-information-aware deep adversarial learning framework. Expert Syst Appl 162. 10.1016/j.eswa.2020.113870
Wang K, Liang S, Zhang Y (2021) “Residual Feedback Network for Breast Lesion Segmentation in Ultrasound Image,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 10.1007/978-3-030-87193-2_45
Wang K, Liang S, Zhong S, Feng Q, Ning Z, Zhang Y (2021) Breast ultrasound image segmentation: A coarse-to-fine fusion convolutional neural network. Med Phys 48(8). 10.1002/mp.15006
Fujioka T et al (2020) Classification of Breast Masses on Ultrasound Shear Wave Elastography using Convolutional Neural Networks. Ultrason Imaging 42:4–5. 10.1177/0161734620932609
Wu JX, Chen PY, Lin CH, Chen S, Shung KK (2020) Breast Benign and Malignant Tumors Rapidly Screening by ARFI-VTI Elastography and Random Decision Forests Based Classifier. IEEE Access 8. 10.1109/ACCESS.2020.2980292
Gong B et al (2020) “BI-Modal Ultrasound Breast Cancer Diagnosis Via Multi-View Deep Neural Network SVM,” in Proceedings - International Symposium on Biomedical Imaging, 10.1109/ISBI45749.2020.9098438
Byra M et al (2022) Joint segmentation and classification of breast masses based on ultrasound radio-frequency data and convolutional neural networks. Ultrasonics 121. 10.1016/j.ultras.2021.106682
“kaggle ultrasound”, Accessed Aug. 20, 2023. [Online]. Available: https://www.kaggle.com/datasets/aryashah2k/breast-ultrasound-images-dataset
Rasti R, Teshnehlab M, Phung SL (2017) Breast cancer diagnosis in DCE-MRI using mixture ensemble of convolutional neural networks. Pattern Recognit 72. 10.1016/j.patcog.2017.08.004
Zhou J et al (2020) Diagnosis of Benign and Malignant Breast Lesions on DCE-MRI by Using Radiomics and Deep Learning With Consideration of Peritumor Tissue. J Magn Reson Imaging 51(3). 10.1002/jmri.26981
Feng H et al (2020) A knowledge-driven feature learning and integration method for breast cancer diagnosis on multi-sequence MRI. Magn Reson Imaging 69. 10.1016/j.mri.2020.03.001
Antropova N, Abe H, Giger ML (2018) Use of clinical MRI maximum intensity projections for improved breast lesion classification with deep convolutional neural networks. J Med Imaging 5(01). 10.1117/1.jmi.5.1.014503
Antropova N, Huynh BQ, Giger ML (2017) A deep feature fusion methodology for breast cancer diagnosis demonstrated on three imaging modality datasets. Med Phys 44(10). 10.1002/mp.12453
Truhn D, Schrading S, Haarburger C, Schneider H, Merhof D, Kuhl C (2019) Radiomic versus Convolutional Neural Networks Analysis for Classification of Contrast-enhancing Lesions at Multiparametric Breast MRI. Radiology 290(3). 10.1148/radiol.2018181352
Zhou J et al (2019) Weakly supervised 3D deep learning for breast cancer classification and localization of the lesions in MR images. J Magn Reson Imaging 50(4). 10.1002/jmri.26721
Dalmiş MU et al (2019) Artificial Intelligence-Based Classification of Breast Lesions Imaged with a Multiparametric Breast MRI Protocol with Ultrafast DCE-MRI, T2, and DWI. Invest Radiol 54(6). 10.1097/RLI.0000000000000544
Zheng J, Lin D, Gao Z, Wang S, He M, Fan J (2020) “Erratum: Deep Learning Assisted Efficient AdaBoost Algorithm for Breast Cancer Detection and Early Diagnosis (IEEE Access 8 (96946–96954) DOI: 10.1109/ACCESS.2020.2993536),” IEEE Access, vol. 8. 2020. 10.1109/ACCESS.2020.3038301
Liu MZ et al (2022) Weakly Supervised Deep Learning Approach to Breast MRI Assessment. Acad Radiol 29. 10.1016/j.acra.2021.03.032
Wu Y, Wu J, Dou Y, Rubert N, Wang Y, Deng J (2022) A deep learning fusion model with evidence-based confidence level analysis for differentiation of malignant and benign breast tumors using dynamic contrast enhanced MRI. Biomed Signal Process Control 72. 10.1016/j.bspc.2021.103319
Carvalho ED, Veloso Silva RR, Mathew MJ, Duarte Araujo FH, De Carvalho AO, Filho (2021) “Tumor Segmentation in Breast DCE- MRI Slice Using Deep Learning Methods,” in Proceedings - IEEE Symposium on Computers and Communications, 10.1109/ISCC53001.2021.9631444
Wang H, Cao J, Feng J, Xie Y, Yang D, Chen B (2021) Mixed 2D and 3D convolutional network with multi-scale context for lesion segmentation in breast DCE-MRI. Biomed Signal Process Control 68. 10.1016/j.bspc.2021.102607
Khaled R, Vidal J, Vilanova JC, Martí R (2022) A U-Net Ensemble for breast lesion segmentation in DCE MRI. Comput Biol Med 140. 10.1016/j.compbiomed.2021.105093
Zhu J et al (2022) Development and validation of a deep learning model for breast lesion segmentation and characterization in multiparametric MRI. Front Oncol 12. 10.3389/fonc.2022.946580
Rahimpour M et al (2023) Visual ensemble selection of deep convolutional neural networks for 3D segmentation of breast tumors on dynamic contrast enhanced MRI. Eur Radiol 33(2). 10.1007/s00330-022-09113-7
Yue W et al (2022) Deep learning-based automatic segmentation for size and volumetric measurement of breast cancer on magnetic resonance imaging. Front Oncol 12. 10.3389/fonc.2022.984626
Dutta K et al (2021) Deep learning segmentation of triple-negative breast cancer (Tnbc) patient derived tumor xenograft (pdx) and sensitivity of radiomic pipeline to tumor probability boundary. Cancers (Basel) 13(15). 10.3390/cancers13153795
Verburg E et al (2022) Deep Learning for Automated Triaging of 4581 Breast MRI Examinations from the DENSE Trial. Radiology 302(1). 10.1148/radiol.2021203960
Skaane P et al (2018) Performance of breast cancer screening using digital breast tomosynthesis: results from the prospective population-based Oslo Tomosynthesis Screening Trial. Breast Cancer Res Treat 169(3). 10.1007/s10549-018-4705-2
Skaane P et al (2019) Digital mammography versus digital mammography plus tomosynthesis in breast cancer screening: The Oslo tomosynthesis screening trial. Radiology 291(1). 10.1148/radiol.2019182394
Haas BM, Kalra V, Geisel J, Raghu M, Durand M, Philpotts LE (2013) Comparison of tomosynthesis plus digital mammography and digital mammography alone for breast cancer screening. Radiology 269(3). 10.1148/radiol.13130307
Pinto MC et al (2021) Impact of artificial intelligence decision support using deep learning on breast cancer screening interpretation with single-view wide-angle digital breast tomosynthesis. Radiology 300(3). 10.1148/radiol.2021204432
Wu N et al (2020) Deep Neural Networks Improve Radiologists’ Performance in Breast Cancer Screening. IEEE Trans Med Imaging 39(4). 10.1109/TMI.2019.2945514
Loizidou K, Skouroumouni G, Pitris C, Nikolaou C (2021) Digital subtraction of temporally sequential mammograms for improved detection and classification of microcalcifications. Eur Radiol Exp 5(1). 10.1186/s41747-021-00238-w
Yang Z et al (2020) “MommiNet: Mammographic Multi-view Mass Identification Networks,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 10.1007/978-3-030-59725-2_20
Singh S et al (2020) Adaptation of a deep learning malignancy model from full-field digital mammography to digital breast tomosynthesis. 10.1117/12.2549923
Samala RK, Chan HP, Hadjiiski LM, Helvie MA, Richter C, Cha K (2018) Evolutionary pruning of transfer learned deep convolutional neural network for breast cancer diagnosis in digital breast tomosynthesis. Phys Med Biol 63(9). 10.1088/1361-6560/aabb5b
Mendel K, Li H, Sheth D, Giger M (2019) Transfer Learning From Convolutional Neural Networks for Computer-Aided Diagnosis: A Comparison of Digital Breast Tomosynthesis and Full-Field Digital Mammography. Acad Radiol 26(6). 10.1016/j.acra.2018.06.019
Hossain B, Nishikawa RM, Lee J (2022) Improving lesion detection algorithm in digital breast tomosynthesis leveraging ensemble cross-validation models with multi-depth levels. 10.1117/12.2611007
Hossain MB, Nishikawa RM, Lee J (2022) Developing breast lesion detection algorithms for digital breast tomosynthesis: Leveraging false positive findings. Med Phys 49(12). 10.1002/mp.15883
Bai J, Jin A, Jin A, Wang T, Yang C, Nabavi S (2022) “Applying graph convolution neural network in digital breast tomosynthesis for cancer classification,” in Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, BCB 2022, 10.1145/3535508.3545549
Mota AM, Clarkson MJ, Almeida P, Matela N (2022) Automatic Classification of Simulated Breast Tomosynthesis Whole Images for the Presence of Microcalcification Clusters Using Deep CNNs. J Imaging 8(9). 10.3390/jimaging8090231
Matthews TP et al (2021) A multisite study of a breast density deep learning model for full-field digital mammography and synthetic mammography. Radiol Artif Intell 3(1). 10.1148/ryai.2020200015
Shoshan Y, Zlotnick A, Ratner V, Khapun D, Barkan E, Gilboa-Solomon F (2021) “Beyond Non-maximum Suppression - Detecting Lesions in Digital Breast Tomosynthesis Volumes,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 10.1007/978-3-030-87240-3_74
Lotter W et al (2021) Robust breast cancer detection in mammography and digital breast tomosynthesis using an annotation-efficient deep learning approach. Nat Med 27(2). 10.1038/s41591-020-01174-9
Shareef B, Xian M, Vakanski A (2020) “Stan: Small Tumor-Aware Network for Breast Ultrasound Image Segmentation,” in Proceedings - International Symposium on Biomedical Imaging, 10.1109/ISBI45749.2020.9098691
Chauhan R, Vinod PK, Jawahar CV (2021) “Exploring genetic-histologic relationships in breast cancer,” in Proceedings - International Symposium on Biomedical Imaging, 10.1109/ISBI48211.2021.9434130
Petrini DGP, Shimizu C, Roela RA, Valente GV, Folgueira MAAK, Kim HY (2022) Breast Cancer Diagnosis in Two-View Mammography Using End-to-End Trained EfficientNet-Based Convolutional Network. IEEE Access 10. 10.1109/ACCESS.2022.3193250
Karthik R, Menaka R, Siddharth MV (2022) Classification of breast cancer from histopathology images using an ensemble of deep multiscale networks. Biocybern Biomed Eng 42(3). 10.1016/j.bbe.2022.07.006
Malebary SJ, Hashmi A (2021) Automated Breast Mass Classification System Using Deep Learning and Ensemble Learning in Digital Mammogram. IEEE Access 9. 10.1109/ACCESS.2021.3071297
Atrey K, Singh BK, Roy A, Bodhey NK (2022) Real-time automated segmentation of breast lesions using CNN-based deep learning paradigm: Investigation on mammogram and ultrasound. Int J Imaging Syst Technol 32(4). 10.1002/ima.22690
Li Y, Zhang L, Chen H, Cheng L (2020) Mass detection in mammograms by bilateral analysis using convolution neural network. Comput Methods Programs Biomed 195. 10.1016/j.cmpb.2020.105518
Shen L, Margolies LR, Rothstein JH, Fluder E, McBride R, Sieh W (2019) Deep Learning to Improve Breast Cancer Detection on Screening Mammography. Sci Rep 9(1). 10.1038/s41598-019-48995-4
Negahbani F et al (2021) PathoNet introduced as a deep neural network backend for evaluation of Ki-67 and tumor-infiltrating lymphocytes in breast cancer. Sci Rep 11(1). 10.1038/s41598-021-86912-w
Senousy Z et al (2022) MCUa: Multi-Level Context and Uncertainty Aware Dynamic Deep Ensemble for Breast Cancer Histology Image Classification. IEEE Trans Biomed Eng 69(2). 10.1109/TBME.2021.3107446
Youk JH, Gweon HM, Son EJ (2017) “Shear-wave elastography in breast ultrasonography: The state of the art,” Ultrasonography, vol. 36, no. 4. 10.14366/usg.17024

No competing interests reported.

Download PDF

Submission checks completed at journal
04 Sep, 2023
Editor assigned by journal
04 Sep, 2023
First submitted to journal
02 Sep, 2023

You are reading this latest preprint version

Deep Learning for Predicting Breast Cancer: A Systematic Review of Progress and Future Directions

Status:

Version 1

Abstract

Figures

1. Introduction

2. Introduction to Deep Learning

2.1 Convolutional Neural Network

2.1.1 Activation Functions

2.1.1.1 Sigmoid Activation Function-

2.1.1.2 Tanh Activation function-

2.1.1.3 ReLU Activation Function-

1. Adding complexity and depth shouldn't reduce the error rate.

2. Keep training the residuals to bring the predicted and observed values closer together.

2.2 Recurrent Neural Network

2.3 Autoencoders

2.4 Restricted Boltzmann Machines

2.5 Self-Organizing Maps

3. Procedures for Breast Cancer Detection

3.1 Detection of Breast Cancer

3.2 Techniques of screening for Breast Cancer

3.2.1 Mammography

3.2.2 Ultrasound

3.2.3 Magnetic Resonance Imaging

3.2.4 Digital Breast Tomosynthesis:

4. Methodology

4.1 Questions in mind of researchers:

4.2 Usage of Datasets:

4.3 PRISMA flow diagram:

5. Literature Review

6. Discussion

6.1 Performance Measures-

6.1.1 Sensitivity-

6.1.2 Accuracy-

6.1.3 F1-Score-

7. Conclusion

Declarations

Conflict of interest

References

Additional Declarations

Status:

Version 1