3.1. Data Collection and Preprocessing
The dataset comprised 1,000 images of breast lesions from ultrasound, which were collected from open databases and clinical archives. The dataset contains breast ultrasound images of 600 female patients; age is not less than 25 and not more than 75. Image collection was done in 2018. The size is 780 images in PNG format, of an average size of 500x500 pixels [8]. The images were preprocessed to a standardized resolution of 64x64 pixels to ensure uniformity across the dataset. We fine-tuned the YOLOv8 model on this dataset for 50 epochs using a batch size of 64.
This dataset had balanced quantities of both benign and malignant cases. Variables of different shapes, sizes, and textures represent benign and malignant lesions. It involves a range of preprocessing techniques that include noise reduction; contrast improvement through median filtering; and resizing the images to standardize them, thus increasing the performance level of the model. It applies data augmentation techniques of rotations, flip, and zoom on the dataset to add more variability applied into the training set to prevent overfitting. Figure 2 provides a sample of data, and Table 1 represents the summary of the used dataset in terms of how many images for each class were applied to the augmentation methods [9],[10].
Table 1
A summary of the used dataset
Lesion Type
|
Number of Images
|
Data Augmentation Techniques
|
Benign
|
500
|
Rotation, Flip, Zoom
|
Malignant
|
500
|
Rotation, Flip, Contrast Adjustment
|
3.2. YOLOv8 Model Architecture
The YOLOv8 represents a significant advancement in the YOLO series by prioritizing improvements in both accuracy and speed. It introduces anchor-free detection and a more efficient backbone, allowing it to process images in real-time with minimal computational overhead. The model's architecture consists of a feature extraction network followed by detection layers that predict the occurrence and place of objects within the image.
The network was fine-tuned by using transfer learning in the world of breast cancer detection by working on a pre-trained model on the COCO dataset when detecting Common Objects in Context [10].
It optimizes the hyper-parameters during training for learning rate, batch size, and epochs. Early stopping avoided overfitting of the model while saving checkpoints based on validation performance.
The architecture of the YOLOv8 model is illustrated in Fig. 3 and shows the flow of information from input to prediction.
The training process involved optimizing the model's hyperparameters, containing the batch size, learning rate, and number of epochs. Early stopping was employed to prevent overfitting, and model checkpoints were saved based on validation loss [11].
3.3. Web-based Accessibility: Convenience at Your Fingertips
One of the defining characteristics of this project lies in its web-based architecture. This strategic choice prioritizes accessibility and user convenience. The platform will be accessible through any standard web browser, without the need for users to download and install additional software. This approach caters to a wide range of users, from healthcare professionals in well-equipped medical facilities to individuals with limited access to specialized technology.
Uploading of the ultrasound images for analysis will be very easy, front-ended by a user-friendly interface with least technical knowledge required. The flowchart of the process for ultrasound image classification as benign or malignant is presented in Fig. 4.
This design consideration ensures that the platform's benefits can be widely disseminated, fostering inclusivity and democratizing access to this sophisticated diagnostic tool as shown in Fig. 5.A and Fig. 5.B.
Figure 4.B demonstrates that patients can easily upload their medical scans with the "Upload your scan" button. The predict button will be enabled after the image passes verification of authenticity and quality. predict button employs advanced algorithms to analyze uploaded scans, predict cancer types and characteristics, and output the prediction results.
3.4. Training of the neural network model
To analyze the neural network training, we use the Neural Network Training Diagram Fig. 6.
The diagram is a graphical representation of training a neural network model. Along the x-axis is represented the number of training epochs (One complete pass through the entire training dataset), and on the y-axis are the loss and accuracy metrics. The subplots are four: two for loss functions and two for accuracy metrics, classified in top-1 and top-5.
Loss Functions
-
Train/Loss: This graph shows how the model loss changes with time. When dropping, the model is learning and its performance is improving when assessed on train data.
-
Val/Loss: This is the graph that describes the loss of the model on the validation dataset. It should be taken under consideration because there is a risk of getting into the zone of overfitting while this number goes up. This actually happens when your model becomes overfit and increasingly specialized on the trained data, thus performing weakly on new data. It is indicated by increasing validation loss while the training loss keeps going down.
Accuracy Metrics
-
Metrics/Accuracy_top1: This curve shows the percentage of correct predictions where the top-estimated class is one of the ground truth labels. A higher curve indicates better overall performance.
-
Metrics/Accuracy_top5: This curve shows the percentage of correct predictions where the correct class is among the five most probable predicted classes. In case of multi-class classification, it is a frequently used measure since in practice it allows considering lower-ranked predicted classes.
Observations
-
Training Loss: The training loss always decreases, which means the model is learning well. Validation Loss:
-
The validation loss is first decreasing and then saturated, implying that the model could have started to overfit.
-
Accuracy: Both top-1 and top-5 accuracies are increasing, leading to the conclusion that the model is doing better than before at the classification of examples.
-
Smoothness: The "smooth" line in the validation loss plot may be its smoothed version of the original curve of loss, which can help obtain the trends clearly.
In general, the plots relate to normal training processes where models tend to improve relative to their performance on training and validation data, which comes at the risk of overfitting and therefore should continue to be watched [10].
3.5. Evaluation Metrics
The performance of the YOLOv8 model was benchmarked using, precision, accuracy, F1-score, and recall. In addition, confusion matrices were generated to provide insights into the model's classification performance. The evaluation was conducted on a test set that was not used during training or validation to ensure that the results were representative of the model's ability to generalize to new data [12].
Evaluation Metrics descriptions:
-
Precision: Number of correct positive predictions by the model.
-
Recall: The recall presents how much the true positive cases have been shown by the model.
-
F1-score: It is a balanced measure since the harmonic mean of precision and recall has to be used.
-
Accuracy: This is the total percentage of the correct predictions, both positive and negative.
Table 2 presents the model results and performance metrics for the YOLOv8 model compared to other popular CNN models used in breast cancer detection.
Table 2
The model results and performance metrics
Model
|
Precision
|
Recall
|
F1-Score
|
Accuracy
|
YOLOv8
|
0.93
|
0.92
|
0.92
|
90%
|
ResNet50
|
0.87
|
0.85
|
0.86
|
88%
|
VGG16
|
0.83
|
0.81
|
0.82
|
85%
|