103 Submission
103 Submission
103 Submission
1 Introduction
Breast cancer accounts for nearly 15% of deaths related to cancer among women,
underlining the critical requirement for advanced research to facilitate early detection,
accurate diagnosis, and effective treatment to enhance survival rates. Mammography
and biopsy stand out as the two primary diagnostic techniques employed for breast
cancer identification. Mammography involves the use of specialized breast imaging
by a radiologist to discern early signs of breast cancer in women. The widespread
adoption of mammography has contributed to a reduction in mortality rates associated
with breast cancer. Another precise diagnostic method is a biopsy, wherein a
pathologist examines a sample of tissue from the infected area of the breast under a
microscope to identify and characterize tumors. Biopsy enables the pathologist to
2
differentiate between malignant and benign lesion types, with benign anomalies often
stemming from irregularities in epithelial cells that generally do not lead to breast
cancer. Malignant or cancerous cells, on the other hand, exhibit abnormal growth and
division patterns.
Analyzing microscopic images manually becomes exceptionally challenging due to
the uneven presence of malignant and benign cells. In recent days, researchers have
proposed various cell classification methods for breast cancer detection in images. In
recent years, Artificial Intelligence (AI) has significantly advanced breast cancer
detection and recognition, aiming to categorize patients into "malignant" or "benign"
groups. Research efforts have explored diverse ML algorithms for breast cancer
classification. For instance, a study [1] utilized the weighted Naïve Bayesian
algorithm, achieving an accuracy of 98.54%. Another work [2] compared six ML
algorithms, using the Wisconsin Diagnostic Breast Cancer dataset, demonstrating
efficient classification of malignant and benign cases.
While traditional ML methods offer effective classification, challenges arise in
accurately detecting tumor subtypes from histopathological images using automated
approaches. A study [3] applied Deep Learning (DL) with Inception and ResNet
architectures to distinguish tiny malignant images and proposed a highly accurate
automatic framework for cancer diagnosis and subtype classification. However, DL
algorithms demand extensive training datasets, posing a challenge due to the limited
accessibility of breast images.
To address this, transfer learning (TL) has been applied to develop a DL model for
breast cancer initial diagnosis. TL involves leveraging a pre-trained model on
ImageNet and attaching it for the prediction, segmentation, or classification of Breast
Cancer Imaging (BCI) [4]. Additionally, a DL approach for categorizing breast
ultrasound images based on transfer learning was presented in [5]. Although
conventional transfer learning methods using an ImageNet pre-trained AlexNet have
demonstrated improved performance, challenges persist, such as small test precision
and undesirable false negatives in the presence of previously unseen examples [6],
limiting their applicability in clinical settings.
This research focuses on optimizing breast cancer classification through advanced
Convolutional Neural Networks (CNNs) and innovative techniques. The initial data
pre-processing addresses noise in breast tissue images, while the proposed data
augmentation method enhances the diversity of the training dataset. Three pre-trained
CNN architectures, including MobileNetV2 and Thin MobileNet, are employed for
feature extraction, complemented by transfer learning for improved model refinement.
The optimization involves the integration of the Osprey Optimization Algorithm
(OOA) to enhance CNN classification performance and reduce computational costs.
The key contributions of this proposed methodology include:
Introduces a novel data augmentation technique, diversifying the training
dataset to improve CNN model accuracy.
Utilizes fine-tuned MobileNetV2, Thin MobileNet, and Reduced
MobileNetV2 for effective feature extraction from breast cancer images.
Applies transfer learning to refine CNN models, leveraging previously
learned characteristics to enhance recognition accuracy.
3
2 Related work
Researchers have looked into the possibility of utilizing image analysis methods to
classify breast cancer in recent years. Several research works have used image-
processing methods with ML and DL algorithms to distinguish between benign and
malignant conditions. The evaluation of the literature highlights the importance of
conducting more research in this field to develop trustworthy and easy-to-use
diagnostic tools that can help healthcare providers diagnose breast cancer promptly
and accurately. Ragab et al. [7] employed the Computer-aid diagnosis (CAD) system
for categorising the tumors type in the mammography images of the breast. The
features are extracted using the DCNN and the classification is done by the SVM.
Additionally, the CLAHE is used to develop the distinction and suppress the noise in
mammogram images and the area of interest (ROI) from the mammography is
extracted using two methods. The first method uses circular contours to manually
estimate the ROI; the second method uses threshold and region-based approaches. It
is challenging to gather a substantial amount of training data, the DCNN performance
for mammography image recognition is currently unknown. Furthermore, the
DCNN's capacity to obtain the feature representation is also unknown because of the
patterns that differ between the mammography and natural images. Therefore, Suzuki
et al. [8] introduced transfer learning in CAD systems to overcome the training
problem of DCNN. Both natural images and mammography images are used for
training. Transfer learning uses a small amount of mammography images after pre-
trained utilizing many natural photographs. Deniz et al. [9] employed transfer learning
(TL) in the classification of histopathological BCI. The TL and feature extraction is
done by using the model namely AlexNet and VGG16. The feature vectors from both
the vectors form a high-dimensional feature representation. Then the SVM classifier
with homogeneous mapping is utilized to categorize the image classes according to
the extracted features. Finally, they fine-tune the AlexNet model by removing the
final three layers and update layers to adapt it to the breast cancer detection problem.
Farhadi el. [10] emphasizes the possibility of structured data in developing the early
detection of malignant breast cancer through advanced ML techniques. It handles
imbalanced data prominent in breast cancer datasets utilizing deep TL on structured
data, as opposed to large image databases.
Zhu et al. [11] introduce a hybrid model merging Convolutional Neural Networks
and Logistic Regression-Principal Component Analysis (LR-PCA) for improved
diagnosis accuracy. The feature extraction is enhanced by using the pseudo-coloured
4
images which enrich the input data for the DL models. Furthermore, the LR-PCA
overcomes the multicollinearity among high-level deep characteristics derived from
CNNs and the image is pre-processed by the CLAHE and Pixel-wise intensity
alteration. The diagnostic results were enhanced by Samee et al. [12] through the
implementation of a several-views screening image-processing structural design.
Tumor patches were segmented by a texture-oriented method called first-class local
entropy. The results of the feature extraction process were used to create malignancy
markers like radius and area. Alruwaili and Gouda [13] employed transfer learning to
distinguish between types of breast cancer. Implementing a variety of augmentation
approaches to avoid overfitting and enhance the stability of outcomes. Researchers
skillfully combined a modified ResNet50 with a blend of Nasnet and MobileNet
architectures. This strategy allowed them to train network weights effectively on large
datasets while also fine-tuning pre-existing network weights on smaller, more
specialized datasets. Khamparia et al. [14] focus on a hybrid TL technique for
combining the MVGG and ImageNet to detect breast cancer from mammograms. It
also discusses the advantages of 3D mammography over standard 2D mammograms
in breast cancer screening but the dataset is tiny and lacks information. Aly et al. [15]
pre-process mammograms from DICOM to image format and use YOLO
architectures to detect and classify masses as malignant or benign. The feature
extractions are done by using ResNet and Inception then the concept of YOLO-V3 is
introduced by utilizing the k-means clustering on the dataset. It was estimated on a
dataset of 322 FFDMs and attained an accuracy of 96.6% for mass detection and
94.7% for mass classification. Two sophisticated neural networks are combined by
Zhang et al. [16], they are a CNN and graph convolutional network. Moreover, it
enhances the identification of malignant lesions in breast mammograms by combining
rank-based stochastic pooling, dropout, and batch normalisation with a two-layer
GCN. Khafaga [17] diagnosed breast cancer from the thermal imageries utilizing the
four pre-trained CNN architectures. Then the features are selected to train the neural
network for classification using an optimization algorithm. Yet, the optimization
method of the method is dependent on several factors, which may result in early
convergence and decreased efficiency. However, existing approaches have
limitations, including early convergence and decreased efficiency in optimization
methods. Addressing these gaps motivates the proposed approach, which aims to
overcome such challenges and improve the accuracy and efficiency of breast cancer
organization methodologies.
3 Proposed methodology
This research is centered on optimizing the performance of CNN for breast cancer
classification. The approach involves a comprehensive data pre-processing and
augmentation step, where noise is eliminated, and the CNN's effectiveness is
enhanced through the incorporation of a data augmentation technique. This technique
involves random transformations like rotation, flipping, scaling, and blurring to
diversify the training dataset, thereby improving the classifier's generalizability and
mitigating overfitting. The feature extraction process utilizes three pre-trained CNN
5
sets the stage for subsequent feature extraction and model learning. The point-wise
convolution layer creates fresh features by combining input channels through linear
combinations. The information from the deep convolutional layer exists as a manifold
within the low-dimensional subspace of the residual bottleneck. This representation
can be captured by reducing the layer and operational space dimensions. The process
of capturing this representation involves dimensionality reduction in both layers and
operational spaces
By suppressing the dimensionality space activation, the manifold increases the space.
ReLU (nonlinear per-coordinate transformation) is a feature of the deep convolution
neural network that defies intuition. The linear transformation is generated if the
ReLU transformation and the manifold of interest still have a volume that is not zero.
The more memory-efficient inverted residuals are constructed. The final layers of
MobileNetV2 were adjusted by updating new layers by the goal datasets. The refined
model is trained via a transfer learning strategy. Eight batches, 0.00001 learning rate,
and 100 epochs are established during the training phase. Ultimately, the improved
model utilizes the global average pool (GAP) to extract deep features for additional
processing. This layer's vector output measures N × 1280.
with soft plus activation is represented as tanh (1+e x ) to enhance the gradient flow
and non-linearity which is expressed in Equation (5)
f ( x )=x ׿ (5)
A CNN that is being trained from the beginning usually requires enormous volumes
of data, but organizing a large data collection with pertinent problems can be
somewhat challenging in certain situations. While the goal is to have matching
training and testing data, achieving this is often impractical or presents a challenging
task. Consequently, the notion of transfer learning has been presented. Transfer
learning is among the best-renowned machine learning methods. Which uses prior
information gained from solving one problem to solve other pertinent problems.
Using a model learned on a particular assignment as the basis for a model on another
is defined as transfer learning. When there is little data for the second task or when
the tasks are comparable to each other, it may be helpful. By using the learned
characteristics from the first assignment, the framework can learn quicker and more
efficiently on the following task. This can also mitigate overfitting, as the system will
have already acquired generic features that are likely to be beneficial in subsequent
operations. The pre-training process and the transfer technique make it possible to
import neural network parameters from real-world imaging datasets. The similarity
and comparability of natural imaging datasets and remote sensing images within their
respective categories made this achievable in part. It makes sense that a CNN
architecture can be trained on large and complex ImageNet datasets to get well-
trained parameters, which is essential to initiating the next classification. As a result,
the pre-training technique enhances the MobileNet to classify breast cancer from the
data.
node. One of the greatest ML methods for medical image exploration is CNN since
even following the input image filtering process, it maintains spatial correlations.
These relationships are highly valued in the field of medical analytics. In CNN, two
processes make up the image analysis process known as convolution. The pixel values
of the characteristics that were taken out of three pre-trained CNNs must be entered
first. A numerical array known as a kernel (or filter) is used to symbolize the second
activity. The outcome of the two procedures is the dot product. The stride length is
then used to determine the kernel's location in the image. Feature maps, sometimes
called activation maps, are created by repeating the calculation until the whole image
is enclosed. The feature maps that are produced are then used as contributions to the
subsequent layer. Sparse connections, sharing of weights, and invariant representation
are necessary for efficient computational machine learning when using convolution.
CNN employs sparse connections, transmitting only a subset of outputs from all
layers to the subsequent one. Unlike other neural networks, it establishes connections
between each neuron within a layer. As the kernel's enclosed part of each step
decreases, the algorithm's performance increases by slowly learning the significant
attributes and significantly lowering the predicted weights. In addition to lowering
computing costs and assisting in avoiding the vanishing gradient issue, the ReLU
layer expedites training. The average pooling is located after the ReLU layer. Then
the maximum and average pooling is the relevantly used method because it decreases
the dimension of the image and the parameters. The fully connected layer establishes
a connection between all neurons in the preceding layer and all neurons in the
subsequent layer. It uses the output of the previous layers such as pooling, ReLU, or
convolutional to calculate the probabilities of different classifications. The CNN can
learn to recognize cancer cells from images by using standard methods like stochastic
gradient descent and backpropagation to train on previously labelled images.
Exploration phase: Finding the spot and pursuing the fish (exploration) is the first
phase of updating the population in the OOA. In this phase, each osprey in the
population randomly detects the position of another osprey that has a better objective
function value and considers it as a fish. The osprey then approaches the fish, goes
under the water, and captures it. The osprey's position undergoes substantial changes
due to this process, which boosts OOA's exploration capacity in locating the primary
optimum region and avoiding local optima. The mathematical model of this phase is
given by equations (7-9)
|
FPi={ X k k ∈ {1 , 2 , … ., N } ∧ F k < Fi } ⋃ { X best } , (7)
p1
x =x i , j +r i , j . ( SF i , j−I i , j . x i , j )
i, j (8)
{
p1 p1
X i = x i , F i < Fi ; (9)
X i , else ,
p1 p1
Where x i is the representation of the new position of i th osprey, x i , j is its jth
p1
dimension, F i is the objective function, r i , j are a random number from the range
[0,1], the random number sets of {1,2} is represented as I i , j .
Exploitation phase: The exploitation phase of the osprey's hunting behavior, which
is represented in equations (10), involves carrying the caught fish to a secure area for
consumption. It updates the OOA population throughout this period. It could result in
minor adjustments to the search space, boosting the capacity of OOA to utilize local
searches and accelerating the convergence of improved solutions. After that, the
osprey's prior location is changed in accordance with equation (11), which also
improves the objective function.
p2 lb j +r .(ub j−lb j)
x i , j=x i , j+ , (10)
t
{
p2 p2
X i = x i , F i < Fi ;
X i , else ,
(11)
12
p2 p2
Where x i is the representation of the new position of i th osprey, x i , j is its jth
p2
dimension, F i is the objective function, r i , j are arbitrary numbers within the range
[0, 1], the random number sets of {1,2} is represented as I i , j . t represents the
algorithm's iteration counter, and the T is the sum of all the iterations.
4.1 Dataset
Benign
Malignan
t
deceiving if one type of dataset is far more common than the other which means
unbalanced. Sensitivity and specificity metrics are very important for binary
classification issues like the categorization of breast cancer. Sensitivity quantifies the
number of true positives the model accurately detects, whereas Specificity quantifies
the number of true negatives that are accurately detected. The models' sensitivity
values in this instance varied from 0.967 to 0.915, suggesting that they are capable of
identifying true positive situations with similar levels of performance. The models'
specificity values, which varied in success, showed that they could accurately detect
real negative situations. These values ranged from 0.852 to 0.901. It is significant to
remember that selecting the decision threshold might have an impact on sensitivity
and specificity and that various thresholds may yield varying presentation levels.
Since the NPV and Precision measures reveal the frequency of FP and FN, they are
also pertinent assessment metrics for binary classification issues. While the Precision
calculates the percentage of negatively categorized cases that are mistakenly labeled
as positive, the NPV calculates the percentage of positively classed cases that are
mistakenly classified as negative. The Precision in this instance ranged from 0.893 to
0.942, showing that the models' charges of FP predictions are comparatively low.
With FPR ranging from 0.072 to 0.053, the models appear to have somewhat greater
FN rates. Lastly, a metric that associates recall and precision into a single number is
the F-score. It offers a useful synopsis of how well the model performed overall in
classifying negative and positive cases. The F1-score values in this instance varied
from 0.521 to 0.529, showing that the models' recall and precision are comparable.
This performance gives a comprehensive description of how successfully the models
under investigation classify breast cancer.
-Testing Type
Precision
F1-Score
Time (s)
Data
AUC
FPR
(%)
(%)
(%)
(%)
(%)
Splitting
Fine-tuned 70%- B 0.974 0.991 0.941 0.984 0.979 0.98 0.021 64.5
MobileNet V2 30% M 0.954 0.981 0.924 0.951 0.947 0.97 0.045 59.7
RMV2 70%- B 0.946 0.961 0.908 0.937 0.943 0.95 0.052 64.6
30% M 0.931 0.928 0.893 0.928 0.916 0.93 0.058 85.4
Thin 70%- B 0.925 0.945 0.884 0.916 0.925 0.92 0.067 98.9
MobileNet 30% M 0.908 0.915 0.852 0.893 0.914 0.90 0.072 128.7
required to completely understand and put these results into context. These outcomes
indicate that the suggested OOA-CNN model is an auspicious method for classifying
breast cancer.
Fig. 5 displays the confusion matrix illustrating the outcomes of the OOA-CNN
method in classifying breast cancer. The matrix provides a strong visual
representation of the model's accuracy, affirming its effectiveness in the medical field.
The accurate classification of breast cancer underscores the robust performance of the
suggested approach in contributing to the field of diagnostic methodologies.
Fig. 6 indicates the accuracy curve during the training and validation. The graph
reveals that the proposed model provides a high accuracy rate during training and
validation because the proposed method extracts more information from the existing
data using transfer learning. It creates new features from the existing ones, which help
the model to learn and classify breast cancer better. Then overfitting is avoided by
introducing the cross-validation dividing the dataset. Then the graph in Fig. 7
illustrates the trajectory of the loss function for the suggested approach throughout the
training and validation phase. The graph shows that both the loss decreases sharply
which indicates that the model is learning and stabilizing because the hyperparameter
of the classification model is optimized using the advanced optimization algorithm to
achieve better results with less loss.
16
5 Conclusion
This article addresses the imperative need for enhanced breast cancer classification
methods by introducing a novel approach leveraging advanced pre-trained
17
Acknowledgements. All authors named on the title page have made substantial
contributions to the research, have reviewed the manuscript, confirm the integrity and
valid interpretation of the data, and consent to its submission.
References
1. Karabatak, M.: A new classifier for breast cancer detection based on Naïve
Bayesian. Measurement 72, 32-36 (2015).
2. Rane, N., Sunny, J., Kanade, R., Devi, S.: Breast cancer classification and
prediction using machine learning. International Journal of Engineering
Research and Technology 9(2), 576-580 (2020).
3. Motlagh, M. H., Jannesari, M., Aboulkheyr, H., Khosravi, P., Elemento, O.,
Totonchi, M., Hajirasouliha, I.: Breast cancer histopathological image
classification: A deep learning approach. BioRxiv 242818 (2018).
4. Zhuang, F., Qi, Z., Duan, K., Xi, D., Zhu, Y., Zhu, H., He, Q.: A comprehensive
survey on transfer learning. Proceedings of the IEEE 109(1), 43-76 (2020).
5. Byra, M., Galperin, M., Ojeda Fournier, H., Olson, L., O'Boyle, M., Comstock,
C., Andre, M.: Breast mass classification in sonography with transfer learning
using a deep convolutional neural network and color conversion. Medical
physics 46(2), 746-755 (2019).
6. Khan, S., Islam, N., Jan, Z., Din, I. U., Rodrigues, J. J. C.: A novel deep learning
based framework for the detection and classification of breast cancer using
transfer learning. Pattern Recognition Letters 125, 1-6 (2019).
7. Ragab, D. A., Sharkas, M., Marshall, S., Ren, J.: Breast cancer detection using
deep convolutional neural networks and support vector machines. PeerJ 7,
e6201 (2019).
8. Suzuki, S., Zhang, X., Homma, N., Ichiji, K., Sugita, N., Kawasumi, Y.,
Yoshizawa, M.: Mass detection using deep convolutional neural network for
mammographic computer-aided diagnosis. 55th Annual conference of the
society of instrument and control engineers of Japan (SICE), Tsukuba, Japan,
pp. 1382-1386 (2016).
18
9. Deniz, E., Şengür, A., Kadiroğlu, Z., Guo, Y., Bajaj, V., Budak, Ü.: Transfer
learning based histopathologic image classification for breast cancer
detection. Health information science and systems 6, 1-7 (2018).
10. Farhadi, A., Chen, D., McCoy, R., Scott, C., Miller, J. A., Vachon, C. M.,
Ngufor, C.: Breast cancer classification using deep transfer learning on
structured healthcare data. IEEE International Conference on Data Science and
Advanced Analytics (DSAA), Washington, DC, USA, pp. 277-286 (2019).
11. Zhu, W., Braun, B., Chiang, L. H., Romagnoli, J. A.: Investigation of transfer
learning for image classification and impact on training sample size.
Chemometrics and Intelligent Laboratory Systems 211, 104269 (2021).
12. Samee, N. A., Alhussan, A. A., Ghoneim, V. F., Atteia, G., Alkanhel, R., Al-
Antari, M. A., Kadah, Y. M.: A Hybrid Deep Transfer Learning of CNN-Based
LR-PCA for Breast Lesion Diagnosis via Medical Breast
Mammograms. Sensors 22(13), 4938 (2022).
13. Alruwaili, M., Gouda, W.: Automated breast cancer detection models based on
transfer learning. Sensors 22(3), 876 (2022).
14. A Khamparia, A., Bharati, S., Podder, P., Gupta, D., Khanna, A., Phung, T. K.,
Thanh, D. N.: Diagnosis of breast cancer based on modern mammography using
hybrid transfer learning. Multidimensional systems and signal processing 32,
747-765 (2021).
15. Aly, G. H., Marey, M., El-Sayed, S. A., Tolba, M. F.: YOLO based breast
masses detection and classification in full-field digital mammograms. Computer
methods and programs in biomedicine 200, 105823 (2021).
16. Zhang, Y. D., Satapathy, S. C., Guttery, D. S., Górriz, J. M., Wang, S. H.:
Improved breast cancer classification through combining graph convolutional
network and convolutional neural network. Information Processing &
Management 58(2), 102439 (2021).
17. Khafaga, D.: Meta-heuristics for feature selection and classification in diagnostic
breast cancer. Computers, Materials and Continua 73(1), 749-765 (2022).
18. Hameed, Z., Zahia, S., Garcia-Zapirain, B., Javier Aguirre, J., Maria Vanegas,
A.: Histopathology image classification using an ensemble of deep learning
models. Sensors 20(16), 4373 (2020).
19. Zahoor, S., Shoaib, U., Lali, I. U.: Breast cancer mammograms classification
using deep neural network and entropy-controlled whale optimization
algorithm. Diagnostics 12(2), 557 (2022).
20. Sinha, D., El-Sharkawy, M.: Thin mobilenet: An enhanced mobilenet
architecture. 10th annual ubiquitous computing, electronics & mobile
communication conference (UEMCON), New York, NY, USA, pp. 0280-0285
(2019).
21. Ayi, M., El-Sharkawy, M.: Rmnv2: Reduced mobilenet v2 for cifar10. 10th
Annual Computing and Communication Workshop and Conference
(CCWC), Las Vegas, NV, USA, pp. 0287-0292 (2020).
22. Dehghani, M., Trojovský, P.: Osprey optimization algorithm: A new bio-
inspired metaheuristic algorithm for solving engineering optimization problems.
Frontiers in Mechanical Engineering 8, 1126450 (2023).
23. Yari, Y., Nguyen, T. V., Nguyen, H. T.: Deep learning applied for histological
diagnosis of breast cancer. IEEE Access 8, 162432–162448 (2020).
19
24. Zaychenko, Y., Hamidov, G.: Medical images of breast tumors diagnostics with
application of hybrid CNN–FNN network. Syst. Res. Inf. Technol. 4, 37-47
(2018).
25. Wang, D., Chen, Z., Zhao, H.: Prototype transfer generative adversarial network
for unsupervised breast cancer histology image classification. Biomed. Signal
Process. Control, 68, 102713 (2021).
26. Hu, C., Sun, X., Yuan, Z., Wu, Y.: Classification of breast cancer
histopathological image with deep residual learning. Int. J. Imaging Syst.
Technol 31(3), 1583–1594 (2021).
27. Liu, M., He, Y., Wu, M., Zeng, C.: Breast histopathological image classification
method based on autoencoder and Siamese framework. Information 13(3), 107
(2022).
28. Zheng, Y., Li, C., Zhou, X., Chen, H., Xu, H., Li, Y., Grzegorzek, M.:
Application of transfer learning and ensemble learning in image-level
classification for breast histopathology. Intelligent Medicine 3(02), 115-128
(2023).
29. Patel, V., Chaurasia, V., Mahadeva, R., Patole, S. P.: GARL-Net: Graph Based
Adaptive Regularized Learning Deep Network for Breast Cancer
Classification. IEEE Access 11, 9095-9112 (2023).