An Empirical Study of Different Machine Learning Techniques For Brain Tumor Classification and Subsequent Segmentation Using Hybrid Texture Feature
An Empirical Study of Different Machine Learning Techniques For Brain Tumor Classification and Subsequent Segmentation Using Hybrid Texture Feature
An Empirical Study of Different Machine Learning Techniques For Brain Tumor Classification and Subsequent Segmentation Using Hybrid Texture Feature
https://doi.org/10.1007/s00138-021-01262-x
Received: 9 April 2021 / Revised: 29 August 2021 / Accepted: 24 October 2021 / Published online: 23 November 2021
© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2021
Abstract
Brain tumor classification and segmentation for different weighted MRIs are among the most tedious tasks for many researchers
due to the high variability of tumor tissues based on texture, structure, and position. Our study is divided into two stages:
supervised machine learning-based tumor classification and image processing-based region of tumor extraction. For this job,
seven methods have been used for texture feature generation. We have experimented with various state-of-the-art supervised
machine learning classification algorithms such as support vector machines (SVMs), K-nearest neighbors (KNNs), binary
decision trees (BDTs), random forest (RF), and ensemble methods. Then considering texture features into account, we have
tried for fuzzy C-means (FCM), K-means, and hybrid image segmentation algorithms for our study. The experimental results
achieved a classification accuracy of 94.25%, 87.88%, 89.57%, 96.99%, and 97% with SVM, KNN, BDT, RF, and Ensemble
methods, respectively, on FLAIR-, T1C-, and T2-weighted MRI, and the hybrid segmentation attaining 90.16% mean dice
score for tumor area segmentation against ground-truth images.
Keywords Medical image processing · Machine Learning · Brain tumor classification · Brain tumor segmentation
1 Introduction cells are termed benign, whereas cancerous cells are called
malignant tumors. Benign types do not pervade the brain or
While the brain is the control unit of the human body, brain its neighboring tissues. However, these can still harm the
tumors have become fatal and life-threatening diseases esca- nearby tissues or vital organs, so they need to be treated in
lating in recent days. It is generally a mass of tissue that the nick of time. On the other hand, malignant tumors are
originates from an irregular stimulus of tumorous cells in any life-threatening as they invade healthy tissues of the brain
part of the human brain. It is an activation in a single cell’s and further spread throughout the brain or other regions of
genes, which is the source of the act, resulting in the further the body.
uncontrollable division of nearby cells. More often than not, A medical imaging technique, especially magnetic res-
a brain tumor emerges inside the brain and its nerves or the onance imaging (MRI) [4,5], plays an essential role in
brain’s coverings. These tumors are generally classified into diagnosing and treating brain tumors with positive outcomes.
malignant tumors or benign tumors [1–3]. The non-cancerous The best parts of MRI, being an important modality of medi-
cal imaging [6–8], provide automated and precise diagnostic
B Biswajit Jena results. One of the most demanding problems while deal-
[email protected]
ing with MR images is to segregate a few distinct cells and
Gopal Krishna Nayak tissues from the image. Further, this leads up to the seg-
[email protected]
mentation process. Segmentation of required objects aids
Sanjay Saxena physicians in identifying lesions more precisely; hence, it
[email protected]
assumes a noteworthy job in computerized medical imaging.
1 International Institute of Information Technology, While there exist many mainstream segmentation techniques
Bhubaneswar, India such as thresholding [9], region-based seed growing [10],
2 IEEE Member, International Institute of Information and graph partitioning [11], still they lack behind in domain
Technology, Bhubaneswar, India application of brain tumor classification due to similarities
123
6 Page 2 of 16 B. Jena et al.
in intensity between some healthy tissues and brain tumors, sidering wavelet feature extraction such as DWT, SWT, and
which can give rise to uncertainty within the algorithm. This DMWT. In [15], Gumei et al. (2019) proposed a regularized
resulted in the usage of multispectral MR sequences [12] for extreme learning machine (RELM) brain tumor classification
tumor identification by researchers to overcome this prob- method based on a machine learning approach. They con-
lem. Nevertheless, four main difficulties were identified in sidered hybrid feature extraction on 3064 brain MRI images
this approach [13]. First, being the acquisition of such MR with max–min preprocessing steps. Basically, GIST and nor-
sequences is not always attainable owing to the condition malized GIST feature descriptors are used in their work.
as in the severity and urgency of patients. Second, it is an In [16], Mishra et al. (2019) discussed the classifica-
expensive procedure. The third is the presence of redun- tion of the microscopic image to classify normal white cells
dant information, which further results in the consumption of from infected cells of leukemia. They have considered dis-
image processing time, and still, the chances of segmentation crete orthonormal S-transform for feature extraction with
errors cannot be ruled out. Finally, the multispectral MRI linear discriminant analysis for feature reduction and finally
scans suffer from misalignment and inconsistency, which Adaboost-based random forest classifier for classification
needs bias correction and image registration in advance purposes. In [17], Amin et al. (2017) proposed a distinc-
before being used in segmentation algorithms [9]. Consid- tive approach to detect and classify the brain tumor from
ering these limitations into consideration, in this research an MRI scan. Basically, they go for an SVM classifier with
article, we propose a scheme for the classification and seg- cross-validation after getting the feature by shape, texture,
mentation of tumors on two-dimensional single spectral MR and intensity feature selection procedure to finally classify
sequences. The mainstream contributions of this article are: cancerous MRI from non-cancerous MRI.
In [18], Bahadure et al. (2017), support vector machine
• This is the first article performing classification on the (SVM) and Berkeley wavelet transformation (BWT)
BraTS dataset for the brain tumor as per the best of our approaches were investigated for image analysis on MRI.
knowledge, and we also consider some data from The The work on [19] by Joseph et al. (2014) proposed K-means
Cancer Imaging Archive (TCIA). clustering algorithms for segmentation of tumor regions, and
• We consider as many as five machine learning algorithms in [20], by Alfonse et al. (2016) used SVM for automatic
along with multiple hyperparameter conditions. brain tumor classification with accuracy further improved by
• We consider as many as seven texture feature extraction Fourier transform for feature extraction.
methodologies to generate texture features. In [21] Ahmadvand et al. (2016), a feature vector of
• Along with tumor classification, we also perform the MRI was extracted based on the wavelet transform described
tumor segmentation method to detect the tumor. as modality fusion vector (MFV). Then for the segmenta-
• Finally, we use a hybrid segmentation approach (hybrid tion of tumorous images, Markov random field model was
of K-means and fuzzy C-means) rather than any individ- used. In [22], Atiq Islam et al. (2013), a new texture feature
ual approach. extraction method, MultiFD with Ada Boost classification
technique, is used to detect and segment out a brain tumor.
The remaining parts of the article have been presented as In [23], Abbasi et al. (2017), automatic detection of the tumor
follows: Section 2 discusses some noticeable related works was performed on 3D images. Histogram-oriented gradients
in this domain of brain tumor classification and segmenta- (HOGs) and local binary pattern (LBP) in three orthogonal
tion. Section 3 presents a system overview of the current planes of MRI were used for the random forest classifier
work. A detail about the dataset and preprocessing tasks is for segmenting out the region of interest. The work in [24]
given in sect. 4. Sections 5 and 6 discuss the tumor clas- by Işın, Ali et al. (2016), emphasizes modern deep learn-
sification and segmentation process, respectively, with the ing architectures, mostly on convolutional neural networks
experimental outcome and discussion. Section 7 is devoted (CNNs) for brain tumor classification. CNN takes spatial
to the conclusion, followed by some selected references. information within input pixels using convolution and pool-
ing processes. The convolution process extracts features, and
round-up operations result in successful classification. Still,
2 Related works it takes a considerably long training period and showcases
issues with being able to adhere to a particular solution at the
Many approaches have been recommended for tumor clas- time of the training process [25].
sification and detection on MRI scans. Some of the related After going through some relevant literature, we have
works are referred here to further improve our work in that studied that most of the literature is based on either one or
direction. two classification algorithms. Similarly, for feature extrac-
In [14], Mishra et al. (2020) present brain tumor MRI tion, some literature uses wavelet features, while others use
classification using a support vector machine (SVM) by con- texture features. Furthermore, each of the classification algo-
123
An empirical study of different machine learning techniques... Page 3 of 16 6
4.1 Dataset
The datasets used in our system have been gathered from the
NCI-MICCAI 2017 and the 2019 Challenge (BraTS-2017
and BraTS-2019) [26–28]. These benchmark datasets con-
Fig. 1 System for tumor classification and segmentation sist of fully anonymous images from institutions like Bern
University, ETH Zurich, Utah University, and Debrecen Uni-
versity. These are the datasets of 3D images of brain tumor
rithms and feature extraction techniques has its own pros and medical resonance imaging (MRI), specifically designed for
cons. So, the motivation behind this work is to use multiple brain tumor segmentation. These datasets consist of mul-
classification algorithms and texture features for a better con- timodal brain MRI scans along with manually annotated
clusion of the classification of brain tumors. Eventually, by tumor regions corresponding to each scan in four different
doing so, we can choose the best classification algorithm as volumes, namely (a) native T1 (T1), (b) contrast-enhanced
per the need of the hour. T1-weighted (T1ce), (c) T2-weighted, (d) fluid attenuated
In this paper, we have considered texture features and inversion recovery (Flair). Samples of all these modalities of
seven techniques of texture feature generation. For classi- the images are shown in Fig. 2. Also, we have considered
fication, we choose SVM, KNN (with Euclidean distance, data from MR sequences from The Cancer Imaging Archive
City-Block distance, and Minkowski distance), binary deci- (TCIA). Overall, the manually inspected central slices of 100
sion tree, random forest, ensemble method (with Adaboost, numbers of T1C-, T2- & FLAIR-weighted MR sequences,
Gentleboost, Logitboost, LPboost, Robustboost, RUSboost, each of the brain images having high-grade glioma (HGG)
and Totalboost). So, as a total, we have considered five main- along with low-grade glioma (LGG) and 100 number MRI
stream classification methods, even more hyperparameters scans of the normal brains are utilized in this study from both
for some classification algorithms. the datasets.
4.2 Preprocessing
3 System overview
The raw MRI scans usually contain various artifacts, noise,
Our whole system is partitioned into two parts, i.e., tumor salts. It also includes uneven intensity distribution as the
classification and tumor segmentation. In tumor classifica- MRI has been collected from the different scanners and even
tion, an MR image has been considered as input and classified under multiple situations and positions of the camera. This
Fig. 2 Middle slices of an HGG patient in (a) Flair, (b) T1, (c) T1ce, (d) T2
123
6 Page 4 of 16 B. Jena et al.
123
An empirical study of different machine learning techniques... Page 5 of 16 6
Table 1 Accuracy of classifiers for the BraTS-2017 and TCIA dataset presented in an image influences the required size of the
Classifier Accuracy matrix. The statistical features that are extracted from this
T1C T2 FLAIR
matrix are energy, entropy, inertia, homogeneity, correlation,
absolute value, maximum probability, and inverse difference,
SVM Linear – 93 94 93 i.e., eight features [32]. By taking distances and angles into
KNN Euclidean distance k = 1 90.5 90 90 consideration, finally, there are 64 consolidated features.
3 87.5 84 82.5
5 87 84 84 5.1.3 Gray-level run length matrix (GLRLM) feature
7 87 84 83
City-Block distance K=1 89.5 89.5 83.5 In GLRLM [33], gray-level run length is a texture primi-
3 85 85.5 80 tive, which is regarded as the connected group of pixels of
5 79 81 81.5 maximum co-linearity with exactly the same gray level. Fur-
7 79.5 80.5 81 ther, gray-level runs are described based on the direction and
Minkowski distance K=1 90 91.5 91 length of the run for a specific gray value. For determining
3 83.5 87 81 GLRLM, different lengths of gray-level runs must be found
5 81 80 80 for certain. The GLR matrices are calculated for angles 0◦ ,
7 84.5 83.5 82.5
45◦ , 90◦ , and 135◦ .
The extracted features are short-run emphasis (SRE),
Binary Decision Trees – – 89 90 88.5
long-run emphasis (LRE), run percentage (RP), run length
Random Forest – – 96 97 96.5
non-uniformity (RLN), gray-level non-uniformity (GLN),
Ensemble Adaboost – 97 98.5 97
low gray-level run emphasis (LGRE), high gray-level run
Gentleboost 99 97 98.5
emphasis (HGRE), short-run high gray-Level emphasis
Logitboost 97 99 98.5
(SRHGE), short-run low gray-level emphasis (SRLGE),
LPboost 97 97 97.5
long-run high gray-level emphasis (LRHGE), and long-run
Robustboost 95.5 98.5 90
low gray-level emphasis (LRLGE), i.e., 11 features. By tak-
RUSboost 87 90 86 ing angles into account, there are 44 features.
Totalboost 94.5 97.5 94.5
5.1.4 Histogram-oriented gradient (HOG) feature
looks fine or coarse, smooth or irregular, homogeneous or The reasoning behind the usage of histogram-oriented gradi-
inhomogeneous, etc. Generally, texture denotes characteris- ent features [34] is that it takes the appearance of local object
tics of the surface and appearance of an object based on the and structure, which can be identified by edge directions and
structure, size, density, arrangement, and proportion of its then generalizes. The method locally outlines the gradient
fundamental parts. The collections of such features through orientation of an image. Hence, 80 histogram-oriented gra-
the process of texture analysis have been described as texture dient features are derived.
feature extraction. The following texture features have been
accounted into the discussion in this study: 5.1.5 Local binary patterns (LBP) feature
5.1.1 First-order statistical feature Local binary patterns [35] can be used to describe the shape
and texture of an image. The image is partitioned into vari-
The most widely used first-order statistical features that char- ous small regions from which the extraction of features takes
acterize texture for image classification are mean, median, place. Binary patterns that describe the neighborhood of pix-
skewness, kurtosis, energy, entropy, average contrast. So, els in the divided small regions constitute the features. The
these six features are considered as first-order statistical fea- features obtained from these small regions are sequenced
tures used for the feature extraction process [31]. into a single-feature histogram, which creates a portrayal of
the image. The resulting histogram will be 256 dimensions.
5.1.2 Gray-level co-occurrence matrix (GLCM) feature Finally, 256 LBP features are extracted.
The GLCM is one of the essential and traditional methods for 5.1.6 Cross-diagonal texture matrix (CDTM) feature
texture feature extraction. It is a two-dimensional histogram
where the occurrence of pairs of pixels which are parted by CDTM [36] represents the spatial connection between a pixel
a particular distance (i.e., d = 1, 2) as well as an angle (0◦ , and its neighboring pixel at a particular angle and distance. It
45◦ , 90◦ , and 135◦ ) is described. The number of gray levels also finds out texture information around the central pixel by
123
6 Page 6 of 16 B. Jena et al.
using the eight neighboring pixels. So, a set of six features are using a random forest classifier are handling unlabeled data,
extracted from the matrix, i.e., homogeneity, entropy, abso- robustness to outliers and nonlinear data, quick prediction
lute value, contrast, energy, and inertia difference moment. and training time, and deals with high dimensionality data
[42]. The feature that ensemble learning follows is: it com-
5.1.7 Simplified texture spectrum feature bines various machine learning models to improve the final
model’s performance. We can call the ensemble technique a
It characterizes local texture information in four directions meta-algorithm. By following this approach, ensemble learn-
instead of eight directions, which is used in the original tex- ing reduces bias and variance or improves predictions [43].
ture spectrum feature. One of the advantages of the texture After the completion of training, the classifiers’ recognition
spectrum approach in image processing is that instead of rate on un-tested data is used to indicate the performance
texture features, it characterizes the texture aspects of an algorithm.
image by the corresponding texture spectrum. The simpli-
fied texture spectrum groups the 81 features obtained from 5.4 Performance evaluation and experimental setup
the texture spectrum into 15 features [37].
So a total of 6 first-order statistical features, 44 gray- The following parameters are used as the performance eval-
level run length matrix features, 64 gray-level co-occurrence uation metrics for the brain tumor classification system.
matrix features, 80 histogram-oriented gradient features, 256
local binary pattern features, 6 cross-diagonal texture matrix
• Accuracy: The texture features consist of many different
features, and 15 simplified texture spectrum features, forms
capabilities for the precise classification of MRI lesions.
a 471-dimensional texture feature vector.
So we computed its confusion matrix, i.e., a table that is
often used to illustrate a trained model’s performance on
5.2 Feature matrix a given test dataset where the true values are well known.
The terms used in a confusion matrix are as follows:
All the texture feature vectors obtained from 100 tumorous
and 100 non-tumorous images are hence consolidated to form True positives (TP): accurately classified +ve samples
a feature vector matrix of size 200 x 471. True negatives (TN): accurately classified -ve sam-
ples
5.3 Feature classification False negatives (FN): inaccurately classified +ve
samples
This section deals with the brain tumor classification from False positives (FP): inaccurately classified -ve sam-
both the tumorous and non-tumorous images using features ples
obtained. Five robust supervised binary classification tech-
We then quantify this capability using accuracy from the
niques are applied, and a comparison of results is made. The
following expression:
techniques applied are SVM (linear) [38,39], KNN (distance
= Euclidean, City-Block and Minkowski & k = 1,3,5,7) [40],
Accuracy = (T P + T N )/(T P + T N + F N + F P)
binary decision trees [41], random forest [42], and ensem-
ble methods [43], i.e., Adaboost, Gentleboost, Logitboost, (1)
LPboost, Robustboost, Rusboost, and Totalboost, as shown
in Fig. 5. Training–testing samples are chosen randomly. Ten- • Sensitivity: It is the metric that evaluates a model’s ability
fold cross-validation is applied to ensure the robustness and to predict the true positives of each available category.
prevent over-fitting of our models.
The advantages of using the SVM model are more visi- Sensitivit y = T P/(T P + F N ) (2)
ble in high-dimensional spaces, i.e., more separation margin
between classes gives better results and is more effective in • Specificity: It is the metric that evaluates a model’s ability
the scenario where number samples are less as compared to to predict true negatives of each available category.
the number of dimensions [38,39]. The superiority of using
KNN is that it has a straightforward implementation proce-
Speci f icit y = T N /(T N + F P) (3)
dure, is robust to noisy data, is effective if the training data
are large, and augmentation in training data [40]. The bene-
fits of using a decision tree as a classifier are decision trees • Precision: It is the ratio of correctly predicted positive
required less effort for data preparation during preprocessing observations to the total predicted positive observations.
compared to other algorithms and do not require normal-
ized data as well as scaling of data [41]. The advantages of Pr ecision = T P/(T P + F P) (4)
123
An empirical study of different machine learning techniques... Page 7 of 16 6
• F1-Score: It is the weighted average of precision and types of classifiers. The accuracy of the individual classifier
recall (sensitivity). is mentioned in Table ztab1.
With reference to Table 1, when we compare among the
F1 − Scor e = classifiers for classification accuracy, ensemble algorithms
2 ∗ (Pr ecision ∗ Recall)/(Pr ecision + Recall) (5) outperform the rest. However, the accuracy of the random
forest algorithm has also reached nearby the ensemble. The
• Experimental Setup: All the source code implementation average accuracy of SVM is 93.33%, KNN is 84.56%, BDT
for preprocessing steps, classification, and segmentation is 89.16%, RF is 96.5%, and ensemble is 96.98% as depicted
are performed on python 3.6 with the help of standard in Table 2 for BraTS-2017 dataset. We have also performed
machine learning libraries. The software systems are exploratory analysis using box-plot for T1C, T2, and FLAIR
compiled on the hardware resource of Windows 10 oper- class of tumor to know the shape of the distribution, central
ating system (64-bit), 8GB memory, Intel(R) Core(TM) value, and variability of classification accuracy, sensitivity,
i5-10300H CPU @ 2.50GHz with NVIDIA GeForce and specificity. Figures 6, 7, and 8 show the box-plot distri-
GTX 1650 Ti graphics. bution of accuracy, sensitivity, and specificity, respectively,
for various classes of tumor. Now, Fig. 9 shows the compari-
5.5 Experimental results son graph for average classification accuracy, sensitivity, and
specificity for various classifiers.
We have trained our classifiers for 200 TIC-, T2-, and We have also calculated the sensitivity, specificity, preci-
FLAIR-weighted images of tumorous and non-tumorous sion, and F1-score of respective classifiers as these parame-
MRI scans. The various classifiers used for these image clas- ters also hold equal importance to measure the performance
sification tasks are support vector machine (SVM) with the of classifiers. Table 3 summarizes the sensitivity of all clas-
linear model, K-nearest neighbors (KNNs) having k values sification models for all the weights of MRI scans, whereas
1,3,5 and 7 for Euclidean, City-Block and Minkowski dis- Table 4 gives the average sensitivity of the classifiers. Sim-
tance, binary decision trees (BDTs), random forest (RF), and ilarly, Tables 5 and 6 illustrate the specificity and average
ensemble methods with Adaboost, Gentleboost, Logitboost, specificity of the classifiers for all volumes of brain image,
Lpboost, Robustboost, Rusboost and Totalboost as various respectively. Now, Table 7 and 8 represents the average pre-
123
6 Page 8 of 16 B. Jena et al.
Fig. 6 Box-plot for the value of accuracy distribution on T1C, T2, and Flair of BraTS-2017+TCIA
Fig. 7 Box-plot for the value of sensitivity distribution on T1C, T2, and Flair of BraTS-2017+TCIA
Table 2 Average accuracy of classifiers cision and F1-score of all the classifiers used in this study.
Classifiers Average accuracy Average accuracy Further, the classification mean CPU time in seconds also
(BraTS-2017+TCIA) (BraTS-2019+TCIA) depicted in Table 9 shows the performance analysis of vari-
ous algorithms.
SVM 93.33 94.25
Even if we go through various articles on brain tumor
KNN 84.57 87.88
classification and detection, we closely follow some very
BDT 89.16 89.57 recent works by Mishra et al. (2020) [14] and Abbasi et al.
RF 96.5 96.99 (2017) [23]. While most of the works follow certain types
Ensemble 96.98 97.01 of methods for feature extraction, then a sole classifier for
the classification task. By considering all the pros and cons,
we go for seven methods for texture feature extraction, and
123
An empirical study of different machine learning techniques... Page 9 of 16 6
Fig. 8 Box-plot for the value of specificity distribution on T1C, T2, and Flair of BraTS-2017+TCIA
Classifier Sensitivity
T1C T2 FLAIR
SVM Linear – 94 95 95
KNN Euclidean distance k = 1 91 91.5 92
3 89.5 85 84
5 89 85 85.5
7 87 86 85
City-Block distance K=1 90 91 85
3 86.5 86 81
5 79.5 81.5 82 Fig. 9 Comparison graph showing the average accuracy, sensitivity,
and specificity of various classifiers for BraTS-2017 and TCIA dataset
7 80.5 82 81.5
Minkowski distance K=1 90.5 92 92
Table 4 Average sensitivity of classifiers
3 85 87.5 81.5
Classifiers Average sensitivity Average sensitivity
5 82 80.5 81
(BraTS- (BraTS-
7 85.5 83.5 83.5 2017+TCIA) 2019+TCIA)
Binary Decision Trees – – 89.5 91 89
SVM 94.66 95.02
Random Forest – – 97 97 98
KNN 85.59 88.16
Ensemble Adaboost – 98 99 97.5
BDT 89.83 90.11
Gentleboost 99 98.5 98.5
RF 97.33 97.89
Logitboost 98.5 99 98
Ensemble 96.57 97.55
LPboost 98 97.5 98
Robustboost 96 98 92
RUSboost 97.5 90.5 87
Totalboost 95.5 98.5 94 then, we consider five classifiers (SVM, KNN, BDT, RF, and
ensemble) for the classification task. The datasets used in
this work are MICCAI’s 2017 and 2019 challenge datasets
of 3D brain MRI scans (BraTS-2017 and BraTS-2019). The
comparison of our work with some relevant recent works is
tabulated in Table 10.
123
6 Page 10 of 16 B. Jena et al.
Table 5 Specificity of classifiers for the BraTS-2017 and TCIA dataset Table 8 Average F1-score of classifiers
Classifiers Average specificity Average specificity
Classifier Specificity (BraTS- (BraTS-
T1C T2 FLAIR 2017+TCIA) 2019+TCIA)
123
An empirical study of different machine learning techniques... Page 11 of 16 6
k
n
J (u, v) = (µi j )m (xi − v j )2 (7)
i=1 j=1
where xi − v j is the Euclidean distance between xi and
Fig. 10 Tumor segmentation process vj .
k = number of clusters.
background of the image. The main principle of this algo- n= number of data points in a cluster.
rithm is: it partitions the given dataset into k number of µi j = represents the membership of ith data to the jth
clusters based on the k-centroids. When we have unlabeled cluster center.
data, we go for this algorithm. Based on a certain similarity m = fuzziness index.
in the data present, we find the groups where the number of
groups formed is represented by k. The initial requirement 6.1.3 Hybrid segmentation
of the k-means algorithm is the number of clusters, i.e., k.
Then randomly, the centers of k-clusters are selected. By tak- This is the segmentation method, which is the combination
ing various distance matrices available, the distance between of K-means and fuzzy C-means [46]. The K-means cluster-
each cluster center to each pixel is determined. Then, the ing method is simple to run and very fast on large datasets,
pixels are grouped to a specific cluster that has the small- but it fails to determine the specific tumorous regions. On the
est distance among others, and then, the re-estimation of the other hand, fuzzy C-mean methods are mainly used because
centroid is calculated. Again the same approach for other they help retain the more minute information of the origi-
pixels continues until the center converges. Finally, this algo- nal image for detecting tumor cells precisely compared to
rithm aims at minimizing an objective function knows as the the K-means. After experimenting with the algorithms men-
squared error function given by Eq. 6: tioned above, we chose the hybrid segmentation algorithm
k
n
based on the results. The hybrid segmentation algorithm first
J (v) = (xi − v j )2 (6) performs K-means followed by FCM on the given dataset
i=1 j=1 using the results of K-means. The resulted cluster centers
of the K-means algorithm are used as the cluster seeds in
where xi − v j is the Euclidean distance between xi and FCM algorithms until the termination condition arrives. But,
vj . to run the initial iteration of the FCM, the cluster centers and
k = number of clusters. the membership matrix are calculated based on the results of
n= number of data points in a cluster. K-means. The remaining iterations continue as in the FCM
algorithm. The two output segmented images from the set
6.1.2 Fuzzy C-means of tumorous images after applying the hybrid segmentation
techniques are shown in Fig. 11.
The case of fuzzy c-means (FCM) [45] is another form of
clustering technique, where a single piece of information 6.2 Morphological operations on the segmented
may belong to more than one cluster. Hence, the pixels of image
a particular image data will be present in various outcome
classes with different degrees of membership. Fuzzy behav- Binary images can have many disfigurements. In specific,
ior can be studied in fuzzy C-means when data are bound after segmentation, the binary regions induced are mainly
123
6 Page 12 of 16 B. Jena et al.
Mishra et al. [14] Features: Wavelet, Classifier: SVM REMBRANDT 200 images Accuracy: 98%
Gumei et al. [15] Features: PCA-NGIST, 3064 brain tumor MRI images Accuracy: 94.23%
Classifier: RELM
Amin et al. [17] Features: Shape, texture, and Harvard dataset: 100 Accuracy:97.1, AUC: 0.98, specificity:98.0
intensity, classifier: SVM RIDER:
126 Local: 85
Abbasi et al. [23] Features: LBP and HOG, BRATS 2013 (85 Flair MRI) High accuracy rate
Classifier: RF
Our work Features: 7 methods for tex- BRATS 2017 (200 Flair, Accuracy:89.22, Sensitivity:90.22, Specificity:90.06
ture feature extraction, T1C and T2 MRI)
classifier: SVM, KNN, BDT,
RF, ensemble
Image preprocessing of ground-truth images is highly required compare ground-truth images with the segmented images.
as ground-truth images are in the grayscale format. We have Figure 13 shows the application of preprocessing on a seg-
converted them to binary format and then filled the holes to mented ground-truth image.
123
An empirical study of different machine learning techniques... Page 13 of 16 6
Saxena et al. [47] Fuzzy c-means Brain MRI (BraTS 2017) Dice:91
Alam et al. [48] Hybrid of K-means and fuzzy c-means Brain MRI Accuracy: 97.5
Duggirala et al. [46] Hybrid of K-means and fuzzy c-means Heart images Error Index (SSE):0.0024
Our work Hybrid of K-means and fuzzy c-means Brain MRI (BraTS 2017) Dice:90.16, Accuracy:98.4
6.4 Comparison of segmented image and JSI value, the better the similarity between the two sets.
ground-truth image For two given sets P and Q, it can be expressed as:
123
6 Page 14 of 16 B. Jena et al.
posed system, we compare our segmentation performance 3. Muhammad, S., Salman, K., Khan, M., Wanqing, W., Amin, U.,
with other similar works, which is tabulated in Table 12 Sung Wook, B.: Multi-grade brain tumor classification using deep
cnn with extensive data augmentation. Journal of computational
Now to have a fair point to discuss, we experimented by science 30, 174–182 (2019)
superimposing the segmented image on the tumorous MR 4. Suneetha, B., JhansiRani, A.: A survey on image processing
image. It can be clearly noticed that the tumorous region techniques for brain tumor detection using magnetic resonance
is correctly superimposed on a tumorous image, and this imaging. In: 2017 International Conference on Innovations in
Green Energy and Healthcare Technologies (IGEHT) , pp. 1–6.
indicates the correctness of our segmentation task, which is IEEE (2017)
depicted in Fig. 14, for a randomly selected image. 5. Pei, L., Bakas, S., Vossough, A., Reza, S.M.S., Davatzikos, C.,
Iftekharuddin, K.M.: Longitudinal brain tumor segmentation pre-
diction in mri using feature and label fusion. Biomedical Signal
7 Conclusion Processing and Control 55, 101648 (2020)
6. Scarapicchia, V., Brown, C., Mayo, C., Gawryluk, J.R.: Functional
This work is toward brain tumor analysis from MRI scans. magnetic resonance imaging and functional near-infrared spec-
We have classified T1C-, T2-, and FLAIR-weighted MR troscopy: insights from combined recording studies. Frontiers in
human neuroscience 11, 419 (2017)
images into tumorous and non-tumorous, and further seg- 7. Nikolaos P Asimakis, Irene S Karanasiou, PK Gkonis, and Niko-
mentation on the tumorous images. The noteworthy accuracy laos K Uzunoglu. Theoretical analysis of a passive acoustic
of the procedure for tumor classification and segmentation brain monitoring system. Progress in Electromagnetics Research,
on the benchmark datasets signifies our method’s efficacy. 23:165–180, 2010
8. Chaturvedi, C.M., Singh, V.P., Singh, P., Basu, P., Singaravel, M.,
The observations prove that the texture features are enough Shukla, R.K., Dhawan, A., Pati, A.K., Gangwar, R.K., Singh, S.:
to differentiate between tumor tissues and non-tumorous tis- 2.45 ghz (cw) microwave irradiation alters circadian organization,
sues in the T1C-, T2-, and FLAIR-weighted scans. Moreover, spatial memory, dna structure in the brain cells and blood cell counts
clustering-based segmentation techniques are one of the best of male mice, mus musculus. Progr. Electromagn. Res. B 29, 23–42
(2011)
ways to segment tumorous regions. By keeping the positive 9. Lemieux, L., Hagemann, G., Krakow, K., Woermann, F.G.: Fast,
outcomes of our proposed methods, we can claim it on other accurate, and reproducible automatic segmentation of the brain in
benchmark datasets of brain tumors and compare the results. t1-weighted volume mri data. Magnetic Resonance in Medicine:
Also, we could extend our observation other than texture fea- An Official Journal of the International Society for Magnetic Res-
onance in Medicine 42(1), 127–135 (1999)
tures and strive to work on multifeatures and diverse datasets. 10. Tang, H., Wu, E.X., Ma, Q.Y., Gallagher, D., Perera, G.M.,
Zhuang, T.: MRI brain image segmentation by multi-resolution
Author Contributions First author is the main contributor and super- edge detection and region selection. Comput. Med. Imaging Gr.
vised by other authors. 24(6), 349–357 (2000)
11. Chen, V., Ruan, S.: Graph cut based segmentation of brain tumor
Funding This study is not funded by any organization or individual. from mri images. International Journal on Sciences and Techniques
of Automatic control & computer engineering 3(2), 1054–1063
Availability of Data and Material Yes. (2009)
12. Jara, H., Sakai, O., Mankal, P., Irving, R.P., Norbash, A.M.: Mul-
tispectral quantitative magnetic resonance imaging of brain iron
Declarations stores: a theoretical perspective. Topics in Magnetic Resonance
Imaging 17(1), 19–30 (2006)
Conflicts of interest All the authors declare that they have no conflict 13. Kabir, Y., Dojat, M., Scherrer, B., Forbes, F., Garbay, C.: Mul-
of interest. timodal MRI segmentation of ischemic stroke lesions. In: 2007
29th annual international conference of the IEEE engineering in
Ethics Approval This article does not contain any studies with human medicine and biology society, pp. 1595–1598 (2007)
participants or animals performed by any of the authors. 14. Mishra, S.K., Deepthi, VH.: Brain image classification by the
combination of different wavelet transforms and support vector
Consent to Participate Yes. machine classification. J. Am. Intell. Human Comput. 12(6), 6741–
6749 (2021)
Consent for Publication Yes. 15. Gumaei, A., Hassan, M.M., Rafiul Hassan, Md., Alelaiwi, A.: A
hybrid feature extractionmethod with regularized extreme learning
machine for brain tumor classification. IEEE Access 7, 36266–
References 36273 (2019) hybrid feature extractionmethod with regularized
extreme learning machine for brain tumor classification. IEEE
1. Tandel, Gopal S., Balestrieri, Antonella., Jujaray, Tanay., Khanna, Access 7, 36266–36273 (2019)
Narender N., Saba, Luca.,Suri, Jasjit S.: Multiclass magnetic 16. Mishra, Sonali, Majhi, Banshidhar, Sa, Pankaj Kumar: Texture
resonance imaging brain tumor classification using artificial intelli- feature based classification on microscopic blood smear for acute
gence paradigm. Computers in Biology and Medicine, 122:103804, lymphoblastic leukemia detection. Biomed. Signal Process. Con-
(2020) trol 47, 303–311 (2019)
2. Padma Nanthagopal, A., Sukanesh Rajamony, R.: Classification of 17. Amin, J., Sharif, M., Yasmin, M., Fernandes, S.L.: A distinctive
benign and malignant brain tumor ct images using wavelet tex- approach in brain tumor detection and classification using mri.
ture parameters and neural network classifier. J. Vis. 16(1), 19–28 Pattern Recognition Letters 139, 118–127 (2020)
(2013)
123
An empirical study of different machine learning techniques... Page 15 of 16 6
18. Bahadure, N.B., Ray, A.K., Thethi, H.P.: Image analysis for MRI 37. He, D.-C., Wang, L.: Simplified texture spectrum for texture anal-
based brain tumor detection and feature extraction using biologi- ysis. Journal of Communication and Computer 7(8), 44–53 (2010)
cally inspired BWT and SVM. International journal of biomedical 38. Tandel, Gopal S., Biswas, Mainak, Kakde, Omprakash G., Tiwari,
imaging, 2017 (2017) Ashish, Suri, Harman S., Turk, Monica, Laird, John R., Asare,
19. Joseph, R.P., Singh, C.S., Manikandan, M.: Brain tumor mri image Christopher K., Ankrah, Annabel A., Khanna, et al, N.N.: A review
segmentation and detection in image processing. International on a deep learning perspective in brain cancer classification. Can-
Journal of Research in Engineering and Technology 3(1), 1–5 cers 11(1), 111 (201+9)
(2014) 39. Braun, A.C., Weidner, U., Hinz, S.: Classification in high-
20. Marco, A., Salem, A.B.M.: An automatic classification of brain dimensional feature spaces-assessment using svm, ivm and rvm
tumors through MRI using support vector machine. Egy. Comp. with focus on simulated enmap data. IEEE J. Sel. Topi. Appl. Earth
Sci. J., 40(3), (2016) Obs. Remote Sens. 5(2), 436–443 (2012)
21. Ahmadvand, A., Kabiri, P.: Multispectral MRI image segmentation 40. Tan, S.: An effective refinement strategy for KNN text classifier.
using Markov random field model. Signal Image Video Process Exp. Syst. Appl. 30(2), 290–298 (2006)
10(2), 251–258 (2016) 41. Tin Kam Ho: A data complexity analysis of comparative advan-
22. Islam, A., Reza, S.M.S., Iftekharuddin, K.M.: Multifractal texture tages of decision forest constructors. Pattern Anal. Appl. 5(2),
estimation for detection and segmentation of brain tumors. IEEE 102–112 (2002)
transactions on biomedical engineering 60(11), 3204–3215 (2013) 42. Raf Guns and Ronald Rousseau. Recommending research col-
23. Abbasi, S., Tajeripour, F.: Detection of brain tumor in 3d mri images laborations using link prediction and random forest classifiers.
using local binary patterns and histogram orientation gradient. Neu- Scientometrics, 101(2), 1461–1473, 2014
rocomputing 219, 526–535 (2017) 43. Oza, N.C., Tumer, K.: Classifier ensembles: Select real-world
24. Işın, A., Direkoğlu, C., Şah, M.: Review of MRI-based brain tumor applications. Information fusion 9(1), 4–20 (2008)
image segmentation using deep learning methods. Procedia Com- 44. Dhanalakshmi, P., Kanimozhi, T.: Automatic segmentation of brain
puter Science 102, 317–324 (2016) tumor using k-means clustering and its area calculation. Interna-
25. Arı, B., Şengür, A., Arı, A.: Local receptive fields extreme learning tional Journal of advanced electrical and Electronics Engineering
machine for apricot leaf recognition. In: International Conference 2(2), 130–134 (2013)
on Artificial Intelligence and Data Processing (IDAP16), pp. 17–18 45. Kalema, K.A., Bukenya, F., Rose, A.A.: A review and analysis
(2016) of fuzzy-c means clustering techniques. Int J Sci Eng Res 5(11),
26. Menze, B.H., Jakab, A., Bauer, S., Kalpathy-Cramer, J., Farahani, 1072–7 (2014)
K., Kirby, J., Burren, Y., Porz, N., Slotboom, J., Wiest, et al, R.: The 46. Raja, K.D.: Segmenting images using hybridization of k-means
multimodal brain tumor image segmentation benchmark (BRATS). and fuzzy c-means algorithms. In: Introduction to data science and
IEEE Trans. Med. Imaging 34(10), 1993–2024 (2014) machine learning, IntechOpen (2019)
27. Bakas, S., Akbari, H., Sotiras, A., Bilello, M., Rozycki, M., Kirby, 47. Sanjay, S., Suraj, S.: Brain tumor segmentation by texture feature
J.S., Freymann, J.B., Farahani, K., Davatzikos, C.: Advancing the extraction with the parallel implementation of fuzzy c-means using
cancer genome atlas glioma mri collections with expert segmenta- CUDA on GPU. In: 2018 5th international conference on Paral-
tion labels and radiomic features. Scientific data 4(1), 1–13 (2017) lel, Distributed and Grid Computing (PDGC), pp. 580–585. IEEE
28. Spyridon, B., Mauricio, R., Andras, J., Stefan, B., Markus, R., (2018)
Alessandro, C., Russell, T.S., Christoph, B., Sung, M.H., Martin, 48. Alam, M. S., Rahman, M. M., Hossain, M. A., Islam, M. K.,
R., et al.: Identifying the best machine learning algorithms for brain Ahmed, K. M., Ahmed, K. T., Singh BC, Sipon MM,: Automatic
tumor segmentation, progression assessment, and overall survival human brain tumor detection in MRI image using template-based k
prediction in the brats challenge. arXiv preprint arXiv: 1811.02629 means and improved fuzzy c means clustering algorithm. Big Data
(2018) Cognit Comput 3(2), 27 (2019)
29. Tustison, N.J., Avants, B.B., Cook, P.A., Zheng, Y., Egan, A.,
Yushkevich, P.A., Gee, J.C.: N4itk: improved n3 bias correction.
IEEE transactions on medical imaging 29(6), 1310–1320 (2010)
30. Humied, I.A., Abou-Chadi, F.E.Z., Rashad, M.Z.: A new combined
technique for automatic contrast enhancement of digital images. Publisher’s Note Springer Nature remains neutral with regard to juris-
Egypt. Inf. J. 13(1), 27–37 (2012) dictional claims in published maps and institutional affiliations.
31. Haralick, R.M., Shanmugam, K., Its Hak, D.I.N.S.T.E.I.N.: Textu-
ral features for image classification. IEEE Trans. Syst. Man Cybern.
6, 610–621 (1973) Biswajit Jena received the B.Tech.
32. Qurat-Ul-Ain, G.L., Kazmi, S.B., Jaffar, M.A., Mirza, A.M.: Clas- Degrees in Computer Science and
sification and segmentation of brain tumor using texture analysis. engineering from the Biju Pat-
In: Recent advances in artificial intelligence, knowledge engineer- tnaik University of Technology-
ing and data bases, pp. 147-155 (2010) break (BPUT), Rourkela, India,
33. Chu, A., Sehgal, C.M., Greenleaf, J.F.: Use of gray value distri- and M.Tech. Degree in Computer
bution of run lengths for texture analysis. Pattern Recognit. Lett. Science and engineering with
11(6), 415–419 (1990) Information Security as special-
34. Tian, S., Bhattacharya, U., Lu, S., Su, B., Wang, Q., Wei, X., Lu, ization from National Institute of
Y., Tan, C.L.: Multilingual scene character recognition with co- Technology (NIT), Rourkela,
occurrence of histogram of oriented gradients. Pattern Recognition India, and is currently working
51, 125–134 (2016) toward the Ph.D. degree from
35. Prakasa, E.: Texture feature extraction by using local binary pattern. International Institute of Informa-
INKOM J. 9(2), 45–48 (2016) tion Technology (IIIT),
36. Al-Janobi, A.: Performance evaluation of cross-diagonal texture Bhubaneswar, India. His research
matrix method of texture analysis. Pattern Recognit. 34(1), 171– interests are in the field of Medical Image Processing, Machine Learn-
180 (2001) ing, and Deep Learning.
123
6 Page 16 of 16 B. Jena et al.
123