sustainability-15-02754-v2
sustainability-15-02754-v2
sustainability-15-02754-v2
Article
A Comparative Analysis of Machine Learning Models: A Case
Study in Predicting Chronic Kidney Disease
Hasnain Iftikhar 1,2 , Murad Khan 3 , Zardad Khan 3 , Faridoon Khan 4 , Huda M Alshanbari 5, *
and Zubair Ahmad 2
Abstract: In the modern world, chronic kidney disease is one of the most severe diseases that nega-
tively affects human life. It is becoming a growing problem in both developed and underdeveloped
countries. An accurate and timely diagnosis of chronic kidney disease is vital in preventing and
treating kidney failure. The diagnosis of chronic kidney disease through history has been considered
unreliable in many respects. To classify healthy people and people with chronic kidney disease,
non-invasive methods like machine learning models are reliable and efficient. In our current work, we
predict chronic kidney disease using different machine learning models, including logistic, probit, ran-
dom forest, decision tree, k-nearest neighbor, and support vector machine with four kernel functions
(linear, Laplacian, Bessel, and radial basis kernels). The dataset is a record taken as a case–control
study containing chronic kidney disease patients from district Buner, Khyber Pakhtunkhwa, Pakistan.
To compare the models in terms of classification and accuracy, we calculated different performance
measures, including accuracy, Brier score, sensitivity, Youdent, specificity, and F1 score. The Diebold
and Mariano test of comparable prediction accuracy was also conducted to determine whether there
Citation: Iftikhar, H.; Khan, M.;
is a substantial difference in the accuracy measures of different predictive models. As confirmed
Khan, Z.; Khan, F.; Alshanbari, H.M.;
Ahmad, Z. A Comparative Analysis
by the results, the support vector machine with the Laplace kernel function outperforms all other
of Machine Learning Models: A Case models, while the random forest is competitive.
Study in Predicting Chronic Kidney
Disease. Sustainability 2023, 15, 2754. Keywords: chronic kidney disease; machine learning models; comparative analysis; predictions
https://doi.org/10.3390/su15032754
1. For the first time, we included primary data from CKD patients in district Buner, Kyber
Pakhtunkhwa, Pakistan, to motivate developing countries to implement machine
learning algorithms to reliably and efficiently classify healthy people and people with
chronic kidney disease;
2. To assess the consistency of the considered ML models, three different scenarios of
training and testing set were adopted: (a) 90% training, 10% testing; (b) 75% training,
25% testing; and (c) 50% training, 50% testing. Additionally, within each validation
scenario, the simulation was ran one thousand times to test the models’ consistency;
3. The prominent machine learning models were used for the comparison of predicting
CKD, including logistic, probit, random forest, decision tree, k-nearest neighbor, and
support vector machine with four kernel functions (linear, Laplacian, Bessel, and
radial basis kernels);
4. The performance of the models is evaluated using the six performance measures,
including accuracy, Brier score, sensitivity, Youdent, specificity, and F1 score. More-
over, to assess the significance of the differences in the prediction performance of the
models, the Diebold and Mariano test was performed.
The rest of the article is organized as follows: Section 2 contains materials and methods,
Section 3 contains results and discussion, and Section 4 contains a conclusion.
z2 pq
n= (1)
m2
Here, the sample size, n, z is the statistic with a subsequent level of confidence, p is the
expected prevalence proportion of CKD patients, q = 1 − p, and m is the precision with
a corresponding effect size. For the calculation of the expected sample size, we have
assumed that 270 patients had CKD and 230 did not have CKD, and calculated the expected
prevalence proportion p = 0.54, q = 0.46, z = 1.96 (95% confidence interval), and m = 0.05;
some authors [22] recommended using 5% precision if the expected prevalence proportion
lies between 10% and 90%. After putting the values of p, z, and m in Equation (1), the
approximated sample of size n = 382 was produced, which is further used for analysis in
this research.
multiple variable model, known as multiple logistic regression (MLR). The mathematical
equation for MLR is
exp(α0 + α1 z1 + α2 z2 + . . . + αk zk )
Pr ( Z ) = (2)
1 + exp(α0 + α1 z1 + α2 z2 + . . . + αk zk )
− k zm − zn k
K (zm , zn ) = e( ); α > 0 (4)
σ
Bessel kernel: The Bessel function is well-known in the theory of kernel function
spaces of fractional smoothness, which is given as
Bv+1 (σ k zm − zn k)
K (zm , zn ) = (5)
k z m − z n k − n ( v +1)
2.3.1. Accuracy
The capacity of data items that is precisely classified is defined as accuracy. This means
that the predictions of the data points are closer to the actual values. The mathematical
equation can be described as
TP + TN
Accuracy = (7)
TP + FP + TN + FN
Here, r shows the r th predicted probability in the cth category, and is the related
observed binary output with r th probability in the category of cth classification. The
minimum value of BS shows that the method is consistent and accurate.
2.3.3. Sensitivity
The measure of true positive observations that are properly identified as positive is
called sensitivity. The mathematical formula is
TP
Sensitivity = (9)
TP + FN
2.3.4. Youdent
Youdent can be generally described as Youdent = max a (Sensitivity ( a) + Speci f icity ( a)
−1). The cut-point that attains this high is decribed as the optimum cut-point ( a∗ ) since it
is the cut-point that improves the biomarker’s distinguishing capability when equivalent
weight is given to specificity and sensitivity.
2.3.5. Specificity
The quantity of true negative values that are precisely recognized as negative is called
specificity. The mathematical formula is
TN
Speci f icity = (10)
TN + FP
2.3.6. F1 Score
The amount of mixture of precision and recall to retain stability between Them is
called F1 score
The mathematical function for computing the F1 score is given by
2 ∗ Precison ∗ Recall
F1 − score = (11)
Precison + Recall
F1 score confirmed the goodness of classifiers in the terms of precision or recall.
Figure1.1.Performance
Figure Performancemeasure
measureplots
plots(90%
(90%training,
training,10%
10%testing)
testing)of
ofthe
themodels.
models.
Table2.2.First
Table Firstscenario
scenario(90%,
(90%,10%):
10%):different
differentperformance
performancemeasures
measuresand
androw-wise
row-wisepredictive
predictivemodels
models
inrows.
in rows.
Models
Models Accuracy
Accuracy Specificity
Specificity Sensitivity
Sensitivity Youdent
Youdent Brier Score
Brier Score Error
Error FFScore
Score
Logistic
Logistic 0.8945
0.8945 0.8736
0.8736 0.9073
0.9073 0.7809
0.7809 0.0699
0.0699 0.1055
0.1055 0.9135
0.9135
Probit
Probit 0.8942
0.8942 0.8736
0.8736 0.9073
0.9073 0.7809
0.7809 0.0686
0.0686 0.1058
0.1058 0.9139
0.9139
D-Tree 0.8839 0.8411 0.9096 0.7507 0.0953 0.1161 0.8985
D-Tree 0.8839 0.8411 0.9096 0.7507 0.0953 0.1161 0.8985
KNN 0.6309 0.4826 0.7209 0.2035 0.2457 0.3691 0.7800
KNN
SVM-RB 0.6309
0.8995 0.4826
0.8794 0.7209
0.9127 0.2035
0.7921 0.2457
0.0671 0.3691
0.1005 0.7800
0.9202
SVM-L
SVM-RB 0.8978
0.8995 0.9219
0.8794 0.8846
0.9127 0.8065
0.7921 0.0644
0.0671 0.1022
0.1005 0.9108
0.9202
SVM-LAP
SVM-L
0.9171
0.8978
0.8671
0.9219
0.9484
0.8846
0.8155
0.8065
0.0643
0.0644
0.0829
0.1022
0.9319
0.9108
SVM-B 0.8961 0.8751 0.9096 0.7846 0.0672 0.1039 0.9175
SVM-LAP 0.9171 0.8671 0.9484 0.8155 0.0643 0.0829 0.9319
RF 0.9129 0.8808 0.9330 0.8138 0.0652 0.0871 0.9270
SVM-B 0.8961 0.8751 0.9096 0.7846 0.0672 0.1039 0.9175
RF 0.9129 0.8808 0.9330 0.8138 0.0652 0.0871 0.9270
In the second scenario, the training dataset is 75% and testing is 25%. The results
are shown in Table 3. In Table 3, it is evident again that SVM-Lap and RF led to a better
prediction than the competitor predictive models. The best predictive model produced
0.9135, 0.8643, 0.9468, 0.8111, 0.0663, 0.0865, and 0.9328 for mean accuracy, specificity,
sensitivity, Youdent, Brier-score, F1 score, and error, respectively. RF again produced the
second best results. The superiority of the model can be observed in the figures. For
example, Figure 2 shows the mean accuracy, specificity, sensitivity, Youdent, Brier score,
and F1 score of all models. As seen from all figures, SVM-Lap produces superior results
compared to the rest of the models. Although RF is competitive and KNN shows the worst
results compared to the rest.
sitivity, Youdent, Brier-score, F1 score, and error, respectively. RF again produced the sec-
ond best results. The superiority of the model can be observed in the figures. For example,
Figure 2 shows the mean accuracy, specificity, sensitivity, Youdent, Brier score, and F1
score of all models. As seen from all figures, SVM-Lap produces superior results com-
Sustainability 2023, 15, 2754 pared to the rest of the models. Although RF is competitive and KNN shows the worst 8 of 13
results compared to the rest.
Figure 2.
Figure Performance measure
2. Performance measure plots
plots (50%
(50% training,
training, 50%
50% testing)
testing) of
of the
the models.
models.
Table 3. Second scenario (75%, 25%): different performance measures and row-wise predictive
Table 3. Second scenario (75%, 25%): different performance measures and row-wise predictive mod-
models
els in rows.
in rows.
Models
Models Accuracy
Accuracy Specificity
Specificity Sensitivity
Sensitivity Youdent
Youdent Brier
BrierScore
Score Error
Error FF Score
Score
Logistic 0.8923 0.8736 0.9073 0.7809 0.0760 0.1077 0.9135
Logistic 0.8923 0.8736 0.9073 0.7809 0.0760 0.1077 0.9135
Probit
Probit 0.8927
0.8927 0.8686
0.8686 0.9088
0.9088 0.7774
0.7774 0.0742
0.0742 0.1073
0.1073 0.9137
0.9137
D-Tree
D-Tree 0.8722
0.8722 0.8297
0.8297 0.9072
0.9072 0.7369
0.7369 0.1040
0.1040 0.1278
0.1278 0.9024
0.9024
KNN
KNN 0.6809
0.6809 0.5811
0.5811 0.7423
0.7423 0.3233
0.3233 0.1933
0.1933 0.3191
0.3191 0.7800
0.7800
SVM-RB
SVM-RB 0.9005
0.9005 0.8761
0.8761 0.9151
0.9151 0.7913
0.7913 0.0686
0.0686 0.0995
0.0995 0.9191
0.9191
SVM-L
SVM-L 0.8918
0.8918 0.9143
0.9143 0.8841
0.8841 0.7984
0.7984 0.0673
0.0673 0.1082
0.1082 0.9125
0.9125
SVM-LAP 0.9135 0.8643 0.9468 0.8111 0.0663 0.0865 0.9328
SVM-LAP 0.9135 0.8643 0.9468 0.8111 0.0663 0.0865 0.9328
SVM-B 0.8972 0.8729 0.9115 0.7845 0.0691 0.1028 0.9163
SVM-B 0.8972 0.8729 0.9115 0.7845 0.0691 0.1028 0.9163
RF 0.9084 0.8764 0.9318 0.8082 0.0672 0.0916 0.9284
RF 0.9084 0.8764 0.9318 0.8082 0.0672 0.0916 0.9284
Finally, in the third scenario, the training dataset is 50% and the testing is 50%. The
outcomes areinlisted
Finally, in Table
the third 4. As the
scenario, seen in Table
training 4, it isisevident
dataset 50% and again that SVM-Lap
the testing is 50%. and
The
RF have a better
outcomes prediction.
are listed in TableThis time,
4. As seentheinbest
Tablepredictive modelagain
4, it is evident produced 0.9052, 0.8585,
that SVM-Lap and
0.9449,
RF have0.8034,
a better0.0701, 0.0948,
prediction. Thisand
time,0.9304 forpredictive
the best mean accuracy, specificity,
model produced sensitivity,
0.9052, 0.8585,
Youdent, Brier score, F1 score, and error, respectively. RF again produced the second
0.9449, 0.8034, 0.0701, 0.0948, and 0.9304 for mean accuracy, specificity, sensitivity, Youdent, best
results. The superiority of the model is shown in the figures. For example,
Brier score, F1 score, and error, respectively. RF again produced the second best results.Figure 3 shows
the
Themean accuracy,
superiority specificity,
of the model is sensitivity,
shown in the Youdent,
figures. Brier score, and
For example, F1 score
Figure of allthe
3 shows models.
mean
It conforms
accuracy, to all other
specificity, figures; Youdent,
sensitivity, SVM-LapBrier produces
score, the
andlowest error.
F1 score of allAlthough
models. ItRF is com-
conforms
petitive, KNN
to all other shows
figures; the worst
SVM-Lap results compared
produces the lowest to all the
error. models.RF is competitive, KNN
Although
shows the worst results compared to all the models.
Once the accuracy measures were computed, the superiority of these results was
evaluated. For this purpose, in the past, many researchers performed the Diebold and
Mariano (DM) test [28–30]. In this study, to verify the superiority of the predictive model
results (accuracy measures) listed in Tables 2–4, we used the DM test [31]. A DM test is
the most used statistical test for comparative predictions acquired from different models.
The DM test for identical prediction accuracy has been performed for pairs of models. The
results (p-values) of the DM test are listed in Table 5; each entry in the table shows the
p-value of a hypothesis assumed, where the null hypothesis supposes no difference in the
accuracy of the predictor in the column or row against the research hypothesis that the
predictor in the column is more accurate as compared to the predictor in the row. In Table 5,
we observed that the prediction accuracy of the previously defined model is not statistically
different from all other models except for KNN.
Sustainability 2023, 15, x FOR PEER REVIEW 9 of 13
Sustainability 2023, 15, 2754 9 of 13
Figure 3. Performance measure plots (75% training, 25% testing) of the models.
Figure 3. Performance measure plots (75% training, 25% testing) of the models.
Table 4. Third scenario (50%, 50%): different performance measures and row-wise predictive models
Once the accuracy measures were computed, the superiority of these results was
in rows.
evaluated. For this purpose, in the past, many researchers performed the Diebold and
Models Accuracy Mariano (DM) test
Specificity [28–30]. In this
Sensitivity study, to verify
Youdent Brierthe superiority
Score Error of the predictive
F Score model
Logistic 0.8858 results (accuracy measures) listed in Tables 2–4, we used the DM test [31]. 0.9135
0.8736 0.9073 0.7809 0.0929 0.1142 A DM test is
Probit 0.8873 the most used statistical
0.8628 0.9090 test for0.7718
comparative predictions
0.0900 acquired
0.1127from different
0.9125models.
D-Tree 0.8457
The DM test for identical
0.8021 0.9065
prediction
0.7086
accuracy has been performed
0.1207 0.1543
for pairs of0.8950
models. The
results (p-values) of the DM test are listed in Table 5; each entry in the table shows the p-
KNN 0.7145 0.6492 0.7553 0.4045 0.1554 0.2855 0.7685
value of a hypothesis assumed, where the null hypothesis supposes no difference in the
SVM-RB 0.9001 0.8712 of the predictor
accuracy 0.9182 in the0.7893
column or row 0.0705 0.0999
against the research 0.9196that the
hypothesis
SVM-L 0.8878 predictor
0.9077 in the column
0.8843 is more0.7920
accurate as compared
0.0720 to the predictor
0.1122 in the 0.9110
row. In Table
SVM-LAP 0.9052 5,0.8585
we observed that the
0.9449 prediction accuracy
0.8034 of the previously
0.0701 defined
0.0948 model is not statis-
0.9304
tically different from all other models except for KNN.
SVM-B 0.8978 0.8692 0.9144 0.7836 0.0714 0.1022 0.9171
RF 0.9021 0.8694
Table 0.9314(50%, 50%):0.8008
4. Third scenario 0.0712 measures0.0979
different performance 0.9264 mod-
and row-wise predictive
els in rows.
(a)
(b)
Figure 4. Cont.
Sustainability 2023, 15, x FOR PEER REVIEW 11 of 13
(c)
Figure
Figure 4. 4.
(a)(a) Error
Error box-plots
box-plots (90%training,
(90% training,10%
10%testing)
testing)ofofthe
themodels;
models;(b)
(b)Error
Errorbox-plots
box-plots(50%
(50%
training, 50% testing) of the models; (c) Error box-plots (75% training, 25% testing) of the models.
training, 50% testing) of the models; (c) Error box-plots (75% training, 25% testing) of the models.
4. Conclusions
4. Conclusions
In In
this research,
this research,wewe attempted
attempted a comparative
a comparativeanalysis
analysisofofdifferent
differentmachine
machinelearning
learning
methods
methods using
usingthetheCKDCKDdata of district
data Buner,
of district Khyber
Buner, Pakhtunkhwa,
Khyber Pakhtunkhwa, Pakistan. The dataset
Pakistan. The da-
consists of records
taset consists collected
of records as part as
collected of part
a case–control study involving
of a case–control patients
study involving with CKD
patients with
from the entire Buner district. For the training and testing prediction, we
CKD from the entire Buner district. For the training and testing prediction, we considered considered three
different scenarios,
three different including
scenarios, 50%, 75%,
including 50%,and75%,90%. For the
and 90%. Forcomparison
the comparison of theof models
the modelsin
terms of classification,
in terms we calculated
of classification, we calculated various
variousperformance
performance measures,
measures,i.e.,
i.e.,accuracy,
accuracy,Brier
Brier
score, sensitivity,
score, sensitivity, Youdent,
Youdent,specificity, and and
specificity, F1 score. The The
F1 score. results indicate
results that the
indicate thatSVM-LAP
the SVM-
model
LAPoutperforms
model outperforms other models in all three
other models scenarios,
in all while the
three scenarios, RF model
while the RFismodel
competitive.
is com-
Additionally, the DM testthe
petitive. Additionally, was DMused
testtowas
ensure
usedthe superiority
to ensure of predictive
the superiority model accuracy
of predictive model
measures.
accuracyThe result (DM
measures. The test p-values)
result (DM test shows the best
p-values) performance
shows of all used methods
the best performance in
of all used
themethods
prediction of chronic kidney disease, excluding the KNN method.
in the prediction of chronic kidney disease, excluding the KNN method.
This study
This study can bebe
can further
further extended
extendedininfuture
futureresearch
researchprojects
projectsin inmedical
medicalsciences
sciences by,
by,
forfor
example,
example, predicting
predictingthetheeffectiveness
effectivenessofofa aparticular
particularmedicine
medicineininspecific
specificdiseases.
diseases. Fur-
Fur-
thermore,
thermore, thethe
reported bestbest
reported MLML models in this
models work
in this can be
work canused to predict
be used otherother
to predict conditions,
condi-
such as heart
tions, such as disease, cancer, cancer,
heart disease, and tuberculosis. Moreover,
and tuberculosis. a novel
Moreover, hybrid
a novel system
hybrid for the
system for
same
the dataset will bewill
same dataset proposed to obtain
be proposed moremore
to obtain accurate and efficient
accurate prediction
and efficient results.
prediction results.
Author
AuthorContributions:
Contributions: H.I., conceptualized
H.I., conceptualizedand andanalyzed
analyzedthethedata,
data,and
andprepared
preparedthe
thefigures
figuresand
and
draft;
draft; M.K.,
M.K., collect
collect data,
data, designed
designed thethe methodology,
methodology, performedthe
performed thecomputation
computationwork,
work,prepared
preparedthethe
tables
tables and and graphs,
graphs, andand authored
authored thethe draft;
draft; Z.K.,
Z.K., supervised,authored,
supervised, authored,reviewed,
reviewed,and
andapproved
approvedthe the
final
final draft;
draft; F.K.,
F.K., H.M.A.
H.M.A. andand
Z.A.Z.A. authored,
authored, reviewed,
reviewed, administered,
administered, and and approved
approved the final
the final draft.draft.
All
All authors
authors haveand
have read read and agreed
agreed to the to the published
published version
version of theof the manuscript.
manuscript.
Funding
Funding: : Princess
Princess Nourah
Nourah bint
bint Abdulrahman
Abdulrahman UniversityResearchers
University ResearchersSupporting
SupportingProject
Projectnumber
number
(PNURSP2023R
(PNURSP2023R 299),
299), Princess
Princess Nourah
Nourah bint
bint Abdulrahman
Abdulrahman University,
University, Riyadh,
Riyadh, SaudiArabia.
Saudi Arabia.
Informed
Informed Consent
Consent Statement:
Statement: Informed
Informed consent
consent was was obtained
obtained from
from all all subjects
subjects involved
involved in the
in the study.
study.
Data Availability Statement: The research data will be provided upon the request to the first author.
Data Availability Statement: The research data will be provided upon the request to the first au-
Acknowledgments:
thor. Princess Nourah bint Abdulrahman University Researchers Supporting Project
number (PNURSP2023R 299), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.
Conflicts of Interest: The authors declare no conflict of interest.
Sustainability 2023, 15, 2754 12 of 13
References
1. Yan, M.T.; Chao, C.T.; Lin, S.H. Chronic kidney disease: Strategies to retard progression. Int. J. Mol. Sci. 2021, 22, 10084. [CrossRef]
[PubMed]
2. Lozano, R.; Naghavi, M.; Foreman, K.; Lim, S.; Shibuya, K.; Aboyans, V.; Abraham, J.; Adair, T.; Aggarwal, R.; Ahn, S.Y.; et al.
Global and regional mortality from 235 causes of death for 20 age groups in 1990 and 2010: A systematic analysis for the Global
Burden of Disease Study 2010. Lancet 2012, 380, 2095–2128. [CrossRef]
3. Jha, V.; Garcia-Garcia, G.; Iseki, K.; Li, Z.; Naicker, S.; Plattner, B.; Saran, R.; Wang, A.Y.M.; Yang, C.W. Chronic kidney disease:
Global dimension and perspectives. Lancet 2013, 382, 260–272. [CrossRef] [PubMed]
4. Eckardt, K.U.; Coresh, J.; Devuyst, O.; Johnson, R.J.; Köttgen, A.; Levey, A.S.; Levin, A. Evolving importance of kidney disease:
From subspecialty to global health burden. Lancet 2013, 382, 158–169. [CrossRef]
5. Rapa, S.F.; Di Iorio, B.R.; Campiglia, P.; Heidland, A.; Marzocco, S. Inflammation and oxidative stress in chronic kidney
disease—Potential therapeutic role of minerals, vitamins and plant-derived metabolites. Int. J. Mol. Sci. 2019, 21, 263. [CrossRef]
6. Jayasumana, C.; Gunatilake, S.; Senanayake, P. Glyphosate, hard water and nephrotoxic metals: Are they the culprits behind
the epidemic of chronic kidney disease of unknown etiology in Sri Lanka? Int. J. Environ. Res. Public Health 2014, 11, 2125–2147.
[CrossRef] [PubMed]
7. Mubarik, S.; Malik, S.S.; Mubarak, R.; Gilani, M.; Masood, N. Hypertension associated risk factors in Pakistan: A multifactorial
case-control study. J. Pak. Med. Assoc. 2019, 69, 1070–1073.
8. Naqvi, A.A.; Hassali, M.A.; Aftab, M.T. Epidemiology of rheumatoid arthritis, clinical aspects and socio-economic determinants
in Pakistani patients: A systematic review and meta-analysis. JPMA J. Pak. Med. Assoc. 2019, 69, 389–398.
9. Hsu, R.K.; Powe, N.R. Recent trends in the prevalence of chronic kidney disease: Not the same old song. Curr. Opin. Nephrol.
Hypertens. 2017, 26, 187–196. [CrossRef]
10. Salazar, L.H.A.; Leithardt, V.R.; Parreira, W.D.; da Rocha Fernandes, A.M.; Barbosa, J.L.V.; Correia, S.D. Application of machine
learning techniques to predict a patient’s no-show in the healthcare sector. Future Internet 2022, 14, 3. [CrossRef]
11. Elsheikh, A.H.; Saba, A.I.; Panchal, H.; Shanmugan, S.; Alsaleh, N.A.; Ahmadein, M. Artificial intelligence for forecasting the
prevalence of COVID-19 pandemic: An overview. Healthcare 2021, 9, 1614. [CrossRef] [PubMed]
12. Khamparia, A.; Pandey, B. A novel integrated principal component analysis and support vector machines-based diagnostic
system for detection of chronic kidney disease. Int. J. Data Anal. Tech. Strateg. 2020, 12, 99–113. [CrossRef]
13. Zhao, Y.; Zhang, Y. Comparison of decision tree methods for finding active objects. Adv. Space Res. 2008, 41, 1955–1959. [CrossRef]
14. Vijayarani, S.; Dhayanand, S.; Phil, M. Kidney disease prediction using SVM and ANN algorithms. Int. J. Comput. Bus. Res.
(IJCBR) 2015, 6, 1–12.
15. Dritsas, E.; Trigka, M. Machine learning techniques for chronic kidney disease risk prediction. Big Data Cogn. Comput. 2022, 6, 98.
[CrossRef]
16. Wickramasinghe, M.P.N.M.; Perera, D.M.; Kahandawaarachchi, K.A.D.C.P. (2017, December). Dietary prediction for patients with
Chronic Kidney Disease (CKD) by considering blood potassium level using machine learning algorithms. In Proceedings of the
2017 IEEE Life Sciences Conference (LSC), Sydney, Australia, 13–15 December 2017; IEEE: Piscataway, NJ, USA, 2018; pp. 300–303.
17. Gupta, A.; Eysenbach, B.; Finn, C.; Levine, S. Unsupervised meta-learning for reinforcement learning. arXiv 2018, arXiv:1806.04640.
18. Lakshmi, K.; Nagesh, Y.; Krishna, M.V. Performance comparison of three data mining techniques for predicting kidney dialysis
survivability. Int. J. Adv. Eng. Technol. 2014, 7, 242.
19. Zhang, H.; Hung, C.L.; Chu, W.C.C.; Chiu, P.F.; Tang, C.Y. Chronic kidney disease survival prediction with artificial neural
networks. In Proceedings of the 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Madrid, Spain,
3–6 December 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1351–1356.
20. Kavakiotis, I.; Tsave, O.; Salifoglou, A.; Maglaveras, N.; Vlahavas, I.; Chouvarda, I. Machine learning and data mining methods in
diabetes research. Comput. Struct. Biotechnol. J. 2017, 15, 104–116. [CrossRef]
21. Singh, V.; Asari, V.K.; Rajasekaran, R. A Deep Neural Network for Early Detection and Prediction of Chronic Kidney Disease.
Diagnostics 2022, 12, 116. [CrossRef]
22. Pourhoseingholi, M.A.; Vahedi, M.; Rahimzadeh, M. Sample size calculation in medical studies. Gastroenterol. Hepatol. Bed Bench
2013, 6, 14.
23. Naing, L.; Winn TB, N.R.; Rusli, B.N. Practical issues in calculating the sample size for prevalence studies. Arch. Orofac. Sci.
2006, 1, 9–14.
24. Nhu, V.H.; Shirzadi, A.; Shahabi, H.; Singh, S.K.; Al-Ansari, N.; Clague, J.J.; Jaafari, A.; Chen, W.; Miraki, S.; Dou, J.; et al. Shallow
landslide susceptibility mapping: A comparison between logistic model tree, logistic regression, naïve bayes tree, artificial neural
network, and support vector machine algorithms. Int. J. Environ. Res. Public Health 2020, 17, 2749. [CrossRef] [PubMed]
25. Joachims, T. Making large-scale svm learning. In Practical Advances in Kernel Methods-Support Vector Learning; MIT Press:
Cambridge, MA, USA, 1999.
26. Criminisi, A.; Shotton, J.; Konukoglu, E. Decision forests: A unified framework for classification, regression, density estimation,
manifold learning and semi-supervised learning. Found. Trends Comput. Graph. Vis. 2012, 7, 81–227. [CrossRef]
27. Tyralis, H.; Papacharalampous, G.; Langousis, A. A brief review of random forests for water scientists and practitioners and their
recent history in water resources. Water 2019, 11, 910. [CrossRef]
Sustainability 2023, 15, 2754 13 of 13
28. Shah, I.; Iftikhar, H.; Ali, S.; Wang, D. Short-term electricity demand forecasting using components estimation technique. Energies
2019, 12, 2532. [CrossRef]
29. Shah, I.; Iftikhar, H.; Ali, S. Modeling and forecasting medium-term electricity consumption using component estimation
technique. Forecasting 2020, 2, 163–179. [CrossRef]
30. Shah, I.; Iftikhar, H.; Ali, S. Modeling and forecasting electricity demand and prices: A comparison of alternative approaches.
J. Math. 2022, 2022. [CrossRef]
31. Diebold, F.X.; Mariano, R.S. Comparing predictive accuracy. J. Bus. Econ. Stat. 2002, 20, 134–144. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.