A Survey on Classification Problem Using ID3, CART, and C4.5
Decision Tree Algorithms
Abdiwak T1. Alemayehu Sh1. Ermias N.1 Senedu G.1 Yonas G.1
1AI
SIG Lab, Department of Computer Science and Engineering, Adama Science and
Technology University, ETHIOPIA
[email protected]
ABSTRACT
Decision tree learning is one of the predictive modeling approaches used in statistics, machine
learning, and Artificial intelligence. And this paper offers an empirical study on the use of a
decision tree algorithm for iris dataset for flower classification. We have done some experimental
works to recommend which DT algorithm is performing well on classification problems in terms
of their accuracy. Among the different DT algorithms, we chose the frequently used algorithm
such that: Id3, CART, and C4.5.ID3 stands for Induced decision tree or Iterator Dichotomizer 3.
It is a type of decision tree first invented by Ross Quinlan in 1986. It generates a decision tree from
the data set using a top-bottom approach using the selected attributes of the dataset. C4.5 is an
improved version of the ID3 algorithm. The major improvements include: processing both numeric
and discrete data, handling missing attribute values, producing easily-interpreted rules and it is
fastest when compared with other algorithms due to it uses the main memory of the computer.
CART stands for Classification and Regression Trees. It constructs binary trees, which means each
internal node has exactly two outgoing edges. Towing criteria is used for selecting the splits and
cost–complexity Pruning is used to prune the obtained tree.
To evaluate the performance of the selected algorithms we used iris flower dataset with four
independent numerical attributes and one dependent attribute. As the result shows the Id3
algorithm outperformed on accuracy measurement over others.
Contents
1.
Introduction ........................................................................................................................................... 2
2.
Literature Survey .................................................................................................................................. 3
3.
Methodology ......................................................................................................................................... 9
4.1.
Decision Tree Classification Algorithms ...................................................................................... 9
4.1.1.
CHAID .................................................................................................................................. 9
4.1.2.
CART .................................................................................................................................. 10
4.1.3.
ID3 ...................................................................................................................................... 10
4.1.4.
C4.5 ..................................................................................................................................... 11
5.
Results and Discussion ....................................................................................................................... 12
6.
Conclusion .......................................................................................................................................... 13
7.
Future Work ........................................................................................................................................ 13
8.
References ........................................................................................................................................... 14
i
structure that includes a root node, branches,
1. Introduction
The
development
“machine
and
learning”
application
and
and leaf nodes. A test on an attribute is
of
denoted by Each internal node, the outcome
“artificial
of a test is denoted by each branch node, and
intelligence” systems have become popular
each leaf node holds a class label[3].
within the last decade. Both terms are used in
science and media frequently, sometimes
Decision tree algorithms discussed in this
interchangeably, sometimes with different
paper are ID3 (Iterative of ID3), C4.5
meanings [1]. For the first time artificial
(Successor of ID3), CART (classification and
intelligence (AI) achieved the task of
Regression Tree), CHAID (Chi – squared
recognition as a discipline in the mid 1950's,
Automatic Interaction Detector). In this paper
machine learning (ML) has been a central
the research questions addressed are:
research area. As machine learning is subset
•
of artificial intelligence the two reasons can
what is the difference among DT
algorithms?
be given as evidence. The ability to learn
•
from experience is a one characteristic of
which DT algorithm perform good for
intelligent behavior, and acting as human
classification problem of dataset
being
similar to Iris.
so
any
attempt
to
understand
intelligence as a phenomenon might include
Section 1 discuss about the introduction of
an understanding of learning from past
Artificial Intelligence, machine learning and
experience[2]. Machine learning makes
Decision tree. Section 2 briefly explains
machine to learn from data without explicitly
literature
programmed.
survey.
Section
3
explains
methodology used. Section 4 discussion
Machine learning algorithms are categorized
results are explained. Section 5 concludes
as
this analysis work. Section 6 shows the future
supervised,
unsupervised
and
works.
reinforcement. In this paper we summarized
one type of supervised learning algorithm
called decision tree. A decision tree (DT) is a
supervised
learning
algorithm
with
a
2
implemented extension that was proposed by
2. Literature Survey
Biggs et al. (1991).
Marina Milanović et al.[4] In this paper the
researchers present conceptual characteristics
Brian Miller et al.[7] Used CHAID decision
of the decision tree, an important data mining
tree to detect early Metabolic Syndrome in
method which is, due to its explorative
Young Adults. The user-specified CHAID
nature, exceptionally suitable for detection of
model was compared to both CHAID model
the data structure when analyzing various
with no user-specified first level and logistic
problem situations. Besides that, they also
regression-based
demonstrate applicative characteristics of
identified waist circumference as a strong
this method using CHAID algorithm in
predictor in the MetS diagnosis. The
leadership studies in their empirical analysis
accuracy of the final model they built, with
part: an interdependence of selected personal
waist circumference user-specified as the
characteristics and the manager’s leadership
first level, was 92.3% with its ability to detect
style has been investigated. Finally, they
MetS
developed a classification model for the
comparison
identification of the dominant leadership
concluded that preliminary findings suggest
style. The study was conducted on a sample
that young adults at risk for MetS could be
of 417 managers of privately owned small-
identified for further followup based on their
sized enterprises in Serbia, using a specially
waist circumference and they can show
designed questionnaire. As predictors of the
promise for the development of a preliminary
dominant leadership style the classification
detection algorithm for MetS.
at
71.8%
model.
This
which
models.
The
analysis
outperformed
researcher
model identified the set of six statistically
Flora M et al.[8] Studies of the segmentation
significant personal characteristics.
of the tourism markets have traditionally
Gilbert Ritschard[5] discussed the origin of
been undertaken by regression methods. The
tree methods. He surveyed earlier methods
need to have a significant number of
that led to CHAID, then explained in detail
segments and qualifying variables has led,
how
the
however, to the use of other procedures of
differences between the original method as
multivariate analysis. CHAID (Chi-square
described in Kass and the nowadays currently
Automatic Interaction Detection), which is
CHAID
function,
especially
more complex than other multivariate
3
techniques, has rarely been used. This study
model in order to construct the collision risk
applies
of
model based on the CART. A total of 100
multivariate analysis and CHAID to the same
investigation reports for collision accidents
population of tourists visiting a particular
during 2006–2015 are collected, which were
destination to compare the quality of the
published by Chinese provincial authorities
information obtained on tourism market
and local maritime bureau on their official
segmentation. The results suggest that the
websites. The experimental results show that
analysis based on CHAID matches the nature
the proposed model is comprehensively
of the problem studied better than those
superior to the other models in terms of the
provided by discriminant analysis
collision risk recognition accuracy and
the
traditional
methods
prediction speed.
Yishan Liet al.[9] explains that the main
objective of the research is to develop a
Leandro Pecchia, et al.[10]
predictive model for ship collision risk using
platform to enhance effectiveness and
classification and regression trees (CARTs)
efficiency of home monitoring using data
under different situations. The proposed
mining for early detection of any worsening
CART prediction model is better than the
in patient’s condition. The proposed data
existing ship collision risk prediction model
mining model based on Classification and
in
and
Regression Tree (CART) briefly describe the
prediction speed when the feature dimension
Remote Health Monitoring (RHM) platform
is low and the sample size is small. It is
we designed and realized, which supports
produced
Heart Failure (HF) severity assessment. In
terms
of
by
comprehensive
prediction
accuracy
combining
evaluation
a
fuzzy
method
Proposed a
to
this paper, preliminary results of classifiers
evaluate the risk of the collected samples is
for HF severity detection is presented, which
used to build a collision risk identification
are innovative in comparison to the others
library that contains information based on
previously published. The system developed
expert collision avoidance and six primary
achieved
factors affecting the ship collision risk are
respectively of 96.39% and 100.00% in
considered as the input of the CART model,
detecting HF and of 79.31% and 82.35% in
and the degree of risk is calculated using the
distinguish severe versus mild HF.
fuzzy comprehensive evaluation method as
the actual output for the proposed CART
4
accuracy
and
a
precision
Richard K. Zimmerman, et al.[11] Explains
Namita Puri et al[13]: -The prediction or
the use of classification and regression trees
classifying is the main goal behind use of ID3
(CART)
the
algorithm. ID3 (Iterative Dichotomiser 3) is
study
an algorithm invented by Ross Quinlan in
concluded that CART algorithm has good
1986 used to generate a decision tree from a
sensitivity and high NPV, but low PPV for
dataset. It builds the decision tree from top to
identifying influenza among outpatients
down. No backtracking is there. In this paper
≥5 years. Thus, it is good at identifying a
authors used ID3 algorithm to predict student
group that does not need testing or antivirals
placement based identifying the relevant
and had fair to good predictive performance
attributes based on academics, skills and
for influenza. In this paper, an algorithm that
curricular of final year student.
modeling
probabilities
of
to
influenza.
estimate
This
included fever, cough, and fatigue had a
Therefore, author of this paper has took the
sensitivity of 84 %, the specificity of 48 %,
student placement as classification problem
positive predictive value (PPV) of 23 % and
and design a model to classify or distribute
negative predictive value (NPV) of 94 % for
students to different streams. This model can
the development sample. It is suggested that
be useful for faculties, university and
further testing of the algorithm in other
students to more emphasis on those which are
influenza seasons would help to optimize
not eligible for placement according to this
decisions for lab testing or treatment.
model. Here authors concluded that ID3 is an
A predictive model has been developed in
outperformed or best algorithm that can be
Bing Zhang et al.[12] For Blood pressure
used for classification and prediction of
prediction. Classification and Regression
student’s placement in an engineering
Trees (CARTs) are proposed and applied to
college. The result indicates that the ID3
tackle the problem. The data were collected
decision tree algorithm is the best classifier
from 18 healthy young people (including 12
with 95% accuracy.
males and 6 females) in simulated resting and
Song Danwa et al[14]: - This paper
exercise environments. The protocol consists
emphasized the design and implementation
of 30 minutes resting measurement, followed
of the ID3 algorithm to classify forestry
by 45 minutes exercising session and 3
resource. They used the ID3 algorithm to
minutes cooling down. The proposed model
analyze the correlation information among
has more than 90% accuracy.
5
different forestry species, altitude, origin,
VenugopalreddyAalagadda et al[16]:- This
forest group and rows by establishing a
paper summarizes the use of ID3 decision
decision tree model, and provide a reference
algorithm in the case of identifying the
for the related decision support. It is proved
dropout students based on different attributes
that this method has a good application
and The objectives of this research work is
prospect in forest resource mode evaluation.
to identify relevant attributes from socio-
In addition to finding the correlation between
demographic, academic and institutional data
the attributes, the model provides a scientific
of first-year students from undergraduate at
basis for operation decisions of forest
the university to classify the students are
management and development strategies
dropout or not based on the information
according to the valuable rule information
collected in the form of a model machine
explored by the construction of model
learning tool that automatically determines
knowledgebase.
whether the student can continue his study or
not they used classification method based on
Shruthi E et al[15]: - WhatsApp Messenger is
the Decision tree. For powerful decision-
a cross-platform messaging social media
making tools, different parameter needs to be
application, where text, video, images, and
considered such as socio-demographic data,
other data can be sent to anyone by just
parental attitude and institutional factors. The
having the contact of the person. In this paper
generated knowledge and the results will be
authors used ID3 decision tree algorithm to
quite useful for the tutor and management of
classify the messages which are found is
the university to develop policies and
abused or not. Today WhatsApp Messenger
strategies related to increase the enrolment
usage is increasing day by day. The decision
rate in university and take precautions, advice
tree algorithm Iterative Dichotomiser ID3 is
and reduce student drop. It can also use to
used to making a decision on WhatsApp text
find the reasons and relevant factors that
and evaluated to measure the accuracy of the
affect the dropout students using the ID3
model. To make a decision they used
algorithm.
different techniques like extracting a message
The
implementation
result
indicates that the ID3 decision tree algorithm
from WhatsApp using different tools,
is the best classifier for the classification of
selecting a feature from text and classify
the student either drop out or not with 98%
based on the data feature.
accuracy.
6
S. Sharma et al [17] discussed that nowadays
comparative analysis between three entropies
information industry and social sites are
which are; Shannon, quadratic and Havrda &
enabling us to have huge amounts of data that
Charvt. They performed an experiment on
are being collected and stored in databases.
eight real datasets by using five methods
The authors also state that those data are
which includes C4.5 decision tree algorithm
needed in order to extract and classify
based on Shannon Entropy, C4.5 decision
important
knowledge
tree algorithm based on Havrda and Charvt
discovery. They state that data mining is the
entropy, C4.5 decision tree algorithm based
most
acquisition
on Quadratic entropy, C4.5 decision tree
Shannon
algorithm based on R´enyi entropy and C4.5
Entropy in order to find information gain,
decision tree algorithm based on Taneja
which calculates the information gain ratio
entropy. Their experimental results show that
contained by the data and this, in turn, can
the accuracy of the Experimental Method
help in order to construct the decision tree
based on three entropies outperformed the
and make the prediction. The results obtained
C4.5 algorithm.
information
popular
technique.
The
and
knowledge
authors
used
from the Shannon Entropy are complex, so in
C. Anuradha [18] in their paper entitled “A
order to minimize the difficulties they use
Data Mining based Survey on Student
other entropy like R´enyi entropy, quadratic
Performance Evaluation System” analyze the
entropy, Havrda and Charvt entropy, and
ID3 and C4.5 algorithms for classification of
Taneja entropy in place of the Shannon
student performance evaluation system.
Entropy. The authors break down the
Classification,
architecture of the experimental model
Clustering,
Prediction,
Association rules, Decisions Trees, Neural
classification into three steps; in the first step
Networks and many others are data mining
they performed data re-processing then using
techniques
various entropies they find the information
among
those
classification
algorithms decision trees ID3 and C4.5 plays
gain and lastly, they show the output.
an important role in Educational Data
The researchers applied various performance
Mining. In data mining, there are several
metrics including Mean Squared Error,
classification algorithms where the decision
Cross-Validation and Test2 methods. The
tree algorithms are mostly used because they
main objective of their research is to classify
are easy for implementation. The ID3
the data set. The authors also performed a
algorithm which accepts only categorical
7
attributes selects the splitting attribute using
unseen testing data. They stated that C4.5
the information gain measure. The enhanced
algorithm has the ability to deal with numeric
version of the ID3 algorithm is C4.5 that can
attributes, missing values and noisy data.
accept both continuous and categorical
Rafik Khairul Amin et al [20] in their paper
attributes in building a decision tree. The
authors
concluded
classification
that
from
many
algorithms
the
C4.5
entitled as “ Implementation of Decision Tree
Using C4.5 Algorithm in Decision Making of
algorithm’s performance is the highest.
Loan Application by Debtor (Case Study:
Sonia Singh [19] in their paper entitled as
they state that C4.5 algorithm is commonly
“Comparative Study Id3, Cart And C4.5
used in generating decision trees since it has
Decision Tree Algorithm: A Survey” they
a high accuracy in decision making. They
gave details on three mostly used decision
evaluated
tree algorithms which includes ID3, C4.5 and
calculating the system accuracy according to
CART in order to understand their use and
an input of training and test data. The data
scalability on different types of attributes and
types they used for classification are
feature. The authors state that in dealing with
categorical and discrete so other data types
discrete attributes many outcomes can be
must be converted to those categories. The
obtained from one test as the number of
authors also used a pruning technique that is
distinct values of an attribute. The authors
used to simplify the tree structure that is
also state that in dealing with continuous
generated by the C4.5 algorithm. They used
attributes in accordance with the binary cuts
100 data in which 70% is approved and 30%
data is stored and the entropy gain is
is rejected. Their report discusses the
calculated on each individual value in a
performance of the C4.5 algorithm in
single scan of the sorted data and this process
identifying the debitor’s eligibility. Their
will be done repeatedly for all continuous
result shows that the highest precision value
attributes. The authors also state that the C4.5
is 78.08% using the C4.5 algorithm with data
algorithm allows the pruning of the resulting
partition of 90%:10%. They also indicate that
decision tree and accordingly there will be an
the biggest recall value is 96.4% with the data
increased error rate on the training data while
partitioning of 80%:20%.
Bank Pasar og Yogyakarta Special Region) “
there will be decreased error rates on the
8
the
system
information
by
presented herein based on their capability’s
3. Methodology
simplicity and robustness. Classification is
likewise characterized as the task of target
function learning for mapping each attribute
set to its corresponding, class label. There are
numerous
Decision
tree
classification
algorithms such as CHAID, CART, Id3,
C4.5.
4.1.1. CHAID
Among
the
well-known
decision
tree
algorithms, CHAID (Chi-square Automatic
Interactive Detector Algorithm) is one of the
Figure 1. Algorithm Evaluation Architecture
oldest classification tree methods. which
Figure 1 shows the methodology of this term
creates predictors by dividing continuous
paper that consists of two phases namely the
distributions into several categories with an
classification
performance
equal number of observations. CHAID
evaluation phase. The Iris dataset is chosen as
developed by Gordon V Kass in 1980 to
an experimental dataset. The classification
evaluate
phase uses three Decision tree classification
predictors and display modeling results in tree
algorithms namely Id3, C4.5, CART. These
diagrams which are easy to interpret. CHAID
algorithms are analyzed and validated using
algorithm is one decision tree classification
an accuracy performance evaluation metric.
algorithm that works well with all kinds of
4.1.
phase
and
categorical
Decision Tree Classification
Algorithms
complex
variables
interactions
and
among
continuous
variables. Chi-square splitting criteria is used
Classification is a data mining task that
for tree construction. This is one algorithm
allocates all records in the data set to one of
that can be used for various tasks like
the few predefined classes. The data set is
prediction, detection of interaction between
divided to training and test sets. The training
variables and classification. This can be
set has a known class label while the test set
considered as an extension of Automatic
labels are unknown. Some of the most
Interaction Detection and THeta Automatic
popular and common are adapted and
Interaction Detection methods. Selects the
9
least significant category concerning the
deviation). Weighted mean on node in each
dependent variable. P-value of the predictor
leaf is used for prediction. It has the following
variable with the smallest adjustment is
advantages and disadvantages.
chosen as the split and this is the variable that
Advantages:
•
will yield the most significant split and this is
It
can
continued until no further splits are possible.
numerical
There are many advantages the CHAID
variables.
•
algorithm provides such as easy interpretation
easily
and
handle
both
categorical
It identifies the most significant
and produces a highly visual output. It
variables and eliminates non-
produces reliable output but requires rather
significant ones.
•
huge data sample sizes as it uses by default
multiway splits and respondent groups can
It can easily handle outliers.
Disadvantages:
become very small when quiet small data
•
sample sizes are used. This algorithm is also
It may have an unstable decision
tree.
one of non-parametric decision tree algorithm.
•
It splits only by one variable.
4.1.2. CART
4.1.3. ID3
CART
stands
for
Classification
and
4.1.3.1.
Regression Trees. It constructs binary trees,
General overview ID3
Decision tree algorithm
which means each internal node has exactly
ID3 stands for Induced decision tree or
two outgoing edges. Towing criteria is used
Iterator Dichotomizer 3. It is a type of
for selecting the splits and cost–complexity
decision tree first invented by Ross Quinlan
Pruning is used to prune the obtained tree.
in 1986. It generates a decision tree from the
CART can consider misclassification costs in
data set using a top-bottom approach using
the tree induction when provided and enables
the selected attributes of the dataset. The top-
users to provide prior probability distribution.
bottom approach is using different node
The ability to generate regression trees is an
called root node, branches and leaf nodes. To
important feature of CART. Trees where their
select the root node ID3 uses entropy or
leaves predict a real number and not a class are
information gain, therefore based on the
Regression trees. In the case of regression,
highest entropy value the root node will be
CART looks for splits that minimize the
selected. Even the ID3 decision tree
prediction squared error (the least–squared
algorithm is used to classify the data or object
10
there is a problem of growing the nodes and
Gain (S, B) =Entropy(S)-∑ (|Sv|/ |S|) Entropy
supporting different data types is a major
(Sv). Where Values (B) is the set of all
problem of the ID3 decision algorithm.
possible values for attribute B, and Svis the
A. Entropy
subset of S for which the attribute A has value
Entropy is a measure of information in the
v. This measure can be used to rank attributes
set, which indicates the impurity of an
and build the decision tree. In the tree at each
arbitrary collection of examples. If the target
node is located the attribute with the highest
attribute takes on a different value, then the
information gain among the attributes not yet
entropy S relative to this a-wise classification
considered in the path from the root [13]. As
is defined as
a general use of the decision tree algorithm in
Entropy(s) = − ∑𝑛𝑖=1 𝑝(𝑥𝑖)𝑙𝑜𝑔2𝑝(𝑥𝑖)
classification and regression problem many
Where p(xi) is the proportion/probability of S
researchers used the ID3 decision tree
belonging to class i. A logarithm is base 2
algorithm in different areas.
because entropy is a measure of the expected
4.1.4. C4.5
encoding length measured in bits.
C4.5 is one of the algorithms which is used to
For example, if training data has 14 instances
generate decision trees. It is employed to
with 6 positive and 8 negative instances, the
generate decisions, based on a certain
entropy is calculated as
available set of sample data. The C4.5 can be
Entropy ([6+, 8-]) = - (6 /14) log (6 /14) -
either a univariate or multivariate predictor.
(8/14) log (8/14) = 0.985
It can also be referred to as a statistic
A key point to note here is that the more
classifier.
uniform is the probability distribution, the
In order to construct a decision tree, the C4.5
greater is its entropy[13].
algorithm uses information gain ratio for
B. Information gain
feature selection[17]. The features at each
It is a change before and after splitting the
node on the tree will be selected based on the
data which measures the expected reduction
information gain ratio and such measure is
in entropy by partitioning the examples
known as feature or attribute selection [17].
according to the selected attribute. The
It works with both continuous and discrete
information gain, Gain (S, B) of an attribute
features [17]. It is a quick classification
A, relative to the collection of examples S, is
technique and has high precision[17]. It can
defined as
be examined based on various entropy such
11
as involves the Shannon entropy, Quadratic
Disadvantages:
entropy, havrda and Charvat entropy[17].
C4.5 is an improved version of the ID3
•
Can construct empty branchs.
•
It can suffer from overfitting when
the model picks up data with
algorithm. The major improvements include:
uncommon characteristics.
processing both numeric and discrete data,
•
handling missing attribute values, producing
easily-interpreted rules and it is fastest when
Susceptible to noise.
5. Results and Discussion
compared with other algorithms due to it uses
A. Dataset Description
the main memory of the computer [20]. The
Table 1 shows the Iris data set with 150
Depth-first strategy enables the decision tree
instances and the attributes sepal. length,
to grow [19]. In order to split the data studies
sepal. width, petal.length and petal.width
all of the possible tests, then selects the test
have been used for predicting which species
that gives the highest possible information
of Iris flower belongs to.
gain[19].
Table 1. Iris dataset description
Advantages and Disadvantages of C4.5
Attribute
Description
algorithm[19]:
sepal.length
length of sepal
Advantages:
sepal.width
width of sepal
Continuous and discrete attributes
petal.length
length of petal
can be handled by
petal.width
width of petal
•
the C4.5
algorithm.
•
•
During gain and entropy calculations
Table 2 shows that the five decision tree
missing attribute values are not used.
models are compared by using performance
Removes branches that are not
evaluation criteria including precision, recall,
important and replaces them with
F1-score, and accuracy.
leaf nodes by going back through the
tree once the tree has been created.
12
Table 2. Performance measure results
Decision Tree algorithms
Precision
Recall
F1-score
accuracy
Id3 without Pruning
97%
96.7%
96.7%
96%
Id3 with Pruning
100%
100%
100%
97.5%
C4.5
96%
96%
96%
85%
CART
95%
95.6%
95.3%
95%
The
three
decision
tree
results
classification
found
we
recommend
future
algorithms have been validated using the four
researchers who are implementing decision
important metrics such as precision, recall,
tree algorithm to use Id3 with Pruning for
F1-score and Accuracy and these validation
classification.
results are shown in Table 2. The result
7. Future Work
shows that Id3 with Pruning gives better
classification accuracy as 97.5% than the
There are a lot of improvements that could be
other two algorithms.
done in the future in order to make
generalization. Experiment can be done on
6. Conclusion
more than one data set to evaluate the
This paper conducted a study on three
performance of the algorithms in which in
decision tree classification algorithms on Iris
our case we have used the iris dataset. Other
dataset for experiment and the experimental
than this, C4.5 algorithm with Pruning can
result shows that the Id3 with Pruning tree
also be included on the experiment for better
classifier gives better accuracy 97.5%. The
comparison. Exhaustive survey on this area
second-best algorithm is Id3 without Pruning
can also be done in the future.
which gives accuracy 96%. Based on the
13
failure with data mining via CART
method on HRV feature,” IEEE Trans.
Biomed. Eng., vol. 58, no. 3 PART 2, pp.
800–804, 2011.
8. References
[1]
[2]
N. Kühl, M. Goutier, R. Hirt, and G.
Satzger, “Machine Learning in Artificial
Intelligence: Towards a Common
Understanding,” Proc. 52nd Hawaii Int.
Conf. Syst. Sci., no. January 2019.
J. R. Quinlan, “Induction of Decision
Trees,” Mach. Learn., vol. 1, no. 1, pp.
81–106, 1986.
[11]
R. K. Zimmerman et al., “Classification
and Regression Tree (CART) analysis to
predict influenza in primary care
patients,” BMC Infect. Dis., vol. 16, no.
1, pp. 1–11, 2016.
[12]
B. Zhang, Z. Wei, J. Ren, Y. Cheng, and
Z. Zheng, “An Empirical Study on
Predicting Blood Pressure Using
Classification and Regression Trees,”
IEEE Access, vol. 6, pp. 21758–21768,
2018.
[3]
H. Sharma and S. Kumar, “A Survey on
Decision Tree Algorithms of
Classification in Data Mining,” Int. J.
Sci. Res., vol. 5, no. 4, pp. 2094–2097,
2016.
[4]
M. Milanović and M. Stamenković,
“CHAID Decision Tree: Methodological
Frame and Application,” Econ. Themes,
vol. 54, no. 4, pp. 563–586, 2017.
[13]
N. Puri, D. Khot, P. Shinde, K. Bhoite,
and P. D. Maste, “Student Placement
Prediction Using ID3,” vol. 3, no. Iii, pp.
81–84, 2015.
[5]
G. Ritschard, “CHAID and earlier
supervised tree methods,” Contemp.
Issues Explore. Data Min. Behav. Sci.,
pp. 48–74, 2013.
[14]
[6]
G. V. Kass, “An Exploratory Technique
for Investigating Large Quantities of
Categorical Data,” Appl. Stat., vol. 29,
no. 2, p. 119, 1980.
D. Song, N. Han, and D. Liu,
“Construction of forestry resource
classification rule decision tree based on
ID3 Algorithm,” Proc. 1st Int. Work.
Educ. Technol. Comput. Sci. ETCS 2009,
vol. 3, pp. 867–870, 2009.
[15]
J. Wiley, “International Journal of
Cancer International Journal of Cancer,”
vol. 2, no. 2, pp. 1–24, 2015.
[16]
V. Aalagadda and I. M. Latha,
“Identifying Dropout Students using ID3
Decision Tree Algorithm,” vol. 7, no. 01,
pp. 1203–1206, 2019.
[17]
S. Sharma, J. Agrawal, and S. Sharma,
“Classification Through Machine
Learning Technique: C4. 5 An algorithm
based on Various Entropies,” Int. J.
Comput. Appl., vol. 82, no. 16, pp. 28–
32, 2013.
[18]
C. Anuradha and T. Velmurugan, “A
data mining based survey on student
performance evaluation system,” 2014
IEEE Int. Conf. Comput. Intell. Comput.
Res. IEEE ICCIC 2014, pp. 43–47, 2015.
[7]
B. Miller, M. Fridline, P. Y. Liu, and D.
Marino, “Use of CHAID decision trees to
formulate pathways for the early
detection of metabolic syndrome in
young adults,” Comput. Math. Methods
Med., vol. 2014, 2014.
[8]
F. M. Díaz-Pérez and M. BethencourtCejas, “CHAID algorithm as an
appropriate analytical method for tourism
market segmentation,” J. Destin. Mark.
Manag., vol. 5, no. 3, pp. 275–282, 2016.
[9]
Y. Li, Z. Guo, J. Yang, H. Fang, and Y.
Hu, “Prediction of ship collision risk
based on CART,” IET Intell. Transp.
Syst., vol. 12, no. 10, pp. 1345–1350,
2018.
[10]
L. Pecchia, P. Melillo, and M. Bracale,
“Remote health monitoring of heart
14
[19]
S. Singh and P. Gupta, “Comparative
Study Id3, Cart and C4.5 Decision Tree
Algorithm: a Survey,” Int. J. Adv. Inf.
Sci. Technol., vol. 27, no. 27, pp. 97–103,
2014.
[20]
R. K. Amin, Indwiarti, and Y. Sibaroni,
“Implementation of a decision tree using
C4.5 algorithm in decision making of
loan application by debtor (Case study:
Bank pasar of Yogyakarta Special
Region),” 2015 3rd Int. Conf. Inf.
Commun. Technol. ICoICT 2015, vol. 0,
pp. 75–80, 2015.
15