Academia.eduAcademia.edu

A Survey on Classification Problem Using ID3, CART, and C4.

Decision tree learning is one of the predictive modeling approaches used in statistics, machine learning, and Artificial intelligence. And this paper offers an empirical study on the use of a decision tree algorithm for iris dataset for flower classification. We have done some experimental works to recommend which DT algorithm is performing well on classification problems in terms of their accuracy. Among the different DT algorithms, we chose the frequently used algorithm such that: Id3, CART, and C4.5.ID3 stands for Induced decision tree or Iterator Dichotomizer 3.

A Survey on Classification Problem Using ID3, CART, and C4.5 Decision Tree Algorithms Abdiwak T1. Alemayehu Sh1. Ermias N.1 Senedu G.1 Yonas G.1 1AI SIG Lab, Department of Computer Science and Engineering, Adama Science and Technology University, ETHIOPIA [email protected] ABSTRACT Decision tree learning is one of the predictive modeling approaches used in statistics, machine learning, and Artificial intelligence. And this paper offers an empirical study on the use of a decision tree algorithm for iris dataset for flower classification. We have done some experimental works to recommend which DT algorithm is performing well on classification problems in terms of their accuracy. Among the different DT algorithms, we chose the frequently used algorithm such that: Id3, CART, and C4.5.ID3 stands for Induced decision tree or Iterator Dichotomizer 3. It is a type of decision tree first invented by Ross Quinlan in 1986. It generates a decision tree from the data set using a top-bottom approach using the selected attributes of the dataset. C4.5 is an improved version of the ID3 algorithm. The major improvements include: processing both numeric and discrete data, handling missing attribute values, producing easily-interpreted rules and it is fastest when compared with other algorithms due to it uses the main memory of the computer. CART stands for Classification and Regression Trees. It constructs binary trees, which means each internal node has exactly two outgoing edges. Towing criteria is used for selecting the splits and cost–complexity Pruning is used to prune the obtained tree. To evaluate the performance of the selected algorithms we used iris flower dataset with four independent numerical attributes and one dependent attribute. As the result shows the Id3 algorithm outperformed on accuracy measurement over others. Contents 1. Introduction ........................................................................................................................................... 2 2. Literature Survey .................................................................................................................................. 3 3. Methodology ......................................................................................................................................... 9 4.1. Decision Tree Classification Algorithms ...................................................................................... 9 4.1.1. CHAID .................................................................................................................................. 9 4.1.2. CART .................................................................................................................................. 10 4.1.3. ID3 ...................................................................................................................................... 10 4.1.4. C4.5 ..................................................................................................................................... 11 5. Results and Discussion ....................................................................................................................... 12 6. Conclusion .......................................................................................................................................... 13 7. Future Work ........................................................................................................................................ 13 8. References ........................................................................................................................................... 14 i structure that includes a root node, branches, 1. Introduction The development “machine and learning” application and and leaf nodes. A test on an attribute is of denoted by Each internal node, the outcome “artificial of a test is denoted by each branch node, and intelligence” systems have become popular each leaf node holds a class label[3]. within the last decade. Both terms are used in science and media frequently, sometimes Decision tree algorithms discussed in this interchangeably, sometimes with different paper are ID3 (Iterative of ID3), C4.5 meanings [1]. For the first time artificial (Successor of ID3), CART (classification and intelligence (AI) achieved the task of Regression Tree), CHAID (Chi – squared recognition as a discipline in the mid 1950's, Automatic Interaction Detector). In this paper machine learning (ML) has been a central the research questions addressed are: research area. As machine learning is subset • of artificial intelligence the two reasons can what is the difference among DT algorithms? be given as evidence. The ability to learn • from experience is a one characteristic of which DT algorithm perform good for intelligent behavior, and acting as human classification problem of dataset being similar to Iris. so any attempt to understand intelligence as a phenomenon might include Section 1 discuss about the introduction of an understanding of learning from past Artificial Intelligence, machine learning and experience[2]. Machine learning makes Decision tree. Section 2 briefly explains machine to learn from data without explicitly literature programmed. survey. Section 3 explains methodology used. Section 4 discussion Machine learning algorithms are categorized results are explained. Section 5 concludes as this analysis work. Section 6 shows the future supervised, unsupervised and works. reinforcement. In this paper we summarized one type of supervised learning algorithm called decision tree. A decision tree (DT) is a supervised learning algorithm with a 2 implemented extension that was proposed by 2. Literature Survey Biggs et al. (1991). Marina Milanović et al.[4] In this paper the researchers present conceptual characteristics Brian Miller et al.[7] Used CHAID decision of the decision tree, an important data mining tree to detect early Metabolic Syndrome in method which is, due to its explorative Young Adults. The user-specified CHAID nature, exceptionally suitable for detection of model was compared to both CHAID model the data structure when analyzing various with no user-specified first level and logistic problem situations. Besides that, they also regression-based demonstrate applicative characteristics of identified waist circumference as a strong this method using CHAID algorithm in predictor in the MetS diagnosis. The leadership studies in their empirical analysis accuracy of the final model they built, with part: an interdependence of selected personal waist circumference user-specified as the characteristics and the manager’s leadership first level, was 92.3% with its ability to detect style has been investigated. Finally, they MetS developed a classification model for the comparison identification of the dominant leadership concluded that preliminary findings suggest style. The study was conducted on a sample that young adults at risk for MetS could be of 417 managers of privately owned small- identified for further followup based on their sized enterprises in Serbia, using a specially waist circumference and they can show designed questionnaire. As predictors of the promise for the development of a preliminary dominant leadership style the classification detection algorithm for MetS. at 71.8% model. This which models. The analysis outperformed researcher model identified the set of six statistically Flora M et al.[8] Studies of the segmentation significant personal characteristics. of the tourism markets have traditionally Gilbert Ritschard[5] discussed the origin of been undertaken by regression methods. The tree methods. He surveyed earlier methods need to have a significant number of that led to CHAID, then explained in detail segments and qualifying variables has led, how the however, to the use of other procedures of differences between the original method as multivariate analysis. CHAID (Chi-square described in Kass and the nowadays currently Automatic Interaction Detection), which is CHAID function, especially more complex than other multivariate 3 techniques, has rarely been used. This study model in order to construct the collision risk applies of model based on the CART. A total of 100 multivariate analysis and CHAID to the same investigation reports for collision accidents population of tourists visiting a particular during 2006–2015 are collected, which were destination to compare the quality of the published by Chinese provincial authorities information obtained on tourism market and local maritime bureau on their official segmentation. The results suggest that the websites. The experimental results show that analysis based on CHAID matches the nature the proposed model is comprehensively of the problem studied better than those superior to the other models in terms of the provided by discriminant analysis collision risk recognition accuracy and the traditional methods prediction speed. Yishan Liet al.[9] explains that the main objective of the research is to develop a Leandro Pecchia, et al.[10] predictive model for ship collision risk using platform to enhance effectiveness and classification and regression trees (CARTs) efficiency of home monitoring using data under different situations. The proposed mining for early detection of any worsening CART prediction model is better than the in patient’s condition. The proposed data existing ship collision risk prediction model mining model based on Classification and in and Regression Tree (CART) briefly describe the prediction speed when the feature dimension Remote Health Monitoring (RHM) platform is low and the sample size is small. It is we designed and realized, which supports produced Heart Failure (HF) severity assessment. In terms of by comprehensive prediction accuracy combining evaluation a fuzzy method Proposed a to this paper, preliminary results of classifiers evaluate the risk of the collected samples is for HF severity detection is presented, which used to build a collision risk identification are innovative in comparison to the others library that contains information based on previously published. The system developed expert collision avoidance and six primary achieved factors affecting the ship collision risk are respectively of 96.39% and 100.00% in considered as the input of the CART model, detecting HF and of 79.31% and 82.35% in and the degree of risk is calculated using the distinguish severe versus mild HF. fuzzy comprehensive evaluation method as the actual output for the proposed CART 4 accuracy and a precision Richard K. Zimmerman, et al.[11] Explains Namita Puri et al[13]: -The prediction or the use of classification and regression trees classifying is the main goal behind use of ID3 (CART) the algorithm. ID3 (Iterative Dichotomiser 3) is study an algorithm invented by Ross Quinlan in concluded that CART algorithm has good 1986 used to generate a decision tree from a sensitivity and high NPV, but low PPV for dataset. It builds the decision tree from top to identifying influenza among outpatients down. No backtracking is there. In this paper ≥5 years. Thus, it is good at identifying a authors used ID3 algorithm to predict student group that does not need testing or antivirals placement based identifying the relevant and had fair to good predictive performance attributes based on academics, skills and for influenza. In this paper, an algorithm that curricular of final year student. modeling probabilities of to influenza. estimate This included fever, cough, and fatigue had a Therefore, author of this paper has took the sensitivity of 84 %, the specificity of 48 %, student placement as classification problem positive predictive value (PPV) of 23 % and and design a model to classify or distribute negative predictive value (NPV) of 94 % for students to different streams. This model can the development sample. It is suggested that be useful for faculties, university and further testing of the algorithm in other students to more emphasis on those which are influenza seasons would help to optimize not eligible for placement according to this decisions for lab testing or treatment. model. Here authors concluded that ID3 is an A predictive model has been developed in outperformed or best algorithm that can be Bing Zhang et al.[12] For Blood pressure used for classification and prediction of prediction. Classification and Regression student’s placement in an engineering Trees (CARTs) are proposed and applied to college. The result indicates that the ID3 tackle the problem. The data were collected decision tree algorithm is the best classifier from 18 healthy young people (including 12 with 95% accuracy. males and 6 females) in simulated resting and Song Danwa et al[14]: - This paper exercise environments. The protocol consists emphasized the design and implementation of 30 minutes resting measurement, followed of the ID3 algorithm to classify forestry by 45 minutes exercising session and 3 resource. They used the ID3 algorithm to minutes cooling down. The proposed model analyze the correlation information among has more than 90% accuracy. 5 different forestry species, altitude, origin, VenugopalreddyAalagadda et al[16]:- This forest group and rows by establishing a paper summarizes the use of ID3 decision decision tree model, and provide a reference algorithm in the case of identifying the for the related decision support. It is proved dropout students based on different attributes that this method has a good application and The objectives of this research work is prospect in forest resource mode evaluation. to identify relevant attributes from socio- In addition to finding the correlation between demographic, academic and institutional data the attributes, the model provides a scientific of first-year students from undergraduate at basis for operation decisions of forest the university to classify the students are management and development strategies dropout or not based on the information according to the valuable rule information collected in the form of a model machine explored by the construction of model learning tool that automatically determines knowledgebase. whether the student can continue his study or not they used classification method based on Shruthi E et al[15]: - WhatsApp Messenger is the Decision tree. For powerful decision- a cross-platform messaging social media making tools, different parameter needs to be application, where text, video, images, and considered such as socio-demographic data, other data can be sent to anyone by just parental attitude and institutional factors. The having the contact of the person. In this paper generated knowledge and the results will be authors used ID3 decision tree algorithm to quite useful for the tutor and management of classify the messages which are found is the university to develop policies and abused or not. Today WhatsApp Messenger strategies related to increase the enrolment usage is increasing day by day. The decision rate in university and take precautions, advice tree algorithm Iterative Dichotomiser ID3 is and reduce student drop. It can also use to used to making a decision on WhatsApp text find the reasons and relevant factors that and evaluated to measure the accuracy of the affect the dropout students using the ID3 model. To make a decision they used algorithm. different techniques like extracting a message The implementation result indicates that the ID3 decision tree algorithm from WhatsApp using different tools, is the best classifier for the classification of selecting a feature from text and classify the student either drop out or not with 98% based on the data feature. accuracy. 6 S. Sharma et al [17] discussed that nowadays comparative analysis between three entropies information industry and social sites are which are; Shannon, quadratic and Havrda & enabling us to have huge amounts of data that Charvt. They performed an experiment on are being collected and stored in databases. eight real datasets by using five methods The authors also state that those data are which includes C4.5 decision tree algorithm needed in order to extract and classify based on Shannon Entropy, C4.5 decision important knowledge tree algorithm based on Havrda and Charvt discovery. They state that data mining is the entropy, C4.5 decision tree algorithm based most acquisition on Quadratic entropy, C4.5 decision tree Shannon algorithm based on R´enyi entropy and C4.5 Entropy in order to find information gain, decision tree algorithm based on Taneja which calculates the information gain ratio entropy. Their experimental results show that contained by the data and this, in turn, can the accuracy of the Experimental Method help in order to construct the decision tree based on three entropies outperformed the and make the prediction. The results obtained C4.5 algorithm. information popular technique. The and knowledge authors used from the Shannon Entropy are complex, so in C. Anuradha [18] in their paper entitled “A order to minimize the difficulties they use Data Mining based Survey on Student other entropy like R´enyi entropy, quadratic Performance Evaluation System” analyze the entropy, Havrda and Charvt entropy, and ID3 and C4.5 algorithms for classification of Taneja entropy in place of the Shannon student performance evaluation system. Entropy. The authors break down the Classification, architecture of the experimental model Clustering, Prediction, Association rules, Decisions Trees, Neural classification into three steps; in the first step Networks and many others are data mining they performed data re-processing then using techniques various entropies they find the information among those classification algorithms decision trees ID3 and C4.5 plays gain and lastly, they show the output. an important role in Educational Data The researchers applied various performance Mining. In data mining, there are several metrics including Mean Squared Error, classification algorithms where the decision Cross-Validation and Test2 methods. The tree algorithms are mostly used because they main objective of their research is to classify are easy for implementation. The ID3 the data set. The authors also performed a algorithm which accepts only categorical 7 attributes selects the splitting attribute using unseen testing data. They stated that C4.5 the information gain measure. The enhanced algorithm has the ability to deal with numeric version of the ID3 algorithm is C4.5 that can attributes, missing values and noisy data. accept both continuous and categorical Rafik Khairul Amin et al [20] in their paper attributes in building a decision tree. The authors concluded classification that from many algorithms the C4.5 entitled as “ Implementation of Decision Tree Using C4.5 Algorithm in Decision Making of algorithm’s performance is the highest. Loan Application by Debtor (Case Study: Sonia Singh [19] in their paper entitled as they state that C4.5 algorithm is commonly “Comparative Study Id3, Cart And C4.5 used in generating decision trees since it has Decision Tree Algorithm: A Survey” they a high accuracy in decision making. They gave details on three mostly used decision evaluated tree algorithms which includes ID3, C4.5 and calculating the system accuracy according to CART in order to understand their use and an input of training and test data. The data scalability on different types of attributes and types they used for classification are feature. The authors state that in dealing with categorical and discrete so other data types discrete attributes many outcomes can be must be converted to those categories. The obtained from one test as the number of authors also used a pruning technique that is distinct values of an attribute. The authors used to simplify the tree structure that is also state that in dealing with continuous generated by the C4.5 algorithm. They used attributes in accordance with the binary cuts 100 data in which 70% is approved and 30% data is stored and the entropy gain is is rejected. Their report discusses the calculated on each individual value in a performance of the C4.5 algorithm in single scan of the sorted data and this process identifying the debitor’s eligibility. Their will be done repeatedly for all continuous result shows that the highest precision value attributes. The authors also state that the C4.5 is 78.08% using the C4.5 algorithm with data algorithm allows the pruning of the resulting partition of 90%:10%. They also indicate that decision tree and accordingly there will be an the biggest recall value is 96.4% with the data increased error rate on the training data while partitioning of 80%:20%. Bank Pasar og Yogyakarta Special Region) “ there will be decreased error rates on the 8 the system information by presented herein based on their capability’s 3. Methodology simplicity and robustness. Classification is likewise characterized as the task of target function learning for mapping each attribute set to its corresponding, class label. There are numerous Decision tree classification algorithms such as CHAID, CART, Id3, C4.5. 4.1.1. CHAID Among the well-known decision tree algorithms, CHAID (Chi-square Automatic Interactive Detector Algorithm) is one of the Figure 1. Algorithm Evaluation Architecture oldest classification tree methods. which Figure 1 shows the methodology of this term creates predictors by dividing continuous paper that consists of two phases namely the distributions into several categories with an classification performance equal number of observations. CHAID evaluation phase. The Iris dataset is chosen as developed by Gordon V Kass in 1980 to an experimental dataset. The classification evaluate phase uses three Decision tree classification predictors and display modeling results in tree algorithms namely Id3, C4.5, CART. These diagrams which are easy to interpret. CHAID algorithms are analyzed and validated using algorithm is one decision tree classification an accuracy performance evaluation metric. algorithm that works well with all kinds of 4.1. phase and categorical Decision Tree Classification Algorithms complex variables interactions and among continuous variables. Chi-square splitting criteria is used Classification is a data mining task that for tree construction. This is one algorithm allocates all records in the data set to one of that can be used for various tasks like the few predefined classes. The data set is prediction, detection of interaction between divided to training and test sets. The training variables and classification. This can be set has a known class label while the test set considered as an extension of Automatic labels are unknown. Some of the most Interaction Detection and THeta Automatic popular and common are adapted and Interaction Detection methods. Selects the 9 least significant category concerning the deviation). Weighted mean on node in each dependent variable. P-value of the predictor leaf is used for prediction. It has the following variable with the smallest adjustment is advantages and disadvantages. chosen as the split and this is the variable that Advantages: • will yield the most significant split and this is It can continued until no further splits are possible. numerical There are many advantages the CHAID variables. • algorithm provides such as easy interpretation easily and handle both categorical It identifies the most significant and produces a highly visual output. It variables and eliminates non- produces reliable output but requires rather significant ones. • huge data sample sizes as it uses by default multiway splits and respondent groups can It can easily handle outliers. Disadvantages: become very small when quiet small data • sample sizes are used. This algorithm is also It may have an unstable decision tree. one of non-parametric decision tree algorithm. • It splits only by one variable. 4.1.2. CART 4.1.3. ID3 CART stands for Classification and 4.1.3.1. Regression Trees. It constructs binary trees, General overview ID3 Decision tree algorithm which means each internal node has exactly ID3 stands for Induced decision tree or two outgoing edges. Towing criteria is used Iterator Dichotomizer 3. It is a type of for selecting the splits and cost–complexity decision tree first invented by Ross Quinlan Pruning is used to prune the obtained tree. in 1986. It generates a decision tree from the CART can consider misclassification costs in data set using a top-bottom approach using the tree induction when provided and enables the selected attributes of the dataset. The top- users to provide prior probability distribution. bottom approach is using different node The ability to generate regression trees is an called root node, branches and leaf nodes. To important feature of CART. Trees where their select the root node ID3 uses entropy or leaves predict a real number and not a class are information gain, therefore based on the Regression trees. In the case of regression, highest entropy value the root node will be CART looks for splits that minimize the selected. Even the ID3 decision tree prediction squared error (the least–squared algorithm is used to classify the data or object 10 there is a problem of growing the nodes and Gain (S, B) =Entropy(S)-∑ (|Sv|/ |S|) Entropy supporting different data types is a major (Sv). Where Values (B) is the set of all problem of the ID3 decision algorithm. possible values for attribute B, and Svis the A. Entropy subset of S for which the attribute A has value Entropy is a measure of information in the v. This measure can be used to rank attributes set, which indicates the impurity of an and build the decision tree. In the tree at each arbitrary collection of examples. If the target node is located the attribute with the highest attribute takes on a different value, then the information gain among the attributes not yet entropy S relative to this a-wise classification considered in the path from the root [13]. As is defined as a general use of the decision tree algorithm in Entropy(s) = − ∑𝑛𝑖=1 𝑝(𝑥𝑖)𝑙𝑜𝑔2𝑝(𝑥𝑖) classification and regression problem many Where p(xi) is the proportion/probability of S researchers used the ID3 decision tree belonging to class i. A logarithm is base 2 algorithm in different areas. because entropy is a measure of the expected 4.1.4. C4.5 encoding length measured in bits. C4.5 is one of the algorithms which is used to For example, if training data has 14 instances generate decision trees. It is employed to with 6 positive and 8 negative instances, the generate decisions, based on a certain entropy is calculated as available set of sample data. The C4.5 can be Entropy ([6+, 8-]) = - (6 /14) log (6 /14) - either a univariate or multivariate predictor. (8/14) log (8/14) = 0.985 It can also be referred to as a statistic A key point to note here is that the more classifier. uniform is the probability distribution, the In order to construct a decision tree, the C4.5 greater is its entropy[13]. algorithm uses information gain ratio for B. Information gain feature selection[17]. The features at each It is a change before and after splitting the node on the tree will be selected based on the data which measures the expected reduction information gain ratio and such measure is in entropy by partitioning the examples known as feature or attribute selection [17]. according to the selected attribute. The It works with both continuous and discrete information gain, Gain (S, B) of an attribute features [17]. It is a quick classification A, relative to the collection of examples S, is technique and has high precision[17]. It can defined as be examined based on various entropy such 11 as involves the Shannon entropy, Quadratic Disadvantages: entropy, havrda and Charvat entropy[17]. C4.5 is an improved version of the ID3 • Can construct empty branchs. • It can suffer from overfitting when the model picks up data with algorithm. The major improvements include: uncommon characteristics. processing both numeric and discrete data, • handling missing attribute values, producing easily-interpreted rules and it is fastest when Susceptible to noise. 5. Results and Discussion compared with other algorithms due to it uses A. Dataset Description the main memory of the computer [20]. The Table 1 shows the Iris data set with 150 Depth-first strategy enables the decision tree instances and the attributes sepal. length, to grow [19]. In order to split the data studies sepal. width, petal.length and petal.width all of the possible tests, then selects the test have been used for predicting which species that gives the highest possible information of Iris flower belongs to. gain[19]. Table 1. Iris dataset description Advantages and Disadvantages of C4.5 Attribute Description algorithm[19]: sepal.length length of sepal Advantages: sepal.width width of sepal Continuous and discrete attributes petal.length length of petal can be handled by petal.width width of petal • the C4.5 algorithm. • • During gain and entropy calculations Table 2 shows that the five decision tree missing attribute values are not used. models are compared by using performance Removes branches that are not evaluation criteria including precision, recall, important and replaces them with F1-score, and accuracy. leaf nodes by going back through the tree once the tree has been created. 12 Table 2. Performance measure results Decision Tree algorithms Precision Recall F1-score accuracy Id3 without Pruning 97% 96.7% 96.7% 96% Id3 with Pruning 100% 100% 100% 97.5% C4.5 96% 96% 96% 85% CART 95% 95.6% 95.3% 95% The three decision tree results classification found we recommend future algorithms have been validated using the four researchers who are implementing decision important metrics such as precision, recall, tree algorithm to use Id3 with Pruning for F1-score and Accuracy and these validation classification. results are shown in Table 2. The result 7. Future Work shows that Id3 with Pruning gives better classification accuracy as 97.5% than the There are a lot of improvements that could be other two algorithms. done in the future in order to make generalization. Experiment can be done on 6. Conclusion more than one data set to evaluate the This paper conducted a study on three performance of the algorithms in which in decision tree classification algorithms on Iris our case we have used the iris dataset. Other dataset for experiment and the experimental than this, C4.5 algorithm with Pruning can result shows that the Id3 with Pruning tree also be included on the experiment for better classifier gives better accuracy 97.5%. The comparison. Exhaustive survey on this area second-best algorithm is Id3 without Pruning can also be done in the future. which gives accuracy 96%. Based on the 13 failure with data mining via CART method on HRV feature,” IEEE Trans. Biomed. Eng., vol. 58, no. 3 PART 2, pp. 800–804, 2011. 8. References [1] [2] N. Kühl, M. Goutier, R. Hirt, and G. Satzger, “Machine Learning in Artificial Intelligence: Towards a Common Understanding,” Proc. 52nd Hawaii Int. Conf. Syst. Sci., no. January 2019. J. R. Quinlan, “Induction of Decision Trees,” Mach. Learn., vol. 1, no. 1, pp. 81–106, 1986. [11] R. K. Zimmerman et al., “Classification and Regression Tree (CART) analysis to predict influenza in primary care patients,” BMC Infect. Dis., vol. 16, no. 1, pp. 1–11, 2016. [12] B. Zhang, Z. Wei, J. Ren, Y. Cheng, and Z. Zheng, “An Empirical Study on Predicting Blood Pressure Using Classification and Regression Trees,” IEEE Access, vol. 6, pp. 21758–21768, 2018. [3] H. Sharma and S. Kumar, “A Survey on Decision Tree Algorithms of Classification in Data Mining,” Int. J. Sci. Res., vol. 5, no. 4, pp. 2094–2097, 2016. [4] M. Milanović and M. Stamenković, “CHAID Decision Tree: Methodological Frame and Application,” Econ. Themes, vol. 54, no. 4, pp. 563–586, 2017. [13] N. Puri, D. Khot, P. Shinde, K. Bhoite, and P. D. Maste, “Student Placement Prediction Using ID3,” vol. 3, no. Iii, pp. 81–84, 2015. [5] G. Ritschard, “CHAID and earlier supervised tree methods,” Contemp. Issues Explore. Data Min. Behav. Sci., pp. 48–74, 2013. [14] [6] G. V. Kass, “An Exploratory Technique for Investigating Large Quantities of Categorical Data,” Appl. Stat., vol. 29, no. 2, p. 119, 1980. D. Song, N. Han, and D. Liu, “Construction of forestry resource classification rule decision tree based on ID3 Algorithm,” Proc. 1st Int. Work. Educ. Technol. Comput. Sci. ETCS 2009, vol. 3, pp. 867–870, 2009. [15] J. Wiley, “International Journal of Cancer International Journal of Cancer,” vol. 2, no. 2, pp. 1–24, 2015. [16] V. Aalagadda and I. M. Latha, “Identifying Dropout Students using ID3 Decision Tree Algorithm,” vol. 7, no. 01, pp. 1203–1206, 2019. [17] S. Sharma, J. Agrawal, and S. Sharma, “Classification Through Machine Learning Technique: C4. 5 An algorithm based on Various Entropies,” Int. J. Comput. Appl., vol. 82, no. 16, pp. 28– 32, 2013. [18] C. Anuradha and T. Velmurugan, “A data mining based survey on student performance evaluation system,” 2014 IEEE Int. Conf. Comput. Intell. Comput. Res. IEEE ICCIC 2014, pp. 43–47, 2015. [7] B. Miller, M. Fridline, P. Y. Liu, and D. Marino, “Use of CHAID decision trees to formulate pathways for the early detection of metabolic syndrome in young adults,” Comput. Math. Methods Med., vol. 2014, 2014. [8] F. M. Díaz-Pérez and M. BethencourtCejas, “CHAID algorithm as an appropriate analytical method for tourism market segmentation,” J. Destin. Mark. Manag., vol. 5, no. 3, pp. 275–282, 2016. [9] Y. Li, Z. Guo, J. Yang, H. Fang, and Y. Hu, “Prediction of ship collision risk based on CART,” IET Intell. Transp. Syst., vol. 12, no. 10, pp. 1345–1350, 2018. [10] L. Pecchia, P. Melillo, and M. Bracale, “Remote health monitoring of heart 14 [19] S. Singh and P. Gupta, “Comparative Study Id3, Cart and C4.5 Decision Tree Algorithm: a Survey,” Int. J. Adv. Inf. Sci. Technol., vol. 27, no. 27, pp. 97–103, 2014. [20] R. K. Amin, Indwiarti, and Y. Sibaroni, “Implementation of a decision tree using C4.5 algorithm in decision making of loan application by debtor (Case study: Bank pasar of Yogyakarta Special Region),” 2015 3rd Int. Conf. Inf. Commun. Technol. ICoICT 2015, vol. 0, pp. 75–80, 2015. 15