Decisiontree

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Decision Tree Induction

Decision Tree is a supervised learning method used in data mining for classification and regression
methods. It is a tree that helps us in decision-making purposes. The decision tree creates
classification or regression models as a tree structure. It separates a data set into smaller subsets,
and at the same time, the decision tree is steadily developed. The final tree is a tree with the
decision nodes and leaf nodes. A decision node has at least two branches. The leaf nodes show a
classification or decision. We can't accomplish more split on leaf nodes-The uppermost decision
node in a tree that relates to the best predictor called the root node. Decision trees can deal with
both categorical and numerical data.

Key factors:
Entropy:
Entropy refers to a common way to measure impurity. In the decision tree, it measures the
randomness or impurity in data sets.

Information Gain:
Information Gain refers to the decline in entropy after the dataset is split. It is also called Entropy
Reduction. Building a decision tree is all about discovering attributes that return the highest data
gain.
In short, a decision tree is just like a flow chart diagram with the terminal nodes showing decisions.
Starting with the dataset, we can measure the entropy to find a way to segment the set until the data
belongs to the same class.

Why are decision trees useful?


It enables us to analyze the possible consequences of a decision thoroughly.
It provides us a framework to measure the values of outcomes and the probability of accomplishing
them.
It helps us to make the best decisions based on existing data and best speculations.
In other words, we can say that a decision tree is a hierarchical tree structure that can be used to
split an extensive collection of records into smaller sets of the class by implementing a sequence of
simple decision rules. A decision tree model comprises a set of rules for portioning a huge
heterogeneous population into smaller, more homogeneous, or mutually exclusive classes. The
attributes of the classes can be any variables from nominal, ordinal, binary, and quantitative values,
in contrast, the classes must be a qualitative type, such as categorical or ordinal or binary. In brief,
the given data of attributes together with its class, a decision tree creates a set of rules that can be
used to identify the class. One rule is implemented after another, resulting in a hierarchy of
segments within a segment. The hierarchy is known as the tree, and each segment is called a node.
With each progressive division, the members from the subsequent sets become more and more
similar to each other. Hence, the algorithm used to build a decision tree is referred to as recursive
partitioning. The algorithm is known as CART (Classification and Regression Trees)
Consider the given example of a factory where
Expanding factor costs $3 million, the probability of a good economy is 0.6 (60%), which leads to
$8 million profit, and the probability of a bad economy is 0.4 (40%), which leads to $6 million
profit.
Not expanding factor with 0$ cost, the probability of a good economy is 0.6(60%), which leads to
$4 million profit, and the probability of a bad economy is 0.4, which leads to $2 million profit.
The management teams need to take a data-driven decision to expand or not based on the given
data.

Net Expand = ( 0.6 *8 + 0.4*6 ) - 3 = $4.2M


Net Not Expand = (0.6*4 + 0.4*2) - 0 = $3M
$4.2M > $3M,therefore the factory should be expanded.

Decision tree Algorithm:


The decision tree algorithm may appear long, but it is quite simply the basis algorithm techniques is
as follows:
The algorithm is based on three parameters: D, attribute_list, and Attribute _selection_method.
Generally, we refer to D as a data partition.
Initially, D is the entire set of training tuples and their related class levels (input training data)
The parameter attribute_list is a set of attributes defining the tuples.
Attribute_selection_method specifies a heuristic process for choosing the attribute that "best"
discriminates the given tuples according to class.
Attribute_selection_method process applies an attribute selection measure

Advantages of using decision trees:


A decision tree does not need scaling of information.
Missing values in data also do not influence the process of building a choice tree to any
considerable extent.
A decision tree model is automatic and simple to explain to the technical team as well as
stakeholders.
Compared to other algorithms, decision trees need less exertion for data preparation during pre-
processing.
A decision tree does not require a standardization of data

A decision tree is a structure that includes a root node, branches, and leaf nodes. Each internal node
denotes a test on an attribute, each branch denotes the outcome of a test, and each leaf node holds a
class label. The topmost node in the tree is the root node.
The following decision tree is for the concept buy_computer that indicates whether a customer at a
company is likely to buy a computer or not. Each internal node represents a test on an attribute.
Each leaf node represents a class.

The benefits of having a decision tree are as follows −


• It does not require any domain knowledge.
• It is easy to comprehend.
• The learning and classification steps of a decision tree are simple and fast.
Decision Tree Induction Algorithm
A machine researcher named J. Ross Quinlan in 1980 developed a decision tree algorithm known as
ID3 (Iterative Dichotomiser). Later, he presented C4.5, which was the successor of ID3. ID3 and
C4.5 adopt a greedy approach. In this algorithm, there is no backtracking; the trees are constructed
in a top-down recursive divide-and-conquer manner.
Generating a decision tree form training tuples of data partition D
Algorithm : Generate_decision_tree

Input:
Data partition, D, which is a set of training tuples
and their associated class labels.
attribute_list, the set of candidate attributes.
Attribute selection method, a procedure to determine the
splitting criterion that best partitions that the data
tuples into individual classes. This criterion includes a
splitting_attribute and either a splitting point or splitting subset.

Output:
A Decision Tree

Method
create a node N;

if tuples in D are all of the same class, C then


return N as leaf node labeled with class C;

if attribute_list is empty then


return N as leaf node with labeled
with majority class in D;|| majority voting

apply attribute_selection_method(D, attribute_list)


to find the best splitting_criterion;
label node N with splitting_criterion;

if splitting_attribute is discrete-valued and


multiway splits allowed then // no restricted to binary trees

attribute_list = splitting attribute; // remove splitting attribute


for each outcome j of splitting criterion

// partition the tuples and grow subtrees for each partition


let Dj be the set of data tuples in D satisfying outcome j; // a partition

if Dj is empty then
attach a leaf labeled with the majority
class in D to node N;
else
attach the node returned by Generate
decision tree(Dj, attribute list) to node N;
end for
return N;

Tree Pruning
Tree pruning is performed in order to remove anomalies in the training data due to noise or outliers.
The pruned trees are smaller and less complex.
Tree Pruning Approaches
There are two approaches to prune a tree −
• Pre-pruning − The tree is pruned by halting its construction early.
• Post-pruning - This approach removes a sub-tree from a fully grown tree.

Cost Complexity
The cost complexity is measured by the following two parameters −
• Number of leaves in the tree, and
• Error rate of the tree

You might also like