Ai Unit 3

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 23

UNIT III

What is planning in AI?

 The planning in Artificial Intelligence is about the decision making tasks


performed by the robots or computer programs to achieve a specific goal.
 The execution of planning is about choosing a sequence of actions with a high
likelihood to complete the specific task.

Blocks-World planning problem


 The blocks-world problem is known as Sussman Anomaly.
 Noninterleaved planners of the early 1970s were unable to solve this problem,
hence it is considered as anomalous.
 When two subgoals G1 and G2 are given, a noninterleaved planner produces
either a plan for G1 concatenated with a plan for G2, or vice-versa.
 In blocks-world problem, three blocks labeled as 'A', 'B', 'C' are allowed to rest on
the flat surface. The given condition is that only one block can be moved at a time
to achieve the goal.
 The start state and goal state are shown in the following diagram.

Components of Planning System

The planning consists of following important steps:

 Choose the best rule for applying the next rule based on the best available
heuristics.
 Apply the chosen rule for computing the new problem state.
 Detect when a solution has been found.
 Detect dead ends so that they can be abandoned and the system’s effort is directed
in more fruitful directions.
 Detect when an almost correct solution has been found.

Goal stack planning

This is one of the most important planning algorithms, which is specifically used
by STRIPS.

 The stack is used in an algorithm to hold the action and satisfy the goal. A
knowledge base is used to hold the current state, actions.
 Goal stack is similar to a node in a search tree, where the branches are created if
there is a choice of an action.

Example :

 The below example is from block world domain.

Solution:-
Planning using State Space Search
State space consists of the initial state, set of goal states, set of actions or
operations, set of states and the path cost. This state space needs to be searched to
find a sequence of actions leading to the goal state. This can be done in the forward
or backward direction.
Forward State Space Search
It is also called Progression. It starts from the initial state and searches in the
forward direction till we reach the goal. It uses STRIPS representation. This is how
the problem formulation looks like.

Initial state: start state


Actions: Each action has a particular precondition to be satisfied before the action
can be performed and an effect that the action will have on the environment.
Goal test: To check if the current state is the goal state or not.
Step cost: Cost of each step which is assumed to be 1.
Backward State Space Search
It is also called as Regression. It uses STRIPS representation. The problem
formulation is similar to that of FSSS and consists of the initial state, actions, goal
test and step cost. In BSSS, the searching starts from the goal state, and moves in
the backward direction until the initial state is reached. It starts at the goal, checks if
it is the initial state. If not, it applies the inverse of the actions to produce sub goals
until start state is reached. For instance,
Total Order planning (TOP)
FSSS and BSSS are examples of TOP. They only explore linear sequences of
actions from start to goal state, They cannot take advantage of problem
decomposition, i.e. splitting the problem into smaller sub-problems and solving
them individually.
Partial Order Planning (POP)
It works on problem decomposition. It will divide the problem into parts and
achieve these sub goals independently. It solves the sub problems with sub plans
and then combines these sub plans and reorders them based on requirements. In
POP, ordering of the actions is partial. It does not specify which action will come
first out of the two actions which are placed in the plan. Let’s look at this with the
help of an example. The problem of wearing shoes can be performed through total
order or partial order planning.
Hierarchical Planning
Here the plans are organized in a hierarchical format. It works on plan
decomposition. Complex actions are decomposed into simpler or primitive ones and
it can be denoted with the help of links between various states at different levels of
the hierarchy. This is called operator expansion.
Primitive tasks- these correspond to the actions of STRIPS,
Compound tasks- these are a set of simpler tasks,
Goal tasks- these correspond to goals of STRIPS.

Conditional Planning
It works regardless of the outcome of an action. It deals with uncertainty by
inspecting what is happening in the environment at predetermined points in the
plan.
Types of Learning in Agents in Artificial Intelligence
Learnining agents as described earlier are the systems which are capable of
training themselves by learning from their own actions and experiences.
The Learning process in the agent is broadly classified into three types:
1. Supervised Machine Learning

As its name suggests, Supervised machine learning is based on supervision. It


means in the supervised learning technique, we train the machines using the
"labelled" dataset, and based on the training, the machine predicts the output. Here,
the labelled data specifies that some of the inputs are already mapped to the output.
More preciously, we can say; first, we train the machine with the input and
corresponding output, and then we ask the machine to predict the output using the
test dataset.

Let's understand supervised learning with an example. Suppose we have an input


dataset of cats and dog images. So, first, we will provide the training to the
machine to understand the images, such as the shape & size of the tail of cat and
dog, Shape of eyes, colour, height (dogs are taller, cats are smaller), etc. After
completion of training, we input the picture of a cat and ask the machine to identify
the object and predict the output. Now, the machine is well trained, so it will check
all the features of the object, such as height, shape, colour, eyes, ears, tail, etc., and
find that it's a cat. So, it will put it in the Cat category. This is the process of how
the machine identifies the objects in Supervised Learning.

The main goal of the supervised learning technique is to map the input
variable(x) with the output variable(y). Some real-world applications of
supervised learning are Risk Assessment, Fraud Detection, Spam filtering, etc.
Categories of Supervised Machine Learning
Supervised machine learning can be classified into two types of problems, which
are given below:

o Classification
o Regression

a) Classification
Classification algorithms are used to solve the classification problems in which the
output variable is categorical, such as "Yes" or No, Male or Female, Red or
Blue, etc. The classification algorithms predict the categories present in the
dataset. Some real-world examples of classification algorithms are Spam
Detection, Email filtering, etc.

Some popular classification algorithms are given below:

o Random Forest Algorithm


o Decision Tree Algorithm
o Logistic Regression Algorithm
o Support Vector Machine Algorithm

b) Regression
Regression algorithms are used to solve regression problems in which there is a
linear relationship between input and output variables. These are used to predict
continuous output variables, such as market trends, weather prediction, etc.

Some popular Regression algorithms are given below:

o Simple Linear Regression Algorithm


o Multivariate Regression Algorithm
o Decision Tree Algorithm
o Lasso Regression

Advantages and Disadvantages of Supervised Learning


Advantages:
o Since supervised learning work with the labelled dataset so we can have an
exact idea about the classes of objects.
o These algorithms are helpful in predicting the output on the basis of prior
experience.

Disadvantages:

o These algorithms are not able to solve complex tasks.


o It may predict the wrong output if the test data is different from the training
data.
o It requires lots of computational time to train the algorithm.

Applications of Supervised Learning


Some common applications of Supervised Learning are given below:

o Image Segmentation:
Supervised Learning algorithms are used in image segmentation. In this
process, image classification is performed on different image data with pre-
defined labels.
o Medical Diagnosis:
Supervised algorithms are also used in the medical field for diagnosis
purposes. It is done by using medical images and past labelled data with
labels for disease conditions. With such a process, the machine can identify
a disease for the new patients.
o Fraud Detection - Supervised Learning classification algorithms are used
for identifying fraud transactions, fraud customers, etc. It is done by using
historic data to identify the patterns that can lead to possible fraud.
o Spam detection - In spam detection & filtering, classification algorithms are
used. These algorithms classify an email as spam or not spam. The spam
emails are sent to the spam folder.
o Speech Recognition - Supervised learning algorithms are also used in
speech recognition. The algorithm is trained with voice data, and various
identifications can be done using the same, such as voice-activated
passwords, voice commands, etc.
o
2. Unsupervised Machine Learning

Unsupervised learning is different from the Supervised learning technique; as its


name suggests, there is no need for supervision. It means, in unsupervised machine
learning, the machine is trained using the unlabeled dataset, and the machine
predicts the output without any supervision.

In unsupervised learning, the models are trained with the data that is neither
classified nor labelled, and the model acts on that data without any supervision.

The main aim of the unsupervised learning algorithm is to group or categories


the unsorted dataset according to the similarities, patterns, and
differences. Machines are instructed to find the hidden patterns from the input
dataset.

Let's take an example to understand it more preciously; suppose there is a basket of


fruit images, and we input it into the machine learning model. The images are
totally unknown to the model, and the task of the machine is to find the patterns
and categories of the objects.

So, now the machine will discover its patterns and differences, such as colour
difference, shape difference, and predict the output when it is tested with the test
dataset.

Categories of Unsupervised Machine Learning


Unsupervised Learning can be further classified into two types, which are given
below:

o Clustering
o Association

1) Clustering
The clustering technique is used when we want to find the inherent groups from
the data. It is a way to group the objects into a cluster such that the objects with the
most similarities remain in one group and have fewer or no similarities with the
objects of other groups. An example of the clustering algorithm is grouping the
customers by their purchasing behaviour.
Some of the popular clustering algorithms are given below:

o K-Means Clustering algorithm


o Mean-shift algorithm
o DBSCAN Algorithm
o Principal Component Analysis
o Independent Component Analysis

2) Association
Association rule learning is an unsupervised learning technique, which finds
interesting relations among variables within a large dataset. The main aim of this
learning algorithm is to find the dependency of one data item on another data item
and map those variables accordingly so that it can generate maximum profit. This
algorithm is mainly applied in Market Basket analysis, Web usage mining,
continuous production, etc.

Some popular algorithms of Association rule learning are Apriori Algorithm,


Eclat, FP-growth algorithm.

Advantages and Disadvantages of Unsupervised Learning Algorithm


Advantages:

o These algorithms can be used for complicated tasks compared to the


supervised ones because these algorithms work on the unlabeled dataset.
o Unsupervised algorithms are preferable for various tasks as getting the
unlabeled dataset is easier as compared to the labelled dataset.

Disadvantages:

o The output of an unsupervised algorithm can be less accurate as the dataset


is not labelled, and algorithms are not trained with the exact output in prior.
o Working with Unsupervised learning is more difficult as it works with the
unlabelled dataset that does not map with the output.
Applications of Unsupervised Learning
o Network Analysis: Unsupervised learning is used for identifying plagiarism
and copyright in document network analysis of text data for scholarly
articles.
o Recommendation Systems: Recommendation systems widely use
unsupervised learning techniques for building recommendation applications
for different web applications and e-commerce websites.
o Anomaly Detection: Anomaly detection is a popular application of
unsupervised learning, which can identify unusual data points within the
dataset. It is used to discover fraudulent transactions.
o Singular Value Decomposition: Singular Value Decomposition or SVD is
used to extract particular information from the database. For example,
extracting information of each user located at a particular location.

3. Semi-Supervised Learning

Semi-Supervised learning is a type of Machine Learning algorithm that lies


between Supervised and Unsupervised machine learning. It represents the
intermediate ground between Supervised (With Labelled training data) and
Unsupervised learning (with no labelled training data) algorithms and uses the
combination of labelled and unlabeled datasets during the training period.

Although Semi-supervised learning is the middle ground between supervised and


unsupervised learning and operates on the data that consists of a few labels, it
mostly consists of unlabeled data. As labels are costly, but for corporate purposes,
they may have few labels. It is completely different from supervised and
unsupervised learning as they are based on the presence & absence of labels.

To overcome the drawbacks of supervised learning and unsupervised learning


algorithms, the concept of Semi-supervised learning is introduced. The main
aim of semi-supervised learning is to effectively use all the available data, rather
than only labelled data like in supervised learning. Initially, similar data is
clustered along with an unsupervised learning algorithm, and further, it helps to
label the unlabeled data into labelled data. It is because labelled data is a
comparatively more expensive acquisition than unlabeled data.
We can imagine these algorithms with an example. Supervised learning is where a
student is under the supervision of an instructor at home and college. Further, if
that student is self-analysing the same concept without any help from the
instructor, it comes under unsupervised learning. Under semi-supervised learning,
the student has to revise himself after analyzing the same concept under the
guidance of an instructor at college.

Advantages and disadvantages of Semi-supervised Learning


Advantages:

o It is simple and easy to understand the algorithm.


o It is highly efficient.
o It is used to solve drawbacks of Supervised and Unsupervised Learning
algorithms.

Disadvantages:

o Iterations results may not be stable.


o We cannot apply these algorithms to network-level data.
o Accuracy is low.

4. Reinforcement Learning

Reinforcement learning works on a feedback-based process, in which an AI


agent (A software component) automatically explore its surrounding by
hitting & trail, taking action, learning from experiences, and improving its
performance. Agent gets rewarded for each good action and get punished for each
bad action; hence the goal of reinforcement learning agent is to maximize the
rewards.

In reinforcement learning, there is no labelled data like supervised learning, and


agents learn from their experiences only.

The reinforcement learning process is similar to a human being; for example, a


child learns various things by experiences in his day-to-day life. An example of
reinforcement learning is to play a game, where the Game is the environment,
moves of an agent at each step define states, and the goal of the agent is to get a
high score. Agent receives feedback in terms of punishment and rewards.

Due to its way of working, reinforcement learning is employed in different fields


such as Game theory, Operation Research, Information theory, multi-agent
systems.

A reinforcement learning problem can be formalized using Markov Decision


Process(MDP). In MDP, the agent constantly interacts with the environment and
performs actions; at each action, the environment responds and generates a new
state.

Categories of Reinforcement Learning


Reinforcement learning is categorized mainly into two types of
methods/algorithms:

o Positive Reinforcement Learning: Positive reinforcement learning


specifies increasing the tendency that the required behaviour would occur
again by adding something. It enhances the strength of the behaviour of the
agent and positively impacts it.
o Negative Reinforcement Learning: Negative reinforcement learning works
exactly opposite to the positive RL. It increases the tendency that the
specific behaviour would occur again by avoiding the negative condition.

Real-world Use cases of Reinforcement Learning


o Video Games:
RL algorithms are much popular in gaming applications. It is used to gain
super-human performance. Some popular games that use RL algorithms
are AlphaGO and AlphaGO Zero.
o Resource Management:
The "Resource Management with Deep Reinforcement Learning" paper
showed that how to use RL in computer to automatically learn and schedule
resources to wait for different jobs in order to minimize average job
slowdown.
o Robotics:
RL is widely being used in Robotics applications. Robots are used in the
industrial and manufacturing area, and these robots are made more powerful
with reinforcement learning. There are different industries that have their
vision of building intelligent robots using AI and Machine learning
technology.
o Text Mining
Text-mining, one of the great applications of NLP, is now being
implemented with the help of Reinforcement Learning by Salesforce
company.

Advantages and Disadvantages of Reinforcement Learning


Advantages

o It helps in solving complex real-world problems which are difficult to be


solved by general techniques.
o The learning model of RL is similar to the learning of human beings; hence
most accurate results can be found.
o Helps in achieving long term results.

Disadvantage

o RL algorithms are not preferred for simple problems.


o RL algorithms require huge data and computations.
o Too much reinforcement learning can lead to an overload of states which
can weaken the results.

The curse of dimensionality limits reinforcement learning for real physical


systems.

Inductive Learning Algorithm


Inductive Learning Algorithm (ILA) is an iterative and inductive machine learning
algorithm which is used for generating a set of a classification rule, which
produces rules of the form “IF-THEN”, for a set of examples, producing rules at
each iteration and appending to the set of rules. Basic Idea: There are basically
two methods for knowledge extraction firstly from domain experts and then with
machine learning. For a very large amount of data, the domain experts are not very
useful and reliable. So we move towards the machine learning approach for this
work. To use machine learning One method is to replicate the experts logic in the
form of algorithms but this work is very tedious, time taking and expensive. So we
move towards the inductive algorithms which itself generate the strategy for
performing a task and need not instruct separately at each step. Need of ILA in
presence of other machine learning algorithms: The ILA is a new algorithm
which was needed even when other reinforcement learnings like ID3 and AQ were
available.
 The need was due to the pitfalls which were present in the previous algorithms,
one of the major pitfalls was lack of generalisation of rules.
 The ID3 and AQ used the decision tree production method which was too
specific which were difficult to analyse and was very slow to perform for basic
short classification problems.
 The decision tree-based algorithm was unable to work for a new problem if
some attributes are missing.
 The ILA uses the method of production of a general set of rules instead of
decision trees, which overcome the above problems
THE ILA ALGORITHM: General requirements at start of the algorithm:-
1. list the examples in the form of a table ‘T’ where each row corresponds to an
example and each column contains an attribute value.
2. create a set of m training examples, each example composed of k attributes and
a class attribute with n possible decisions.
3. create a rule set, R, having the initial value false.
4. initially all rows in the table are unmarked.

How AI Classification Works

AI classifications works when the business feeds the AI data points, such as
product stock, along with their predetermined categories. The algorithm studies the
information in this database. For each category, it creates a model based on what it
learned that likely represents the type of product in that category. It then applies
this model to new products to decide which category they belong to.
Types of Classification Algorithms

While you don’t have to understand the details of your AI classifier, here is a basic
overview of some strategies computers use to categorize data points in business.
 Using Bayes’ Theorem, which predicts the probability that a product belongs to
a class based on its features. If many parts of the product seem to share
characteristics with the category, then it’s likely part of it.

 Decision Trees. You’ve likely heard of this type if you’ve played the “20
questions” game. The algorithm slowly deduces the attributes of the product by
asking questions to narrow down where the data point belongs.

 K-nearest Neighbors. This algorithm compares a new piece of data with other
similar data points already in the database. It’s has a population application
predicting the prices of goods in the marketplace and providing product
recommendations.

 Neural Networks. This buzzword is popular in the machine learning field since
it simulates how the human brain itself picks up new information. Arguably the
most challenging type of AI to develop, neural networks have caught the interest
of large enterprises like Facebook, Google, and Amazon for their versatility and
virtually endless applications.

Naïve Bayes Classifier Algorithm


o Naïve Bayes algorithm is a supervised learning algorithm, which is based
on Bayes theorem and used for solving classification problems.
o It is mainly used in text classification that includes a high-dimensional
training dataset.
o Naïve Bayes Classifier is one of the simple and most effective Classification
algorithms which helps in building the fast machine learning models that can
make quick predictions.
o It is a probabilistic classifier, which means it predicts on the basis of the
probability of an object.
o Some popular examples of Naïve Bayes Algorithm are spam filtration,
Sentimental analysis, and classifying articles.

Why is it called Naïve Bayes?

The Naïve Bayes algorithm is comprised of two words Naïve and Bayes, Which
can be described as:
o Naïve: It is called Naïve because it assumes that the occurrence of a certain
feature is independent of the occurrence of other features. Such as if the fruit
is identified on the bases of color, shape, and taste, then red, spherical, and
sweet fruit is recognized as an apple. Hence each feature individually
contributes to identify that it is an apple without depending on each other.
o Bayes: It is called Bayes because it depends on the principle of Bayes'
Theorem

Bayes' Theorem:

o Bayes' theorem is also known as Bayes' Rule or Bayes' law, which is used
to determine the probability of a hypothesis with prior knowledge. It
depends on the conditional probability.
o The formula for Bayes' theorem is given as:

Where,

P(A|B) is Posterior probability: Probability of hypothesis A on the observed


event B.

P(B|A) is Likelihood probability: Probability of the evidence given that the


probability of a hypothesis is true.

P(A) is Prior Probability: Probability of hypothesis before observing the


evidence.

P(B) is Marginal Probability: Probability of Evidence.

Working of Naïve Bayes' Classifier:

Working of Naïve Bayes' Classifier can be understood with the help of the below
example:

Suppose we have a dataset of weather conditions and corresponding target


variable "Play". So using this dataset we need to decide that whether we should
play or not on a particular day according to the weather conditions. So to solve this
problem, we need to follow the below steps:

1. Convert the given dataset into frequency tables.


2. Generate Likelihood table by finding the probabilities of given features.
3. Now, use Bayes theorem to calculate the posterior probability.

Problem: If the weather is sunny, then the Player should play or not?

Solution: To solve this, first consider the below dataset:

Outlook Play

0 Rainy Yes

1 Sunny Yes

2 Overcast Yes

3 Overcast Yes

4 Sunny No

5 Rainy Yes

6 Sunny Yes

7 Overcast Yes

8 Rainy No

9 Sunny No
10 Sunny Yes

11 Rainy No

12 Overcast Yes

13 Overcast Yes

Frequency table for the Weather Conditions:

Weather Yes No

Overcast 5 0

Rainy 2 2

Sunny 3 2

Total 10 5

Likelihood table weather condition:

Weather No Yes

Overcast 0 5 5/14= 0.35

Rainy 2 2 4/14=0.29

Sunny 2 3 5/14=0.35

All 4/14=0.29 10/14=0.71


Applying Bayes'theorem:

P(Yes|Sunny)= P(Sunny|Yes)*P(Yes)/P(Sunny)

P(Sunny|Yes)= 3/10= 0.3

P(Sunny)= 0.35

P(Yes)=0.71

So P(Yes|Sunny) = 0.3*0.71/0.35= 0.60

P(No|Sunny)= P(Sunny|No)*P(No)/P(Sunny)

P(Sunny|NO)= 2/4=0.5

P(No)= 0.29

P(Sunny)= 0.35

So P(No|Sunny)= 0.5*0.29/0.35 = 0.41

So as we can see from the above calculation that P(Yes|Sunny)>P(No|Sunny)

Hence on a Sunny day, Player can play the game.

Advantages of Naïve Bayes Classifier:


o Naïve Bayes is one of the fast and easy ML algorithms to predict a class of
datasets.
o It can be used for Binary as well as Multi-class Classifications.
o It performs well in Multi-class predictions as compared to the other
Algorithms.
o It is the most popular choice for text classification problems.

Disadvantages of Naïve Bayes Classifier:


o Naive Bayes assumes that all features are independent or unrelated, so it
cannot learn the relationship between features.
Applications of Naïve Bayes Classifier:
o It is used for Credit Scoring.
o It is used in medical data classification.
o It can be used in real-time predictions because Naïve Bayes Classifier is an
eager learner.
o It is used in Text classification such as Spam filtering and Sentiment
analysis.

Decision Tree Induction


Decision Tree is a supervised learning method used in data mining for
classification and regression methods. It is a tree that helps us in decision-making
purposes. The decision tree creates classification or regression models as a tree
structure. It separates a data set into smaller subsets, and at the same time, the
decision tree is steadily developed. The final tree is a tree with the decision nodes
and leaf nodes. A decision node has at least two branches. The leaf nodes show a
classification or decision. We can't accomplish more split on leaf nodes-The
uppermost decision node in a tree that relates to the best predictor called the root
node. Decision trees can deal with both categorical and numerical data.

Key factors:

Entropy: Entropy refers to a common way to measure impurity. In the decision


tree, it measures the randomness or impurity in data sets.
Information Gain:
Information Gain refers to the decline in entropy after the dataset is split. It is also
called Entropy Reduction. Building a decision tree is all about discovering
attributes that return the highest data gain.

In short, a decision tree is just like a flow chart diagram with the terminal nodes
showing decisions. Starting with the dataset, we can measure the entropy to find a
way to segment the set until the data belongs to the same class.

Why are decision trees useful?

It enables us to analyze the possible consequences of a decision thoroughly.

It provides us a framework to measure the values of outcomes and the probability


of accomplishing them.

It helps us to make the best decisions based on existing data and best speculations.

In other words, we can say that a decision tree is a hierarchical tree structure that
can be used to split an extensive collection of records into smaller sets of the class
by implementing a sequence of simple decision rules. A decision tree model
comprises a set of rules for portioning a huge heterogeneous population into
smaller, more homogeneous, or mutually exclusive classes. The attributes of the
classes can be any variables from nominal, ordinal, binary, and quantitative values,
in contrast, the classes must be a qualitative type, such as categorical or ordinal or
binary. In brief, the given data of attributes together with its class, a decision tree
creates a set of rules that can be used to identify the class. One rule is implemented
after another, resulting in a hierarchy of segments within a segment. The hierarchy
is known as the tree, and each segment is called a node. With each progressive
division, the members from the subsequent sets become more and more similar to
each other. Hence, the algorithm used to build a decision tree is referred to as
recursive partitioning. The algorithm is known as CART (Classification and
Regression Trees)

Consider the given example of a factory where

Expanding factor costs $3 million, the probability of a good economy is 0.6 (60%),
which leads to $8 million profit, and the probability of a bad economy is 0.4
(40%), which leads to $6 million profit.

Not expanding factor with 0$ cost, the probability of a good economy is 0.6(60%),
which leads to $4 million profit, and the probability of a bad economy is 0.4,
which leads to $2 million profit.

The management teams need to take a data-driven decision to expand or not based
on the given data.
Net Expand = ( 0.6 *8 + 0.4*6 ) - 3 = $4.2M
Net Not Expand = (0.6*4 + 0.4*2) - 0 = $3M
$4.2M > $3M,therefore the factory should be expanded.

Decision tree Algorithm:

The decision tree algorithm may appear long, but it is quite simply the basis
algorithm techniques is as follows:

The algorithm is based on three parameters: D, attribute_list, and


Attribute _selection_method.

Generally, we refer to D as a data partition.

Initially, D is the entire set of training tuples and their related class levels (input
training data).

The parameter attribute_list is a set of attributes defining the tuples.

Attribute_selection_method specifies a heuristic process for choosing the


attribute that "best" discriminates the given tuples according to class.

Attribute_selection_method process applies an attribute selection measure.

Advantages of using decision trees:

A decision tree does not need scaling of information.

Missing values in data also do not influence the process of building a choice tree to
any considerable extent.

A decision tree model is automatic and simple to explain to the technical team as
well as stakeholders.

Compared to other algorithms, decision trees need less exertion for data
preparation during pre-processing.

A decision tree does not require a standardization of data.

You might also like