Machine Learning Unit-I

Download as pdf or txt
Download as pdf or txt
You are on page 1of 41

Machine Learning

UNIT -I
Definition of Machine
Machine learning is a subfield of artificial intelligence, which is broadly defined as the capability of a machine to imitate intelligent human behavior.
Artificial intelligence systems are used to perform complex tasks in a way that is similar to how humans solve problems.

A subset of artificial intelligence (AI), machine learning (ML) is the area of computational science that focuses on analyzing and interpreting patterns and
structures in data to enable learning, reasoning, and decision making outside of human interaction.x

As explained, machine learning algorithms have the ability to improve themselves through training. Today, ML algorithms are trained using three prominent
methods. These are three types of machine learning: supervised learning, unsupervised learning, and reinforcement learning.

Machine learning is an application of AI that enables systems to learn and improve from experience without being explicitly programmed. Machine
learning focuses on developing computer programs that can access data and use it to learn for themselves.
An “intelligent” computer uses AI to think like a human and perform tasks on its own. Machine learning is how a computer system develops its
intelligence. One way to train a computer to mimic human reasoning is to use a neural network, which is a series of algorithms that are modeled after the
human brain.

Advantages of Machine Learning


• Continuous Improvement. Machine Learning algorithms are capable of learning from the data we provide. ...

• Automation for everything. ...

• Trends and patterns identification. ...

• Wide range of applications. ...

• Data Acquisition. ...

• Highly error-prone. ...

Types of Machine Learning


Machine learning is a subset of AI, which enables the machine to automatically learn from data, improve performance from past
experiences, and make predictions. Machine learning contains a set of algorithms that work on a huge amount of data.
Data is fed to these algorithms to train them, and on the basis of training, they build the model & perform a specific
task.
These ML algorithms help to solve different business problems like Regression, Classification, Forecasting, Clustering,
and Associations, etc.

Based on the methods and way of learning, machine learning is divided into mainly four types, which are:

1. Supervised Machine Learning

2. Unsupervised Machine Learning

3. Semi-Supervised Machine Learning

4. Reinforcement Learning
In this topic, we will provide a detailed description of the types of Machine Learning along with their respective
algorithms:

1. Supervised Machine Learning


As its name suggests, Supervised machine learning is based on supervision. It means in the supervised learning
technique, we train the machines using the "labelled" dataset, and based on the training, the machine predicts the
output. Here, the labelled data specifies that some of the inputs are already mapped to the output. More preciously, we
can say; first, we train the machine with the input and corresponding output, and then we ask the machine to predict
the output using the test dataset.

Let's understand supervised learning with an example. Suppose we have an input dataset of cats and dog images. So,
first, we will provide the training to the machine to understand the images, such as the shape & size of the tail of cat and
dog, Shape of eyes, colour, height (dogs are taller, cats are smaller), etc. After completion of training, we input the picture of
a cat and ask the machine to identify the object and predict the output. Now, the machine is well trained, so it will
check all the features of the object, such as height, shape, colour, eyes, ears, tail, etc., and find that it's a cat. So, it
will put it in the Cat category. This is the process of how the machine identifies the objects in Supervised Learning.

The main goal of the supervised learning technique is to map the input variable(x) with the output variable(y). Some real-world
applications of supervised learning are Risk Assessment, Fraud Detection, Spam filtering, etc.

Categories of Supervised Machine Learning

Supervised machine learning can be classified into two types of problems, which are given below:

o Classification

o Regression

a) Classification

Classification algorithms are used to solve the classification problems in which the output variable is categorical, such as
"Yes" or No, Male or Female, Red or Blue, etc. The classification algorithms predict the categories present in the dataset.
Some real-world examples of classification algorithms are Spam Detection, Email filtering, etc.

Some popular classification algorithms are given below:

o Random Forest Algorithm

o Decision Tree Algorithm

o Logistic Regression Algorithm

o Support Vector Machine Algorithm


b) Regression

Regression algorithms are used to solve regression problems in which there is a linear relationship between input and
output variables. These are used to predict continuous output variables, such as market trends, weather prediction, etc.

Some popular Regression algorithms are given below:

o Simple Linear Regression Algorithm

o Multivariate Regression Algorithm

o Decision Tree Algorithm

o Lasso Regression

Advantages and Disadvantages of Supervised Learning

Advantages:

o Since supervised learning work with the labelled dataset so we can have an exact idea about the classes of objects.

o These algorithms are helpful in predicting the output on the basis of prior experience.

Disadvantages:

o These algorithms are not able to solve complex tasks.

o It may predict the wrong output if the test data is different from the training data.

o It requires lots of computational time to train the algorithm.

Applications of Supervised Learning


Some common applications of Supervised Learning are given below:

o Image Segmentation:

Supervised Learning algorithms are used in image segmentation. In this process, image classification is performed on
different image data with pre-defined labels.

o Medical Diagnosis:

Supervised algorithms are also used in the medical field for diagnosis purposes. It is done by using medical images and past
labelled data with labels for disease conditions. With such a process, the machine can identify a disease for the new
patients.

o Fraud Detection - Supervised Learning classification algorithms are used for identifying fraud transactions, fraud customers,

etc. It is done by using historic data to identify the patterns that can lead to possible fraud.

o Spam detection - In spam detection & filtering, classification algorithms are used. These algorithms classify an email as spam

or not spam. The spam emails are sent to the spam folder.

o Speech Recognition - Supervised learning algorithms are also used in speech recognition. The algorithm is trained with voice

data, and various identifications can be done using the same, such as voice-activated passwords, voice commands, etc.

2. Unsupervised Machine Learning


Unsupervised learning is different from the Supervised learning technique; as its name suggests, there is no need for
supervision. It means, in unsupervised machine learning, the machine is trained using the unlabeled dataset, and the
machine predicts the output without any supervision.

In unsupervised learning, the models are trained with the data that is neither classified nor labelled, and the model acts
on that data without any supervision.
The main aim of the unsupervised learning algorithm is to group or categories the unsorted dataset according to the similarities,
patterns, and differences. Machines are instructed to find the hidden patterns from the input dataset.

Let's take an example to understand it more preciously; suppose there is a basket of fruit images, and we input it into
the machine learning model. The images are totally unknown to the model, and the task of the machine is to find the
patterns and categories of the objects.

So, now the machine will discover its patterns and differences, such as colour difference, shape difference, and predict
the output when it is tested with the test dataset.

Categories of Unsupervised Machine Learning

Unsupervised Learning can be further classified into two types, which are given below:

o Clustering

o Association

1) Clustering

The clustering technique is used when we want to find the inherent groups from the data. It is a way to group the
objects into a cluster such that the objects with the most similarities remain in one group and have fewer or no
similarities with the objects of other groups. An example of the clustering algorithm is grouping the customers by their
purchasing behaviour.

Some of the popular clustering algorithms are given below:

o K-Means Clustering algorithm


o Mean-shift algorithm

o DBSCAN Algorithm

o Principal Component Analysis

o Independent Component Analysis

2) Association

Association rule learning is an unsupervised learning technique, which finds interesting relations among variables within
a large dataset. The main aim of this learning algorithm is to find the dependency of one data item on another data
item and map those variables accordingly so that it can generate maximum profit. This algorithm is mainly applied
in Market Basket analysis, Web usage mining, continuous production, etc.

Some popular algorithms of Association rule learning are Apriori Algorithm, Eclat, FP-growth algorithm.

Advantages and Disadvantages of Unsupervised Learning Algorithm

Advantages:

o These algorithms can be used for complicated tasks compared to the supervised ones because these algorithms work on the
unlabeled dataset.

o Unsupervised algorithms are preferable for various tasks as getting the unlabeled dataset is easier as compared to the
labelled dataset.

Disadvantages:

o The output of an unsupervised algorithm can be less accurate as the dataset is not labelled, and algorithms are not trained
with the exact output in prior.
o Working with Unsupervised learning is more difficult as it works with the unlabelled dataset that does not map with the
output.

Applications of Unsupervised Learning


o Network Analysis: Unsupervised learning is used for identifying plagiarism and copyright in document network analysis of text

data for scholarly articles.

o Recommendation Systems: Recommendation systems widely use unsupervised learning techniques for building

recommendation applications for different web applications and e-commerce websites.

o Anomaly Detection: Anomaly detection is a popular application of unsupervised learning, which can identify unusual data

points within the dataset. It is used to discover fraudulent transactions.

o Singular Value Decomposition: Singular Value Decomposition or SVD is used to extract particular information from the

database. For example, extracting information of each user located at a particular location.

3. Semi-Supervised Learning
Semi-Supervised learning is a type of Machine Learning algorithm that lies between Supervised and Unsupervised machine
learning. It represents the intermediate ground between Supervised (With Labelled training data) and Unsupervised
learning (with no labelled training data) algorithms and uses the combination of labelled and unlabeled datasets during
the training period.

Although Semi-supervised learning is the middle ground between supervised and unsupervised learning and operates
on the data that consists of a few labels, it mostly consists of unlabeled data. As labels are costly, but for corporate
purposes, they may have few labels. It is completely different from supervised and unsupervised learning as they are
based on the presence & absence of labels.
To overcome the drawbacks of supervised learning and unsupervised learning algorithms, the concept of Semi-supervised
learning is introduced. The main aim of semi-supervised learning is to effectively use all the available data, rather than
only labelled data like in supervised learning. Initially, similar data is clustered along with an unsupervised learning
algorithm, and further, it helps to label the unlabeled data into labelled data. It is because labelled data is a
comparatively more expensive acquisition than unlabeled data.

We can imagine these algorithms with an example. Supervised learning is where a student is under the supervision of
an instructor at home and college. Further, if that student is self-analysing the same concept without any help from the
instructor, it comes under unsupervised learning. Under semi-supervised learning, the student has to revise himself
after analyzing the same concept under the guidance of an instructor at college.

Advantages and disadvantages of Semi-supervised Learning

Advantages:

o It is simple and easy to understand the algorithm.

o It is highly efficient.

o It is used to solve drawbacks of Supervised and Unsupervised Learning algorithms.

Disadvantages:

o Iterations results may not be stable.

o We cannot apply these algorithms to network-level data.

o Accuracy is low.

4. Reinforcement Learning
Reinforcement learning works on a feedback-based process, in which an AI agent (A software component) automatically explore
its surrounding by hitting & trail, taking action, learning from experiences, and improving its performance. Agent gets
rewarded for each good action and get punished for each bad action; hence the goal of reinforcement learning agent is
to maximize the rewards.

In reinforcement learning, there is no labelled data like supervised learning, and agents learn from their experiences
only.

The reinforcement learning process is similar to a human being; for example, a child learns various things by
experiences in his day-to-day life. An example of reinforcement learning is to play a game, where the Game is the
environment, moves of an agent at each step define states, and the goal of the agent is to get a high score. Agent
receives feedback in terms of punishment and rewards.

Due to its way of working, reinforcement learning is employed in different fields such as Game theory, Operation
Research, Information theory, multi-agent systems.

A reinforcement learning problem can be formalized using Markov Decision Process(MDP). In MDP, the agent constantly
interacts with the environment and performs actions; at each action, the environment responds and generates a new
state.

Categories of Reinforcement Learning

Reinforcement learning is categorized mainly into two types of methods/algorithms:

o Positive Reinforcement Learning: Positive reinforcement learning specifies increasing the tendency that the required behaviour

would occur again by adding something. It enhances the strength of the behaviour of the agent and positively impacts it.
o Negative Reinforcement Learning: Negative reinforcement learning works exactly opposite to the positive RL. It increases the

tendency that the specific behaviour would occur again by avoiding the negative condition.

Real-world Use cases of Reinforcement Learning


o Video Games:

RL algorithms are much popular in gaming applications. It is used to gain super-human performance. Some popular games
that use RL algorithms are AlphaGO and AlphaGO Zero.

o Resource Management:

The "Resource Management with Deep Reinforcement Learning" paper showed that how to use RL in computer to
automatically learn and schedule resources to wait for different jobs in order to minimize average job slowdown.

o Robotics:

RL is widely being used in Robotics applications. Robots are used in the industrial and manufacturing area, and these robots
are made more powerful with reinforcement learning. There are different industries that have their vision of building
intelligent robots using AI and Machine learning technology.

o Text Mining

Text-mining, one of the great applications of NLP, is now being implemented with the help of Reinforcement Learning by
Salesforce company.

Advantages and Disadvantages of Reinforcement Learning

Advantages

o It helps in solving complex real-world problems which are difficult to be solved by general techniques.

o The learning model of RL is similar to the learning of human beings; hence most accurate results can be found.

o Helps in achieving long term results.


Disadvantage

o RL algorithms are not preferred for simple problems.

o RL algorithms require huge data and computations.

o Too much reinforcement learning can lead to an overload of states which can weaken the results.

The curse of dimensionality limits reinforcement learning for real physical systems

Descriptive and Predictive Analysis in Machine Learning


Descriptive and Predictive Analysis are types of statistical analysis techniques structured as a
sequence of steps that you need to take to gain comprehensive domain knowledge to solve
complex business problems. These techniques give you a clear understanding of the business
problem so that you can make the right decisions. Let’s take a look at descriptive and predictive
analytics in machine learning one by one.
Descriptive Analysis:

Before using a machine learning algorithm, it is very important to acquire abstract knowledge of the
problem. The goal of descriptive analysis is to find an accurate understanding of the problem by
asking questions from historical data. Let’s understand the descriptive analysis process using an
example. Suppose your task is to optimize the supply chain of a department store, for this task we
have purchase and sales data. After analyzing the data, we make assumptions that sales increase
during the day just before the weekend. This means that our machine learning model is based on
periodicity. So, descriptive analysis helps us understand the deep patterns from the data to uncover
all those special features that were overlooked at the initial stage.

In short, the purpose of descriptive analysis is to enable us to understand whether the


machine learning model will perform poorly or whether it is the best model in a particular
problem.
Predictive Analysis:

Predictive analytics is an important concept in machine learning. What happens is that once we
have formed a machine learning model based on descriptive analysis, the next goal is to infer its
future steps by giving some initial conditions. Predictive analytics is used to discover and define
certain rules that underlie a process for pushing a particular condition on time. For example, the
object detector of a self-driven car can be extremely precise at detecting an obstacle in time, but
another model must take action that minimizes the risk of damage and maximizes the likelihood of
safe movement.
Predictive analytics, therefore, means observing a problem in time and taking the most appropriate
action as a prescription to avoid any type of risk.
7 CHARACTERISTICS OF MACHINE LEARNING

1- THE ABILITY TO PERFORM AUTOMATED DATA VISUALIZATION

A massive amount of data is being generated by businesses and common people on a regular basis. By visualizing notable relationships in data, businesses can not
only make better decisions but build confidence as well. Machine learning offers a number of tools that provide rich snippets of data which can be applied to both
unstructured and structured data. With the help of user-friendly automated data visualization platforms in machine learning, businesses can obtain a wealth of
new insights in an effort to increase productivity in their processes.

2- AUTOMATION AT ITS BEST

One of the biggest characteristics of machine learning is its ability to automate repetitive tasks and thus, increasing productivity. A huge number of organizations
are already using machine learning-powered paperwork and email automation. In the financial sector, for example, a huge number of repetitive, data-heavy and
predictable tasks are needed to be performed. Because of this, this sector uses different types of machine learning solutions to a great extent. The make accounting
tasks faster, more insightful, and more accurate. Some aspects that have been already addressed by machine learning include addressing financial queries with the
help of chatbots, making predictions, managing expenses, simplifying invoicing, and automating bank reconciliations.

3- CUSTOMER ENGAGEMENT LIKE NEVER BEFORE


For any business, one of the most crucial ways to drive engagement, promote brand loyalty and establish long-lasting customer relationships is by triggering
meaningful conversations with its target customer base. Machine learning plays a critical role in enabling businesses and brands to spark more valuable
conversations in terms of customer engagement. The technology analyzes particular phrases, words, sentences, idioms, and content formats which resonate
with certain audience members. You can think of Pinterest which is successfully using machine learning to personalize uggestions to its users. It uses the
technology to source content in which users will be interested, based on objects which they have pinned already.

4- THE ABILITY TO TAKE EFFICIENCY TO THE NEXT LEVEL WHEN MERGED WITH IOT
Thanks to the huge hype surrounding the IoT, machine learning has experienced a great rise in popularity. IoT is being designated as a strategically
significant area by many companies. And many others have launched pilot projects to gauge the potential of IoT in the context of business operations. But
attaining financial benefits through IoT isn’t easy. In order to achieve success, companies, which are offering IoT consulting services and platforms, need to
clearly determine the areas that will change with the implementation of IoT strategies. Many of these businesses have failed to address it. In this scenario,
machine learning is probably the best technology that can be used to attain higher levels of efficiency. By merging machine learning with IoT, businesses can
boost the efficiency of their entire production processes.

5- THE ABILITY TO CHANGE THE MORTGAGE MARKET


It’s a fact that fostering a positive credit score usually takes discipline, time, and lots of financial planning for a lot of consumers. When it comes to the
lenders, the consumer credit score is one of the biggest measures of creditworthiness that involve a number of factors including payment history, total debt,
length of credit history etc. But wouldn’t it be great if there is a simplified and better measure? With the help of machine learning, lenders can now obtain a
more comprehensive consumer picture. They can now predict whether the customer is a low spender or a high spender and understand his/her tipping point
of spending. Apart from mortgage lending, financial institutions are using the same techniques for other types of consumer loans.

6- ACCURATE DATA ANALYSIS


Traditionally, data analysis has always been encompassing trial and error method, an approach which becomes impossible when we are working with large
and heterogeneous datasets. Machine learning comes as the best solution to all these issues by offering effective alternatives to analyzing massive volumes
of data. By developing efficient and fast algorithms, as well as, data-driven models for processing of data in real-time, machine learning is able to generate
accurate analysis and results.

7- BUSINESS INTELLIGENCE AT ITS BEST


Machine learning characteristics, when merged with big data analytical work, can generate extreme levels of business intelligence with the help of which
several different industries are making strategic initiatives. From retail to financial services to healthcare, and many more – machine learning has already
become one of the most effective technologies to boost business operations.

Whether you are convinced or not, the above characteristics of machine learning have contributed heavily toward making it one of the most crucial
technology trends – it underlies a huge number of things we use these days without even thinking about them

Examples of Machine Learning


Machine Learning technology has widely changed the lifestyle of a human beings as we are highly dependent on
this technology. It is the subset of Artificial Intelligence, and we all are using this either knowingly or
unknowingly. For example, we use Google Assistant that employs ML concepts, we take help from online customer
support, which is also an example of machine learning, and many more.

Machine Learning uses statistical techniques to make a computer more intelligent, which helps to fetch entire
business data and utilize it automatically as per requirement. There are so many examples of Machine Learning in
real-world, which are as follows:
1. Speech & Image Recognition
Computer Speech Recognition or Automatic Speech Recognition helps to convert speech into text. Many
applications convert the live speech into an audio file format and later convert it into a text file.

Voice search, voice dialing, and appliance control are some real-world examples of speech recognition. Alexa and
Google Home are the most widely used speech recognition software.

Similar to speech recognition, Image recognition is also the most widely used example of Machine Learning
technology that helps identify any object in the form of a digital image. There are some real-world examples of
Image recognition, such as,

Tagging the name on any photo as we have seen on Facebook. It is also used in recognizing handwriting by

segmenting a single letter into smaller images.


Further, there is the biggest example of Image recognition is facial recognition. We all are using new generation

mobile phones, where we use facial recognition techniques to unlock our devices. Hence, it also helps to increase
the security of the system.

2. Traffic alerts using Google Map


Google Map is one of the widely used applications whenever anyone goes out to reach the correct destination. The
map helps us find the best route or fastest route, traffic, and much more information. But how it provides this
information to us? Google map uses different technologies, including machine learning which collects information
from different users, analyze that information, update the information, and make predictions. With the help of
predictions, it can also tell us the traffic before we start our journey. Machine Learning also helps identify the best
and fastest route while we are in traffic using Google Maps. Further, we can also answer some questions like does
the route still have traffic? This information and data get stored automatically in the database, which Machine
Learning uses for the exact information for other people in traffic. Further, Google maps also help find locations
like a hotel, mall, restaurant, cinema hall, buses, etc.

3. Chatbot (Online Customer Support)


A chatbot is the most widely used software in every industry like banking, Medical, education, health, etc. You can
see chatbots in any banking application for quick online support to customers. These chatbots also work on the
concepts of Machine Learning. The programmers feed some basic questions and answers based on the frequently
asked queries. So, whenever a customer asks a query, the chatbot recognizes the question's keywords from a
database and then provides appropriate resolution to the customer. This helps to make quick and fast customer
service facilities to customers.

4. Google Translation
Suppose you work on an international banking project like French, German, etc., but you only know English. In
that case, this will be a very panic moment for you because you can't proceed further without reviewing
documents. Google Translator software helps to translate any language into the desired language. So, in this way,
you can convert French, German, etc., into English, Hindi, or any other language. This makes the job of different
sectors very easy as a user can work on any country's project hassle-free.

Google uses the Google Neural Machine Translation to detect any language and translate it into any desired

language.

5. Prediction
Prediction system also uses Machine learning algorithms for making predictions. There are various sectors where
predictions are used. For example, in bank loan systems, error probability can be determined using predictions
with machine learning. For this, the available data are classified into different groups with the set of rules
provided by analysts, and once the classification is done, the error probability is predicted.

6. Extraction
One of the best examples of machine learning is the extraction of information. In this process, structured data is
extracted from unstructured data, and which is used in predictive analytics tools. The data is usually found in a
raw or unstructured form that is not useful, and to make it useful, the extraction process is used. Some real-world
examples of extraction are:

o Generating a model to predict vocal cord disorders.

o Helping diagnosis and treatment of problem faster.

7. Statistical Arbitrage
Arbitrage is an automated trading process, which is used in the finance industry to manage a large volume of
securities. The process uses a trading algorithm to analyze a set of securities using economic variables and
correlations. Some examples of statistical arbitrage are as follows:

o Algorithmic trading that analyses a market microstructure


o Analyze large data sets

o Identify real-time arbitrage opportunities

o Machine learning optimizes the arbitrage strategy to enhance results.

8. Auto-Friend Tagging Suggestion


One of the popular examples of machine learning is the Auto-friend tagging suggestions feature by Facebook.
Whenever we upload a new picture on Facebook with friends, it suggests to tag the friends and automatically
provides the names. Facebook does it by using DeepFace, which is a facial recognition system created by

Facebook. It identifies the faces and images also.

9. Self-driving cars
The future of the automobile industry is self-driving cars. These are driverless cars, which are based on concepts
of deep learning and machine learning. Some commonly used machine learning algorithms in self-driving cars
are Scale-invariant feature transform (SIFT), AdaBoost, TextonBoost, YOLO(You only look once).

10. Ads Recommendation


Nowadays, most people spend multiple hours on google or the internet surfing. And while working on any
webpage or website, they get multiples ads on each page. But these ads are different for each user even when
two users are using the same internet and on the same location. These ads recommendations are done with the
help of machine learning algorithms. These ads recommendations are based on the search history of each user.
For example, if one user searches for the Shirt on Amazon or any other e-commerce website, he will get start ads
recommendation of shirts after some time.

11. Video Surveillance


Video Surveillance is an advanced application of AI and machine learning, which can detect any crime before it
happens. It is much efficient than observed by a human because it is a much difficult and boring task for a human
to keep monitoring multiple videos; that's why machines are the better option. Video surveillance is very useful as
they keep looking for specific behavior of people like standing motionless for a long time, stumbling, or napping
on benches, etc. Whenever the surveillance system finds any unusual activity, it alerts the respective team, which
can stop or help avoid some mishappening at that place.

Some popular uses of video surveillance are:

o Facility protections

o Operation monitoring

o Parking lots

o Traffic monitoring

o Shopping patterns

12. Email & spam filtering


Emails are filtered automatically when we receive any new email, and it is also an example of machine learning.
We always receive an important mail in our inbox with the important symbol and spam emails in our spam box,
and the technology behind this is Machine learning. Below are some spam filters used by Gmail:

o Content Filter

o Header filter

o General blacklists filter

o Rules-based filters

o Permission filters

Some machine learning algorithms that are used in email spam filtering and malware detection are Multi-Layer
Perceptron, Decision tree, and Naïve Bayes classifier.
13. Real-Time Dynamic Pricing
Whenever we book an Uber in peak office hours in the morning or evening, we get a difference in prices compared
to normal hours. The prices are hiked due to surge prices applied by companies whenever demand is high. But
how these surge prices are determined & applied by companies. So, the technologies behind this are AI and
machine learning. These technologies solve two main business queries, which are

o The reaction of customers on surge prices

o Suggesting optimum prices so that no harm of customer losing occurs to business.

Machine Learning technology also helps in finding discounted prices, best prices, promotional prices, etc., for each
customer.

14. Gaming and Education


Machine learning technology is widely being used in gaming and education. There are various gaming and learning
apps that are using AI and Machine learning. Among these apps, Duolingo is a free language learning app, which

is designed in a fun and interactive way. While using this app, people feel like playing a game on the phone.

It collects data from the user's answer and creates a statical model to determine that how long a person can
remember the word, and before requiring a refresher, it provides that information.

15. Virtual Assistants


Virtual assistants are much popular in today's world, which are the smart software embedded in smartphones or
laptops. These assistants work as personal assistants and assist in searching for information that is asked over
voice. A virtual assistant understands human language or natural language voice commands and performs the
task for that user. Some examples of virtual assistants are Siri, Alexa, Google, Cortana, etc. To start working with
these virtual assistants, first, they need to be activated, and then we can ask anything, and they will answer it.
For example, "What's the date today?", "Tell me a joke", and many more. The technologies used behind Virtual
assistants are AI, machine learning, natural language processing, etc. Machine learning algorithms collect and
analyze the data based on the previous involvement of the user and predict data as per the user preferences.

learning algorithms. In this section, we present a taxonomy of machine learning models adapted from the
book Machine Learning by Peter Flach. While the structure for classifying algorithms is based on the book, the
explanation presented below is created by us.

Machine LearningModel:
There is no simple way to classify machine learning algorithms. In this section, we present a taxonomy of machine
learning models adapted from the book Machine Learning by Peter Flach. While the structure for classifying
algorithms is based on the book, the explanation presented below is created by us.

For a given problem, the collection of all possible outcomes represents the sample space or instance space.
The basic idea for creating a taxonomy of algorithms is that we divide the instance space by using one of three
ways:

• Using a Logical expression.


• Using the Geometry of the instance space.
• Using Probability to classify the instance space.
The outcome of the transformation of the instance space by a machine learning algorithm using the above
techniques should be exhaustive (cover all possible outcomes) and mutually exclusive (non-overlapping).
2. Logical models

2.1 Logical models - Tree models and Rule models

Logical models use a logical expression to divide the instance space into segments and hence construct grouping
models. A logical expression is an expression that returns a Boolean value, i.e., a True or False outcome. Once the
data is grouped using a logical expression, the data is divided into homogeneous groupings for the problem we are
trying to solve. For example, for a classification problem, all the instances in the group belong to one class.
There are mainly two kinds of logical models: Tree models and Rule models.
Rule models consist of a collection of implications or IF-THEN rules. For tree-based models, the ‘if-part’ defines a
segment and the ‘then-part’ defines the behaviour of the model for this segment. Rule models follow the same
reasoning.

Tree models can be seen as a particular type of rule model where the if-parts of the rules are organised in a tree
structure. Both Tree models and Rule models use the same approach to supervised learning. The approach can be
summarised in two strategies: we could first find the body of the rule (the concept) that covers a sufficiently
homogeneous set of examples and then find a label to represent the body. Alternately, we could approach it from
the other direction, i.e., first select a class we want to learn and then find rules that cover examples of the class.

A simple tree-based model is shown below. The tree shows survival numbers of passengers on the Titanic ("sibsp" is
the number of spouses or siblings aboard). The values under the leaves show the probability of survival and the
percentage of observations in the leaf. The model can be summarised as: Your chances of survival were good if you
were (i) a female or (ii) a male younger than 9.5 years with less than 2.5 siblings.

(Image source.)
2.2 Logical models and Concept learning

To understand logical models further, we need to understand the idea of Concept Learning. Concept Learning
involves learning logical expressions or concepts from examples. The idea of Concept Learning fits in well with the
idea of Machine learning, i.e., inferring a general function from specific training examples. Concept learning forms
the basis of both tree-based and rule-based models. More formally, Concept Learning involves acquiring the
definition of a general category from a given set of positive and negative training examples of the category. A
Formal Definition for Concept Learning is “The inferring of a Boolean-valued function from training examples of its
input and output.” In concept learning, we only learn a description for the positive class and label everything that
doesn’t satisfy that description as negative.
The following example explains this idea in more detail.
A Concept Learning Task called “Enjoy Sport” as shown above is defined by a set of data from some example days.
Each data is described by six attributes. The task is to learn to predict the value of Enjoy Sport for an arbitrary day
based on the values of its attribute values. The problem can be represented by a series of hypotheses. Each
hypothesis is described by a conjunction of constraints on the attributes. The training data represents a set of
positive and negative examples of the target function. In the example above, each hypothesis is a vector of six
constraints, specifying the values of the six attributes – Sky, AirTemp, Humidity, Wind, Water, and Forecast. The
training phase involves learning the set of days (as a conjunction of attributes) for which Enjoy Sport = yes.
Thus, the problem can be formulated as:

• Given instances X which represent a set of all possible days, each described by the attributes:
o Sky – (values: Sunny, Cloudy, Rainy),
o AirTemp – (values: Warm, Cold),
o Humidity – (values: Normal, High),
o Wind – (values: Strong, Weak),
o Water – (values: Warm, Cold),
o Forecast – (values: Same, Change).
Try to identify a function that can predict the target variable Enjoy Sport as yes/no, i.e., 1 or 0.
2.3 Concept learning as a search problem and as Inductive Learning

We can also formulate Concept Learning as a search problem. We can think of Concept learning as searching
through a set of predefined space of potential hypotheses to identify a hypothesis that best fits the training
examples. Concept learning is also an example of Inductive Learning. Inductive learning, also known as discovery
learning, is a process where the learner discovers rules by observing examples. Inductive learning is different from
deductive learning, where students are given rules that they then need to apply. Inductive learning is based on
the inductive learning hypothesis. The Inductive Learning Hypothesis postulates that: Any hypothesis found to
approximate the target function well over a sufficiently large set of training examples is expected to approximate
the target function well over other unobserved examples. This idea is the fundamental assumption of inductive
learning.
To summarise, in this section, we saw the first class of algorithms where we divided the instance space based on a
logical expression. We also discussed how logical models are based on the theory of concept learning – which in
turn – can be formulated as an inductive learning or a search problem.
3. Geometric models

In the previous section, we have seen that with logical models, such as decision trees, a logical expression is used to
partition the instance space. Two instances are similar when they end up in the same logical segment. In this
section, we consider models that define similarity by considering the geometry of the instance space. In Geometric
models, features could be described as points in two dimensions (x- and y-axis) or a three-dimensional space
(x, y, and z). Even when features are not intrinsically geometric, they could be modelled in a geometric manner (for
example, temperature as a function of time can be modelled in two axes). In geometric models, there are two ways
we could impose similarity.
• We could use geometric concepts like lines or planes to segment (classify) the instance space. These are
called Linear models.
• Alternatively, we can use the geometric notion of distance to represent similarity. In this case, if two points
are close together, they have similar values for features and thus can be classed as similar. We call such
models as Distance-based models.
3.1 Linear models

Linear models are relatively simple. In this case, the function is represented as a linear combination of its inputs.
Thus, if x1 and x2 are two scalars or vectors of the same dimension and a and b are arbitrary scalars,
then ax1 + bx2 represents a linear combination of x1 and x2. In the simplest case where f(x) represents a straight line,
we have an equation of the form f (x) = mx + c where c represents the intercept and m represents the slope.

(Image source.)
Linear models are parametric, which means that they have a fixed form with a small number of numeric
parameters that need to be learned from data. For example, in f (x) = mx + c, m and c are the parameters that we are
trying to learn from the data. This technique is different from tree or rule models, where the structure of the model
(e.g., which features to use in the tree, and where) is not fixed in advance.
Linear models are stable, i.e., small variations in the training data have only a limited impact on the learned model.
In contrast, tree models tend to vary more with the training data, as the choice of a different split at the root of
the tree typically means that the rest of the tree is different as well. As a result of having relatively few parameters,
Linear models have low variance and high bias. This implies that Linear models are less likely to overfit the
training data than some other models. However, they are more likely to underfit. For example, if we want to learn
the boundaries between countries based on labelled data, then linear models are not likely to give a good
approximation.
In this section, we could also use algorithms that include kernel methods, such as support vector machine (SVM).
Kernel methods use the kernel function to transform data into another dimension where easier separation can be
achieved for the data, such as using a hyperplane for SVM.
3.2 Distance-based models

Distance-based models are the second class of Geometric models. Like Linear models, distance-based models are
based on the geometry of data. As the name implies, distance-based models work on the concept of distance. In
the context of Machine learning, the concept of distance is not based on merely the physical distance between two
points. Instead, we could think of the distance between two points considering the mode of transport between two
points. Travelling between two cities by plane covers less distance physically than by train because a plane is
unrestricted. Similarly, in chess, the concept of distance depends on the piece used – for example, a Bishop can
move diagonally. Thus, depending on the entity and the mode of travel, the concept of distance can be
experienced differently. The distance metrics commonly used are Euclidean, Minkowski, Manhattan,
and Mahalanobis.

(Image source.)
Distance is applied through the concept of neighbours and exemplars. Neighbours are points in proximity with
respect to the distance measure expressed through exemplars. Exemplars are either centroids that find a centre of
mass according to a chosen distance metric or medoids that find the most centrally located data point. The most
commonly used centroid is the arithmetic mean, which minimises squared Euclidean distance to all other points.
Notes:

• The centroid represents the geometric centre of a plane figure, i.e., the arithmetic mean position of all the
points in the figure from the centroid point. This definition extends to any object in n-dimensional space: its
centroid is the mean position of all the points.
• Medoids are similar in concept to means or centroids. Medoids are most commonly used on data when a
mean or centroid cannot be defined. They are used in contexts where the centroid is not representative of
the dataset, such as in image data.
Examples of distance-based models include the nearest-neighbour models, which use the training data as
exemplars – for example, in classification. The K-means clustering algorithm also uses exemplars to create clusters
of similar data points.
4. Probabilistic models

The third family of machine learning algorithms is the probabilistic models. We have seen before that the k-nearest
neighbour algorithm uses the idea of distance (e.g., Euclidian distance) to classify entities, and logical models use a
logical expression to partition the instance space. In this section, we see how the probabilistic models use the
idea of probability to classify new entities.
Probabilistic models see features and target variables as random variables. The process of modelling represents
and manipulates the level of uncertainty with respect to these variables. There are two types of probabilistic
models: Predictive and Generative. Predictive probability models use the idea of a conditional
probability distribution P (Y |X) from which Y can be predicted from X. Generative models estimate the joint
distribution P (Y, X). Once we know the joint distribution for the generative models, we can derive any conditional
or marginal distribution involving the same variables. Thus, the generative model is capable of creating new data
points and their labels, knowing the joint probability distribution. The joint distribution looks for a relationship
between two variables. Once this relationship is inferred, it is possible to infer new data points.
Naïve Bayes is an example of a probabilistic classifier.
The goal of any probabilistic classifier is given a set of features (x_0 through x_n) and a set of classes (c_0
through c_k), we aim to determine the probability of the features occurring in each class, and to return the most
likely class. Therefore, for each class, we need to calculate P(c_i | x_0, …, x_n).
We can do this using the Bayes rule defined as

The Naïve Bayes algorithm is based on the idea of Conditional Probability. Conditional probability is based on
finding the probability that something will happen, given that something else has already happened. The task of the
algorithm then is to look at the evidence and to determine the likelihood of a specific class and assign a label
accordingly to each entity.
Conclusion

The above discussion presents a way to classify algorithms based on their mathematical foundations. While the
discussion is simplified, it provides a comprehensive way to explore algorithms from first principles. If you are
interested in getting early discounted copies, please contact ajit.jaokar at feynlabs.ai.
Related:
• Decision Trees — An Intuitive Introduction
• A Beginner’s Guide to Linear Regression in Python with Scikit-Learn
• Naive Bayes from Scratch using Python only – No Fancy Frameworks

More On This Topic

• Metis Webinar: Deep Learning Approaches to Forecasting


• Do You Trust and Understand Your Predictive Models?
• Roadmaps to becoming a Full-Stack AI Developer, Data Scientist, Machine…
• Free AI for Beginners Course
• New Poll: What Percentage of Your Machine Learning Models Have Been…
• Top KDnuggets tweets, Sep 02-08: Training alone is never enough to generate…

What is Feature Engineering?


Feature engineering is the pre-processing step of machine learning, which extracts features from raw data. It helps to
represent an underlying problem to predictive models in a better way, which as a result, improve the accuracy of the
model for unseen data. The predictive model contains predictor variables and an outcome variable, and while the
feature engineering process selects the most useful predictor variables for the model.
Since 2016, automated feature engineering is also used in different machine learning software that helps in
automatically extracting features from raw data. Feature engineering in ML contains mainly four processes: Feature
Creation, Transformations, Feature Extraction, and Feature Selection.

These processes are described as below:

1. Feature Creation: Feature creation is finding the most useful variables to be used in a predictive model. The process is

subjective, and it requires human creativity and intervention. The new features are created by mixing existing features
using addition, subtraction, and ration, and these new features have great flexibility.

2. Transformations: The transformation step of feature engineering involves adjusting the predictor variable to improve the

accuracy and performance of the model. For example, it ensures that the model is flexible to take input of the variety of
data; it ensures that all the variables are on the same scale, making the model easier to understand. It improves the
model's accuracy and ensures that all the features are within the acceptable range to avoid any computational error.

3. Feature Extraction: Feature extraction is an automated feature engineering process that generates new variables by

extracting them from the raw data. The main aim of this step is to reduce the volume of data so that it can be easily used
and managed for data modelling. Feature extraction methods include cluster analysis, text analytics, edge detection algorithms,
and principal components analysis (PCA).
4. Feature Selection: While developing the machine learning model, only a few variables in the dataset are useful for building

the model, and the rest features are either redundant or irrelevant. If we input the dataset with all these redundant and
irrelevant features, it may negatively impact and reduce the overall performance and accuracy of the model. Hence it is very
important to identify and select the most appropriate features from the data and remove the irrelevant or less important
features, which is done with the help of feature selection in machine learning. "Feature selection is a way of selecting the subset
of the most relevant features from the original features set by removing the redundant, irrelevant, or noisy features."

Below are some benefits of using feature selection in machine learning

o It helps in avoiding the curse of dimensionality.

o It helps in the simplification of the model so that the researchers can easily interpret it.

o It reduces the training time.

o It reduces overfitting hence enhancing the generalization.

Feature Engineering Techniques


Some of the popular feature engineering techniques include:

1. Imputation
Feature engineering deals with inappropriate data, missing values, human interruption, general errors, insufficient data
sources, etc. Missing values within the dataset highly affect the performance of the algorithm, and to deal with them
"Imputation" technique is used. Imputation is responsible for handling irregularities within the dataset.

For example, removing the missing values from the complete row or complete column by a huge percentage of missing
values. But at the same time, to maintain the data size, it is required to impute the missing data, which can be done
as:
o For numerical data imputation, a default value can be imputed in a column, and missing values can be filled with means or
medians of the columns.

o For categorical data imputation, missing values can be interchanged with the maximum occurred value in a column.

2. Handling Outliers
Outliers are the deviated values or data points that are observed too away from other data points in such a way that
they badly affect the performance of the model. Outliers can be handled with this feature engineering technique. This
technique first identifies the outliers and then remove them out.

Standard deviation can be used to identify the outliers. For example, each value within a space has a definite to an
average distance, but if a value is greater distant than a certain value, it can be considered as an outlier. Z-score can
also be used to detect outliers.

3. Log transform
Logarithm transformation or log transform is one of the commonly used mathematical techniques in machine learning.
Log transform helps in handling the skewed data, and it makes the distribution more approximate to normal after
transformation. It also reduces the effects of outliers on the data, as because of the normalization of magnitude
differences, a model becomes much robust.

Note: Log transformation is only applicable for the positive values; else, it will give an error. To avoid this, we can add 1 to the data before
transformation

4. Binning
In machine learning, overfitting is one of the main issues that degrade the performance of the model and which occurs
due to a greater number of parameters and noisy data. However, one of the popular techniques of feature engineering,
"binning", can be used to normalize the noisy data. This process involves segmenting different features into bins.

5. Feature Split
As the name suggests, feature split is the process of splitting features intimately into two or more parts and performing
to make new features. This technique helps the algorithms to better understand and learn the patterns in the dataset.
The feature splitting process enables the new features to be clustered and binned, which results in extracting useful
information and improving the performance of the data models.

6. One hot encoding


One hot encoding is the popular encoding technique in machine learning. It is a technique that converts the categorical
data in a form so that they can be easily understood by machine learning algorithms and hence can make a good
prediction. It enables group the of categorical data without losing any information.

You might also like