Machine Learning: Cognate/ Elective 2
Machine Learning: Cognate/ Elective 2
Machine Learning: Cognate/ Elective 2
COGNATE/ ELECTIVE 2
MACHINE LEARNING
• 1. DATA STORAGE
• Facilities for storing and retrieving huge amounts of data are an important component of the learning
process. Humans and computers alike utilize data storage as a foundation for advanced reasoning.
• In a human being, the data is stored in the brain and data is retrieved using electrochemical signals.
• Computers use hard disk drives, flash memory, random access memory and similar devices to store data and
use cables and other technology to retrieve data.
• 2. ABSTRACTION
• The second component of the learning process is known as abstraction.
• Abstraction is the process of extracting knowledge about stored data. This involves creating general
concepts about the data as a whole. The creation of knowledge involves application of known models and
creation of new models. The process of fitting a model to a dataset is known as training. When the model
has been trained, the data is transformed into an abstract form that summarizes the original information.
COMPONENTS OF LEARNING
• 3. GENERALIZATION
• The third component of the learning process is known as generalization. The term
generalization describes the process of turning the knowledge about stored data into a
form that can be utilized for future action. These actions are to be carried out on tasks that
are similar, but not identical, to those what have been seen before. In generalization, the
goal is to discover those properties of the data that will be most relevant to future tasks.
• 4. EVALUATION
• Evaluation is the last component of the learning process. It is the process of giving
feedback to the user to measure the utility of the learned knowledge. This feedback is then
utilised to effect improvements in the whole learning process
APPLICATIONS OF MACHINE
LEARNING
• Machine learning is a diverse and exciting field, and there are multiple ways of
defining it:
• 1. THE ARTIFICIAL INTELLIGENCE VIEW.
• Learning is central to human knowledge and intelligence, and, likewise, it is also
essential for building intelligent machines. Years of effort in AI has shown that
trying to build intelligent computers by programming all the rules cannot be done;
automatic learning is crucial.
• For example, we humans are not born with the ability to understand language —
we learn it — and it makes sense to try to have computers learn language instead
of trying to program it all it.
• 2. THE SOFTWARE ENGINEERING VIEW.
• Machine learning allows us to program computers by example, which can be
easier than writing code the traditional way.
• 3. THE STATS VIEW.
• Machine learning is the marriage of computer science and statistics:
computational techniques are applied to statistical problems. Machine learning
has been applied to a vast number of problems in many contexts, beyond the
typical statistics problems. Machine learning is often designed with different
considerations than statistics (e.g., speed is often more important than accuracy).
TWO PHASES
• There are many other types of machine learning as well, for example:
• 1. Semi-supervised learning, in which only a subset of the training data is
labeled
• 2. Time-series forecasting, such as in financial markets
• 3. Anomaly detection such as used for fault-detection in factories and in
surveillance
• 4. Active learning, in which obtaining data is expensive, and so an algorithm
must determine which training data to acquire and many others.
LEARNING MODELS
• Machine learning is concerned with using the right features to build the right
models that achieve the right tasks.
• The basic idea of Learning models has divided into three categories.
• For a given problem, the collection of all possible outcomes represents the sample
space or instance space.
• Using a Logical expression. (Logical models)
• Using the Geometry of the instance space. (Geometric models)
• Using Probability to classify the instance space. (Probabilistic models)
• Grouping and Grading
LEARNING MODELS
• There are mainly two kinds of logical models: Tree models and Rule models.
• Rule models consist of a collection of implications or IF-THEN rules. For
tree-based models, the ‘if-part’ defines a segment and the ‘then-part’ defines
the behaviour of the model for this segment. Rule models follow the same
reasoning.
LOGICAL MODELS AND CONCEPT
LEARNING
• To understand logical models further, we need to understand the idea of Concept Learning.
Concept Learning involves learning logical expressions or concepts from examples.
• The idea of Concept Learning fits in well with the idea of Machine learning, i.e., inferring a
general function from specific training examples.
• Concept learning forms the basis of both tree-based and rule-based models. More formally,
Concept Learning involves acquiring the definition of a general category from a given set of
positive and negative training examples of the category.
• A Formal Definition for Concept Learning is “The inferring of a Boolean-valued function
from training examples of its input and output.” In concept learning, we only learn a
description for the positive class and label everything that doesn’t satisfy that description as
negative.
LO G I C A L LE A R N I N G M O D EL S
• A Concept Learning Task called “Enjoy Sport” as shown above is defined by a set of data
from some example days.
• Each data is described by six attributes. The task is to learn to predict the value of Enjoy Sport
for an arbitrary day based on the values of its attribute values. The problem can be represented
by a series of hypotheses.
• Each hypothesis is described by a conjunction of constraints on the attributes. The training
data represents a set of positive and negative examples of the target function.
• In the example, each hypothesis is a vector of six constraints, specifying the values of the six
attributes – Sky, AirTemp, Humidity, Wind, Water, and Forecast.
• The training phase involves learning the set of days (as a conjunction of attributes) for which
Enjoy Sport = yes.
• Thus, the problem can be formulated as:
• Given instances X which represent a set of all possible days, each described by the attributes:
• Sky – (values: Sunny, Cloudy, Rainy),
• AirTemp – (values: Warm, Cold),
• Humidity – (values: Normal, High),
• Wind – (values: Strong, Weak),
• Water – (values: Warm, Cold),
• Forecast – (values: Same, Change).
• Try to identify a function that can predict the target variable Enjoy Sport as yes/no, i.e., 1 or 0.
• GEOMETRIC MODELS
• In the previous section, we have seen that with logical models, such as decision
trees, a logical expression is used to partition the instance space. Two instances
are similar when they end up in the same logical segment. In this section, we
consider models that define similarity by considering the geometry of the
instance space.
• In Geometric models, features could be described as points in two dimensions
(x- and y-axis) or a three-dimensional space (x, y, and z). Even when features
are not intrinsically geometric, they could be modelled in a geometric manner
(for example, temperature as a function of time can be modelled in two axes).
G E O M ET RI C L E A RN I N G M O D E L
• PROBABILISTIC MODELS
• The third family of machine learning algorithms is the probabilistic models.
We have seen before that the k-nearest neighbor algorithm uses the idea of
distance (e.g., Euclidian distance) to classify entities, and logical models use a
logical expression to partition the instance space. In this section, we see how
the probabilistic models use the idea of probability to classify new entities.
• Probabilistic models see features and target variables as random variables. The
process of modelling represents and manipulates the level of uncertainty with
respect to these variables. There are two types of probabilistic models:
Predictive and Generative.
• Predictive probability models use the idea of a conditional probability
distribution P (Y |X) from which Y can be predicted from X.
• Generative models estimate the joint distribution P (Y, X).
• Naïve Bayes is an example of a probabilistic classifier.
• The Naïve Bayes algorithm is based on the idea of Conditional Probability.
Conditional probability is based on finding the probability that something will
happen, given that something else has already happened. The task of the
algorithm then is to look at the evidence and to determine the likelihood of a
specific class and assign a label accordingly to each entity.
DESIGN OF A LEARNING SYSTEM
• Just now we looked into the learning process and also understood the goal of the
learning. When we want to design a learning system that follows the learning
process, we need to consider a few design choices.
• The design choices will be to decide the following key components:
• 1. Type of training experience
• 2. Choosing the Target Function
• 3. Choosing a representation for the Target Function
• 4. Choosing an approximation algorithm for the Target Function
• 5. The final Design
TYPE OF TRAINING EXPERIENCE
• During the design of the checker's learning system, the type of training
experience available for a learning system will have a significant effect on the
success or failure of the learning.
• Direct or Indirect training experience — In the case of direct training
experience, an individual board states and correct move for each board state are
given. In case of indirect training experience, the move sequences for a game
and the final result (win, loss or draw) are given for a number of games. How
to assign credit or blame to individual moves is the credit assignment problem.
TY P E O F T RA I N I N G EX P ERI E N C E
• Teacher or Not —
• Supervised — The training experience will be labeled, which means, all the
board states will be labeled with the correct move. So the learning takes place
in the presence of a supervisor or a teacher.
• Unsupervised — The training experience will be unlabeled, which means, all
the board states will not have the moves. So the learner generates random
games and plays against itself with no supervision or teacher involvement.
• Semi-supervised — Learner generates game states and asks the teacher for
help in finding the correct move if the board state is confusing.
• Is the training experience good — Do the training examples represent the
distribution of examples over which the final system performance will be
measured? Performance is best when training examples and test examples are
from the same/a similar distribution.
PERSPECTIVES AND ISSUES IN
MACHINE LEARNING
• Our checkers example raises a number of generic questions about machine learning. The field
of machine learning, and much of this book, is concerned with answering questions such as the
following:
• What algorithms exist for learning general target functions from specific training examples?
In what settings will particular algorithms converge to the desired function, given sufficient
training data? Which algorithms perform best for which types of problems and representations?
• How much training data is sufficient? What general bounds can be found to relate the
confidence in learned hypotheses to the amount of training experience and the character of the
learner's hypothesis space?
• When and how can prior knowledge held by the learner guide the process of generalizing
from examples? Can prior knowledge be helpful even when it is only approximately correct?
ISSUES IN MACHINE LEARNING
• What is the best strategy for choosing a useful next training experience, and
how does the choice of this strategy alter the complexity of the learning
problem?
• What is the best way to reduce the learning task to one or more function
approximation problems? Put another way, what specific functions should the
system attempt to learn? Can this process itself be automated?
• How can the learner automatically alter its representation to improve its
ability to represent and learn the target function?
VERSION SPACES