ai unit 5 part 3
ai unit 5 part 3
ai unit 5 part 3
Learning
An agent is learning if it improves its performance on future tasks after making observations about
the world.
Forms Of Learning
Any component of an agent can be improved by learning from data.
It depends upon 4 factors:
Which component is to be improved
o direct mapping from conditions on the current state to actions
o infers relevant properties of the world
o results of possible actions
o Action-value information
o Goals that describe classes of states
What prior knowledge the agent already has.
What representation is used for the data and the component.
o representations: propositional and first-order logical sentences
o Bayesian networks for the inferential components
o factored representation—a vector of attribute values—and outputs that can be either a continuous
numerical value or a discrete value
What feedback is available to learn from: types of feedback that determine the three main types of
learning
o In unsupervised learning the agent learns patterns in the input even though no explicit feedback is
supplied
o reinforcement learning the agent learns from a series of reinforcements—rewards or punishments.
SUPERVISED LEARNING
Given a training set of N example input–output pairs (x1, y1), (x2, y2), . . . (xN, yN) , where each yj
was generated by an unknown function y = f(x), discover a function h that approximates the true
function f. The function h is a hypothesis. To measure the accuracy of a hypothesis we give it a test
set of examples that are distinct from the training set.
Conditional Probability Distribution: the function f is stochastic—it is not strictly a function of x,
and what we have to learn is a, P(Y | x)
Classification: When the output y is one of a finite set of values the learning problem is called
classification
Regression: When y is a number (such as tomorrow’s temperature), the learning problem is called
regression
Hypothesis space, H, can be a set of polynomials. A polynomial is fitting a function of a single
variable to some data points.
Ockham’s razor: how do we choose a function or a polynomial from among multiple consistent
hypotheses? One answer is to prefer the simplest hypothesis consistent with the data. This principle
is called Ockham’s razor
Realizable: a learning problem is realizable if the hypothesis space contains the true function.
Unfortunately, we cannot always tell whether a given learning problem is realizable, because the true
function is not known.
Supervised learning can be done by choosing the hypothesis “h” that is most probable one for the
given data:
There is a trade-off between the expressiveness of a hypothesis space and the complexity of finding a
good hypothesis within that space.
We can check that the entropy of a fair coin flip is indeed 1 bit:
H(Fair) = −(0.5 log2 0.5 + 0.5 log2 0.5) = 1 .
The information gain from the attribute INFORMATION GAIN test on A is the expected reduction in
entropy:
Pruning
In decision trees, a technique called decision tree pruning combats overfitting. Pruning works by
eliminating nodes that are not clearly relevant.
Issues in decision trees:
Missing data
Multivalued attributes
The extension of the hypothesis must be decreased to exclude the example. This is called
specialization.
Least-commitment search
Backtracking arises because the current-best-hypothesis approach has to choose a particular
hypothesis as its best guess even though it does not have enough data yet to be sure of the choice.
What we can do instead is to keep around all and only those hypotheses that are consistent with all
the data so far.
Each new example will either have no effect or will get rid of some of the hypotheses.
EXPLANATION-BASED LEARNING
Explanation-based learning is a method for extracting general rules from individual observations.
Memoization
The technique of memoization has long been used in computer science to speed up programs by
saving the results of computation.
The basic idea of memo functions is to accumulate a database of input–output pairs; when the
function is called, it first checks the database to see whether it can avoid solving the problem from
scratch.
Explanation-based learning takes this a good deal further, by creating general rules that cover an
entire class of cases.
Basic EBL process works as follows:
1. Given an example, construct a proof that the goal predicate applies to the example using the
available background knowledge
Mr. Mohammed Afzal, Asst. Professor in AIML
Mob: +91-8179700193, Email: [email protected]
2. In parallel, construct a generalized proof tree for the variabilized goal using the same inference
steps as in the original proof.
3. Construct a new rule whose left-hand side consists of the leaves of the proof tree and whose
right-hand side is the variabilized goal (after applying the necessary bindings from the
generalized proof).
4. Drop any conditions from the left-hand side that are true regardless of the values of the variables
in the goal.
Three factors involved in the analysis of efficiency gains from EBL:
1. Adding large numbers of rules can slow down the reasoning process, because the inference
mechanism must still check those rules even in cases where they do not yield a solution. In other
words, it increases the branching factor in the search space.
2. To compensate for the slowdown in reasoning, the derived rules must offer significant increases
in speed for the cases that they do cover. These increases come about mainly because the
derived rules avoid dead ends that would otherwise be taken, but also because they shorten the
proof itself.
3. Derived rules should be as general as possible, so that they apply to the largest possible set of
cases.
for i = 0 to n do
for each subset Ai of A of size i do
if CONSISTENT-DET?(Ai,E) then return Ai
function CONSISTENT-DET?(A,E) returns a truth value
Given an algorithm for learning determinations, a learning agent has a way to construct a minimal
hypothesis within which to learn the target predicate. For example, we can combine MINIMAL-
CONSISTENT-DET with the DECISION-TREE-LEARNING algorithm.
This yields a relevance-based decision-tree learning algorithm RBDTL that first identifies a minimal
set of relevant attributes and then passes this set to the decision tree algorithm for learning.
The object of an inductive learning program is to come up with a set of sentences for the Hypothesis
such that the entailment constraint is satisfied.
Suppose, for the moment, that the agent has no background knowledge: Background is empty. Then
one possible solution we would need to make pairs of people into objects.
Top-down inductive learning methods
The first approach to ILP works by starting with a very general rule and gradually specializing it so
that it fits the data.