Find - S

Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

Find-S algorithm

The S algorithm, also known as the Find-S algorithm, is a machine learning algorithm
that seeks to find a maximally specific hypothesis based on labeled training data. It
starts with the most specific hypothesis and generalizes it by incorporating positive
examples. It ignores negative examples during the learning process.

Symbols used in Find-S algorithm


 ∅ (Empty Set) − This symbol represents the absence of any specific value or attribute. It is often
used to initialize the hypothesis as the most specific concept.
 ? (Don't Care) − The question mark symbol represents a "don't care" or "unknown" value for an
attribute. It is used when the hypothesis needs to generalize over different attribute values that are
present in positive examples.
 Hypothesis (h) − The variable h represents the hypothesis, which is the learned concept or
generalization based on the training data. It is refined iteratively throughout the algorithm.

Find-S Algorithm:
Following are the steps for the Find-S algorithm:

1. Initialize h to the most specific hypothesis in H


2. For each positive training example x ,
1. For each attribute, constraint ai in h

If the constraints ai is satisfied by x


Then do nothing
Else
replace ai in h by the next more general constraint that is satisfied by x
3. Output hypothesis h

 Initialization − The algorithm starts with the most specific hypothesis, denoted as h. This initial
hypothesis is the most restrictive concept and typically assumes no positive examples. It may be
represented as h = <∅, ∅, ..., ∅>, where ∅ denotes "don't care" or "unknown" values for each
attribute.
 Iterative Process − The algorithm iterates through each training example and refines the hypothesis
based on whether the example is positive or negative.
o For each positive training example (an example labeled as the target class), the algorithm
updates the hypothesis by generalizing it to include the attributes of the example. The
hypothesis becomes more general as it covers more positive examples.
o For each negative training example (an example labeled as a non-target class), the algorithm
ignores it as the hypothesis should not cover negative examples. The hypothesis remains
unchanged for negative examples.
 Generalization − After processing all the training examples, the algorithm produces a final
hypothesis that covers all positive examples while excluding negative examples. This final
hypothesis represents the generalized concept that the algorithm has learned from the training data.

You might also like