Fuzzy Logic and Neural Networks notes

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 68

Fuzzy Logic and Neural Networks

UNIT: I

Fuzzy Set Theory and Fuzzy Logic Control:

Basic concepts of fuzzy sets-Operations on fuzzy sets-Fuzzy relation equations-Fuzzy logic


control Fuzzification–Defuzzification-Knowledge base-Decision making logic-Membership
functions–Rule base.

UNIT:II
Adaptive Fuzzy Systems

Performance index – Modification of rule base 0 – Modification of membership functions -


Simultaneous modification of rule base and membership functions – Genetic algorithms-
Adaptive fuzzy system Neuro fuzzy systems.

UNIT: III
Artificial Neural Networks:
Introduction-History of neural networks-multilayer perceptions-Back propagation algorithm and
its Variants-Different types of learning, examples.

UNIT: IV
Mapping and Recurrent Networks:
Counter propagation–Self organization Map-Cognitron and Neo cognitron – Hop field Net-
Kohonnen Nets-Gross berg Nets-Art-I, Art-II reinforcement learning.

UNIT:V
Case Studies
Application of fuzzy logic and neural networks to Measurement-Control-Adaptive Neural
Controllers –Signal Processing and Image Processing.

Text Book(s):

1.VallumB.RAndHayagrivaV.RC++,NeuralnetworksandFuzzylogic,BPBPublications,NewDelhi,1
996 .
Reference Book(s):

1.Fuzzy logic & Neural Networks / Chennakesava R.Alavala / NewAgeInternational,2008 .


2.NeuralNetworksforcontrol,MillonW.T,SuttonR.SandWerbosP.J,MITPress1992.
3.FuzzysetsFuzzylogic,Klir,G.JanfdYuanB.BPrenticeHalloifIndiaPvt.Ltd.,,NewDelhi.
4.NeuralNetworksandFuzzysystems,Kosko..PrenticehallofIndiaPvt.Ltd.,,NewDelhi1994.
5.IntroductiontoFuzzycontrol,DirankovD.HellendoornH,ReinfrankM.,NarosaPublications
House,New Delhi1996.
6.IntroductiontoArtificialNeuralsystems,ZuradaJ.MJaicoPublishingHouse,NewDelhi1994
Fuzzy Logic and Neural Networks
UNIT: I
Fuzzy Set Theory and Fuzzy Logic Control:

Basic Concepts of Fuzzy Sets


Introduction:

Logic deals with true and false. A proposition can be true on one occasion and false on
another. “Apple is a red fruit” is such a proposition. If you are holding a Granny Smith apple that is
green, the proposition that apple is a red fruit is false.

On the other hand, if your apple is of a red delicious variety, it is a red fruit and the proposition in
reference is true. If a proposition is true, it has a truth value of 1; if it is false, its truth value is 0.
These are the only possible truth values. Propositions can be combined to generate other
propositions, by means of logical operations.
When you say it will rain today or that you will have an outdoor picnic today, you are
making statements with certainty. Of course your statements in this case can be either true or false.
The truth values of your statements can be only 1, or 0. Your statements then can be said to be
crisp. On the other hand, there are statements you cannot make with such certainty. You may be
saying that you think it will rain today. If pressed further, you may be able to say with a degree of
certainty in your statement that it will rain today. Your level of certainty, however, is about 0.8,
rather than 1. This type of situation is what fuzzy logic was developed to model. Fuzzy logic deals
with propositions that can be true to a certain degree somewhere from 0 to 1.

Therefore, a proposition’s truth value indicates the degree of certainty about which the
proposition is true. The degree of certainty sounds like a probability (perhaps subjective
probability), but it is not quite the same. Probabilities for mutually exclusive events cannot add up
to more than 1, but their fuzzy values may.

Suppose that the probability of a cup of coffee being hot is 0.8 and the probability of the
cup of coffee being cold is 0.2. These probabilities must add up to 1.0. Fuzzy values do not need to
add up to 1.0. The truth value of a proposition that a cup of coffee is hot is 0.8. The truth value of a
proposition that the cup of coffee is cold can be 0.5. There is no restriction on what these truth
values must add up to.
Fuzzy Sets:

Fuzzy logic is best understood in the context of set membership. Suppose you are assembling a set
of rainy days. Would you put today in the set? When you deal only with crisp statements that are
either true or false, your inclusion of today in the set of rainy days is based on certainty. When
dealing with fuzzy logic, you would include today in the set of rainy days via an ordered pair, such
as (today, 0.8). The first member in such an ordered pair is a candidate for inclusion in the set, and
the second member is a value between 0 and 1, inclusive, called the degree of membership in the
set. The inclusion of the degree of membership in the set makes it convenient for developers to
come up with a set theory based on fuzzy logic, just as regular set theory is developed. Fuzzy sets
are sets in which members are presented as ordered pairs that include information on degree of
membership. A traditional set of, say, k elements, is a special case of a fuzzy set, where each of
those k elements has 1 for the degree of membership, and every other element in the universal
set has a degree of membership 0, for which reason you don’t bother to list it.

Fuzzy Set Operations:

The usual operations you can perform on ordinary sets are union, in which you take all the
elements that are in one set or the other; and intersection, in which you take the elements that are in
both sets. In the case of fuzzy sets, taking a union is finding the degree of membership that an
element should have in the new fuzzy set, which is the union of two fuzzy sets. If a, b, c, and d are
such that their degrees of membership in the fuzzy set A are 0.9, 0.4, 0.5, and 0, respectively, then
the fuzzy set A is given by the fit vector (0.9, 0.4, 0.5, 0). The components of this fit vector are
called fit values of a, b, c, and d. Union of Fuzzy Sets Consider a union of two traditional sets and
an element that belongs to only one of those sets. Earlier you saw that if you treat these sets as
fuzzy sets, this element has a degree of membership of 1 in one case and 0 in the other since it
belongs to one set and not the other. Yet you are going to put this element in the union. The
criterion you use in this action has to do with degrees of membership. You need to look at the two
degrees of membership, namely, 0 and 1, and pick the higher value of the two, namely, 1. In other
words, what you want for the degree of membership of an element when listed in the union of two
fuzzy sets, is the maximum value of its degrees of membership within the two fuzzy sets forming a
union. If a, b, c, and d have the respective degrees of membership in fuzzy sets A, B as A = (0.9,
0.4, 0.5, 0) and B = (0.7, 0.6, 0.3, 0.8), then A [cup] B = (0.9, 0.6, 0.5, 0.8).

Intersection and Complement of Two Fuzzy Sets:


Analogously, the degree of membership of an element in the intersection of two fuzzy sets is the
minimum, or the smaller value of its degree of membership individually in the two sets forming
the intersection. For example, if today has 0.8 for degree of membership in the set of rainy days
and 0.5 for degree of membership in the set of days of work completion, then today belongs to the
set of rainy days on which work is completed to a degree of 0.5, the smaller of 0.5 and 0.8.

Recall the fuzzy sets A and B in the previous example. A = (0.9, 0.4, 0.5, 0) and B = (0.7,0.6,
0.3, 0.8). A[cap]B, which is the intersection of the fuzzy sets A and B, is obtained by taking, in
each component, the smaller of the values found in that component in A and in B. Thus A[cap]B
= (0.7, 0.4, 0.3, 0).

The idea of a universal set is implicit in dealing with traditional sets. For example, if you talk of
the set of married persons, the universal set is the set of all persons. Every other set you consider
in that context is a subset of the universal set. We bring up this matter of universal set because
when you make the complement of a traditional set A, you need to put in every element in the
universal set that is not in A. The complement of a fuzzy set, however, is obtained as follows. In
the case of fuzzy sets, if the degree of membership is 0.8 for a member, then that member is not in
that set to a degree of 1.0 – 0.8 = 0.2. So you can set the degree of membership in the complement
fuzzy set to the complement with respect to 1. If we return to the scenario of having a degree of
0.8 in the set of rainy days, then today has to have 0.2 membership degree in the set of non-rainy
or clear days.

Continuing with our example of fuzzy sets A and B, and denoting the complement of A by
A’, we have A’ = (0.1, 0.6, 0.5, 1) and B’ = (0.3, 0.4, 0.7, 0.2). Note that A’ [cup] B’ = (0.3, 0.6,
0.7, 1), which is also the complement of A [cap] B. You can similarly verify that the complement
of A [cup] B is the same as A’ [cap] B’. Furthermore, A [cup] A’ = (0.9,0.6, 0.5, 1) and A [cap]
A’ = (0.1, 0.4, 0.5, 0), which is not a vector of zeros only, as would be the case in conventional
sets. In fact, A and A’ will be equal in the sense that their fit vectors are the same, if each
component in the fit vector is equal to 0.5.

Applications of Fuzzy Logic:


Applications of fuzzy sets and fuzzy logic are found in many fields, including artificial
intelligence, engineering, computer science, operations research, robotics, and pattern recognition.
These fields are also ripe for applications for neural networks. So it seems natural that fuzziness
should be introduced in neural networks themselves. Any area where humans need to indulge in
making decisions, fuzzy sets can find a place, since information on which decisions are to be
based may not always be complete and the reliability of the supposed values of the underlying
parameters is not always certain.

Examples of Fuzzy Logic:


Let us say five tasks have to be performed in a given period of time, and each task requires one
person dedicated to it. Suppose there are six people capable of doing these tasks. As you have
more than enough people, there is no problem in scheduling this work and getting it done. Of
course who gets assigned to which task depends on some criterion, such as total time for
completion, on which some optimization can be done. But suppose these six people are not
necessarily available during the particular period of time in question. Suddenly, the equation is
seen in less than crisp terms. The availability of the people is fuzzy-valued. Here is an example of
an assignment problem where fuzzy sets can be used.

Commercial Applications:

Many commercial uses of fuzzy logic exist today. A few examples are listed here:
• A subway in Sendai, Japan uses a fuzzy controller to control a subway car. This controller has
outperformed human and conventional controllers in giving a smooth ride to passengers in all
terrain and external conditions.
• Cameras and camcorders use fuzzy logic to adjust autofocus mechanisms and to cancel the jitter
caused by a shaking hand.
• Some automobiles use fuzzy logic for different control applications. Nissan has patents on fuzzy
logic braking systems, transmission controls, and fuel injectors.GM uses a fuzzy transmission
system in its Saturn vehicles.
• FuziWare has developed and patented a fuzzy spreadsheet called FuziCalc that allows users to
incorporate fuzziness in their data.
• Software applications to search and match images for certain pixel regions of interest have been
developed. Avian Systems has a software package called FullPixelSearch.

• A stock market charting and research tool called SuperCharts from Omega Research, uses fuzzy
logic in one of its modules to determine whether the market is bullish, bearish, or neutral.

Fuzzy relation equations


Fuzzy Relations:
A standard relation from set A to set B is a subset of the Cartesian product of A and B, written
as A×B. The elements of A×B are ordered pairs (a, b) where a is an element of A and b is an
element of B. For example, the ordered pair (Joe, Paul) is an element of the Cartesian product of
the set of fathers, which includes Joe and the set of sons which includes Paul. Or, you can
consider it as an element of the Cartesian product of the set of men with itself. In this case, the
ordered pair (Joe, Paul) is in the subset which contains (a, b, if a is the father of b. This subset is a
relation on the set of men. You can call this relation “father.”
A fuzzy relation is similar to a standard relation, except that the resulting sets are fuzzy
sets. An example of such a relation is ‘much_more_educated’. This fuzzy set may look something
like,much_more_educated = { ...,0.2/(Jeff, Steve), 0.7/(Jeff, Mike),... }

Matrix Representation of a Fuzzy Relation:


A fuzzy relation can be given as a matrix also when the underlying sets, call them domains,
are finite. For example, let the set of men be S = { Jeff, Steve, Mike }, and let us use the same
relation, much_more_educated. For each element of the Cartesian product S×S, we need the
degree of membership in this relation. We already have two such values,
mmuch_more_educated(Jeff, Steve) = 0.2, and mmuch_more_educated(Jeff, Mike) = 0.7. What
degree of membership in the set should we assign for the pair (Jeff, Jeff)? It seems reasonable to
assign a 0. We will assign a 0 whenever the two members of the ordered pair are the same. Our
relation much_more_educated is given by a matrix that may look like the following:
0/(Jeff, Jeff) 0.2/(Jeff, Steve) 0.7/(Jeff, Mike)
much_more_educated = 0.4/(Steve, Jeff) 0/(Steve, Steve) 0.3/(Steve, Mike)
0.1/(Mike, Jeff) 0.6/(Mike, Steve) 0/(Mike, Mike)

Properties of Fuzzy Relations:


A relation on a set, that is a subset of a Cartesian product of some set with itself, may have
some interesting properties. It may be reflexive. For this you need to have 1 for the degree of
membership of each main diagonal entry. Our example here is evidently not reflexive.
A relation may be symmetric. For this you need the degrees of membership of each pair of entries
symmetrically situated to the main diagonal to be the same value. For example (Jeff, Mike) and
(Mike, Jeff) should have the same degree of membership. Here they do not, so our example of a
relation is not symmetric.
A relation may be anti symmetric. This requires that if a is different from b and the degree
of membership of the ordered pair (a, b) is not 0, then its mirror image, the ordered pair (b, a),
should have 0 for degree of membership. In our example, both (Steve, Mike) and (Mike, Steve)
have positive values for degree of membership; therefore, the relation much_more_educated over
the set {Jeff, Steve, Mike} is not anti symmetric also.
A relation may be transitive. For transitivity of a relation, you need the following
condition, illustrated with our set {Jeff, Steve, Mike}. For brevity, let us use r in place of
much_more_educated, the name of the relation:
min (mr(Jeff, Steve) , mr(Steve, Mike) )[le]mr(Jeff, Mike)
min (mr(Jeff, Mike) , mr(Mike, Steve) )[le]mr(Jeff, Steve)
min (mr(Steve, Jeff) , mr(Jeff, Mike) )[le]mr(Steve, Mike)
min (mr(Steve, Mike) , mr(Mike, Jeff) )[le]mr(Steve, Jeff)
min (mr(Mike, Jeff) , mr(Jeff, Steve) )[le]mr(Mike, Steve)
min (mr(Mike, Steve) , mr(Steve, Jeff) )[le]mr(Mike, Jeff)
In the above listings, the ordered pairs on the left-hand side of an occurrence of [le] are such that
the second member of the first ordered pair matches the first member of the second ordered pair,
and also the right-hand side ordered pair is made up of the two nonmatching elements, in the same
order.
Example:
min (mr(Jeff, Steve) , mr(Steve, Mike) ) = min (0.2, 0.3) = 0.2
mr(Jeff, Mike) = 0.7 > 0.2
For this instance, the required condition is met. But in the following:
min (mr(Jeff, Mike), mr(Mike, Steve) ) = min (0.7, 0.6) = 0.6
mr(Jeff, Steve) = 0.2 < 0.6
The required condition is violated, so the relation much_more_educated is not transitive.
If you think about it, it should be clear that when a relation on a set of more than one element is
symmetric, it cannot be antisymmetric also, and vice versa. But a relation can be both not
symmetric and not anti symmetric at the same time.
Example:
An example of reflexive, symmetric, and transitive relation is given by the following matrix:
1 0.4 0.8
0.4 1 0.4
0.8 0.4 1

Similarity Relations
A reflexive, symmetric, and transitive fuzzy relation is said to be a fuzzy equivalence relation.
Such a relation is also called a similarity relation. When you have a similarity relation s, you can
define the similarity class of an element x of the domain as the fuzzy set in which the degree of
membership of y in the domain is ms(x, y). The similarity class of x with the relation s can be
denoted by [x]s.
Resemblance Relations:
Do you think similarity and resemblance are one and the same? If x is similar to y, does it mean
that x resembles y? Or does the answer depend on what sense is used to talk of similarity or of
resemblance? In everyday jargon, Bill may be similar to George in the sense of holding high
office, but does Bill resemble George in financial terms? Does this prompt us to look at a
‘resemblance relation’ and distinguish it from the ‘similarity relation’? Of course.

Recall that a fuzzy relation that is reflexive, symmetric, and also transitive is called
similarity relation. It helps you to create similarity classes. If the relation lacks any one of the
three properties, it is not a similarity relation. But if it is only not transitive, meaning it is both
reflexive and
symmetric, it is still not a similarity relation, but it is a resemblance relation. An example of a
resemblance relation, call it t, is given by the following matrix.
Let the domain have elements a, b, and c:
1 0.4 0.8
t = 0.4 1 0.5
0.8 0.5 1
This fuzzy relation is clearly reflexive, and symmetric, but it is not transitive. For example:
min (mt(a, c) , mt(c, b) ) = min (0.8, 0.5) = 0.5 ,
but the following:
mt(a, b) = 0.4 < 0.5 ,is a violation of the condition for transitivity. Therefore, t is not a similarity
relation, but it certainly is a resemblance relation.

Fuzzy Partial Order:


One last definition is that of a fuzzy partial order. A fuzzy relation that is reflexive, anti
symmetric, and transitive is a fuzzy partial order. It differs from a similarity relation by requiring
anti symmetry instead of symmetry. In the context of crisp sets, an equivalence relation that helps
to generate equivalence classes is also a reflexive, symmetric, and transitive relation. But those
equivalence classes are disjoint, unlike similarity classes with fuzzy relations. With crisp sets, you
can define a partial order, and it serves as a basis for making comparison of elements in the
domain with one another.

Fuzzy logic control Fuzzification


Fuzzification:
Fuzzification is the process of converting a crisp input value to a fuzzy value that is performed
by the use of the information in the knowledge base. Although various types of curves can be seen
in literature, Gaussian, triangular, and trapezoidal MFs are the most commonly used in
the fuzzification process. These types of MFs can easily be implemented by embedded controllers.
The MFs are defined mathematically with several parameters. In order to fine-tune the
performance of a FLC, these parameters, or the shape of the MFs, can be adapted.

Fuzzy Control:

This section discusses the fuzzy logic controller (FLC), its application and design. Fuzzy
control is used in a variety of machines and processes today, with widespread application
especially in Japan. A few of the applications in use today are in the list in Table; Applications of
Fuzzy Logic Controllers (FLCs) and Functions Performed.

Application FLC function(s)

Helicopter control Determine the best operation actions by judging human


instructions and the flying conditions including wind
speed and direction.

Defuzzification:
 Defuzzification is the process of representing a fuzzy set with a crisp number. Internal
representations of data in a fuzzy system are usually fuzzy sets.
 the output frequently needs to be a crisp number that can be used to perform a function such as
commanding a valve to a desired position in a control application or indicate a problem risk
index as discussed in next section.
 The most commonly used defuzzification method is the center of area method (COA), also
commonly referred to as the centroid method. This method determines the center of area of
fuzzy set and returns the corresponding crisp value. The center of sums (COS) method and the
mean of maximum method are two alternative methods in defuzzification.

the complete structure of a Fuzzy Logic System. Once all input variable values are translated into
respective linguistic variable values, the fuzzy inference step evaluates the set of fuzzy rules that
define the evaluation. The result of this is again a linguistic value. The defuzzification step
translates this linguistic result into a numerical value

Knowledge base:
In fuzzy logic systems, the fuzzy knowledge base represents the facts of the rules and linguistic
variables based on the fuzzy set theory so that the knowledge base systems will allow approximate
reasoning.

Knowledge Representation:
Fuzzy logic, which may be viewed as an extension of classical logical systems, provides an
effective conceptual framework for dealing with the problem of knowledge representation in an
environment of uncertainty and imprecision. Meaning representation in fuzzy logic is based on
test-score semantics.

Decision Making Logic:

Definition:
Decision making logic refers to the process or system used to make choices or select actions
based on available information, criteria, and objectives.

Components of Decision Making Logic:


Inputs:
Information, data, or variables used as input to the decision-making process.
Criteria:
Standards or requirements against which options are evaluated.
Rules:
Guidelines or conditions used to determine the best course of action based on inputs and
criteria.
Decision Maker:
Individual, group, or automated system responsible for making decisions.
Outputs:
Selected actions or choices resulting from the decision-making process.
3. Types of Decision Making Logic:

Deterministic Logic: Based on clear rules and precise information, leading to unambiguous
decisions.
Probabilistic Logic: Considers probabilities and uncertainties, often using statistical methods to
make decisions.
Heuristic Logic: Relies on rules of thumb, experience, or intuition to guide decision making in
complex or ambiguous situations.
Fuzzy Logic: Handles uncertainty and vagueness by allowing for degrees of truth between true
and false, suitable for systems with imprecise inputs.
4. Decision Making Process:

Identify the Decision: Clearly define the problem or choice to be made.


Gather Information: Collect relevant data and inputs necessary for making informed decisions.
Evaluate Options: Assess available choices against predefined criteria or objectives.
Apply Decision Making Logic: Use appropriate decision-making methods, algorithms, or models
to select the best option.
Make the Decision: Choose the option that best meets the criteria and objectives.
Implement and Monitor: Put the decision into action and monitor outcomes to evaluate
effectiveness.
5. Factors Influencing Decision Making Logic:
Objectives and Goals: Desired outcomes or objectives drive the decision-making process.
Constraints: Limitations or restrictions that affect available options or choices.
Risk Tolerance: Willingness to accept uncertainty or potential negative outcomes.
Preferences: Individual or organizational preferences, values, and priorities.
External Factors: Environmental, economic, social, or political factors that impact decision
making.
6. Applications of Decision Making Logic:

Decision making logic is applied in various domains, including business management,


engineering, healthcare, finance, and artificial intelligence.
Examples include investment decisions, project management, resource allocation, and strategic
planning.
7. Challenges and Considerations:

Balancing competing objectives and priorities.


Dealing with uncertainty and incomplete information.
Ensuring transparency and accountability in decision-making processes.
Adapting to changing conditions and unforeseen events.
Addressing biases or cognitive limitations that may influence decisions.

Steps for Decision Making:

Let us now discuss the steps involved in the decision making process −

Determining the Set of Alternatives − In this step, the alternatives from which the decision has to
be taken must be determined.

Evaluating Alternative − Here, the alternatives must be evaluated so that the decision can be taken
about one of the alternatives.

Comparison between Alternatives − In this step, a comparison between the evaluated alternatives
is done.

Types of Decision
Making We will now understand the different types of decision making.

Individual Decision Making


In this type of decision making, only a single person is responsible for taking decisions. The
decision making model in this kind can be characterized as −

Set of possible actions


Set of goals [Math Processing Error]
Set of Constraints [Math Processing Error]
The goals and constraints stated above are expressed in terms of fuzzy sets.

Now consider a set A. Then, the goal and constraints for this set are given by −

[Math Processing Error]


= composition[Math Processing Error]
= [Math Processing Error]
with [Math Processing Error]
[Math Processing Error]
= composition[Math Processing Error]
= [Math Processing Error]
with [Math Processing Error]
for [Math Processing Error]
The fuzzy decision in the above case is given by −

[Math Processing Error]


Multi-person Decision Making
Decision making in this case includes several persons so that the expert knowledge from various
persons is utilized to make decisions.

Calculation for this can be given as follows −

Number of persons preferring [Math Processing Error]


to [Math Processing Error]
= [Math Processing Error]
Total number of decision makers = [Math Processing Error]
Then, [Math Processing Error]
Multi-objective Decision Making
Multi-objective decision making occurs when there are several objectives to be realized. There are
following two issues in this type of decision making −

To acquire proper information related to the satisfaction of the objectives by various alternatives.

To weigh the relative importance of each objective.


Mathematically we can define a universe of n alternatives as −

[Math Processing Error]


And the set of “m” objectives as [Math Processing Error]
Multi-attribute Decision Making
Multi-attribute decision making takes place when the evaluation of alternatives can be carried out
based on several attributes of the object. The attributes can be numerical data, linguistic data and
qualitative data.

Mathematically, the multi-attribute evaluation is carried out on the basis of linear equation as
follows −

[Math Processing Error]


Membership functions
We already know that fuzzy logic is not logic that is fuzzy but logic that is used to describe
fuzziness. This fuzziness is best characterized by its membership function. In other words, we can
say that membership function represents the degree of truth in fuzzy logic.

Following are a few important points relating to the membership function :


 Membership functions were first introduced in 1965 by Lofti A. Zadeh in his first research
paper “fuzzy sets”.
 Membership functions characterize fuzziness (i.e., all the information in fuzzy set),
whether the elements in fuzzy sets are discrete or continuous.
 Membership functions can be defined as a technique to solve practical problems by
experience rather than knowledge.
 Membership functions are represented by graphical forms.
 Rules for defining fuzziness are fuzzy too.
Mathematical Notation:
We have already studied that a fuzzy set à in the universe of information U can be
defined as a set of ordered pairs and it can be represented mathematically as
Rule base:

The following sections are included:

 Framework: Fuzzy Logic and Fuzzy Systems

 Mamdani Fuzzy Rule-Based Systems


o The knowledge base of Mamdani fuzzy rule-based systems
o The inference engine of Mamdani fuzzy rule-based systems
 The fuzzification interface
 The inference system
 The defuzzification interface
o Example of application
o Design of the inference engine
o Advantages and drawbacks of Mamdani-type fuzzy rule-based systems
o Variants of Mamdani fuzzy rule-based systems
 DNF Mamdani fuzzy rule-based systems
 Approximate Mamdani-type fuzzy rule-based systems

 Takagi–Sugeno–Kang Fuzzy Rule-Based Systems

 Generation of the Fuzzy Rule Set


o Design tasks for obtaining the fuzzy rule set
o Kinds of information available to define the fuzzy rule set
o Generation of linguistic rules
o Generation of approximate Mamdani-type fuzzy rules
o Generation of TSK fuzzy rules
o Basic properties of fuzzy rule sets
 Completeness of a fuzzy rule set
 Consistency of a fuzzy rule set
 Low complexity of a fuzzy rule set
 Redundancy of a fuzzy rule set

 Applying Fuzzy Rule-Based Systems


o Fuzzy modelling
 Benefits of using fuzzy rule-based systems for modelling
 Relationship between fuzzy modelling and system identification
 Some applications of fuzzy modelling
o Fuzzy control
 Advantages of fuzzy logic controllers
 Differences between the design of fuzzy logic controllers and fuzzy models
 Some applications of fuzzy control
o Fuzzy classification
 Advantages of using fuzzy rule-based systems for classification
 Components and design of fuzzy rule-based classification systems
 Some applications of fuzzy classification
UNIT:II
Adaptive Fuzzy Systems

Performance index – Modification of rule base 0 – Modification of membership functions -


Simultaneous modification of rule base and membership functions – Genetic algorithms-
Adaptive fuzzy system Neuro fuzzy systems.

Adaptive fuzzy systems are used to manage complex and nonlinear systems by adapting their
behaviour based on changing conditions. Performance indices for these systems measure how well
they achieve desired outcomes. Here are some key performance indices used to evaluate adaptive
fuzzy systems:
1. Tracking Error:
Mean Squared Error (MSE): Measures the average of the squares of the errors between the
desired output and the system output. Lower MSE indicates better performance.
Root Mean Squared Error (RMSE): The square root of MSE, providing an interpretable scale
of error.
Integral of Absolute Error (IAE): Integrates the absolute error over time, giving a cumulative
measure of performance.
- *Integral of Squared Error (ISE)*: Integrates the square of the error over time, penalizing
larger errors more heavily.
- *Integral of Time-weighted Absolute Error (ITAE)*: Weighs errors that occur later in the
process more heavily, which is useful for systems where late errors are more critical.

2. System Stability:
Lyapunov Stability Criteria: Ensures the system remains stable over time, meaning that it will
not exhibit unbounded behavior.
BIBO Stability (Bounded-Input, Bounded-Output):
Ensures that bounded inputs result in bounded outputs, a crucial property for system reliability.

3. Adaptation and Convergence Speed:


Adaptation Time: The time taken for the system to adapt to a new set of conditions or reach a
new equilibrium.
Rate of Convergence: How quickly the system’s output converges to the desired output after a
change in input or conditions.
4. Robustness:
Robustness to Disturbances: Measures how well the system maintains performance in the
presence of disturbances or uncertainties.
Sensitivity to Parameter Variations: Assesses how changes in system parameters affect
performance.

5. Computational Efficiency:
Computation Time: The time required to compute the control actions. Lower computation
times indicate a more efficient system.
Resource Utilization: The amount of computational resources
(e.g., memory, processing power) required for operation.

6. Control Effort:
Energy Consumption: The amount of energy used by the system to maintain control. Lower
energy consumption is usually preferable.
Control Signal Smoothness: The smoothness of the control signals, which can be important for
mechanical systems to avoid wear and tear.
These performance indices provide a comprehensive evaluation of an adaptive fuzzy system,
ensuring it meets desired criteria for accuracy, stability, robustness, and efficiency.

Modification of rule base 0:

In fuzzy logic, the modification of rule base 0 typically refers to the process of adjusting or
updating the initial set of rules used in a fuzzy inference system. Here are some notes on the
modification of rule base 0 in fuzzy logic:

Initial Rule Base: Rule base 0 represents the starting point of the fuzzy inference system,
consisting of a set of linguistic rules that define the relationships between inputs and outputs in the
system. These rules are typically formulated based on expert knowledge or empirical data.

Rule Base Evaluation: Before modification, the effectiveness of rule base 0 needs to be
evaluated. This evaluation involves assessing how well the rules capture the underlying system
dynamics and whether they produce accurate outputs for a given set of inputs.

Identifying Deficiencies: Deficiencies in rule base 0 may arise due to various reasons, such as
incomplete or inaccurate domain knowledge, insufficient data, or changes in the system dynamics
over time. These deficiencies can lead to poor performance or limited applicability of the fuzzy
system.

Data Gathering: If the deficiencies are related to inadequate data representation or changes in the
system, additional data gathering may be necessary. This could involve collecting new data
samples or updating existing datasets to better reflect the current state of the system.
Rule Base Modification Techniques:

Rule Addition: New rules can be added to rule base 0 to capture previously unrepresented
relationships between inputs and outputs. These rules can be derived from additional domain
knowledge or acquired through data-driven approaches such as machine learning.

Rule Deletion: Rules that are redundant or irrelevant can be removed from the rule base to
simplify the inference process and improve computational efficiency. This can be done based on
the analysis of rule contribution to the overall system performance.

Rule Modification: Existing rules can be modified to better align with the observed system
behavior or to accommodate changes in the input-output mapping. This may involve adjusting
membership functions, rule weights, or linguistic terms to improve the accuracy of the inference
process.

Validation and Testing: After modifying rule base 0, it's essential to validate the updated fuzzy
inference system to ensure that it produces reliable outputs across a range of input scenarios. This
validation typically involves testing the system using a separate dataset or through simulation
experiments.

Iterative Process: Rule base modification is often an iterative process, where the system is
refined through multiple rounds of evaluation, modification, and testing. This iterative approach
allows for continuous improvement and adaptation of the fuzzy logic system to changing
requirements or environments.

Modification of membership functions

Modification of membership functions is a crucial aspect in fine-tuning and optimizing fuzzy


logic systems. Here's a breakdown of the process:

Understanding Membership Functions: Membership functions define the degree of


membership of an input variable to a particular linguistic term (e.g., "low," "medium," "high").
These functions map input values to membership degrees on a scale from 0 to 1, indicating the
degree of belongingness of the input to the linguistic term.

Evaluation of Existing Membership Functions: Before modification, it's essential to evaluate


the performance of the existing membership functions. This evaluation includes analyzing how
well they capture the input-output relationships and whether they accurately represent the system
dynamics.

Identifying Areas for Improvement: Areas for improvement in membership functions may arise
due to various factors, such as inaccuracies in modeling, incomplete domain knowledge, or
changes in the system environment. Common issues include under-representation or over-
representation of certain input ranges, lack of sensitivity to changes, or excessive complexity.

Types of Modifications:

Adjustment of Shape: This involves modifying the parameters of the membership functions to
alter their shape. Parameters such as the width, center, and slope of the functions can be adjusted
to better align with the data distribution or the expert knowledge of the system.
Adding/Removing Functions: New membership functions can be added or existing ones
removed to better cover the input space. This can help improve the granularity of representation
and capture subtle variations in the input-output mapping.

Fusion or Splitting: Membership functions can be fused or split to merge overlapping functions
or divide overly broad functions into smaller segments. This can enhance the discrimination
capability of the fuzzy system and improve its accuracy.

Scaling or Shifting: Scaling involves stretching or compressing membership functions along the
input axis, while shifting involves moving the functions horizontally. These adjustments can help
align the functions with changes in the input data distribution or system dynamics.

Fuzzy Clustering: Data-driven techniques such as fuzzy clustering can be used to automatically
identify optimal membership functions from input-output data. Clustering algorithms can partition
the input space into regions with similar characteristics, and membership functions can be derived
from these clusters.

Validation and Testing: After modifying the membership functions, it's essential to validate the
updated fuzzy logic system using appropriate validation techniques. This validation may involve
testing the system on unseen data, conducting simulation experiments, or comparing the system's
performance against established benchmarks.

Iterative Refinement: Similar to rule base modification, the process of modifying membership
functions is often iterative. It may require multiple rounds of evaluation, adjustment, and
validation to achieve the desired level of performance and accuracy.

By carefully modifying membership functions, practitioners can improve the representational


power, sensitivity, and robustness of fuzzy logic systems, making them more effective in
modeling complex and uncertain systems.

Simultaneous modification of rule base and membership functions

Simultaneous modification of both the rule base and membership functions in fuzzy logic is a
comprehensive approach to fine-tuning and optimizing fuzzy inference systems. Here's how it can
be done:

Evaluation of Current System: Before making any modifications, assess the performance of the
existing fuzzy logic system. This involves analyzing the accuracy of the system's output across a
range of input scenarios and identifying areas where improvements are needed.

Identifying Deficiencies: Determine whether deficiencies in the system arise from the rule base,
membership functions, or both. Look for inconsistencies, inaccuracies, or gaps in the system's
ability to capture the underlying input-output relationships.

Rule Base Modification:

Rule Addition/Deletion: Evaluate the existing rules and consider adding new rules or removing
redundant ones to better cover the system's behavior. New rules can be derived from expert
knowledge, data analysis, or machine learning techniques.
Rule Modification: Adjust existing rules to better reflect the system dynamics or accommodate
changes in the input-output mapping. This may involve refining linguistic terms, adjusting rule
weights, or updating the logical structure of the rules.

Membership Function Modification:

Adjustment of Shape: Modify the parameters of the membership functions to improve their
alignment with the data distribution or expert knowledge. Adjustments can include changing the
width, center, slope, or shape of the functions.

Adding/Removing Functions: Consider adding new membership functions or removing


redundant ones to better cover the input space and capture the nuances of the system behavior.

Fusion or Splitting: Merge overlapping membership functions or split overly broad functions to
enhance the discrimination capability and accuracy of the fuzzy system.

Scaling or Shifting: Scale or shift membership functions to better align with changes in the input
data distribution or system dynamics.

Simultaneous Optimization:

Modify rules and membership functions iteratively, considering their combined impact on the
system's performance.

Evaluate how changes in membership functions affect the applicability and effectiveness of
the rules, and vice versa.

Ensure that modifications to both components complement each other and lead to overall
improvements in system accuracy and robustness.

Validation and Testing:

Validate the modified fuzzy logic system using appropriate validation techniques. Test the
system on unseen data or conduct simulation experiments to assess its performance and compare
it against established benchmarks.

Iterative Refinement: The process of simultaneous modification is iterative, requiring multiple


rounds of evaluation, adjustment, and validation to achieve the desired level of performance and
accuracy.

Genetic algorithms

Genetic algorithms give an alternative method to determine the global optimum for complicated
problems having several local optima. To determine the optimum they use stochastic search
procedures borrowed from natural evolution.
Base concepts of natural genetics The interest towards the application of genetic algorithms has
been significantly grown in the recent years. Comparing to the traditional search and optimization
methods, genetic algorithms are robust, global and can be applied at a small cost, especially when
there is no much information available about the examined system. Since genetic algorithms do
not require information about the gradient of the objective function to be minimized, the applied
search method, arising from its stochastic nature, is capable of searching the whole solution space
and finding the optimum with great probability.

The genetic algorithm (GA) is a global stochastic search method, which learned its ideas by close
observation of the natural biological evolution [47]. Genetic algorithms operate on the population
of potential solutions, and apply the principle which states that by the survive of the ones most
capable of living, (hopefully) better and better individuals come into being, that is, the better
approximation of the solution is achieved. For every generation a new set of approximations is
determined based on their suitability level (fitness) in the problem space in such a way that the
approximations (individuals) are paired using operators taken from natural genetics. The process
leads to a new population of individuals which produce better objective function values (adjust
better to the environment), rather those individuals from which they were created, similarly the
natural adaptation. In what follows, first the genetic base concepts, which form the base of genetic
algorithms, are summarized. ·
Gene: Material carrying genetic information, more or less permanent organization unit of the
dezoxiribonucleinacid (DNA), encircled section of the polynucleotid double spiral, which alone or
cooperating with other genes inside the boundaries specified by environmental conditions
determines the appearance of certain properties or property groups. Managing to get unchanged in
every successor cell or successor organism by repeated self-duplication, implements genetic
continuity in the cell generations and generation series following each other. One gene consists of
great number of nucleotides. · Chromosome: Carrier of the inheriting material, the genes, which is
organized into longer or shorter spring-like bodies. Chromosomes are embedded in the plasma of
the cell center. They keep their individual properties throughout the whole life cycle of the cell,
but their material concentrates only at the fission of the cell as much as they are visible for
microscopic observation. During observation the chromosomes are declassified, sorted into a
plane, painted, taken a photo or transformed into a digitized picture. The separated and sorted
picture is the chromosome set (chariogram), peculiar to the creature, in which the chromosomes
are ranked according to their shape. Chromosomes, regarding their chemical compound, can
contain DNAs, histons, non-histon-tempered proteins and metallic ions. The chromosomes,
according to their morphological properties, can be more or less ranked into groups consisting of
pairs (chariogram). Between shape-identically constructed chromosome pairs, in a certain life
period of the cell (in the beginning of the meiosis), an odd attraction occurs. Individual members
of the chromosome pairs fit lengthways, and single or multiple crossover happens. The paired
chromosomes later separate again. The originated chromosomes usually resemble the ones made
connection. ·
Allel: One of the structurally and functionally altered versions of a gene. Different alternative
allels of the same gene come into existence by mutation from each other, and are formed into each
other the same way. A gene is represented by one of its allels at a place in the chromosome. In
natural populations the most frequent form is usually called original or wild type. A mutant allel
arisen by altering some point of the gene, compared to the wild allel, can be of variant strength:
unable to display visible effect, displaying weaker or stronger visible effect than the wild type,
opposite effect to the wild or qualitatively different effect from the wild type. · Genotype: Sum of
the genetic information stored in the chromosomal genes of the organism, which in interaction
with the plasmon and environmental conditions determines the outer appearance (phenotype) of
the organism. ·
Phenotype: Sum of the perceptible, determinable (describable and measurable) inner and outer
properties, resultant of the inheritable base and the life conditions. · Individual: Organizational,
physiological and reproduction-biological (genetic) unit of the living world, in other names living
being or organism. Characteristic property of the individual is that it separates from its
environment, exercises metabolism, propagates. Sum of individuals of identical origin, arisen via
the genital process, is the population, while individuals arisen via non-genital way can not be
considered as independent individuals until they are connected with their parent. Individuals in the
nature usually make communities (supra individual organizations) and fill the biosphere forming
populations, associations and ecological systems. · Population: Organizational unit above the
individual level, which means sum of individuals sharing the same quality of a certain peculiarity,
feature within the species. According to the genetic interpretation population is a smaller or larger
group of living beings belonging to the same species, in which the possibility of propagation
between individuals exists. Every species exists in a form of populations. Due to the adaptation to
the environment, the populations forming species more or less differ from each other regarding
their phenotype and genotype. The ideal population consists of a large number of individuals; the
number of individuals remains constant through generations, because it is not subject to the
natural selection caused by the environment. · Species: Fundamental unit of the systematization of
the living beings and the evolution, which usually consists of populations of living beings similar
to each other. By permanent or occasional exchange of the genes between the individuals of these
populations, continuous variation of several properties arise. The emphasis of the biological
species concept falls to the possibility of gene exchange. According to this the species is an
advanced level possibility for propagation, whose members can exchange genes, and are separated
from similar propagation communities by reproductive isolation. · Breed: Fundamental
economical or taxonomic (zoological or botanical) unit within species of cultivated plants and
domestic animals. It is a group of individuals of a species, which is unambiguously distinguished
by certain morphological and physiological properties. · Evolution: Production of living material
from lifeless (biogenesis) and development of diversity. Evolution is unequal in space and time,
continuous and irreversible process. During evolution organism arise, which can balance
disadvantageous environmental effects. Evolution genetics deals with examination of mechanics
and factors of species arising and metamorphosis. Its fields are mutation, selection, isolation,
recombination. ·
Selection: Naturally or artificially initiated biological process, which hinders the survival or
propagation of certain individuals or groups, while doesn’t hinder others. In accordance with
population genetics it means the non-random propagation of different phenotypes. Mutation is
able to change only the composition of those groups in which genotype variance exists. Selection
coefficient and selection pressure plays important role in the characterization of the mutation. The
selection coefficient (s) is measurable at selection for qualitative character as the difference
between the phenotype medium values (Ms) of the base population before the selection (M) and
the descendant

population: s = M - Ms . This difference, depending on the value of 2 h (inheritance rate), is


inherited only to a certain extent in the descendant generation. Selection pressure is the intensity
of the change of the gene frequency from generation to generation as the effect of selection.
Selection is much more effective in large population, while smaller populations are dominated
rather by genetic drift. Recombination (crossover): Exchange between genetic materials
containing different genotypes, whose result is the recombined genotype of the descendant,
differing from both parents; development of one or more new combination of genes from two,
partly different parental genesets. Within the cell there is always a chance that partly different, but
homogeneous genetic materials (chromosomes, DNA molecules) get so close that exchange can
occur. The difference between identical gene positions (allels) can be smaller (point mutation) or
larger (gene mutation). Recombination is the base of constructing the genetic map, because the
frequencies can be transformed into relative distances, and with this the gene positions and
connections can be computed. The gene order within a chromosome, the cistons within a gene and
in these the distance between gene positions can be determined with the analysis of recombinants.
For exchange between chromosomes two types of reaction models, break-and-fusion and pattern
selection are known. The process occurs on the DNA level by means of working of several
enzymes. ·
Mutation: Change in succession materials, which is not a result of genetic recombination. A
mutant is an individual in which by means of mutation at least one gene locus has changed. The
mutant can be a new phenotype, but can be a cell component change too, invisible from outside. A
characteristic property of mutants is that in morphological nature (height, number of leaves, spike
shape) can be well distinguished. Biochemical mutants lost that property of the wild type which
can syntethize physiologically important chemical compounds (vitamins, amino-acids).
Depending on the level of change of the inheritable material, one can differ genomutation
(number of chromosomes changes), chromosome mutation (genetic change leading to a new allel,
which doesn’t change the structure of the chromosome), plasmon mutation (change of plasmatic
genetic components), plastidom mutation (extrachromosomal inheritance: inheritable change of
plastids). Spontaneous mutations (arisen without use of mutagenes) and induced mutations (by
effect of physical/chemical factors) can be differed. The unit of mutation is the muton, the
smallest part of the DNA molecule, whose change can result in development of a mutant
organism. This is usually a nucleotid pair (ciston, rekon) of the molecule. The absence, excess or
change of the pending base pair transforms the triplet code and with this the information too, and
comes down by means of replication. Mutations have important role in development of new
species. The rate of spontaneous gene mutations is extremely small. ·
Migration: Spreading of plant and animal species or their totality, flora and fauna. Gene
migration is the transmission of genetic information from one population (emigration) to the other
population (immigration) by migrant individuals or groups (gene drift). ·
Fitness (competence value): Upon given external conditions the chance of an individual or
individuals, incarnating a given genotype, for staying alive and giving birth such viable
descendants which will take part in forming the different generations. The ratio, with which the
individuals with the pending genotype take part in forming the next generations, has been found
really appropriate for expressing its numerical value. As base of comparison for determining the
numerical value of the fitness, the mean of the population or the values characterizing the other
genotypes of the population are used. Genetic algorithms use ideas and rules taken from natural
genetics with significant simplifications in the global stochastic optimum searching procedure.
They primarily employ the following principles:
- Let there be a population of individuals. Every individual is a string over an alphabet. The
individuals are different. - Let there be genetic operations, which alter the individuals. - Let there
be a function, which defines a fitness (competence) value for every individual. - After multiple
changes the rearrangement of the population occurs based on the fitness value (reproduction). The
reproduction, which leads to the survival of chromosomes having high fitness value and dying out
of those having smaller value, finally acts so that from generation to generation the chromosomes,
from the point of view of the problem to be solved, become better. Recently three schools of
evolutionary algorithms can be differed: genetic algorithms, evolutionary strategies and genetic
programming. Genetic algorithms were originally developed by J.H.Holland and D.E.Goldberg in
the USA.Evolutionary strategies employ a different approach, its founders are I.Rechenberg and
H.P. Schwefel in Germany. However, both schools are based on the ideas of evolution. Genetic
programming generalizes these methods. If, namely, the original parameter-dependent system
which should be optimized as a function of these parameters, is substituted with a theoretical
construction such as a computational rule or a computer program, then the rule or program which
optimally solves the formulated problem can be searched for. The possibility for connection is
given by the fact that arithmetical or Boole-type expressions can be encoded with chromosomes.
In case of programs the same goes for forks and recursions, where their prefix representation can
be encoded with chromosomes. In case of genetic programming various fitness functions are used,
which are based on errors occurring during use. Among these the most simple is the raw fitness.
According to it if the j th chromosome executes the i th task with E[i, j] value (e.g. this will be
the value of a Booletype expression as its effect) and the expected value is f [i], then the fitness is
r[ j] = å E[i, j] - f [i]
Principles of genetic algorithms Although genetic algorithms can be used in case of multi-
criterion (multiobjective) optimization problems, hereinafter only the optimization of single
objective function is considered. A scalar criterion optimum problem can be always rephrased to a
minimum problem. Usually several local minima are possible, but the problem is the calculation
of the global minimum.
Constraints can be taken into account as penalty functions embedded to the objective
function. The problem is, consequently, the calculation of the global minimum of the
unconstrained real function ( ) 1 var ,..., N f x x : ( ) 1 var min ,..., N f x x . It is assumed that for
every i x variable the finite smallest and largest value is known, between which it can change.
The i x variables are typically real numbers ( ) 1 xi Î R , but exceptionally it is allowable that
they take only integer values ( ) 1 xi Î Z . If i x is real, it can be represented as a floating point
number, but in case of predefined precision (number of bits PRECI), and lower and upper bounds,
it can be represented as a binary combination in two’s complement or Gray code. If i x can be
integer only, then a suitable base can be selected, which spans its domain and where an integer
value codes it.
In this sense real, binary and integer problems can be distinguished, where there is a
significant similarity between the handling of the latter two. Hereinafter only the real and binary
case is addressed. The possible combinations of the 1 var ,..., N x x natural variables are the
individuals. The real form is the phenotype form, the coded form is the genotype form or
chromosome. In the combination, in the place of the combination representing the i x variable
stands the tangible allel of the gene.

For the sake of uniformity, the individuals represented directly as the real combination of 1
var ,..., N x x , are considered also as chromosomes (in this case, the phenotype is identical to the
genotype). Certain number of individuals ( ) Nind can be dynamically stored within the algorithm
in coded (chromosome) form, which make the population. Individuals must be selected for
recombination.
The number of the selected individuals is defined by GGAP (generation gap) in such a manner
that the number of individuals selected for crossover is the number of individuals of the
population multiplied by this factor(N GGAP) ind *. For determining the place of the optimum, a
global stochastic search method is used, which is based on the principles of selection,
recombination, reinsertion, mutation and migration. The steps are realized by stochastic (random
number generator-based) algorithms.
At the reinsertion of the individuals created by recombination, certain number of the best
individuals prior to selection can be kept unchanged (elitist strategy). This number is indirectly
determined by the reinsertion rate. Genetic algorithms using one population (simple genetic
algorithm, SGA) and using several sub-populations (multi-population genetic algorithm, MPGA),
are distinguished.
In case of multi-population genetic algorithm, migration is possible between sub-
populations. MATLAB can be effectively used for implementation of genetic algorithms. It is
practical to implement multi-population genetic algorithms so that functions realizing selection,
recombination, reinsertion and mutation remain the same as in the case of a single population. In
this chapter we make acquainted with the concept of the GA toolbox of Chipper field .
Objective functions should be of unified structure and exchangeable. In case of binary
coding, before calculating the actual value of the objective function the value of the genotype
(chromosome) value of the coded variables must be converted to phenotype (real) value, for
which a conversion routine (binary string to real value, bs2rv) is needed.
The real value of the objective function (objfun) before selection, reinsertion and migration
is used to be converted to a fitness (competency) value transformed to a limited positive interval,
thus during sophisticated algorithms the usually signed value of the objective function should not
be taken into consideration. Different kinds of transformations are possible. For practical reasons
it is expedient to limit the number of generations (MAXGEN), at which the algorithm stops to
guard against the divergence of the global stochastic optimum search.
It is practical to generate the starting population randomly (create binary population, crtbp and
create real-valued population, crtrp). For implementing the calculation of fitness, selection,
recombination, reinsertion, mutation and migration it is practical to offer alternative possibilities.
It is practical to automatically derive the parameters which influence running time and memory
consumption from the dimension of the problem ( ) DIM = Nvar .

Fig. 2.1 shows the theoretical structure of a simple (one population) binary genetic algorithm. In
case of binary population FielD has as many columns as many variables exist, and for every
variable it defines the accuracy (len), lower (lb) and upper (ub) bound, coding (1=Gray,
0=standard binary), scale (1=logarithmic resolution, 0=binary resolution), and whether the lower
bound is part of the represented interval (in case of lbin=1 it is part of it, otherwise not), and
whether the upper bound is part of the represented interval (in case of ubin=1 it is part of it,
otherwise not).
The rep function (which originally did not exist in the earlier versions of MATLAB) replicates
the matrix given as first parameter in horizontal and vertical direction. Naturally, the parameters
of the multi population genetic algorithm are more ample. Different alternative functions can be
selected for implementation of selection, recombination and mutation. The FielD consists only of
upper and lower bounds.
NIND=40;
MAXGEN=300;
NVAR=20;
PRECI=20;
GGAP=0.9;
%Build field descriptor
FieldD=[rep([PRECI],[1,NVAR]);rep([-512;512],[1,NVAR]);... rep([1;0;1;1],[1,NVAR])];
%Initialize population
Chrom=crtbp(NIND,NVAR*PRECI);
gen=0;
%Evaluate initial population
ObjV=objfun1(bs2rv(Chrom,FieldD));
%Generational loop
while gen<MAXGEN
%Assign fitness values to entire population
FitnV=ranking(ObjV);
%Select individuals for breeding
SelCh=select(‘sus’,Chrom,FitnV,GGAP);
%Recombine individuals (crossover)
Selch=recombin(‘xovsp’,Selch,0.7);
%Apply mutation SelCh=mut(SelCh);
%Evaluate offspring, call objective function
ObjVSel=objfun1(bs2rv(SelCh,FieldD));
%Reinsert offspring into population
[Chrom ObjV]=reins(Chrom,SelCh,1,1,ObjV,ObjVSel);
%Increment counter gen=gen+1;
Structure of a simple genetic algorithm (SGA)
Figure 1.

Structure of the regulator ANFIS.

Through this structure we can see five layers described as follows:

Layer 1: The function of node at this layer is identical to the membership function in the
fuzzification process:

Layer 2: generate the degree of activation of a rule.

Layer 3: Each node of this layer is a circular node denoted by N. The output node represents the
normalized activation degree according to the ith rule.

Layer 4: Each node of this layer is a square node with a function described as follows:
where vi is the output of the node I of layer 3 and {ai,bi,ci}aibici is the set of update parameters.

Layer 5: In this layer, there is only one node that determines the overall output by using the
following expression:

Considering x1 and x2 are the position error e and its derivative Δe: [x1,x2]=[e,Δe].

We associate two fuzzy sets for each of the inputs x1 and x2 namely N (Negative) and P
(Positive). μN and μP represent the degrees of membership appropriate to variables xi with
respect to the fuzzy subsets Ai and Bi, defined by the following membership functions

Figure 2.

Membership functions.

on sidering x1 and x2 are the position error e and its derivative Δe: [x1,x2]=[e,Δe].

We associate two fuzzy sets for each of the inputs x1 and x2 namely N (Negative) and P
(Positive). μN and μP represent the degrees of membership appropriate to variables xi with
respect to the fuzzy subsets Ai and Bi , defined by the following membership functions.
Certainly! Let's delve into an algorithmic description of an Adaptive Neuro-Fuzzy Inference
System (ANFIS), one of the most popular neuro-fuzzy systems:

Adaptive Neuro-Fuzzy Inference System (ANFIS) :

Initialization:

Initialize the parameters of the fuzzy system, including membership function parameters and rule
weights.

Determine the structure of the fuzzy inference system, such as the number of input variables,
linguistic terms, and rules.

Forward Pass:

For each training sample:

Fuzzification:

Calculate the degree of membership for each input variable using appropriate membership
functions.

Rule Evaluation:

Compute the firing strength of each rule, which represents the degree to which each rule is
activated given the input values.

Consequent Parameters:

Calculate the output of each rule by multiplying the firing strength by the consequent parameters
(output membership function parameters).

Normalization:

Normalize the firing strengths of all rules to ensure that they sum up to 1.

Backward Pass:

For each training sample:

Error Calculation:

Compute the error between the actual output and the output generated by the fuzzy system.

Gradient Descent:

Update the parameters of the membership functions and rule weights using gradient descent
optimization.

Adjust the parameters to minimize the error between the actual and predicted outputs.
Learning Rate:

Apply a learning rate to control the step size of parameter updates and prevent overshooting.

Stopping Criterion:

Check for convergence or apply a stopping criterion to halt the training process.

Repeat Iterations:

Repeat the forward and backward passes for multiple epochs until the stopping criterion is met or
convergence is achieved.

Validation and Testing:

After training, validate the ANFIS model using a separate validation dataset.

Assess the performance of the model using appropriate evaluation metrics such as mean squared
error, accuracy, or correlation coefficients.

Test the model on unseen data to evaluate its generalization ability and robustness.

Fine-Tuning:

Optionally, fine-tune the ANFIS model by adjusting hyperparameters, such as the number of
linguistic terms, rules, or the learning rate, to optimize performance further.

Deployment:

Deploy the trained ANFIS model for real-world applications, such as control systems, pattern
recognition, forecasting, or decision support systems.

UNIT: III
Artificial Neural Networks:
Introduction-History of neural networks-multilayer perceptions-Back propagation algorithm and
its
Variants-Different types of learning, examples.

Artificial Neural Networks (ANNs) are computational models inspired by the structure and
function of biological neural networks in the human brain. They consist of interconnected nodes,
called neurons or units, organized into layers. ANNs are capable of learning complex patterns and
relationships from data, making them powerful tools for tasks such as classification, regression,
pattern recognition, and optimization. Here's an overview of artificial neural networks:

History of Artificial Neural Network:

The history of neural networking arguably began in the late 1800s with scientific endeavors
to study the activity of the human brain. In 1890, William James published the first work about
brain activity patterns. In 1943, McCulloch and Pitts created a model of the neuron that is still
used today in an artificial neural network. This model is segmented in two parts

o A summation over-weighted inputs.


o An output function of the sum.

Artificial Neural Network (ANN):

In 1949, Donald Hebb published "The Organization of Behavior," which illustrated a law for
synaptic neuron learning. This law, later known as Hebbian Learning in honor of Donald Hebb, is
one of the most straight-forward and simple learning rules for artificial neural networks.

In 1951, Narvin Minsky made the first Artificial Neural Network (ANN) while working at
Princeton.

In 1958, "The Computer and the Brain" were published, a year after Jhon von
Neumann's death. In that book, von Neumann proposed numerous extreme changes to how
analysts had been modelling the brain.

Components of Artificial Neural Networks:

Neurons/Nodes:

Neurons are the basic processing units in ANNs. They receive input signals, perform a
computation, and produce an output signal.

Each neuron typically applies an activation function to the weighted sum of its inputs to produce
its output.

Layers:

ANNs are organized into layers, including an input layer, one or more hidden layers, and
an output layer.
The input layer receives the raw input data, while the output layer produces the final output of the
network.

Hidden layers perform intermediate computations and extract features from the input data.

Connections/Weights:

Connections between neurons are represented by weighted edges. Each connection has an
associated weight that determines its strength.

During training, these weights are adjusted to minimize the difference between the predicted
outputs of the network and the true outputs in the training data.

Activation Functions:

Activation functions introduce nonlinearity into the network, allowing it to learn complex
relationships in the data.

Common activation functions include sigmoid, hyperbolic tangent (tanh), Rectified Linear Unit
(ReLU), and softmax.

Training Artificial Neural Networks:

Forward Propagation:

During forward propagation, input data is fed through the network, and computations are
performed layer by layer to produce the final output.

Each neuron computes a weighted sum of its inputs, applies an activation function, and passes the
result to the next layer.

Loss/Cost Function:

A loss function measures the difference between the predicted outputs of the network and the true
outputs in the training data.

Common loss functions include mean squared error (MSE) for regression tasks and cross-entropy
loss for classification tasks.

Backpropagation:

Backpropagation is used to update the weights of the network based on the gradient of the loss
function with respect to the weights.

The gradient is computed using the chain rule of calculus, and the weights are adjusted using an
optimization algorithm such as stochastic gradient descent (SGD) or its variants.

Epochs and Mini-Batches:

Training is typically performed iteratively over multiple epochs, where each epoch involves one
pass through the entire training dataset.

To improve efficiency and stability, training data is often divided into mini-batches, and weight
updates are computed based on the average gradient over each mini-batch.
Types of Artificial Neural Networks:

Feed forward Neural Networks (FNNs):

In FNNs, information flows in one direction, from the input layer to the output layer, without any
feedback loops.

They are suitable for tasks such as regression and classification.

Recurrent Neural Networks (RNNs):

RNNs have connections that form directed cycles, allowing them to retain information over time.

They are well-suited for sequence modeling tasks such as natural language processing (NLP),
speech recognition, and time series prediction.

Convolutional Neural Networks (CNNs):

CNNs are designed to process structured grid-like data, such as images or audio
spectrograms.

They use convolutional layers to extract spatial hierarchies of features from the input data.

Generative Adversarial Networks (GANs):

GANs consist of two neural networks, a generator and a discriminator, trained


adversarially to generate realistic samples from a given distribution.

They are used for tasks such as image generation, style transfer, and data augmentation.

Applications of Artificial Neural Networks:

Image Recognition and Computer Vision:

CNNs are widely used for tasks such as object detection, image classification, and facial
recognition.

Natural Language Processing (NLP):

RNNs and transformer-based models such as the Transformer and BERT are used for tasks such
as machine translation, text summarization, and sentiment analysis.

Speech Recognition:

RNNs and hybrid models combining CNNs and RNNs are used for speech recognition and
synthesis.

Healthcare:

ANNs are used for medical image analysis, disease diagnosis, drug discovery, and personalized
treatment planning.

Finance:

ANNs are used for stock market prediction, fraud detection, credit scoring, and algorithmic
trading.
Autonomous Vehicles:

ANNs are used for perception, decision-making, and control in autonomous vehicles, enabling
tasks such as object detection, lane keeping, and path planning.

Multilayer Perceptions

Multilayer Perceptrons (MLPs) are a type of artificial neural network (ANN) characterized by
having multiple layers of nodes, each layer connected to the next. They are also sometimes
referred to as feedforward neural networks or deep neural networks if they have many hidden
layers.

Here's a breakdown of the components and workings of an MLP:

Input Layer: This layer contains neurons (nodes) corresponding to the input features of the
dataset. Each neuron represents a feature, and the number of neurons in this layer depends on the
dimensionality of the input data.

Hidden Layers:

These are the intermediate layers between the input and output layers. Each hidden layer
consists of multiple neurons, and each neuron is connected to every neuron in the previous layer.
The number of hidden layers and the number of neurons in each layer are design choices and can
vary depending on the complexity of the problem.

Weights and Biases:

Each connection between neurons in adjacent layers is associated with a weight, which
represents the strength of the connection. Additionally, each neuron has an associated bias, which
allows the network to learn more complex functions. The weights and biases are the parameters of
the network that are learned during the training process.

Activation Functions:

Neurons in each layer (except the input layer) apply an activation function to the weighted sum
of their inputs plus the bias. Activation functions introduce non-linearity to the network, enabling
it to learn complex relationships in the data. Common activation functions include sigmoid, tanh,
ReLU (Rectified Linear Unit), and softmax.

Output Layer:

This layer produces the final output of the network. The number of neurons in the output layer
depends on the nature of the task. For regression tasks, there may be a single neuron or multiple
neurons if predicting multiple continuous values. For classification tasks, each neuron typically
represents a class, and the output is often passed through a softmax function to convert the raw
output into probability scores.

Forward Propagation:

During the forward pass, input data is passed through the network, layer by layer, with each
layer performing a weighted sum of the inputs followed by an activation function. This process
continues until the output layer is reached, and the final output of the network is produced.

Backpropagation:
After the forward pass, the network's output is compared to the true target values using a loss
function, which measures the difference between the predicted and actual values.
Backpropagation is then used to compute the gradients of the loss function with respect to the
weights and biases of the network. These gradients are used to update the weights and biases
through optimization algorithms like gradient descent, with the goal of minimizing the loss
function and improving the network's performance.

Back propagation algorithm and its Variants

What is a backpropagation algorithm?

Backpropagation, or backward propagation of errors, is an algorithm that is designed to test for


errors working back from output nodes to input nodes. It's an important mathematical tool for
improving the accuracy of predictions in data mining and machine learning. Essentially,
backpropagation is an algorithm used to quickly calculate derivatives in a neural network, which
are the changes in output because of tuning and adjustments.

There are two leading types of backpropagation networks:

 Static backpropagation. Static backpropagation is a network developed to map static


inputs for static outputs. Static networks can solve static classification problems, such
as optical character recognition (OCR).

 Recurrent backpropagation. The recurrent backpropagation network is used for


fixed-point learning. This means that during neural network training, the weights are
numerical values that determine how much nodes -- also referred to as neurons --
influence output values. They're adjusted so that the network can achieve stability by
reaching a fixed value.

The key difference here is that static backpropagation offers instant mapping, while recurrent
backpropagation does not.

What is a backpropagation algorithm in a neural network?

Artificial neural networks (ANNs) and deep neural networks use backpropagation as a learning
algorithm to compute a gradient descent, which is an optimization algorithm that guides the user
to the maximum or minimum of a function.

In a machine learning context, the gradient descent helps the system minimize the gap between
desired outputs and achieved system outputs. The algorithm tunes the system by adjusting the
weight values for various inputs to narrow the difference between outputs. This is also known as
the error between the two.

More specifically, a gradient descent algorithm uses a gradual process to provide information on
how a network's parameters need to be adjusted to reduce the disparity between the desired and
achieved outputs. An evaluation metric called a cost function guides this process. The cost
function is a mathematical function that measures this error. The algorithm's goal is to determine
how the parameters must be adjusted to reduce the cost function and improve overall accuracy.

Backpropagation is a fundamental algorithm used for training artificial neural networks, including
multilayer perceptrons (MLPs). It's a supervised learning algorithm that adjusts the weights of the
connections in the network to minimize the difference between the predicted output and the actual
target output. The backpropagation algorithm consists of two main phases: forward propagation
and backpropagation of errors. Here's an overview of each phase:

Forward Propagation:

During forward propagation, input data is fed into the network, and the activations of
neurons in each layer are computed sequentially until the output layer is reached.

The activations are computed by applying the weighted sum of inputs to each neuron in a layer
and then passing the result through an activation function, typically a non-linear function like
sigmoid, tanh, or ReLU.

The output of the network is then compared to the true target values using a loss function, which
measures the difference between the predicted output and the actual output.

Backpropagation of Errors:

In this phase, the error gradient with respect to the network's output is computed using the chain
rule of calculus. The error gradient quantifies how much each weight in the network contributes to
the overall error.

The error gradient is then propagated backward through the network, layer by layer, using
the chain rule to compute the gradient of the loss function with respect to the weights of each
layer.

Finally, the weights of the network are updated using an optimization algorithm like gradient
descent, which adjusts the weights in the direction that minimizes the loss function.

Variants of the backpropagation algorithm and improvements to its efficiency and effectiveness
have been developed over the years. Some notable variants include:

Stochastic Gradient Descent (SGD): Instead of computing the gradient of the loss function over
the entire dataset, SGD computes the gradient using only a single randomly selected data point (or
a small batch of data points) at a time. This can lead to faster convergence and is particularly
useful for large datasets.
Mini-Batch Gradient Descent: This approach combines the advantages of SGD and batch
gradient descent by computing the gradient using a small random subset of the dataset (a mini-
batch) at each iteration. It strikes a balance between the efficiency of SGD and the stability of
batch gradient descent.

Momentum: Momentum is a technique that accelerates SGD by adding a fraction of the update
vector from the previous iteration to the current update vector. This helps SGD to navigate
through shallow minima and accelerate convergence.

Adaptive Learning Rate Methods: Algorithms like Adagrad, RMSprop, and Adam adapt the
learning rate for each parameter based on the past gradients, improving the convergence speed and
stability of the training process.

Weight Regularization: Techniques like L1 and L2 regularization add penalty terms to the loss
function to prevent overfitting by encouraging smaller weights. This helps to generalize better to
unseen data.

Dropout: Dropout is a regularization technique that randomly drops a fraction of neurons (along
with their connections) during training, forcing the network to learn redundant representations and
reducing over fitting.

These variants and improvements to the backpropagation algorithm have been instrumental in
advancing the field of deep learning and improving the performance of neural network models
across various tasks and domains.

break down the back propagation algorithm step by step with a simple example. Suppose we have
a small neural network with one hidden layer, as shown below:

In this example:

 We have one input layer with two neurons (x1 and x2).

 We have one hidden layer with two neurons (h1 and h2).

 We have one output neuron (y).

 The weights of the connections are represented by w1, w2, w3, w4, w5, and w6.

 The biases of the neurons are represented by b1, b2, and b3.

Let's assume our forward pass has already been completed, and we've calculated the output of
the network, which is y_hat. Now, we want to use backpropagation to update the weights and
biases based on the error between y_hat and the actual target y.

Here are the steps of the back propagation algorithm:

Compute the Error Gradient at the Output Layer:

Compute the error between the predicted output y_hat and the actual target y.

Compute the derivative of the loss function with respect to the output neuron's activation.

For example, if we're using mean squared error (MSE) loss, the derivative would be:

scss
δ_output = 2 * (y_hat - y)

Backpropagate the Error Gradient to the Hidden Layer:

Use the chain rule to compute the error gradient at the hidden layer neurons.

Compute the derivative of the activation function of each hidden neuron with respect to its input.

Compute the error gradient with respect to the inputs to the hidden layer neurons.

For each hidden neuron, this would involve multiplying the derivative of its activation function by
the sum of the weighted errors from the neurons in the next layer.

For example, if using the sigmoid activation function:

scss

δ_h1 = (y_hat - y) * w5 * sigmoid_derivative(h1)

δ_h2 = (y_hat - y) * w6 * sigmoid_derivative(h2)

Update Weights and Biases:

 Use the error gradients to update the weights and biases of the network.

 Compute the gradient of the loss function with respect to each weight and bias.

Update each weight and bias using an optimization algorithm like gradient descent.

For example, the update rule for a weight might be:

css

w_new = w_old - learning_rate * δ * input

Where learning_rate is a hyperparameter controlling the size of the updates.

Repeat for Each Training Example:

Iterate through the entire training dataset, performing steps 1-3 for each example.

This process is typically repeated for multiple epochs until the network's performance converges
or reaches a satisfactory level.

Different types of learning, examples.

Objective of Learning:
There are many varieties of neural networks. In the final analysis, as we have discussed
network modeling, all neural networks do one or more of the
following :
 Pattern classification
 Pattern completion
 Optimization
 Data clustering
 Approximation
 Function evaluation

A neural network, in any of the previous tasks, maps a set of inputs to a set of outputs. This
nonlinear mapping can be thought of as a multidimensional mapping surface.
The objective of learning is to mold the mapping surface according to a desired
response, A network can learn when training is used, or the network can learn also in the absence
of training. The difference between supervised and unsupervised training is that, in the
former case, external prototypes are used as target outputs for specific inputs, and the
network is given a learning algorithm to follow and calculate new connection weights that
bring the output closer to the target output. Unsupervised learning is the sort of learning
that takes place without a teacher. For example, when you are finding your way out of a
labyrinth, no teacher is present. You learn from the responses or events that develop as you
try to feel your way through the maze. For neural networks, in the unsupervised case, a
learning algorithm may be given but target outputs are not given. In such a case, data input
to the network gets clustered together; similar input stimuli cause similar responses.
When a neural network model is developed and an appropriate learning algorithm is
proposed, it would be based on the theory supporting the model. Since the dynamics of the
operation of the neural network is under study, the learning equations are initially
formulated in terms of differential equations. After solving the differential equations, and
using any initial conditions that are available, the algorithm could be simplified to consist
of an algebraic equation for the changes in the weights. These simple forms of learning
equations are available for your neural networks.
At this point of our discussion you need to know what learning algorithms are available,
and what they look like. We will now discuss two main rules for learning—Hebbian
learning, used with unsupervised learning and the delta rule, used with supervised
learning. Adaptations of these by simple modifications to suit a particular context generate
many other learning rules in use today. Following the discussion of these two rules, we
present variations for each of the two classes of learning: supervised learning and
unsupervised learning.

Hebb’s Rule:
Learning algorithms are usually referred to as learning rules. The foremost such rule is due
to Donald Hebb. Hebb’s rule is a statement about how the firing of one neuron, which has
a role in the determination of the activation of another neuron, affects the first neuron’s
influence on the activation of the second neuron, especially if it is done in a repetitive
manner. As a learning rule, Hebb’s observation translates into a formula for the difference
in a connection weight between two neurons from one iteration to the next, as a constant
[mu] times the product of activations of the two neurons. How a connection weight is to be
modified is what the learning rule suggests. In the case of Hebb’s rule, it is adding the
quantity [mu]aiaj, where ai is the activation of the ith neuron, and aj is the activation of the jth
neuron to the connection weight between the ith and jth neurons. The constant [mu] itself is
referred to as the learning rate. The following equation using the notation just
described, states it succinctly:
[Delta]wij = [mu]aiaj

As you can see, the learning rule derived from Hebb’s rule is quite simple and is used in
both simple and more involved networks. Some modify this rule by replacing the quantity ai with
its deviation from the average of all as and, similarly, replacing aj by a corresponding quantity.
Such rule variations can yield rules better suited to different situations.
For example, the output of a neural network being the activations of its output layer
neurons, the Hebbian learning rule in the case of a perceptron takes the form of adjusting
the weights by adding [mu] times the difference between the output and the target.
Sometimes a situation arises where some unlearning is required for some neurons. In this
case a reverse Hebbian rule is used in which the quantity [mu]aiaj is subtracted from the
connection weight under question, which in effect is employing a negative learning rate.
In the Hopfield network of Chapter 1, there is a single layer with all neurons fully
interconnected. Suppose each neuron’s output is either a + 1 or a – 1. If we take [mu] = 1
in the Hebbian rule, the resulting modification of the connection weights can be described
as follows: add 1 to the weight, if both neuron outputs match, that is, both are +1 or –1.
And if they do not match (meaning one of them has output +1 and the other has –1), then
subtract 1 from the weight.

Delta Rule:
The delta rule is also known as the least mean squared error rule (LMS). You first calculate
thesquare of the errors between the target or desired values and computed values, and then take
the average to get the mean squared error. This quantity is to be minimized. For this, realize that it
is a function of the weights themselves, since the computation of output uses them. The set of
values of weights that minimizes the mean squared error is what is needed for the next cycle of
operation of the neural network. Having worked this out mathematically, and having compared
the weights thus found with the weights actually used, one determines their difference and gives it
in the delta rule,each time weights are to be updated. So the delta rule, which is also the rule used
first by Widrow and Hoff, in the context of learning in neural networks, is stated as an equation
defining the change in the weights to be affected.
Suppose you fix your attention to the weight on the connection between the ith neuron in one
layer and the jth neuron in the next layer. At time t, this weight is wij(t) . After one cycle of
operation,
this weight becomes wij(t + 1). The difference between the two is wij(t + 1) - wij(t), and is denoted
by [Delta]wij . The delta rule then gives [Delta]wij as :
[Delta]wij = 2[mu]xi(desired output value – computed output value)j
Here, [mu] is the learning rate, which is positive and much smaller than 1, and xi is the ith
component of the input vector.

Supervised Learnin:
Supervised neural network paradigms to be discussed include :
• Perceptron
• Adaline
• Feedforward Backpropagation network
• Statistical trained networks (Boltzmann/Cauchy machines)
 Radial basis function networks
The Perceptron and the Adaline use the delta rule; the only difference is that the Perceptron has
binary output, while the Adaline has continuous valued output. The Feedforward Backpropagation
network uses the generalized delta rule, which is described next.

Unsupervised Learning:

In unsupervised learning, the system learns patterns and relationships in the data without explicit
supervision. It identifies hidden structures or clusters within the data.

Example: Clustering similar customer profiles in a retail dataset based on purchasing behavior
without knowing in advance the categories or labels for the clusters.

Reinforcement Learning:

Reinforcement learning involves learning optimal actions to take in a given situation to maximize
a cumulative reward signal. The system learns through trial and error, receiving feedback from the
environment.

Example: Teaching a fuzzy controller to navigate a maze by rewarding successful movements


towards the goal and penalizing collisions or wrong turns.
Semi-Supervised Learning:

Semi-supervised learning combines elements of supervised and unsupervised learning. It


leverages a small amount of labeled data along with a larger pool of unlabeled data to improve
learning performance.

Example: Using a small set of labeled images of different fruits along with a larger set of
unlabeled images to train a fuzzy classifier to recognize various fruits in images.

Active Learning:

Active learning involves an interactive process where the system selects the most informative data
points to query labels for from an oracle (human expert or another source). This helps maximize
learning efficiency.

Example: An autonomous vehicle using fuzzy logic selects uncertain or ambiguous driving
scenarios to present to a human operator for clarification, thereby improving its understanding of
complex traffic situations.

Online Learning:

Online learning, also known as incremental learning, involves updating the model continuously as
new data becomes available. It adapts to changes in the environment over time.

Example: A fuzzy control system for an industrial process continuously adjusts its parameters
based on real-time sensor data to maintain optimal performance despite changing operating
conditions.
UNIT: IV
Mapping and Recurrent Networks:
Counter propagation–Self organization Map-Cognitron and Neo cognitron – Hop field Net-
Kohonnen Nets-Gross berg Nets-Art-I, Art-II reinforcement learning.

Counter propagation
Counterpropagation Network is supervised neural network that can be used for
multimodal processing, but is not trained using the backpropagation rule is
the counterpropagation network. This network has been specifically developed to
provide bidirectional mapping between input and output training patterns. For
this reason it can be used in multimodal processing to associate different modal data
sets. Counterpropagation networks typically converge much more quickly than
multilayer perceptron neural networks. Consequently, counterpropagation networks
are preferred over multilayer perceptron neural networks when the data sets to be
processed are large and time is critical.
The counterpropagation network architecture is an example of the third category of
supervised crossmodal neural networks in which there are no separate modal layers,
and all processing is carried out by a common multimodal hidden layer. In
the counterpropagation network, the hidden layer is trained using Kohonen’s self-
organising learning rule and the output layer, which maps the output of the hidden
layer to target output values, is trained using Grossberg’s outstar learning algorithm.
The hidden layer is referred to as the Kohonen layer whilst the output layer is referred
to as the Grossberg layer.
There are two types of counterpropagation networks: full and forward-only.
Full counterpropagation networks are designed to learn bidirectional mappings
between two sets of vectors whilst forward-only counterpropagation networks are
trained to provide the mapping in only one direction. We consider the training of the
forward-only network first, followed later by a description of the
full counterpropagation network.
Forward-only counterpropagation
Kohonen’s Layer Grossberg’s Layer

Figure : Forward only counterpropagation architecture


To train the forward-only counterpropagation network (Figure ), examples of the
desired
mapping are presented to the network. Each example consists of the input vector =
Ân´1 x and output vector =Âm´1 y . The weights in the Kohonen and Grossberg layer
are trained independently.
First, input vectors are applied to the network, and the neurons in the hidden layer
are trained through Kohonen’s self-organising learning. For each input, the neuron
with weights closest to the input pattern wins, and its weight vector wj is updated in
accordance with the equation.
Full-counterpropagation
The full counterpropagation network differs from the forward
only counterpropagation network in that both the hidden layer and output layer have
two sets of weights, one for the x input vectors and the other set for the y output
vectors.

Figure : Full counterpropagation network (adapted from Ham and Kostanic, 2001)

The Kohonen learning rule is used to update the hidden layer neuron weights .
As in the forward counterpropagation network, the winner generates an output 1,
with the output of the other neurons being set to 0.

are learning rate parameters and i is the index of the winning neuron.
As in the forward counter propagation network, the winner generates an output 1,
with the output of the other neurons being set to 0.

Forward-only counterpropagation

Figure : Forward only counterpropagation architecture

To train the forward-only counterpropagation network examples of the desired


mapping are presented to the network. Each example consists of the
input vector x and output vector = y . The weights in the Kohonen and Grossberg
layer are trained independently. First, input vectors are applied to the network, and
the neurons in the hidden layer are trained through Kohonen’s self-organising
learning. For each input, the neuron with weights closest to the input pattern wins,
and its weight vector wj is updated in accordance with the equation:
(1)

where a is the learning rate. Other neurons in the hidden layer do not adjust their
weights.
After the Kohonen layer has been trained, input-output vector pairs are then applied
to the network,and the output layer is trained in accordance with Grossberg’s
learning rule. For each applied input,winner take all competition ensues between
the Kohonen layer neurons and the winner generates anoutput 1, with the output of
the other neurons being set to 0. The weights in the output layer are then updated in
accordance with the Grossberg learning rule:

(2)
h
where uji is the weight between the jt second layer neuron and the i neuron in the
output layer; b is the learning rate parameter for the output layer, and i z is the
output of the th i neuron in the Kohonen layer.
Self organization Map

Kohonen Self- Organizing Feature Map:

Kohonen Self-Organizing feature map (SOM) refers to a neural network, which is trained using
competitive learning. Basic competitive learning implies that the competition process takes place
before the cycle of learning. The competition process suggests that some criteria select a winning
processing element. After the winning processing element is selected, its weight vector is adjusted
according to the used learning law (Hecht Nielsen 1990).

The self-organizing map makes topologically ordered mappings between input data and
processing elements of the map. Topological ordered implies that if two inputs are of similar
characteristics, the most active processing elements answering to inputs that are located closed to
each other on the map. The weight vectors of the processing elements are organized in ascending
to descending order. Wi < Wi+1 for all values of i or Wi+1 for all values of i (this definition is valid
for one-dimensional self-organizing map only).

The self-organizing map is typically represented as a two-dimensional sheet of processing


elements described in the figure given below. Each processing element has its own weight vector,
and learning of SOM (self-organizing map) depends on the adaptation of these vectors. The
processing elements of the network are made competitive in a self-organizing process, and
specific criteria pick the winning processing element whose weights are updated. Generally, these
criteria are used to limit the Euclidean distance between the input vector and the weight vector.
SOM (self-organizing map) varies from basic competitive learning so that instead of adjusting
only the weight vector of the winning processing element also weight vectors of neighboring
processing elements are adjusted. First, the size of the neighborhood is largely making the rough
ordering of SOM and size is diminished as time goes on. At last, only a winning processing
element is adjusted, making the fine-tuning of SOM possible. The use of neighborhood makes
topologically ordering procedure possible, and together with competitive learning makes process
non-linear.

It is discovered by Finnish professor and researcher Dr.Teuvo Kohonen in 1982. The self-
organizing map refers to an unsupervised learning model proposed for applications in which
maintaining a topology between input and output spaces. The notable attribute of this algorithm is
that the input vectors that are close and similar in high dimensional space are also mapped to close
by nodes in the 2D space. It is fundamentally a method for dimensionality reduction, as it maps
high-dimension inputs to a low dimensional discretized representation and preserves the basic
structure of its input space.
Cognitron and Neo cognitron

The synaptic strength from cell X to cell Y is reinforced if and only if the following two
conditions are true:

l. Cell X- presynaptic cell fires.

2. None of the postsynaptic cells present near cell Y fire stronger than Y. The model developed
by Fukushima was called cognitron as a successor to the perceptron which can perform
cognizance of symbols from any alphabet after training.

Figure 6-6 shows the connection between presynaptic cell and postsynaptic cell. The cognitron
network is a self-organizing multilayer neural network. Its nodes receive input from the defined
areas of the previous layer and also from units within its own area. The input and output neural
elements can rake the form of positive analog values, which are proportional to the pulse density
of firing biological neurons. The cells in the cognitron model use a mechanism of shunting
inhibition, i.e., a cell is bound in terms of a maximum and minimum activities and is driven
toward these extremities. The area from which the cell receives input is called connectable area.
The area formed by the inhibitory cluster is called the vicinity area.
Figure 6. 7 shows the model of a cognitron. Since the connectable areas for cells in the same
vicinity are defined to overlap, but are not exactly the same, there will bea slight difference
appearing between the cells which is reinforced so that the gap becomes more apparent. Like this,
each cell is allowed to develop its own characteristics. Cognitron network can be used in
neurophysiology and psychology. Since this network closely resembles the natural characteristics
of a biological neuron, this is best suited for various kinds of visual and auditory information
processing systems. However, a major drawback of cognitron net is that it cannot deal with the
problems of orientation or distortion. To overcome this drawback, an improved version called
neocognitron was developed.

Neocognitron Network:

Neocognitron is a multilayer feed-forward network model for visual pattern recognition. It is a


hierarchical net comprising many layers and there is a localized pattern of connectivity between
the layers. It is an extension of cognitron network. Neocognitron net can be used for recognizing
hand-written characters. A neocognitron model is

The algorithm used in cognitron and neocognitron is same, except that neocognicron model can
recognize patterns that are position-shifted or shapedistorted.

The cells used in neocognitron are of two types:

1. S·-cell: Cells that are trained suitably to. respond to only certain features in the previous layer.
1. C-cell· A C-cell displaces the result of an S-cell in space, i.e., son of "spreads" the features
recognized by the S-cell.

Neocognitron net consists of many modules with the layered arrangement of Scells and C-cells.
The S-cells receive the input from the previous layer, while Ccells receive the input from the S-
layer.

During training, only the inputs to the Slayer are modif1ed.

The S-layer helps in the detection of spccif1c features and their complexities. The feature
recognized in the S1 layer may be a horizontal bar or a vertical bar but the feature in the Sn layer
may be more complex.

Each unit in the C-layer corresponds to one relative position independent feature. For the
independent feature, C-node receives the inputs from a subset of S-layer nodes. For instance, if
one node in C-layer detects a vertical line and if four nodes in the preceding S-layer detect a
vertical line, then these four nodes will give the input to the specific node in C-layer to spatially
distribute the extracted features.

Modules present near the input layer (lower in hierarchy) will be trained before the modules that
are higher in hierarchy, i.e., module 1 will be trained before module 2 and so on.

The users have to fix the "receptive field" of each C-node before training starts because the inputs
to C-node cannot be modified. The lower level modules have smaller receptive fields while the
higher level modules indicate complex independent features present in the hidden layer

Hop field Net

Hopfield network :
A Hopfield network is a form of recurrent artificial neural network invented by John Hopfield in
1982. Hopfield nets serve as content-addressable memory systems
with binary threshold nodes. A Hopfield network is a class of artificial neural network where
connections between units form a directed cycle. This creates an internal state of the network
which allows it to exhibit dynamic temporal behaviour. Unlike feedforward neural networks,
RNNs can use their internal memory to process arbitrary sequences of inputs. This makes them
applicable to tasks such as unsegmented connected handwriting recognition, where they have
achieved the best known results. Hopfield networks also provide a model for understanding
human memory.
CONFIGURATION
The units in Hopfield nets are binary threshold units, i.e. the units only take on two different
values for their states and the value is determined by whether or not the units' input exceeds their
threshold. Hopfield nets normally have units that take on values of 1 or -1, and this convention
will be used throughout the article. However, other literature might use units that take values of 0
and 1.
Every pair of units i and j in a Hopfield network have a connection that is described by the
connectivity weight . In this sense, the Hopfield network can be formally described as a
complete undirected graph , where is a set of McCulloch-Pitts
neurons and is a function that links pairs of nodes to a real value, the connectivity
weight.
The connections in a Hopfield net typically have the following restrictions:
· (no unit has a connection with itself)
· (connections are symmetric)
The requirement that weights be symmetric is typically used, as it will guarantee that the energy
function decreases monotonically while following the activation rules, and the network may
exhibit some periodic or chaotic behaviour if non-symmetric weights are used. However, Hopfield
found that this chaotic behavior is confined to relatively small parts of the phase space, and does
not impair the network's ability to act as a content-addressable associative memory system

Updation of Weight:

Updating one unit (node in the graph simulating the artificial neuron) in the Hopfield network is
performed using the following rule:

where:
· is the strength of the connection weight from unit j to unit i (the weight of the
connection).
· is the state of unit j.
· is the threshold of unit i.
Updates in the Hopfield network can be performed in two different ways:
· Asynchronous: Only one unit is updated at a time. This unit can be picked
at random, or a pre-defined order can be imposed from the very beginning.
· Synchronous: All units are updated at the same time. This requires a
central clock to the system in order to maintain synchronization. This method is less
realistic, since biological or physical systems lack a global clock that keeps track of
time.

Energy:
Hopfield nets have a scalar value associated with each state of the network referred to as the
"energy", E, of the network, where:

This value is called the "energy" because the definition ensures that when units are randomly
chosen to update, the energy E will either lower in value or stay the same. Furthermore, under
repeated updating the network will eventually converge to a state which is a local minimum in the
energy function (which is considered to be a Lyapunov function). Thus, if a state is a local
minimum in the energy function, it is a stable state for the network. Note that this energy function
belongs to a general class of models in physics, under the name of Ising models; these in turn are
a special case of Markov networks, since the associated probability measure, the Gibbs measure,
has the Markov property

In this Fig Energy Landscape of a Hopfield Network, highlighting the current state of the
network (up the hill), an attractor state to which it will eventually converge, a minimum
energy level and a basin of attraction shaded in green. Note how the update of the
Hopfield Network is always going down in Energy.
Kohonnen Nets

Self Organizing Map (or Kohonen Map or SOM) is a type of Artificial Neural Network
which is also inspired by biological models of neural systems from the 1970s. It follows an
unsupervised learning approach and trained its network through a competitive learning
algorithm. SOM is used for clustering and mapping (or dimensionality reduction) techniques to
map multidimensional data onto lower-dimensional which allows people to reduce complex
problems for easy interpretation. SOM has two layers, one is the Input layer and the other one is
the Output layer.
The architecture of the Self Organizing Map with two clusters and n input features of any
sample is given below:

How do SOM works?


Let’s say an input data of size (m, n) where m is the number of training examples and n is the
number of features in each example. First, it initializes the weights of size (n, C) where C is the
number of clusters. Then iterating over the input data, for each training example, it updates the
winning vector (weight vector with the shortest distance (e.g Euclidean distance) from training
example). Weight updation rule is given by :
wij = wij(old) + alpha(t) * (x ik - wij(old))
where alpha is a learning rate at time t, j denotes the winning vector, i denotes the i th feature of
training example and k denotes the k th training example from the input data. After training the
SOM network, trained weights are used for clustering new examples. A new example falls in the
cluster of winning vectors.

Algorithm

Training:
Step 1: Initialize the weights w ij random value may be assumed. Initialize the learning rate α.
Step 2: Calculate squared Euclidean distance.
D(j) = Σ (wij – xi)^2 where i=1 to n and j=1 to m
Step 3: Find index J, when D(j) is minimum that will be considered as winning index.
Step 4: For each j within a specific neighborhood of j and for all i, calculate the new weight.
wij(new)=wij(old) + α[xi – wij(old)]
Step 5: Update the learning rule by using :
α(t+1) = 0.5 * t
Step 6: Test the Stopping Condition.

Gross berg Nets

Grossberg networks, proposed by Stephen Grossberg, are a type of neural network known for
their ability to perform tasks such as pattern recognition, classification, and decision-making.
They are often used in cognitive science and neuroscience to model various aspects of human
brain function.

Integrating Grossberg networks with fuzzy logic can offer several advantages, particularly in
handling uncertainty and imprecision in input data and decision-making processes. Here's how
Grossberg networks can be utilized within a fuzzy logic framework:

1. Fuzzy Sets and Membership Functions:

Input Representation: Fuzzy logic can be employed to represent input features using fuzzy sets
and membership functions. This allows for a more flexible representation of uncertain or
imprecise input data within the Grossberg network.

2. Fuzzy Rule-Based Systems:

Decision Making: Fuzzy logic rules can guide decision-making processes within the Grossberg
network. Fuzzy inference systems can interpret network outputs and make decisions based on
fuzzy logic inference, allowing for robust decision-making in uncertain environments.

3. Uncertainty Handling:

Incorporating Uncertainty: Fuzzy logic enables the incorporation of uncertainty measures within
the network. Fuzzy membership functions can represent the certainty or ambiguity of input
features or network outputs, allowing the network to handle uncertainty more effectively.

4. Adaptation and Learning:

Adaptive Fuzzy Control: Fuzzy logic can provide adaptive control mechanisms within the
Grossberg network. This allows for dynamic adjustment of network parameters in response to
changes in input data or task requirements, improving the network's adaptability and performance.

5. Probabilistic Interpretation:

Probabilistic Framework: Fuzzy logic can provide a probabilistic interpretation of network


outputs. This enables the quantification of uncertainty in predictions and facilitates probabilistic
reasoning within the network.

6. Hierarchical Processing:

Fuzzy Hierarchical Representation: Fuzzy logic can aid in hierarchical representation learning
within the network. Fuzzy sets and rules can capture relationships between low-level and high-
level features in a more flexible and adaptive manner.
7. Hybrid Systems:

Fuzzy-Neural Integration: Combining fuzzy logic with Grossberg networks results in hybrid
systems that leverage the strengths of both approaches. Fuzzy logic enhances the network's ability
to handle uncertainty and imprecision, while the network learns complex patterns and
relationships from data.

8. Pattern Recognition and Classification:

Fuzzy Pattern Recognition: Grossberg networks, within a fuzzy logic framework, can improve
pattern recognition and classification tasks by incorporating fuzzy sets and rules to handle
uncertain or imprecise input data.

Art-I, Art-II

Classification
ART is a family of different neural architecture.

ART 1:-
The first and most basic architecture is ART1 (Carpenter and Grossberg, 1987). ART1 can learn
and recognize binary patterns. It is the simplest variety of ART networks, accepting only binary
inputs. ART 2 ART 2 extends network capabilities to support continuous inputs.
ART 2:-
ART2 (Carpenter and Grossberg, 1987) is a class of architectures categorizing arbitrary
sequences of analog input patterns. It is a streamlined form of ART-2 with a drastically
accelerated runtime, and with qualitative results being only rarely inferior to the full ART-2
implementation.
ART 3 :-
It builds on ART-2 by simulating rudimentary neurotransmitter regulation of synaptic activity by
incorporating simulated sodium (Na+) and calcium (Ca2+) concentrations into the system’s
equations, which results in a more physiologically realistic means of partially inhibiting categories
that trigger mismatch resets.
Fuzzy ART :-
It implements fuzzy logic into ART’s pattern recognition, thus enhancing generalizability. An
optional (and very useful) feature of fuzzy ART is complement coding, a means of incorporating
the absence of features into pattern classifications, which goes a long way towards preventing
inefficient and unnecessary category proliferation.
ARTMAP :-
It is also known as Predictive ART, combines two slightly modified ART-1 or ART-2 units into a
supervised learning structure where the first unit takes the input data and the second unit takes the
correct output data, then used to make the minimum possible adjustment of the vigilance
parameter in the first unit in order to make the correct classification.

Fuzzy ARTMAP:- It is merely ARTMAP using fuzzy ART units, resulting in a corresponding
increase in efficacy. An ART system consists of two subsystems, an attentional subsystem and an
orienting subsystem. The stabilization of learning and activation occurs in the attentional
subsystem by matching bottom-up input activation and top-down expectation. The orienting
subsystem controls the attentional subsystem when a mismatch occurs in the attentional
subsystem. In other words, the orienting subsystem works like a novelty detector.
Adaptive Resonance Theory (ART)
Introduction:
Grossberg’s Adaptive Resonance Theory, developed further by Grossberg and Carpenter, is
for the categorization of patterns using the competitive learning paradigm. It introduces a gain
control and a reset to make certain that learned categories are retained even while new categories
are learned and thereby addresses the plasticity–stability dilemma.

Adaptive Resonance Theory makes much use of a competitive learning paradigm. A


criterion is developed to facilitate the occurrence of winner-take-all phenomenon. A single
node with the largest value for the set criterion is declared the winner within its layer, and
it is said to classify a pattern class.
If there is a tie for the winning neuron in a layer, then an arbitrary rule, such as the first of
them in a serial order, can be taken as the winner. The neural network developed for this theory
establishes a system that is made up of two subsystems, one being the attentional subsystem, and
this contains the unit for gain control.
The other is an orienting subsystem, and this contains the unit for reset. During the
operation of the network modeled for this theory, patterns emerge in the attentional
subsystem and are called traces of STM (short-term memory). Traces of LTM (long-term
memory) are in the connection weights between the input layer and output layer.

The network uses processing with feedback between its two layers, until resonance occurs.
Resonance occurs when the output in the first layer after feedback from the second layer
matches the original pattern used as input for the first layer in that processing cycle.

A match of this type does not have to be perfect. What is required is that the degree of
match, measured suitably, exceeds a predetermined level, termed vigilance parameter. Just as a
photograph matches the likeness of the subject to a greater degree when the granularity is
higher, the pattern match gets finer when the vigilance parameter is closer to 1.

The Network for ART1:


The neural network for the adaptive resonance theory or ART1 model consists of the
following:
 A layer of neurons, called the F1 layer (input layer or comparison layer)
 A node for each layer as a gain control unit
 A layer of neurons, called the F2 layer (output layer or recognition layer)
 A node as a reset unit
 Bottom-up connections from F1 layer to F2 layer
 Top-down connections from F2 layer to F1 layer
 Inhibitory connection (negative weight) form F2 layer to gain control
 Excitatory connection (positive weight) from gain control to a layer
 Inhibitory connection from F1 layer to reset node
 Excitatory connection from reset node to F2 layer.
simplified diagram of the neural network for an ART1 model.
Processing in ART1:

The ART1 paradigm, just like the Kohonen Self-Organizing Map to be introduced in performs
data clustering on input data; like inputs are clustered together into a category. As an example,
you can use a data clustering algorithm such as ART1 for Optical Character Recognition (OCR),
where you try to match different samples of a letter to its ASCII equivalent. Particular attention is
made in the ART1 paradigm to ensure that old information is not thrown away while new
information is assimilated.

An input vector, when applied to an ART1 system, is first compared to existing patterns in
the system. If there is a close enough match within a specified tolerance (as indicated by a
vigilance parameter), then that stored pattern is made to resemble the input pattern further and the
classification operation is complete. If the input pattern does not resemble any of the stored
patterns in the system, then a new category is created with a new stored pattern that resembles the
input pattern.

Special Features of the ART1 Model:

One special feature of an ART1 model is that a two-thirds rule is necessary to determine the
activity of neurons in the F1 layer. There are three input sources to each neuron in layer F1. They
are the external input, the output of gain control, and the outputs of F2 layer neurons. The
F1neurons will not fire unless at least two of the three inputs are active. The gain control unit and
the two-thirds rule together ensure proper response from the input layer neurons. A second feature
is that a vigilance parameter is used to determine the activity of the reset unit, which is activated
whenever there is no match found among existing patterns during classification.

Reinforcement learning
Reinforcement learning is an area of Machine Learning. It is about taking suitable action to
maximize reward in a particular situation. It is employed by various software and machines to
find the best possible behavior or path it should take in a specific situation. Reinforcement
learning differs from supervised learning in a way that in supervised learning the training data
has the answer key with it so the model is trained with the correct answer itself whereas in
reinforcement learning, there is no answer but the reinforcement agent decides what to do to
perform the given task. In the absence of a training dataset, it is bound to learn from its
experience.
Reinforcement Learning (RL) is the science of decision making. It is about learning the optimal
behavior in an environment to obtain maximum reward. In RL, the data is accumulated from
machine learning systems that use a trial-and-error method. Data is not part of the input that we
would find in supervised or unsupervised machine learning.
Reinforcement learning uses algorithms that learn from outcomes and decide which action to
take next. After each action, the algorithm receives feedback that helps it determine whether the
choice it made was correct, neutral or incorrect. It is a good technique to use for automated
systems that have to make a lot of small decisions without human guidance.
Reinforcement learning is an autonomous, self-teaching system that essentially learns by trial
and error. It performs actions with the aim of maximizing rewards, or in other words, it is
learning by doing in order to achieve the best outcomes.
Example:
The problem is as follows: We have an agent and a reward, with many hurdles in between. The
agent is supposed to find the best possible path to reach the reward. The following problem
explains the problem more easily.

The above image shows the robot, diamond, and fire. The goal of the robot is to get the reward
that is the diamond and avoid the hurdles that are fired. The robot learns by trying all the
possible paths and then choosing the path which gives him the reward with the least hurdles.
Each right step will give the robot a reward and each wrong step will subtract the reward of the
robot. The total reward will be calculated when it reaches the final reward that is the diamond.
Main points in Reinforcement learning –

 Input: The input should be an initial state from which the model will start
 Output: There are many possible outputs as there are a variety of solutions to a particular
problem
 Training: The training is based upon the input, The model will return a state and the user
will decide to reward or punish the model based on its output.
 The model keeps continues to learn.
 The best solution is decided based on the maximum reward.
UNIT:V
Case Studies
Application of fuzzy logic and neural networks to Measurement-Control-Adaptive Neural
Controllers –Signal Processing and Image Processing.
Application of fuzzy logic and neural networks to Measurement

Case Study: Smart Energy Management System

Problem Statement:

A company aims to develop a smart energy management system for optimizing energy
consumption in a manufacturing plant. The system needs to accurately measure and predict
energy usage across different processes to minimize costs and environmental impact.

Solution Approach:

1. Data Collection and Preprocessing:

Collect historical data on energy consumption, production output, environmental conditions, and
other relevant variables.

Preprocess the data to handle missing values, outliers, and noise.

2. Fuzzy Logic for Uncertainty Handling:

Use fuzzy logic to model uncertainty and imprecision in the data, especially in variables like
environmental conditions and equipment efficiency.

Define fuzzy sets and membership functions to represent linguistic variables such as "high",
"medium", and "low" for energy consumption, production demand, etc.

Implement fuzzy inference systems to make decisions based on fuzzy rules and expert knowledge.
For example, fuzzy rules can relate energy consumption to production demand and environmental
factors.

3. Neural Networks for Prediction and Control:

Train neural networks to predict future energy consumption based on historical data and current
conditions.

Use recurrent neural networks (RNNs) or long short-term memory (LSTM) networks to capture
temporal dependencies in energy usage patterns.

Implement neural network controllers to adjust energy usage in real-time based on predicted
demands and environmental conditions.

Optimize neural network architectures and hyperparameters using techniques like cross-validation
and grid search.

4. Integration and Optimization:

Integrate fuzzy logic and neural networks into a unified system for smart energy management.
Develop optimization algorithms to find the optimal setpoints for energy-consuming devices (e.g.,
HVAC systems, lighting, machinery) based on predictions from the neural networks and decisions
from the fuzzy logic controllers.

Implement feedback mechanisms to continuously update the models and controllers based on new
data and performance feedback.

Results and Benefits:

The integrated system accurately predicts energy demand and adjusts energy usage in real-time,
leading to significant cost savings and reduced environmental impact.

By incorporating fuzzy logic, the system can handle uncertainty and imprecision in the data and
decision-making process, improving robustness and reliability.

The neural network models continuously learn from new data, leading to improved prediction
accuracy and control performance over time.

Case Study: Smart Water Quality Monitoring System

Problem Statement:

A municipality wants to develop a smart water quality monitoring system to ensure the safety and
purity of its drinking water supply. The system should accurately measure various water quality
parameters and detect anomalies or contamination events in real-time.

Solution Approach:

1. Data Collection and Preprocessing:

Collect real-time data from sensors installed at different points in the water distribution network.
These sensors measure parameters such as pH, turbidity, dissolved oxygen, and conductivity.

Preprocess the data to handle noise, outliers, and missing values. Apply signal processing
techniques to filter and smooth sensor readings.

2. Fuzzy Logic for Anomaly Detection:

Use fuzzy logic to model uncertainty and imprecision in water quality measurements.

Define fuzzy sets and membership functions for linguistic variables representing different levels
of water quality (e.g., "clean", "acceptable", "contaminated").

Implement fuzzy inference systems to detect anomalies or contamination events based on


deviations from normal water quality ranges. Fuzzy rules can relate sensor readings to water
quality states and trigger alarms when anomalies are detected.

3. Neural Networks for Prediction and Classification:

Train neural networks to predict future water quality based on historical data and current sensor
readings.

Use feedforward neural networks or recurrent neural networks (RNNs) to capture temporal
dependencies in water quality patterns.
Implement neural network classifiers to categorize water quality events (e.g., contamination
types) based on sensor data patterns.

Optimize neural network architectures and parameters using techniques like cross-validation and
hyperparameter tuning.

4. Integration and Deployment:

Integrate fuzzy logic and neural networks into a unified smart water quality monitoring system.

Develop a user interface for water quality engineers to visualize sensor data, monitor system
performance, and receive alerts in case of anomalies or contamination events.

Deploy the system in the water distribution network and continuously monitor its performance.

Results and Benefits:

The smart water quality monitoring system accurately detects anomalies and contamination events
in real-time, allowing prompt responses and mitigation measures to be taken.

By incorporating fuzzy logic, the system can handle uncertainty and imprecision in water quality
measurements, improving reliability and robustness.

The neural network models provide accurate predictions of future water quality and enable
classification of different types of water quality events, enhancing overall system performance.

Control-Adaptive Neural Controllers

Adaptive neural controllers are a class of control systems that utilize neural networks to
adaptively adjust control parameters based on changing operating conditions or system dynamics.
These controllers offer several advantages over traditional control approaches, such as improved
robustness, flexibility, and adaptability to nonlinear and time-varying systems. Here's a
breakdown of adaptive neural controllers:

Components of Adaptive Neural Controllers:

1. Neural Networks:

Function Approximators: Neural networks are used to approximate the system's dynamics or the
control policy.

Universal Approximators: Neural networks have the capability to approximate any continuous
function, making them suitable for modeling complex system behaviors.

2. Adaptation Mechanisms:

Online Learning: Adaptive neural controllers continuously learn from system feedback and adjust
their parameters in real-time.

Backpropagation: Error signals from the control performance are propagated backward through
the neural network layers to update weights and biases.

Reinforcement Learning: Some adaptive controllers use reinforcement learning techniques to


learn optimal control policies through trial and error.

3. Control Strategies:
Model-Free Control: Adaptive neural controllers do not require an explicit model of the system
dynamics. Instead, they learn control policies directly from input-output data.

Model-Based Control: In some cases, neural networks are used to learn or approximate the system
model, which is then used for control synthesis.

4. Adaptation Algorithms:

Gradient Descent Methods: Common optimization techniques such as stochastic gradient descent
(SGD) or variants like Adam are used to update neural network parameters.

Kalman Filters: Kalman filters or their variants may be incorporated into the adaptation process
to estimate system states and improve control performance.

Applications of Adaptive Neural Controllers:

Robotics: Adaptive neural controllers are used for robot control tasks such as trajectory tracking,
obstacle avoidance, and manipulation of objects in dynamic environments.

Aerospace: In aerospace applications, adaptive neural controllers are employed for aircraft
autopilot systems, spacecraft attitude control, and flight path optimization.

Process Control: Adaptive neural controllers find applications in process industries for
controlling variables such as temperature, pressure, and flow rates in chemical plants, power
plants, and manufacturing facilities.

Power Systems: In power systems, adaptive neural controllers are used for voltage and frequency
regulation, optimal power flow, and stability enhancement.

Automotive: Adaptive neural controllers are applied in vehicle control systems for tasks such as
adaptive cruise control, lane keeping assistance, and collision avoidance.

Advantages of Adaptive Neural Controllers:

Robustness: Adaptive neural controllers can adapt to uncertainties and disturbances in the system,
improving robustness compared to fixed controllers.

Flexibility: They can handle nonlinear and time-varying systems, making them suitable for a
wide range of applications.

Adaptability: Adaptive neural controllers can continuously learn and update their control policies
based on changing system conditions, ensuring optimal performance over time.

Challenges and Considerations:

Training Data: Adequate training data is required to ensure accurate learning and adaptation,
which may be challenging to obtain in some applications.

Computational Complexity: Training and updating neural network parameters may require
significant computational resources, especially for large-scale systems.

Overfitting: There's a risk of overfitting to the training data, which can lead to poor
generalization performance on unseen data.

Case Study: Adaptive Traffic Signal Control System


Problem Statement:

A city faces traffic congestion issues at intersections during peak hours, leading to delays,
inefficiencies, and increased pollution. The city authorities aim to develop an adaptive traffic
signal control system that can dynamically adjust signal timings to optimize traffic flow and
reduce congestion.

Solution Approach:

1. Data Collection and Preprocessing:

Collect real-time data from traffic sensors installed at intersections, including vehicle counts,
speeds, queue lengths, and traffic patterns.Preprocess the data to handle noise, outliers, and
missing values. Apply techniques like data fusion to integrate information from multiple sensors.

2. Neural Network Model:

Train a neural network model to predict traffic flow and congestion levels based on historical data
and current sensor readings.Use a recurrent neural network (RNN) or long short-term memory
(LSTM) network to capture temporal dependencies in traffic patterns and predict future traffic
conditions.

3. Adaptive Control Strategy:

Develop an adaptive control strategy based on the predicted traffic conditions and system
feedback.Use the neural network predictions to dynamically adjust signal timings at each
intersection to optimize traffic flow and minimize delays.Implement reinforcement learning
techniques to learn optimal control policies through trial and error, taking into account factors
such as traffic volume, vehicle speeds, and congestion levels.

4. Real-time Optimization:

Deploy the adaptive traffic signal control system in a pilot area or selected intersections within the
city.Continuously monitor traffic conditions and system performance in real-time.Adaptively
update the neural network model and control policies based on new data and feedback from the
system to improve accuracy and effectiveness.

Results and Benefits:

The adaptive traffic signal control system effectively reduces congestion, delays, and pollution at
intersections during peak hours.

By dynamically adjusting signal timings based on real-time traffic conditions, the system
optimizes traffic flow and minimizes travel times for commuters.

The neural network model continuously learns and adapts to changing traffic patterns, leading to
improved prediction accuracy and control performance over time.
Signal Processing and Image Processing.

Fuzzy Logic in Signal Processing:

Noise Filtering:

Fuzzy logic can be used to design adaptive filters that adjust their parameters based on the
characteristics of the input signal and noise.

Fuzzy systems can determine the degree to which each frequency component contributes to the
noise, allowing for more effective noise reduction.

Signal Denoising:

Fuzzy logic-based algorithms can distinguish between signal and noise components by
considering their membership functions.

Fuzzy inference systems can be used to estimate the signal's true values from noisy measurements
while considering the uncertainty associated with each measurement.

Feature Extraction:

Fuzzy logic can extract features from signals by defining fuzzy sets and membership functions to
represent signal characteristics.

Fuzzy clustering algorithms can partition signals into meaningful clusters based on similarity
measures derived from fuzzy memberships.

Classification and Recognition:

Fuzzy logic can aid in classifying signals into different categories by considering multiple features
and their uncertainties.

Fuzzy rule-based systems can be employed to recognize patterns or events in signals, taking into
account fuzzy relationships between input features and output classes.

Fuzzy Logic in Image Processing:

Image Enhancement:

Fuzzy logic-based contrast enhancement techniques can adjust image brightness and contrast
while preserving image details.

Fuzzy histogram equalization methods can improve the visibility of details in both bright and dark
regions of an image.

Image Denoising:

Fuzzy inference systems can be employed to denoise images by considering spatial and intensity
similarities among neighboring pixels.

Fuzzy median filtering techniques can effectively remove impulse noise while preserving image
details and edges.
Feature Extraction:

Fuzzy clustering algorithms such as Fuzzy C-Means (FCM) can partition images into meaningful
regions based on similarity measures derived from fuzzy memberships.

Fuzzy texture analysis methods can extract texture features from images by considering spatial
and spectral variations in pixel intensities.

Object Recognition and Segmentation:

Fuzzy logic-based segmentation algorithms can partition images into regions based on fuzzy
similarity measures, allowing for more accurate delineation of object boundaries.

Fuzzy rule-based systems can classify image regions into different object classes based on fuzzy
relationships between image features and object categories.

Advantages of Fuzzy Logic in Signal and Image Processing:

Handling Uncertainty: Fuzzy logic can effectively model uncertainty and imprecision present in
signals and images, allowing for more robust processing.

Non-linear Relationships: Fuzzy logic can capture non-linear relationships between input and
output variables, which may not be adequately modeled by traditional methods.

Adaptability: Fuzzy systems can adapt their parameters and rules based on changes in the input
data or application requirements, leading to more flexible processing.

Challenges and Considerations:

Computational Complexity: Fuzzy logic-based algorithms may be computationally intensive,


especially for large-scale signal and image processing tasks.

Parameter Tuning: Designing effective fuzzy systems requires careful selection and tuning of
parameters, which may require domain expertise.

Interpretability: While fuzzy logic provides interpretable models, the interpretability of complex
fuzzy systems may be challenging, particularly for large-scale applications.

Case Study: Medical Image Denoising and Enhancement

Problem Statement:

A hospital's radiology department aims to improve the quality of medical images acquired from
various imaging modalities such as X-ray, MRI, and CT scans. The images often suffer from
noise, artifacts, and low contrast, which can affect diagnostic accuracy and patient care.

Solution Approach:

1. Data Collection and Preprocessing:

Gather a dataset of medical images representing different anatomical structures and pathologies.

Preprocess the images to remove artifacts, correct for uneven illumination, and standardize image
resolutions.
2. Signal Processing for Noise Reduction:

Apply signal processing techniques such as median filtering or wavelet denoising to remove noise
from the medical images.

Use adaptive filtering methods, such as Wiener filtering or Gaussian filtering, to selectively
remove noise while preserving image details.

3. Image Enhancement using Fuzzy Logic:

Design a fuzzy logic-based image enhancement system to improve the contrast and visibility of
anatomical structures.

Define fuzzy sets and membership functions to represent image intensities and spatial
relationships.

Develop fuzzy inference rules to adjust image contrast and brightness based on local image
characteristics.

4. Implementation of Deep Learning Models:

Train convolutional neural networks (CNNs) on a large dataset of medical images to learn
complex image transformations and feature representations.

Utilize pretrained CNN models, such as U-Net or ResNet, for tasks such as image denoising,
enhancement, and segmentation.

Fine-tune the pretrained models on the hospital's specific imaging data to improve performance
and generalization.

5. Validation and Evaluation:

Evaluate the denoising and enhancement algorithms using quantitative metrics such as peak
signal-to-noise ratio (PSNR) and structural similarity index (SSIM).

Validate the performance of the deep learning models using a separate test dataset and compare
results with ground truth annotations provided by expert radiologists.

Conduct clinical validation studies to assess the impact of image quality improvements on
diagnostic accuracy and clinical outcomes.

Results and Benefits:

The developed denoising and enhancement algorithms effectively improve the quality of medical
images, reducing noise and enhancing contrast.

The fuzzy logic-based image enhancement system provides intuitive control over image
appearance, allowing radiologists to customize visualization settings based on diagnostic needs.

The deep learning models demonstrate state-of-the-art performance in image denoising,


enhancement, and segmentation, outperforming traditional methods in many cases.

Clinical validation studies show that improved image quality leads to more accurate diagnoses,
better treatment planning, and improved patient outcomes.

You might also like