6.0 Fuzzy Sets
6.0 Fuzzy Sets
6.0 Fuzzy Sets
'
A u if
A u if
u
A
, 0
, 1
) (
1
Zadeh L A, Fuzzy Sets. Information Control, 8 (1965), 338-353.
Baikunth Nath 433-679 Evolutionary and Neural Computation
3
For example, let A = {x | x is a real number between -1 and +1}, then the following
characteristic function can be used to define set A (Figure 6.1):
Figure 6.1 The characteristic function
From Figure 6.1, we notice that for any real number, its membership for set A is uniquely
determined by the characteristic function, either equals 1 or 0. Therefore, in the classical
set theory, a definition of a concept (set) admits of no degrees. However, this contradicts
with the real world. As a matter of fact, a lot of concepts, especially used by human
beings, possess a grade of degree. Concepts such as mountain, hill, lake, pond and so on
are very hard to describe by a classical set. A lot of attributes, such as, low, high, short,
tall, beautiful, ugly, and so on are ambiguous. If we intend to describe tall men with a
classical set, we have to provide a precise standard, say, 180cm or taller. Then the
situation will be that a person whose height is 180.2cm will be classified as tall men
whereas another one who is 179.8cm will be not. This seemingly precise way of
classification brings about an unreasonable result.
The limitation of the classical set theory lies in the fact that a characteristic function that
describes a classical (crisp) set can only assume 0 or 1. However, if we allow a value
between 0 and 1, these difficulties can be removed.
'
otherwise
x if
x
A
, 0
1 1 , 1
) (
x -1 1
1
433-679 Evolutionary and Neural Computation Baikunth Nath
4
In fuzzy sets an object can belong to a set partially. The degree of membership is defined
through a generalized characteristic function called membership function:
] 1 , 0 [ : ) ( U u
A
for any U u and ] 1 , 0 [ ) ( u
A
is a function that specifies a degree to which element u
belongs to a set A. The set A is called a fuzzy set and the characteristic function ) (u
A
is
called a function. Thus an element belongs to a set with grade of degree. The maximal
membership is 1 and the minimal is 0. Therefore, we can see that a fuzzy set is an
extension of a classical set, where a classical set is a special case of a fuzzy set.
Now we can define tall men with a fuzzy set. Figure 6.2 provides an example.
Figure 6.2 A definition of fuzzy set tall men.
We notice that in this description, a person who is 170cm tall belongs to the fuzzy set tall
men with a degree of 0.6, whereas another one who is 180cm tall belong to tall men with
degree of 0.9. This is no doubt much sounder than we say that the latter is tall and the
former is not.
Notation
Assume U is the universe which is a classical set of n elements } , ... , , {
2 1 n
u u u U . A is a
fuzzy set on the universe U specified a membership function ) (u
A
, then the fuzzy set
can be expressed in one of the following three ways:
n n A A A
u u u u u u A ) ( ... ) ( ) (
2 2 1 1
+ + + Note that symbol / is not a division
and symbol + is not a plus.
))} ( , ( , ... )), ( , ( )), ( , {(
2 2 1 1 n A n A A
u u u u u u A It can be seen that a fuzzy set is
described by a series of ordered pairs.
)) ( , ... ), ( ), ( (
2 1 n A A A
u u u A Note that in this notation, a membership degree
cannot be left out from the vector even if it is zero.
Membership
Height
Tall Men
0.6
0.9
170 180
Baikunth Nath 433-679 Evolutionary and Neural Computation
5
If the universe is continuous and cannot be defined described by a finite number of
elements, the following notation is used to describe fuzzy set A:
U
A
u u A ) (
The symbol does not mean integration, but means a collection of.
To illustrate the defined notations of fuzzy sets, let U = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10},
A is a fuzzy set } , 5 | { U i to close is i i A . If the membership function is defined as in
Figure 6.3, then fuzzy set A can be written as:
A = 0/0 + 0.2/1 + 0.4/2 + 0.6/3 + 0.8/4 + 1/5 + 0.8/6 + 0.6/7 + 0.4/8 + 0.2/9 + 0/10
Figure 6.3 A definition of fuzzy set close to 5
If the fuzzy set is expressed by ordered pairs, it looks like:
A = {(0,0), (1,0.2), (2,0.4), (3,0.6), (4,0.8), (5,1.0), (6,0.8), (7,0.6), (8,0.4), (9,0.2), (10,0)
Finally, if A is notated as a vector, it has the following form:
A = (0, 0.2, 0.4, 0.6, 0.8, 1.0, 0.8, 0.6, 0.4, 0.2, 0)
Membership Functions
Any function ] 1 , 0 [ ) x ( describes a membership function associated with some fuzzy
set. Which particular membership function is suitable for fuzzy modeling can be
determined in a specific context. The most commonly used membership functions are
triangular and trapezoidal defined as:
Triangular Membership Functions
'
>
<
b u if
b c u if
c b
u b
c a u if
a c
a u
a u if
u
0
] , [
] , [
0
) (
a b c
1.
0 5 10
i
433-679 Evolutionary and Neural Computation Baikunth Nath
6
Trapezoidal Membership Functions
>
<
b u if
b n u if
n b
u b
n m u if
m a u if
a m
a u
a u if
u
0
] , [
] , [ 1
] , [
0
) (
The values of the membership function are real numbers in the interval [0, 1], where 0
means that the object is not a member of the set and 1 means that it belongs entirely. Each
value of the function is called membership degree. One way of defining a membership
function is through an analog function. Figure 6.4 shows three membership functions
representing three fuzzy sets labeled as short, medium, and tall, all of them being
fuzzy values of a variable height.
Figure 6.4 Membership functions representing three fuzzy sets for the variable height
In Figure 6.4, the value 170 cm belong to the fuzzy set medium to a degree of 0.2 and at
the same time to the set tall to a degree of 0.7.
30
170 250
Height (cm)
(cm)
Short
Medium Tall
0.7
0.2
0.0
1.0
m n b a
Baikunth Nath 433-679 Evolutionary and Neural Computation
7
u
30
15 25
cool medium
A
10
The properties of a fuzzy set are uniquely determined by its membership function.
Usually, the elements on the universe whose membership is larger than zero are called the
support of the fuzzy set. If there is only one element in the support of the fuzzy set, then
this fuzzy set is called a fuzzy singleton.
supp } 0 ) ( , | { ) ( > u U u u A
A
For a fuzzy set A on the universe U defined by a membership function ) (u
A
, hgt(A) is
called the height of the fuzzy set that is defined by:
)} ( { max ) ( u A hgt
A U u
If hgt(A) = 1, then the fuzzy set A is said to be normal and all its elements that have a
membership degree of 1 are called the kernel of the fuzzy set. A fuzzy set is non-normal,
if 0 < hgt(A) < 1. A non-normal fuzzy set can be normalized by dividing the membership
function ) (u
A
by the maximum membership grade, i.e.,
)) ( (
) (
) (
u A hgt
u
u
A
A
Figure 6.5 Support and kernel of a fuzzy set
For example, in the Figure 6.6, the support of the fuzzy set medium temperature is the
interval (10, 30) on the Celsius scale. A fuzzy set A can be formulated entirely by its
support, that is u u u A
A
| / ) ( { supp(A)
Figure 6.6 Crisp and fuzzy set as subsets of a domain U
1.
0
u
kernel
support
433-679 Evolutionary and Neural Computation Baikunth Nath
8
Note: Crisp sets use clear cut on the boundaries. Fuzzy sets use grades. For example, the
membership degree to which two values, say 14.999 and 15.001 belong to a fuzzy set are
very close to each other, which represents their closeness in the universe, but because of
the crisp border between them, say 15, the two values are associated with different crisp
sets.
6.2 Fuzzy Operations
Similar to classical set, there are some basic operations on fuzzy sets. Consider fuzzy sets
A and B on the universe U, whose membership functions are ) (u
A
and ) (u
B
,
) (u
A
, ] 1 , 0 [ ) ( u
B
. then
Intersection, )) ( ), ( (( ) ( ) ( ) ( : u B u A T u u u B A
B A B A
for all u
from U, where means MIN, and T (.) is a triangular norm or T-norm for short.
Union, )) ( ), ( ( ) ( ) ( ) ( :
*
u B u A T u u u B A
B A B A
for all u from U,
where means MAX, and T
*
(.) is T-conorm.
Complement of A, not A, A: ) ( 1 ) ( u u
A A not
for all u from U
The properties that T-operators must satisfy are given below.
Let T: [0, 1] [0, 1] [0, 1]. T is T-norm if an only if for all ] 1 , 0 [ , , z y x
1. T (x, y) = T (y, x) (commutative)
2. T (x, y) T (x, z), if y z (monotone)
3. T(x, T(y, z))) = T (T(x, z), z) (associative)
4. T (x, 1) = x
Similarly, for T
*
: [0, 1] [0, 1] [0, 1]. T
*
is T-conorm if an only if for all
] 1 , 0 [ , , z y x
1. T
*
(x, y) = T
*
(y, x) (commutative)
2. T
*
(x, y) T
*
(x, z), if y z (monotone)
3. T
*
(x, T
*
(y, z))) = T
*
(T
*
(x, z), z) (associative)
4. T
*
(x, 0) = x
One of the most popular T-norms operators are
T (x, y) = min (x, y) and T
*
(x, y) = max (x, y)
Another pair of T-operators, called probabilistic operators, is
T (x, y) = x y and T
*
(x, y) = x + y - x y
Baikunth Nath 433-679 Evolutionary and Neural Computation
9
Other Operations:
Equality, ) ( ) ( u u
B A
for all u from U
Concentration, CON(A):
2
) (
)) ( ( ) ( u u
A A CON
, for all u from U
Dilation, DIL(A):
5 . 0
) (
)) ( ( ) ( u u
A A DIL
, for all u from U
Subset, AB: ) ( ) ( u u
B A
for all u from U
Algebraic product, A.B: ) ( ). ( u u
B A
, for all u from U
Bounded sum, )} ( ) ( , 1 max{ u u
B A
+ , for all u from U
Bounded difference, )} ( ) ( , 0 min{ u u
B A
, for all u from U
Bounded product, } 1 ) ( ) ( , 1 max{ + u u
B A
for all u from U
Normalization, NORM (A): )} ( { / ) ( ) (
) (
u MAX u u
A A A NORM
, for all u from U
Algebraic sum, ) ( ) ( ) ( u u u
B A B A
+
+
, for all u from U
Algebraic difference, ) ( ) ( ) ( u u u
B A B A
U u u A M
A
), ( ) ( ]. The greater the entropy, the greater the fuzziness. Crisp sets
have an entropy of 0.
Another formula for measuring entropy of fuzzy set A is
+ )} ( log ) ( ) ( log ) ( { ) (
i A i A i A i A
u u u u k A E , for all u from U
where k > 0 is a constant.
2
Kosko B, Fuzzy Entropy and Conditioning. Information Science, 40, 165-174.
433-679 Evolutionary and Neural Computation Baikunth Nath
10
6.3 Fuzziness and Probability
Let us suppose that a particular problem is not well defined; the existing knowledge is
vague, and so forth. How to represent the problem in fuzzy terms? Should we use
probability representation or a fuzzy representation?
In probability theory an event uU either happens or not, its probability represents the
chance for the event to happen, that is, the chances that a random variable x takes a value
u. A probability p(u) can be represented by the ratio between the number of experiments
of a series when u happens and the total number of experiments. For example, the
probability that it will rain on 31 August is 0.73, because 73 out of 100 days on this date
were recorded for the last 100 years as rainy days. The probability density function P(x)
gives the probability of each of the possible values of a random variable x (event uU).
But 31 August comes and it is not quite clear whether it is rainy or not. The notion of
rainy can be represented as a fuzzy set A on the universe of the rainfall and represented by
its membership function
A
. In general, a membership coefficient ) (u
A
measures the
grade to which an event u, which has already happened, is a member of the concept
labeled A. The membership function represents the degree to which the total set U is
contained in the subset A. For example, 31 August was a rainy day to a degree of 0.6. So
the probability density function P(x) is a completely different concept from the
membership function of a fuzzy set A.
Consider another classical example, throwing a dice. Figure 6.7 shows a tabular and
graphical form of the probability density function of the random number achieved after
each throw, and the membership function that the number is small.
Dice 1 2 3 4 5 6
Probability Density Function 1/6 1/6 1/6 1/6 1/6 1/6
A
small number 1 0.5 0 0 0 0
p(x)
1/6
1 2 3 4 5 6
Figure 6.7 Probability density function for throwing dice and membership function of the
concepts small, medium and large
1 5 6 3
small medium large
0.5
Baikunth Nath 433-679 Evolutionary and Neural Computation
11
In the above example two events can be distinguished, a crisp event achieving a number 2,
and a fuzzy event achieving a small number. The probability of crisp event
p(x = 2) = 1/6, but it is also possible to calculate the probability p(A) of a fuzzy event, the
probability that a small number will be achieved p(x = small).
In general, a probability of a fuzzy event is calculated as:
n u A p
A
/ ) ( ) ( for uniformly distributed variable x having n discrete values, and
) ( ). ( ) ( u x p u A p
A
for the general case.
In the above example, for the membership function shown in Figure 6.7, we calculate
P (x = small) = 1.5/6; p(x = medium) = 2/6; p(x = large) = 2.5/6
Probabilities of fuzzy events have same properties as probabilities of crisp events. For
example, if A and B are two fuzzy events, then the following holds:
) ( ) ( , B p A p B A
) ( 1 ) ( A p A p
) ( ) ( ) ( ) ( B A p B p A p B A p +
The conditional probability of fuzzy events is defined in the same way as the conditional
probability of crisp events. For example, p (A| B) denotes the conditional probability of the
fuzzy event A to happen if the fuzzy event B has happened. So fuzziness can also be used
to represent future events either by calculating the probability for the fuzzy events or by
applying existing knowledge, for example, tomorrows value of stock index will be high,
if todays value is moderate, and yesterdays value was low.
Conceptualizing in Fuzzy Terms
One of the most important steps towards using fuzzy logic and fuzzy systems for problem-
solving is representing the problem in fuzzy terms. We often use linguistic terms in the
process of identification and specification of a problem, or in the process of articulating
heuristic rules. For example, the linguistic terms: higher, lower, very strong, slowly, much
dependent and less dependent, low, high, good, bad etc are fuzzy concepts respectable as
fuzzy sets.
The term linguistic variable is used to denote a variable which takes fuzzy values and has
linguistic meaning. For example, the term velocity is a linguistic variable if it takes as
values low, moderate or high. Linguistic values, also called fuzzy labels or fuzzy
concepts, have semantic meaning and can be expressed numerically by their membership
functions. . For example a fuzzy variable score may have a universe all the numbers
between 50 and 100.
433-679 Evolutionary and Neural Computation Baikunth Nath
12
Linguistic variables can be quantitative, for example, temperature (low, high); time (early,
late); spatial (around the corner); or qualitative, for example, truth, certainty, belief,
etc.
The process of representing a linguistic variable into a set of linguistic values is called
fuzzy quantization.
6.4 Fuzzy Relations
Let us consider the notion of ordered pairs. When making pairs of anything, the order of
the elements is usually of great importance, for example, the points (2, 3) and (3, 2) in
(x, y) plane are different. A pair of elements that occur in a specified order is called an
ordered pair. A relation is a set of ordered pairs.
Relations express connections between sets. A crisp relation represents the presence or
absence of association, interaction or interconnectedness between the elements of two or
more sets. If this concept is generalized, allowing various degrees or strengths of relations
between elements, we get fuzzy relations.
As an example, consider two sets X and Y: X = {1, 2, 3}, Y = {2, 3, 4} and relation
R: x is smaller than y. Then R = {(1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4)} a set of pairs
and a binary relation. The relational matrix, or membership array, in this crisp case
comprises only of 1s and 0s.
x/ y 2 3 4
1 1 1 1
2 0 1 1
3 0 0 1
The elements of the relation matrix are degrees of membership ) , ( y x
R
, that is,
possibilities or degrees of belonging of a specific pair (x, y). Thus, for example, the pair
(3, 1) belongs with a degree 0 to the relation x is smaller than y. This is a typical
example of crisp relation.
Consider another example, when relation R is given as an association or interconnection
between fruit color and state. X = {green, yellow, red}, Y = {unripe, semiripe, ripe}.
R unripe semiripe ripe
Green 1 0 0
Yellow 0 1 0
red 0 0 1
This relational matrix can be interpreted as a notation or model of an existing empirical set
of IF-THEN rules:
Baikunth Nath 433-679 Evolutionary and Neural Computation
13
R
1
= IF (the tomato is) green THEN (it is) unripe
R
2
= IF yellow THEN semiripe
R
3
= IF red THEN ripe
However, this matrix is a crisp one and not in total agreement with our experience. A
better interconnection between fruit color and state may be given by the following fuzzy
relation matrix:
R unripe semiripe ripe
Green 1 0.5 0
Yellow 0.3 1 0.4
red 0 0.2 1
One of the most useful operations on fuzzy relations is the composition, which combines
fuzzy relation between different variables. Suppose R and S are fuzzy relations in the
domain UV and VW, then the composition of R and S denoted by R o S is a fuzzy
relation in the domain UW which is defined by
)) , ( ) , ( ( ) , )( ( w v S v u R w u S o R
V v
As described earlier, we usually use maximum for the fuzzy intersection and the minimum
for the fuzzy union. This kind of composition is called the min-max composition. Fuzzy
relation composition is very important in fuzzy modeling and control, because it plays an
essential role in approximate reasoning. Consider the following question:
If x is A, then y is B
If x is A, how is y?
First we define the fuzzy relationship between x and y using fuzzy membership functions:
) , )( ( ) , ( y x B A y x R
))) ( 1 ( ) ( ) ( ( x A y B x A
where A and B are fuzzy sets on the universe of X and Y, Y y X x , , A(x) and B(y) are
the membership functions for A and B.
With the help of fuzzy composition using approximate reasoning, we can get the answer to
the above question. Suppose y is B if x is A, then
) , ( ) ( ) ( y x R o x A y B
))] ( 1 ( )) ( ) ( ( ) ( [
)] , ( ) ( [
x A y B x A x A
y x R x A
X x
X x
433-679 Evolutionary and Neural Computation Baikunth Nath
14
A simple illustration:
Let X = Y = {1, 2, 3, 4, 5}. Large, Small and quite Small are three fuzzy sets o X and Y,
which are defined by the following membership functions:
Large = (0, 0, 0.1, 0.6, 1)
Small = (1, 0.7, 0.4, 0, 0)
quite Small = (1, 0.6, 0.4, 0.2, 0)
The problem is: If x is Small, then y is Large
If x is quite Small, how is y?
Using the fuzzy implication defined above, we get
)) ( 1 ( )) ( arg ) ( ( ) , ( x Small y e L x Small y x R
1
1
1
1
1
1
]
1
1 1 1 1 1
1 1 1 1 1
6 . 0 6 . 0 6 . 0 6 . 0 6 . 0
3 . 0 3 . 0 3 . 0 3 . 0 3 . 0
1 6 . 0 1 . 0 0 0
Given a new instance x is quite Small, we can get y with the help of fuzzy composition
according to the above equation
) 1 , 6 . 0 , 4 . 0 , 4 . 0 , 4 . 0 (
1 1 1 1 1
1 1 1 1 1
6 . 0 6 . 0 6 . 0 6 . 0 6 . 0
3 . 0 3 . 0 3 . 0 3 . 0 3 . 0
1 6 . 0 1 . 0 0 0
) 0 , 2 . 0 , 4 . 0 , 06 , 1 (
) , ( ) )( (
1
1
1
1
1
1
]
1
o
y x R o x quiteSmall y
Compared to the definition of fuzzy set Large, y can be interpreted as quite Large. This
result is in good agreement with our intuition.
Baikunth Nath 433-679 Evolutionary and Neural Computation
15
6.5 Fuzzy Inference
It is a process of matching, that is, matching a domain space with a solution space. It is
also matching in a narrow sense too, that is, matching a new fact, for example, X with a set
of rules R
i
(i=1, 2, , n) and inferring a solution Y, the whole inference process being a
chain of such matches.
The matching in the symbolic AI systems is exact matching. If we have Y X , and X is
present, exactly Y will be inferred. But in fuzzy representation we may not have exact
values of the input variables. Some fuzzy input value X is supplied instead. What then will
the inferred result for Y be?
If X and X
are similar, will Y and Y
be similar as well, and how much similar? In a
special case, if X = X
, will Y = Y
?
In traditional logic an expression such as IF A, THEN B written as B A , A implies
B, or B follows from A. Such an implication is defined by the following truth table
A B B A
T T T
T F F
F T T
F F T
The following identity is used to calculate the truth table:
B A B A
C
Note the strangeness of the last two rows. Conditional statements or implications sound
paradoxical when the components are not related. For example, the statement
IF 22 = 5, THEN cows are horses is true (row 4 in truth table), but IF 22 = 4, THEN
cows are horses is false (row 2).
In traditional (Boolean) logic there does not have to be any real causality between the IF
part and the THEN part. It is different in human reasoning. Our rules express cause-effect
relations; fuzzy logic is a tool for transferring such structured knowledge into workable
algorithms. Thus fuzzy logic cannot be and is not Boolean logic. It must go beyond crisp
logic. This is because there is no effect (output) without a cause (input).
433-679 Evolutionary and Neural Computation Baikunth Nath
16
References
Fogel D, Evolutionary Computation: Towards a New Philosophy of Machine Intelligence.
2nd edition, IEEE Press, 2000.
Zadeh L A, Theory of Fuzzy Sets and Fuzzy Logic.
Terano T, Kiyoji A and Sugeno M, Fuzzy Systems Theory and Its Applications. Academic
Press, 1992.
Kecman V, Learning and Soft Computing. MIT Press, 2001.
Exercises
1. Construct a fuzzy relation matrix for the relation concept R very far for the two
crisp sets X = {Auckland, London, Tokyo}, Y = {Melbourne, Athens, Paris, Delhi,
New York}.
2. Find the relational matrix of the concept a young tall man. The concept a young
tall man means young AND tall man. Therefore two fuzzy sets young man
and tall man are defined and then the intersection (MIN or any other) operator is
applied to these two sets defined on different universes of age and height.
)} ( ), ( { ) , (
2 1
height age MIN height age
R
.
1
2
1 1
S
1
= {15, 20, 25, 30, 35} S
2
= {170, 175, 180, 185, 190}
T
] 0 5 . 0 1 5 . 0 0 [
1
T
] 1 1 1 5 . 0 0 [
2
Then
T
R
2 1
Age (years)
15
Height (cm)
180 25 190