These de Doctorat: Universit e de Tunis Institut Sup Erieur de Gestion

Download as pdf or txt
Download as pdf or txt
You are on page 1of 39

Université de Tunis

Institut Supérieur de Gestion

THESE DE DOCTORAT
en vue de l’obtention du titre de docteur en
NOM DE LA SPECIALITE

Option
Nom de l’Option

TITRE

Sous TITRE

NOM PRENOM

Soutenu le JOUR MOIS ANNEE, devant le jury composé de:

NOM PRENOM Grade, Institution Président


NOM PRENOM Grade, Institution Rapporteur
NOM PRENOM Grade, Institution Rapporteur
NOM PRENOM Grade, Institution Membre
NOM PRENOM Grade, Institution Directeur de thèse

Laboratoire/Unité de recherche:
Acknowledgments

Such descriptions give rise to speculation: are judgment-laden adjectives like “in-
tense” and “hardheaded” simply a way to describe traits that today might be cate-
gorized under juvenile behavioral disorder? A December, 2001, Wired magazine
article titled “The Geek Syndrome” paints the portrait of several scientifically
gifted children diagnosed with high-functioning autism or Asperger Syndrome.

...

Dan Chess, a fellow classmate in the Columbia Science Honors Program, re-
calls Richard Stallman seeming a bit weird even among the students who shared
a similar lust for math and science. “We were all geeks and nerds, but he was
unusually poorly adjusted,” recalls Chess, now a mathematics professor at Hunter
College. “He was also smart as shit. I’ve known a lot of smart people, but I think
he was the smartest person I’ve ever known.”

...

At first, Stallman viewed these notices with alarm. Rare was the software
program that didn’t borrow source code from past programs, and yet, with a single
stroke of the president’s pen, Congress had given programmers and companies
the power to assert individual authorship over communally built programs. It also
injected a dose of formality into what had otherwise been an informal system.

i
Contents

Introduction 1

Part I : Theoretical Aspects 4

1 First Chapter 5

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2 The Dendritic Cell Algorithm . . . . . . . . . . . . . . . . . . . . 5

1.2.1 Biological Background . . . . . . . . . . . . . . . . . . . 6

1.2.2 Algorithmic Details . . . . . . . . . . . . . . . . . . . . . 7

1.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2 Second Chapter 10

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2 Rough Set Based Approach for Feature Selection . . . . . . . . . 11

2.2.1 Decision and Information Systems . . . . . . . . . . . . . 11

2.2.2 Indiscernibility Relation . . . . . . . . . . . . . . . . . . 11

2.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

ii
CONTENTS iii

Part II : Contributions 14

3 Chapter 3 15

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.2 Rough DCA Based Methods . . . . . . . . . . . . . . . . . . . . 15

3.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

4 Chapter 4 18

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4.2 Why Adopting Fuzzy-Rough Sets for Data Pre-processing? . . . . 18

4.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

5 Chapter 5 20

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . 20

5.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

6 Chapter 6 22

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

6.2 Overview of the Maintenance Database Policies . . . . . . . . . . 22

6.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

7 Chapter 7 24

7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

7.2 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

7.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Conclusion 26
iv CONTENTS

Appendix 27

A Appendix 1 28

A.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

A.2 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

References 29
List of Figures

1.1 DCs Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3.1 The Rough DCA Proposed Model . . . . . . . . . . . . . . . . . 17

v
List of Tables

2.1 Information System . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2 Decision System . . . . . . . . . . . . . . . . . . . . . . . . . . 12

vi
List of Algorithms

1.1 The Dendritic Cell Algorithm . . . . . . . . . . . . . . . . . . . . 8

vii
Introduction

marwa or Artificial Immune System (AIS) (Eiben & Smith., 2007).

At the heart of the human immune system is its ability to discriminate between
the body’s own cells, the self elements, and the foreign cells, non-self elements,
by a process called “self-nonself discrimination”. This fundamental process is
performed by several AIS algorithms. However, it was noted that these algo-
rithms cannot produce the same high performance as the human immune system.
Hence, in (Matzinger, 2001), a relatively newer immunological discovery using
more rigorous and up-to-date immunology, known as the “Danger Theory” (DT),
has been proposed. The DT states that the recognition of a pathogen is based on
environmental context rather than the simple self-nonself principle.

...

Aim and Scope

As mentioned, the dendritic cell algorithm is a new foundation in the field of


artificial immune systems. As a new paradigm, more works have to be done on
the algorithm in order to check its properties and its performance as a binary
classifier. In this context, the main objective of this Thesis is to investigate a
number of algorithmic properties of the DCA. This is to improve the dendritic
cell algorithm’s applicability and accessibility to future users. With respect to this
objective, a set of the main contributions are presented within this dissertation and
they are listed as follows:

1
2 Introduction

...

• Contribution 1.

• ...

• Contribution n.

Thesis Outline

This Thesis is structured into two main parts.

The first part, Theoretical Aspects, is composed of two main Chapters which
are the following:

• Chapter 1: . . .

• Chapter 2: . . .

The second part of this Thesis presents our contributions. We can decompose
this part into three main blocks. The first block is composed of two Chapters
dealing with new developed danger theory approaches based on robust data pre-
processing modules:

• Chapter 3: . . .

• Chapter 4: . . .

The second block is composed of two Chapters and deals with the develop-
ment of new DCAs handling the problem of the algorithm sensitivity to the input
class data order. We hypothesize that the causes of such sensitivity may be either
tight to the DCA context assessment phase, first hypothesis, or to the DCA detec-
tion phase, second hypothesis. On this light, we propose to investigate the first
hypothesis in Chapter 5 and the second hypothesis in Chapter 6.

• Chapter 5: . . .

• Chapter 6: . . .
Introduction 3

Lastly, the third block is seen as the fruit of all the previously studied parts. This
block deals with the development of a more general DCA taking all the studied
facts into account. The final fruit is a more robust danger theory classifier based
on dendritic cells within an imprecise framework, and is presented in Chapter 7.

• Chapter 7: . . .

Finally, our dissertation ends with a conclusion and an Appendix. The conclu-
sion summarizes all the work presented in this report and proposes further works
to be done with the dendritic cell algorithm.

An Appendix presenting the main concepts of fuzzy set theory is provided


since the latter theory is used to develop our fuzzy DCA versions.
Part I
Theoretical Aspects

Part I presents the theoretical aspects of this Thesis. It provides the


necessary background regarding the basic concepts of the danger
theory and more precisely the dendritic cell algorithm. Besides, it
introduces the theories of rough sets and fuzzy-rough sets as main
techniques adopted in this dissertation.
Chapter 1
First Chapter

1.1 Introduction

Artificial Immune Systems (AISs) can be defined as a computational paradigm


that is inspired by theoretical immunology, observed immune functions, principles
and mechanisms (Zuben & Timmis, 2003). The latest immunological theory in the
AIS field is called the “Danger Theory” (DT) (Matzinger, 2001). The DT which
has become popular among immunologists is based on the behavior of special
immune cells called the “Dendritic Cells” (DCs). An inspiration from the behavior
of these special immune cells led to the development of an immune algorithm
termed the “Dendritic Cell Algorithm” (DCA).

In this Chapter, first, we start by presenting the artificial immune system field.
Next, we detail the basics of the danger theory. Finally in Section 1.4, we describe
the dendritic cell algorithm including its biological principals.

1.2 The Dendritic Cell Algorithm

As previously stated, the most prominent players of the DT are the dendritic cells
(DCs). An inspiration from the DCs behavior led to the development of an im-
mune algorithm termed the dendritic cell algorithm (DCA) (Greensmith & Aick-
elin, 2005). Before describing the functioning of the algorithm, we give a general
overview of the biological principles used by the DCA.

5
6 Chapter 1 : First Chapter

1.2.1 Biological Background

A DC is a type of antigen-presenting cell. DCs are in charge of catching, pro-


cessing and revealing antigens to T-cells. They, also, express receptors on their
surfaces to receive signals from their neighborhood. The behavior of DCs de-
pends on the concentration of the signals received. As a result, they differentiate
into three different maturity levels described as follows (Lutz & Schuler, 2002):

• Immature DCs: On arrival in the tissue, DCs are found in an immature


state. Here, immature DCs (iDCs) collect antigen which could be a “safe”
molecule or something foreign. Furthermore, DCs can collect and sense
the various signals that may be present in the tissue. Receipt of signals
causes changes to the function, morphology and behavior of the iDC. In other
words, the relative proportions and potency of the different signals lead to a
full or partial maturation state of iDCs.

• Mature DCs: For an iDC to become a “mature DC” (mDC), the iDC must
be exposed to a greater quantity of either PAMPs or DSs than SSs. Sufficient
exposure to PAMPs and DSs causes the DC to cease antigen collection and
migrate from the tissue to the lymph node. Most importantly, mDCs produce
an inflammatory cytokine called “interleukin-12” which stimulates T-cell ac-
tivation in order to be reactive to antigen presentation. Additionally, mDCs
produce costimulatory molecules (CSMs) which are known to facilitate the
antigen presenting process.

• Semi-mature DCs: In the presence of apoptosis conditions, exposure to


SSs diverts the iDC to become a “semi-mature DC” (smDC). smDCs appear
morphologically very similar to mDCs and can present antigen, yet they do
not have the ability to activate T-cells. The smDC produces “interleukin-10”
which suppresses T-cells matching the presented antigen. Antigens collected
with SSs are presented in a tolerogenic context and lead to unresponsiveness
to those antigens.

Swapping from one DC state to another is dependent upon the receipt of dif-
ferent signals throughout the initial state (iDC). This is shown in Figure 1.1. The
migration to the mature state or to the semi-mature state depends on the concen-
tration of the input signals received from the environment. Immunologically, if the
concentration of SSs is greater than the other categories of signals then DCs mi-
Section 1.2 – The Dendritic Cell Algorithm 7

Figure 1.1: DCs Behavior

grate to the semi-mature state. However, if the concentration of PAMPs and DSs
is greater than the concentration of SSs then DCs migrate to the mature context.

1.2.2 Algorithmic Details

To transform the abstract model of DC biology into an immune-inspired algo-


rithm, it must be formalized into the structure of a generic algorithm and into a set
of logical processes. A generic form of the DCA binary classifier is described by
Algorithm 1.1.

For the ease of analysis, the algorithm is divided into the following four main
phases:

1. Pre-Processing & Initialization phase - Line 1 to Line 5;

2. Detection phase - Line 6 to Line 13;

3. Context Assessment phase - Line 14 to Line 19;

4. Classification phase - Line 20 to Line 28;

...
8 Chapter 1 : First Chapter

...

...

...

Algorithm 1.1 The Dendritic Cell Algorithm


1: input: signals from all categories and antigens;
2: output: antigens plus context values;
3: for each DC do /* Pre-processing & Initialization phase */
4: initialize DC;
5: end for
6: while CSM output signal < migration threshold do /* Detection phase */
7: get antigens;
8: store antigens;
9: get signals;
10: calculate interim output signals;
11: update cumulative output signals;
12: end while
13: cell location update to lymph node;
14: if semi-mature output > mature output then /*Context Assessment phase */
15: cell context is assigned as 0;
16: else
17: cell context is assigned as 1;
18: end if
19: print collected antigens plus cell contexts;
20: for all antigens in total list do /* Classification phase */
21: increment antigen count for this antigen type;
22: if antigen context equals 1 then
23: increment antigen type mature count;
24: end if
25: end for
26: for all antigen types do
27: MCAV of antigen type = mature count / antigen count;
28: end for

1.3 Conclusion

In this Chapter, we have presented the basic notions of the danger theory and
basically the DCA. We have, also, discussed the major DCA works where we have
clarified the strengths and the weaknesses of the algorithm. In the next Chapter,
Section 1.3 – Conclusion 9

we will present the basics of rough set and fuzzy rough set theories as main tools
to be used for the data pre-processing phase of our new DCA proposed versions.
Chapter 2
Second Chapter

2.1 Introduction

Data reduction is a main point of interest across a wide variety of fields. In fact,
focusing on this step is crucial as it often presents a source of significant data
loss. Many techniques were proposed in literature to achieve the task of data
reduction. However, most of them tend to destroy the underlying semantics of the
features after reduction or require additional information about the given data set
for thresholding. Thus, it seems necessary to think about a technique that can on
the one hand reduce data dimensionality using information contained within the
data set and on the other hand capable of preserving the meaning of the features.
Rough Set Theory (RST) and Fuzzy-Rough Set Theory (FRST) can be used as
such tools to discover data dependencies and to reduce the number of attributes
contained in a data set using the data alone, requiring no additional information
(Jensen, 2005).

This Chapter focuses on introducing RST and FRST for feature selection. In
the next Section, a brief refresh on some tools for data reduction is proposed. In
Section 2.3, the basic concepts of RST are highlighted and the fundamentals of
FRST are presented next in Section 2.4.

10
Section 2.2 – Rough Set Based Approach for Feature Selection 11

2.2 Rough Set Based Approach for Feature Selec-


tion

In this Section, rudiments of the RST will be outlined and basic concepts of the
theory will be illustrated by a simple tutorial example.

2.2.1 Decision and Information Systems

Data are represented as a table where each row represents an object and where
each column represents an attribute that can be measured for each object. Such
table is called an “Information System” (IS) or an “Information Table” (IT). To
fit this definition to our research field, an information table can be seen as a rep-
resentation of antigens which are defined via a set of attributes. Formally, an in-
formation system can be defined as a pair IS = (U, A) where U = {x1 , x2 , . . . , xn }
is a non-empty, finite set of objects called the “universe” and A = {a1 , a2 , . . . , ak }
is a non-empty, finite set of “condition” attributes. In supervised learning, a spe-
cial case of the defined information table is considered, called a “Decision Table”
(DT) or a “Decision System” (DS). A DT is an information system of the form
IS = (U, A ∪ {d}), where d < A is a distinguished attribute called “decision”. The
value set of d, called θ = {d1 , d2 , . . . , d s }.

Example 2.1 Using the terminology of RST, the data set presented in Table 2.1
can be considered as an information system IS = (U, A) consisting of 4 condi-
tional features (a, b, c, d) and 8 objects. To illustrate an example of a decision
system, Table 2.2 is used. It consists of 4 conditional features (a, b, c, d), 1 deci-
sion feature (e) and 8 objects.

2.2.2 Indiscernibility Relation

The indiscernibility relation is the mathematical basis of rough set theory. Every
object of the universe is described by certain amount of information expressed
by means of some attributes used for object description. Objects characterized
by the same information are indiscernible in view of the available information
about them. For every set of attributes P ⊂ A, an indiscernibility relation, denoted
by IND(P) or U/P, is defined in the following way: two objects, xi and x j , are
12 Chapter 2 : Second Chapter

Table 2.1: Information System Table 2.2: Decision System

x∈U a b c d x∈U a b c d e
x1 1 0 2 2 x1 1 0 2 2 0
x2 0 1 1 1 x2 0 1 1 1 2
x3 2 0 0 1 x3 2 0 0 1 1
x4 1 1 0 2 x4 1 1 0 2 2
x5 1 0 2 0 x5 1 0 2 0 1
x6 2 2 0 1 x6 2 2 0 1 1
x7 2 1 1 1 x7 2 1 1 1 2
x8 0 1 1 0 x8 0 1 1 0 1

indiscernible by the set of attributes P in A, if p(xi ) = p(x j ) for every p ⊂ P. In


other words, two objects are considered to be indiscernible or equivalent if and
only if they have the same values for all attributes in the set.

The equivalence class of IND(P) is called elementary set in P because it rep-


resents the smallest discernible groups of objects. For any element xi of U, the
equivalence class of xi in relation IND(P) is represented as [xi ]IND(P) . For every
object x j ∈ U, we will use ai (x j ) to denote the value of a condition attribute ai for
an object x j . Similarly, d(x j ) is the value of the decision attribute for an object x j .
We further extend these notations for a set of attributes P ⊆ A, by defining P(x j )
to be value tuple of attributes in P for an object x j . The indiscernibility relation
based on a subset of the condition attributes P, denoted by IND(P), is defined as
follows:

IND(P) = U/P = {[x j ]P |x j ∈ U} (2.1)

where [x j ]P = {xi |P(xi ) = P(x j )} (2.2)

The indiscernibility relation based on the decision attribute d, denoted by


IND{d} , is defined as follows:

IND{d} = U/{d} = {[x j ]{d} |x j ∈ U} (2.3)

Example 2.2 In order to illustrate how a decision table from Table 2.2 defines
an indiscernibility relation, we consider the following three non-empty subsets of
Section 2.3 – Conclusion 13

the conditional attributes: {a}, {b, c} and {a, b, c}. The relation IND may define
three partitions of the universe.

IND(a) = {{x1 , x4 , x5 }, {x2 , x8 }, {x3 , x6 , x7 }}


IND(b, c) = {{x3 }, {x1 , x5 }, {x4 }, {x2 , x7 , x8 }, {x6 }}
IND(a, b, c) = {{x1 , x5 }, {x2 , x8 }, {x3 }, {x4 }, {x6 }, {x7 }}

If we take into consideration the set a, the objects x1 , x4 and x5 belong to the
same equivalence class; they are indiscernible.

2.3 Conclusion

In this Chapter, both rough sets and fuzzy-rough sets for feature selection were
introduced. The application of these theories form the main contribution of this
Thesis. New hybridized models will be presented in the second part of this disser-
tation where the mentioned theories are combined with the DCA. This is to cover
the DCA limitations which are linked to its data pre-processing step.
Part II
Contributions

Part II presents the contributions of this Thesis. This part focuses on


presenting our newly developed approaches solving some of the
DCA mentioned limitations. The two first Chapters present new
data pre-processing modules dedicated to the standard version of the
dendritic cell algorithm. These developed modules are based on the
theory of rough sets and the theory of fuzzy-rough sets. Secondly,
the next two Chapters detail solutions to overcome the DCA
limitation as it is sensitive to the input class data order. The
proposed algorithms are based on the use of fuzzy set theory, fuzzy
clustering techniques and a database maintenance technique.
Finally, the last Chapter deals with the development of a more
general DCA taking into account all the previously studied parts.
Chapter 3
Chapter 3

3.1 Introduction

In this Chapter, we aim to investigate the DCA data pre-processing phase includ-
ing the step of feature selection/reduction and the step of signal categorization.
We aim to detect and to handle the shortcomings of these two steps. To do so,
we suggest to apply Rough Set Theory (RST) as a feature selection and signal
categorization technique in the DCA data pre-processing phase. This will result
on different DCA new automated approaches that will be compared to each other
while pointing out their characteristics.

This Chapter is organized as follows: Firstly, criticisms of the standard DCA


data pre-processing module are given in Section 3.2. Secondly, the development
of the proposed rough pre-processing modules is based on a set of hypotheses.
These hypotheses are presented in Section 3.3. Then, the proposed rough DCA
hybrid approaches are explained in details in Sections 3.4, 3.5 and 3.6.

3.2 Rough DCA Based Methods

To handle the mentioned drawbacks of the standard DCA data pre-processing


module, we propose three automated DCAs based on the theory of rough sets
(RST). The difference between the three proposed rough DCAs is based on a set
of hypotheses that will be tested later in the experimental part of this Chapter. The

15
16 Chapter 3 : Chapter 3

drawn hypotheses are constructed as follows:

• H1: The categorization of different features to different signals leads to im-


prove the performance of the dendritic cell algorithm. We will show that
assigning for each selected feature a specific signal category leads to a better
performance than assigning the same attribute to both SSs and PAMPs.

• H2: The use of a rough set heuristic can keep the DCA main characteristic
which is its lightweight in terms of processing time. This will be achieved by
finding a trade-off between generating satisfactory classification results and
preserving the lightweight of the algorithm.

H1 focuses on checking whether designating the same attribute to both SSs


and PAMPs is more convenient for the DCA signal categorization substep than
assigning different features for each signal category. By designating the same
attribute to both SSs and PAMPs, we develop our first DCA model based on Rough
Set Theory (RST) named RST-DCA. By attaching different attributes to different
signals, we develop our second rough DCA model based on the “Reduct” and the
“Core” RST concepts. The proposed algorithm is named RC-DCA. Conversely,
H2 deals with the development of our third rough DCA based on the use of a rough
heuristic; i.e., the QuickReduct algorithm. The proposed rough DCA is named
QR-DCA. By developing the latter algorithm, we try to find a trade-off between
generating satisfactory classification results and preserving the lightweight of the
standard DCA.

The three developed rough DCAs, including RST-DCA, RC-DCA and QR-
DCA, function under four levels of abstraction as shown in Figure 3.1. We will
mainly discuss the algorithms data pre-processing phase as the rest of the algo-
rithms steps are performed the same as the standard DCA and as explained, pre-
viously, in Chapter 1.

3.3 Conclusion

In this Chapter, we have elucidated how rough set theory could be used to select
the right features and to categorize each selected attribute to its signal category.
We have developed three rough dendritic cell algorithms, compared them and
noted the appropriate observations while testing two crucial hypotheses.
Section 3.3 – Conclusion 17

Figure 3.1: The Rough DCA Proposed Model

Despite of the noticed advantages of our proposed rough set methods, they
have some limitations when dealing with real-valued attributes. This prompted
our research into the use of fuzzy-rough sets for feature selection and signal cate-
gorization. This will be dealt with in the next Chapter.
Chapter 4
Chapter 4

4.1 Introduction

In this Chapter, we still focus on the DCA data pre-processing phase. We propose
novel approaches to more improve the standard DCA data pre-processing task.
This is achieved through the use of Fuzzy Rough Set Theory (FRST) (Dubois &
Prade, 1992). The proposed algorithms are seen as extensions to the rough set
algorithms which were previously presented in Chapter 3.

The rest of this Chapter is organized as follows: Section 4.2 deals with the
motivation behind the use of FRST for data pre-processing. Our developed DCA
variants are explained in details in Sections 4.3, 4.4 and 4.5. A comparison of our
proposed algorithms is presented in Section 4.6.

4.2 Why Adopting Fuzzy-Rough Sets for Data Pre-


processing?

....

18
Section 4.3 – Conclusion 19

4.3 Conclusion

Throughout this Chapter, we focused our work on using fuzzy rough set theory for
the DCA data pre-processing phase. We have developed three fuzzy-rough DCAs
based on different methodologies for the feature selection and signal categoriza-
tion processes. These methods were compared and the appropriate observations
were noted.

Up to now, we have made a deep investigation of the DCA data pre-processing


phase. We have proposed adequate solutions to build a solid data pre-processing
module. While focusing on the rest DCA algorithmic steps, more investigations
have to de done. To begin with and in the next Chapter, we will focus on solving
the DCA limitation as it is sensitive to the input class data order.
Chapter 5
Chapter 5

5.1 Introduction

In this Chapter, we will investigate and try to solve another limitation of the DCA
which is its sensitivity to the input class data order. As previously mentioned, the
standard DCA had to be applied to ordered data sets where all class 1 are fol-
lowed by all class 2. This is to ensure high and satisfactory classification results.
While investigating this DCA limitation, we have noticed that this problem could
be linked to the DCA context assessment phase. Therefore, we propose in this
Chapter to develop an automated fuzzy dendritic cell algorithm characterized by
its stability as a binary classifier.

This Chapter is structured as follows: In Section 5.2, we present the motivation


behind the development of our proposed solution. In Section 5.3, fuzzy clustering
techniques which are needed to build our new fuzzy DCA are detailed. Section
5.4 describes our new developed algorithm. Experiments are outlined in Section
5.5 and the final Section includes the discussion of the obtained results.

5.2 Problem Statement

A study focusing on the DCA behavior stated that the algorithm gives satisfactory
and interesting classification results when it is, only, applied to databases having
ordered classes (Greensmith & Aickelin, 2008). In this Chapter, we try to study

20
Section 5.3 – Conclusion 21

carefully the DCA algorithmic phases while trying to figure out the reasons of
such a restriction.

5.3 Conclusion

In this Chapter, we have analyzed the behavior of the standard DCA while trying
to overcome its shortcoming as it is sensitive to the class data order. This inves-
tigation led us to the development of the FCDCM algorithm. FCDCM presents a
fuzzy version of the DCA and it aims at coping with the crisp separation between
the two DCs contexts. In order to fix the parameters of the algorithm, we have used
various fuzzy clustering techniques, compared them and chosen the most appro-
priate one. The experimental results demonstrate that our proposed method with
the Gustafson-Kessel algorithm (FCDCMGK ) achieves better results compared to
other methods.

In the next Chapter, we will try to investigate more the causes of the dendritic
cell algorithm sensitivity to the input class data order. We will focus, mainly, on
the dendritic cell algorithm detection phase while suggesting new automated and
more adequate solutions for the DCA.
Chapter 6
Chapter 6

6.1 Introduction

Still focusing on the DCA sensitivity to the input class data order, we have noticed
that this limitation cannot only be linked to the DCA context assessment phase,
as presented in the previous Chapter, but also to the DCA detection phase. We
hypothesize that there is a second possible cause related to the DCA sensitivity.
It is possible that the induced signal base generated by the DCA detection phase
contains disagreeable objects such as noisy, incoherent or redundant instances.
Such objects may influence the DCA behavior and, thus, maintaining the signal
base looks essential. To achieve this, we propose in this Chapter to expand a
second fuzzy DCA version based on a maintenance technique.

This Chapter is structured as follows: In Section 6.2, we give an overview of


the maintenance database policies while in Section 6.3, we present our new fuzzy
maintained DCA. The experiments and the obtained results are outlined in Section
6.4.

6.2 Overview of the Maintenance Database Policies

The quality of the database is very important to generate accurate results and to
have the possibility to learn models from the presented data. To achieve this,
the disagreeable objects - especially noisy and redundant instances which affect

22
Section 6.3 – Conclusion 23

negatively the quality of the results - have to be eliminated from the data set. To
address this situation, data maintenance is a feasible way. The goal of database
maintenance methods is to select the most representative information from a given
data set.

Many works dealing with database maintenance have been proposed in liter-
ature (Leake & Wilson, 2001). Most of them are based on updating a database
by adding or deleting instances to optimize and reduce the initial database. These
policies include different operations such as: the outdated, redundant or incon-
sistent instances may be deleted, groups of objects may be merged to eliminate
redundancy and improve reasoning power, objects may be re-described to repair
incoherencies, signs of corruption in the database have to be checked and any ab-
normalities in the database which might signal a problem has to be controlled. A
database which is not maintained can become sluggish and without accurate data
users will make uninformed decisions. The database maintenance policies may
be categorized into two main heads; the selective reduction approaches and the
deletion reduction approaches.

6.3 Conclusion

In this Chapter, we have presented a new fuzzy hybrid evolutionary algorithm,


COID-FCDCMGK , aiming at solving the DCA sensitivity to input class data order.
COID-FCDCMGK is based on two hypotheses which were studied and checked.
COID-FCDCMGK is a stable classifier which outperforms not only the state-of-
the-art classifiers but also the standard version of the DCA as well as the DCA
hybrid proposed versions.

Up to now, we have made several investigations on the dendritic cell func-


tioning as well as its algorithmic steps. While investigating the DCA detection
and context assessment phases, we have proposed solutions to overcome the DCA
sensitivity to the input class data order. While investigating the DCA data pre-
processing phase, we have proposed new rough and fuzzy-rough methods leading
to a more robust DCA classifiers. Note that the proposed rough and fuzzy-rough
DCAs seen in the previous Chapters are applied to ordered data sets only as they
are sensitive to the input class data order. Therefore and at the end of this disser-
tation, we will propose a more general fuzzy-rough, maintained and non-sensitive
dendritic cell algorithm. This will be dealt with in the next Chapter.
Chapter 7
Chapter 7

7.1 Introduction

The developed non-sensitive COID-FCDCMGK algorithm has shown promising


results in terms of classification accuracy. It is based on a deep investigation of
its algorithmic steps. Yet, it is important to mention that the COID-FCDCMGK
data pre-processing phase is based on the use of the principal component analysis
technique. In (Chelly & Elouedi, 2012a, 2012b), we have criticized the fact of
using PCA and proposed adequate solutions. Therefore, in this Chapter, we aim
to develop an overall automated fuzzy DCA classifier characterized by its non-
sensitivity to the input class data order and based on a robust data pre-processing
phase.

This Chapter is organized as follows: Section 7.2 describes an analysis of


the COID-FCDCMGK algorithm. Section 7.3 details the hybrid rough automated
fuzzy DCAs. These classifiers are non-sensitive automated algorithms based on
the use of the traditional rough set theory for data pre-processing. Section 7.4
presents the hybrid fuzzy-rough automated fuzzy DCAs. These algorithms are
non-sensitive automated classifiers based on the use of the theory of fuzzy rough
sets for data pre-processing. Section 7.5 proposes our newly algorithm. Finally,
Section 7.6 gives a summary of the whole work.

24
Section 7.2 – Summary 25

7.2 Summary

In this Section, a summary of the whole work is presented.....

7.3 Conclusion

In this Chapter, we have proposed a hybrid automated and maintained fuzzy den-
dritic cell immune classifier. COID-FLA-FCDCMGK can be seen as the output of
several investigations conducted on the standard DCA version. A set of hypothe-
ses has been checked and based on the results drawn for the six hypotheses, the
COID-FLA-FCDCMGK was developed. The worthy characteristics of the latter
algorithm were highlighted in this Chapter as well.
Conclusion

Throughout this dissertation, we have investigated a series of algorithmic prop-


erties of the standard dendritic cell algorithm. The objective is to improve the
applicability and the accessibility of the DCA to future users. The fruit of these
investigations was an overall automated and maintained dendritic cell classifier
characterized by its robustness and stability. Yet, to achieve the latter algorithm
several developments were proposed in order to conduct conclusions and obser-
vations.

More precisely, we have tended to solve two main limitations of the standard
version of the dendritic cell algorithm. . . .

Indeed, a number of interesting future works has to be mentioned. Firstly . . .

Secondly, . . .

Thirdly, . . .

Finally, . . . .

26
Appendix

In this Appendix, we will present Fuzzy Set Theory (FST). FST is


used as one of the main techniques to build our several proposed
algorithms throughout this dissertation.
Appendix A
Appendix 1

A.1 Introduction

Fuzzy set theory was introduced, in 1965, by Zadeh (Zadeh, 1965). It is consid-
ered as a useful theory for modeling and reasoning with imprecise knowledge.
Fuzzy set theory is a mathematical theory where the fuzziness is the ambigu-
ity that can be found in the definition of a concept or the meaning of a word
(Zimmermann, 1996). Imprecision in expressions like “low frequency”, “high de-
mand” or “small number” can be called fuzziness. In this Appendix, the basics of
fuzzy sets will be introduced.

In Section A.2, the basics of fuzzy set theory will be given. In Section A.3, the
main notions of the fuzzy set membership functions will be introduced while in
Section A.4, the fundamental operations of fuzzy sets will be highlighted. Finally,
in Section A.5, the process of fuzzy logic will be described.

A.2 Conclusion

In this Appendix, we have elucidated the basics of fuzzy set theory which is a
generalization of the classical set theory. Offering a natural model, fuzzy sets are
used to handle imprecise information.

28
References

Chelly, Z., & Elouedi, Z. (2012a). Rc-dca:a new feature selection and signal
categorization technique for the dendritic cell algorithm based on rough set
theory. Proceedings of the 11th International Conference of Artificial Im-
mune Systems, Springer, ICARIS’2012, Lecture Notes in Computer Science,
7597, 152-165.
Chelly, Z., & Elouedi, Z. (2012b). Rst-dca: A dendritic cell algorithm based
on rough set theory. Proceedings of the 19th International Conference on
Neural Information Processing, Springer, ICONIP’2012, Lecture Notes in
Computer Science, 7665, 480-487.
Dubois, D., & Prade, H. (1992). Putting rough sets and fuzzy sets together.
Intelligent decision support : Handbook of applications and advances of
the rough sets theory, Springer, 11, 203-232.
Eiben, A., & Smith., J. (2007). Introduction to evolutionary computing. Natural
Computing.
Greensmith, J., & Aickelin, U. (2005). Introducing dendritic cells as a
novel immune-inspired algorithm for anomaly detection. Proceedings of
the 4th International Conference on Artificial Immune Systems, Springer,
ICARIS’2005, Lecture Notes in Computer Science, 3627, 153-167.
Greensmith, J., & Aickelin, U. (2008). The deterministic dendritic cell algorithm.
Proceedings of the 7th International Conference on Artificial Immune Sys-
tems, Springer, ICARIS’2008, Lecture Notes in Computer Science, 5132,
291-302.
Jensen, R. (2005). Combining rough and fuzzy sets for feature selection. Doctoral
Dissertation, School of Informatics University of Edinburgh.
Leake, D., & Wilson, D. (2001). Maintaining case-based reasoners: Dimensions

29
30 References

and directions. Computational Intelligence, 17, 196-213.


Lutz, M., & Schuler, G. (2002). Immature, semi-mature and fully mature dendritic
cells: which signals induce tolerance or immunity? TRENDS in Immunol-
ogy, 23, 445-449.
Matzinger, P. (2001). The danger model in it’s historical context. Scandinavian
Journal of Immunology, 54, 4-9.
Zadeh, L. (1965). Fuzzy sets. Information and Control, 8, 338-353.
Zimmermann, H. (1996). Fuzzy set theory and its applications. Kluwer Aca-
demic.
Zuben, V., & Timmis, J. (2003). Artificial immune systems as a novel soft com-
puting paradigm. Soft Computing Journal, 7, 526-544.

You might also like