Skip to main content

Y. Slimani

University of Manouba, Tunisia, ISAMM Institute, Faculty Member

Followers

32

Following

12

Co-authors

11

Public Views

Uploads

Papers by Y. Slimani

A Secure Protocol to Maintain Data Privacy In Data Mining

Advanced Data Mining and Applications, 2009

Recently, privacy issues have becomes important in data mining, especially when data is horizonta... more Recently, privacy issues have becomes important in data mining, especially when data is horizontally or vertically partitioned. For the vertically partitioned case, many data mining problems can be reduced to securely computing the scalar product. Among these problems, we can mention association rule mining over vertically partitioned data. Efficiency of a secure scalar product can be measured by the overhead of communication needed to ensure this security. Several solutions have been proposed for privacy preserving association rule mining in vertically partitioned data. But the main drawback of these solutions is the excessive overhead communication needed for ensuring data privacy. In this paper we propose a new secure scalar product with the aim to reduce the overhead communication.

A Transputer-based Parallel DataBase Machine

Massively Parallel Processing Applications and Development, 1994

A temporal model for fault-tolerant parallel programs

Proceedings of the Sixth IEEE Computer Society Workshop on Future Trends of Distributed Computing Systems, 1997

ABSTRACT

Papiers présentés à la conférence Renpar 2002

Page 1. Papiers présentés à la conférence Renpar 2002 Yaya SLIMANI et Denis TRYSTRAM 2 juin 2004 ... more

Informatique répartie (RSTI/hors série 2005)

by Y. Slimani and Mohamed Jemni

... 81,00 Ajouter au panier le numéro de revue de TRYSTRAM Denis, SLIMANI Yahia, JEMNI Mohamed.... more

Semantics of Concurrent Programming

Relation-based semantics for concurrency

by Y. Slimani and noureddine boudriga

Information Sciences, 1993

ABSTRACT

An Adaptive Cost Model for Distributed Query Optimization on the Grid

by Y. Slimani, Faiza Najjar, and Najla Mami

Lecture Notes in Computer Science, 2004

Distributed query processing is fast becoming a reality. With the new emerging applications such ... more Distributed query processing is fast becoming a reality. With the new emerging applications such as the grid applications, distributed data processing becomes a complex undertaking due to the changes coming from both underlying networks and the requirements of grid-enabled databases. Clearly, without considering the network characteristics and the heterogeneity, the solution quality for distributed data processing may degrade. In this paper, we propose a generic cost-based query optimization to meet the requirements while taking network topology into consideration.

Executing association rule mining algorithms under a Grid computing environment

Proceedings of the Workshop on Parallel and Distributed Systems Testing, Analysis, and Debugging - PADTAD '11, 2011

Grids are now regarded as promising platforms for data and computation-intensive applications lik... more Grids are now regarded as promising platforms for data and computation-intensive applications like data mining. However, the exploration of such large-scale computing resources necessitates the development of new distributed algorithms. The major challenge facing the developers of distributed data mining algorithms is how to adjust the load imbalance that occurs during execution. This load imbalance is due to the dynamic nature of data mining algorithms (i.e. we cannot predict the load before execution) and the heterogeneity of Grid computing systems. In this paper, we propose a dynamic load balancing strategy for distributed association rule mining algorithms under a Grid computing environment. We evaluate the performance of the proposed strategy by the use of Grid'5000. A Grid infrastructure distributed in nine sites around France, for research in large-scale parallel and distributed systems.

Discovering" Factual" and" Implicative" generic association rules

Un modèle comportemental d'interclassement de résultats dans un système de recherche d'information pair-à-pair

La Recherche d'Information Distribuée (RID) constitue aujourd'hui un domaine d'investigation en p... more La Recherche d'Information Distribuée (RID) constitue aujourd'hui un domaine d'investigation en pleine effervescence. Afin de définir un processus de RID dans un système totalement distribué tel que les systèmes pair-à-pair, un intérêt particulier doit être porté à la phase de combinaison de résultats provenant de pairs autonomes, que nous appelons interclassement. Les approches classiques ne s'avèrent pas efficaces vu le manque de statistiques globales. Elles s'avèrent également trop généralistes pour s'adapter à des besoins particuliers. Toutes ces raisons nous ont incités à proposer un modèle d'interclassement de résultats qui tient compte des besoins de l'utilisateur (appelé modèle comportemental). ABSTRACT. Distributed Information Retrieval (DIR) is already an important area of investigation due to the importance of ongoing access to relevant information. In a completely distributed system such as peer-to-peer systems, an important interest should be attributed to the rank aggregation phase. Classical approaches miss effectiveness due to lack of global statistics. This leads us to seek for a new aggregation model that takes into account the user needs (called behavioral model).

Consistency Management for Data Grid in OptorSim Simulator

2007 International Conference on Multimedia and Ubiquitous Engineering (MUE'07), 2007

ABSTRACT One of the principal motivations to use the grids computing and data grids comes from th... more ABSTRACT One of the principal motivations to use the grids computing and data grids comes from the applications using of large sets from data, for example, in high-energy physics or life science to improve the total output of the software environments used to carry these applications on the grids, data replication are deposited on various selected sites. In the field of the grids the majority of the strategies of replication of the data and scheduling of the jobs were tested by simulation. Several simulators of grids were born. One of the most simulators interesting for our study is the OptorSim tool. In this paper, we present an extension of the OptorSim simulator by a consistency management module of the replicas in the Data Grids. This extension corresponds to a hybrid approach of consistency, it inspired by the pessimistic and optimistic approaches of consistency. This suggested approach has two vocations, in the first time, it makes it possible to reduce the response times compared with the completely pessimistic approach, in the second time, it gives a good quality of service compared with the optimistic approach.

FGMAC: Frequent subgraph mining with Arc Consistency

by Y. Slimani and Michel Liquiere

2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM), 2011

With the important growth of requirements to analyze large amount of structured data such as chem... more With the important growth of requirements to analyze large amount of structured data such as chemical compounds, proteins structures, XML documents, to cite but a few, graph mining has become an attractive track and a real challenge in the data mining field. Among the various kinds of graph patterns, frequent subgraphs seem to be relevant in characterizing graphsets, discriminating different groups of sets, and classifying and clustering graphs. Because of the NP-Completeness of subgraph isomorphism test as well as the huge search space, fragment miners are exponential in runtime and/or memory consumption. In this paper we study a new polynomial projection operator named AC-Projection based on a key technique of constraint programming namely Arc Consistency (AC). This is intended to replace the use of the exponential subgraph isomorphism. We study the relevance of frequent AC-reduced graph patterns on classification and we prove that we can achieve an important performance gain without or with non-significant loss of discovered pattern's quality.

A hierarchical approach for semi-structured document indexing and terminology extraction

2010 International Conference on Information Retrieval & Knowledge Management (CAMP), 2010

Many approaches of terminology extraction make use of contextual information to acquire relations... more Many approaches of terminology extraction make use of contextual information to acquire relations between terms. The quality and the quantity of this information influence the accuracy of the terminology extractor. In this paper, we assume that logical structure of documents constitute a rich source of contextual information which can be used to infer semantic relations between terms and thus construct

The enhancement of semijoin strategies in distributed query optimization

Lecture Notes in Computer Science, 1998

We investigate the problem of optimizing distributed queries by using semijoins in order to minim... more We investigate the problem of optimizing distributed queries by using semijoins in order to minimize the amount of data communication between sites. The problem is reduced to that of finding an optimal semijoin sequence that locally fully reduces the relations referenced in a general query graph before processing the join operations.

A Patricia-Tree Approach for Frequent Closed Itemsets

Enformatika Conferences, 2005

Abstract—In this paper, we propose an adaptation of the Patricia-Tree for sparse datasets to gene... more Abstract—In this paper, we propose an adaptation of the Patricia-Tree for sparse datasets to generate ,non ,redundant ,rule associations. Using this adaptation, we can generate frequent closed itemsets that are more compact,than frequent itemsets used in Apriori approach. This adaptation has been experimented,on a set of datasets benchmarks. Keywords—Datamining, Frequent itemsets, Frequent closed

EGEA : A New Hybrid Approach Towards Extracting Reduced Generic Association Rule Set (Application to AML Blood Cancer Therapy)

Lecture Notes in Computer Science, 2006

To avoid obtaining an unmanageable highly sized association rule sets-compounded with their low p... more To avoid obtaining an unmanageable highly sized association rule sets-compounded with their low precision-that often make the perusal of knowledge ineffective, the extraction and exploitation of compact and informative generic basis of association rules is a becoming a must. Moreover, they provide a powerful verification technique for hampering gene mis-annotating or badly clustering in the Unigene library. However, extracted generic basis is still oversized and their exploitation is impractical. Thus, providing critical nuggets of extra-valued knowledge is a compellingly addressable issue. To tackle such a drawback, we propose in this paper a novel approach, called EGEA (Evolutionary Gene Extraction Approach). Such approach aims to considerably reduce the quantity of knowledge, extracted from a gene expression dataset, presented to an expert. Thus, we use a genetic algorithm to select the more predictive set of genes related to patient situations. Once, the relevant attributes (genes) have been selected, they serve as an input for a second approach stage, i.e., extracting generic association rules from this reduced set of genes. The notably decrease of the generic association rule cardinality, extracted from the selected gene set, permits to improve the quality of knowledge exploitation. Carried out experiments on a benchmark dataset pointed out that among this set, there are genes which are previously unknown prognosis-associated genes. This may serve as molecular targets for new therapeutic strategies to repress the relapse of pediatric acute myeloid leukemia (AML).

Adaptive Particle Swarm Optimizer for Feature Selection

Lecture Notes in Computer Science, 2010

The combinatorial nature of the Feature Selection problem has made the use of heuristic methods i... more The combinatorial nature of the Feature Selection problem has made the use of heuristic methods indispensable even for moderate dataset dimensions. Recently, several optimization paradigms emerged as attractive alternatives to classic heuristic based approaches. In this paper, we propose a new an adapted Particle Swarm Optimization for the exploration of the feature selection problem search space. In spite of the

Memetic Feature Selection: Benchmarking Hybridization Schemata

Lecture Notes in Computer Science, 2010

Feature subset selection is an important preprocessing and guiding step for classification. The c... more Feature subset selection is an important preprocessing and guiding step for classification. The combinatorial nature of the problem have made the use of evolutionary and heuristic methods indispensble for the exploration of high dimensional problem search spaces. In this paper, a set of hybridization schemata of genetic algorithm with local search are investigated through a memetic framework. Empirical study compares

Distributed optimization of cyclic queries with parallel semijoins

by Faiza Najjar and Y. Slimani

Proceedings Ninth International Workshop on Database and Expert Systems Applications (Cat. No.98EX130), 1998

A Secure Protocol to Maintain Data Privacy In Data Mining

Advanced Data Mining and Applications, 2009

Recently, privacy issues have becomes important in data mining, especially when data is horizonta... more Recently, privacy issues have becomes important in data mining, especially when data is horizontally or vertically partitioned. For the vertically partitioned case, many data mining problems can be reduced to securely computing the scalar product. Among these problems, we can mention association rule mining over vertically partitioned data. Efficiency of a secure scalar product can be measured by the overhead of communication needed to ensure this security. Several solutions have been proposed for privacy preserving association rule mining in vertically partitioned data. But the main drawback of these solutions is the excessive overhead communication needed for ensuring data privacy. In this paper we propose a new secure scalar product with the aim to reduce the overhead communication.

A Transputer-based Parallel DataBase Machine

Massively Parallel Processing Applications and Development, 1994

A temporal model for fault-tolerant parallel programs

Proceedings of the Sixth IEEE Computer Society Workshop on Future Trends of Distributed Computing Systems, 1997

ABSTRACT

Papiers présentés à la conférence Renpar 2002

Page 1. Papiers présentés à la conférence Renpar 2002 Yaya SLIMANI et Denis TRYSTRAM 2 juin 2004 ... more

Informatique répartie (RSTI/hors série 2005)

by Y. Slimani and Mohamed Jemni

... 81,00 Ajouter au panier le numéro de revue de TRYSTRAM Denis, SLIMANI Yahia, JEMNI Mohamed.... more

Semantics of Concurrent Programming

Relation-based semantics for concurrency

by Y. Slimani and noureddine boudriga

Information Sciences, 1993

ABSTRACT

An Adaptive Cost Model for Distributed Query Optimization on the Grid

by Y. Slimani, Faiza Najjar, and Najla Mami

Lecture Notes in Computer Science, 2004

Distributed query processing is fast becoming a reality. With the new emerging applications such ... more Distributed query processing is fast becoming a reality. With the new emerging applications such as the grid applications, distributed data processing becomes a complex undertaking due to the changes coming from both underlying networks and the requirements of grid-enabled databases. Clearly, without considering the network characteristics and the heterogeneity, the solution quality for distributed data processing may degrade. In this paper, we propose a generic cost-based query optimization to meet the requirements while taking network topology into consideration.

Executing association rule mining algorithms under a Grid computing environment

Proceedings of the Workshop on Parallel and Distributed Systems Testing, Analysis, and Debugging - PADTAD '11, 2011

Grids are now regarded as promising platforms for data and computation-intensive applications lik... more Grids are now regarded as promising platforms for data and computation-intensive applications like data mining. However, the exploration of such large-scale computing resources necessitates the development of new distributed algorithms. The major challenge facing the developers of distributed data mining algorithms is how to adjust the load imbalance that occurs during execution. This load imbalance is due to the dynamic nature of data mining algorithms (i.e. we cannot predict the load before execution) and the heterogeneity of Grid computing systems. In this paper, we propose a dynamic load balancing strategy for distributed association rule mining algorithms under a Grid computing environment. We evaluate the performance of the proposed strategy by the use of Grid'5000. A Grid infrastructure distributed in nine sites around France, for research in large-scale parallel and distributed systems.

Discovering" Factual" and" Implicative" generic association rules

Un modèle comportemental d'interclassement de résultats dans un système de recherche d'information pair-à-pair

La Recherche d'Information Distribuée (RID) constitue aujourd'hui un domaine d'investigation en p... more La Recherche d'Information Distribuée (RID) constitue aujourd'hui un domaine d'investigation en pleine effervescence. Afin de définir un processus de RID dans un système totalement distribué tel que les systèmes pair-à-pair, un intérêt particulier doit être porté à la phase de combinaison de résultats provenant de pairs autonomes, que nous appelons interclassement. Les approches classiques ne s'avèrent pas efficaces vu le manque de statistiques globales. Elles s'avèrent également trop généralistes pour s'adapter à des besoins particuliers. Toutes ces raisons nous ont incités à proposer un modèle d'interclassement de résultats qui tient compte des besoins de l'utilisateur (appelé modèle comportemental). ABSTRACT. Distributed Information Retrieval (DIR) is already an important area of investigation due to the importance of ongoing access to relevant information. In a completely distributed system such as peer-to-peer systems, an important interest should be attributed to the rank aggregation phase. Classical approaches miss effectiveness due to lack of global statistics. This leads us to seek for a new aggregation model that takes into account the user needs (called behavioral model).

Consistency Management for Data Grid in OptorSim Simulator

2007 International Conference on Multimedia and Ubiquitous Engineering (MUE'07), 2007

ABSTRACT One of the principal motivations to use the grids computing and data grids comes from th... more ABSTRACT One of the principal motivations to use the grids computing and data grids comes from the applications using of large sets from data, for example, in high-energy physics or life science to improve the total output of the software environments used to carry these applications on the grids, data replication are deposited on various selected sites. In the field of the grids the majority of the strategies of replication of the data and scheduling of the jobs were tested by simulation. Several simulators of grids were born. One of the most simulators interesting for our study is the OptorSim tool. In this paper, we present an extension of the OptorSim simulator by a consistency management module of the replicas in the Data Grids. This extension corresponds to a hybrid approach of consistency, it inspired by the pessimistic and optimistic approaches of consistency. This suggested approach has two vocations, in the first time, it makes it possible to reduce the response times compared with the completely pessimistic approach, in the second time, it gives a good quality of service compared with the optimistic approach.

FGMAC: Frequent subgraph mining with Arc Consistency

by Y. Slimani and Michel Liquiere

2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM), 2011

With the important growth of requirements to analyze large amount of structured data such as chem... more With the important growth of requirements to analyze large amount of structured data such as chemical compounds, proteins structures, XML documents, to cite but a few, graph mining has become an attractive track and a real challenge in the data mining field. Among the various kinds of graph patterns, frequent subgraphs seem to be relevant in characterizing graphsets, discriminating different groups of sets, and classifying and clustering graphs. Because of the NP-Completeness of subgraph isomorphism test as well as the huge search space, fragment miners are exponential in runtime and/or memory consumption. In this paper we study a new polynomial projection operator named AC-Projection based on a key technique of constraint programming namely Arc Consistency (AC). This is intended to replace the use of the exponential subgraph isomorphism. We study the relevance of frequent AC-reduced graph patterns on classification and we prove that we can achieve an important performance gain without or with non-significant loss of discovered pattern's quality.

A hierarchical approach for semi-structured document indexing and terminology extraction

2010 International Conference on Information Retrieval & Knowledge Management (CAMP), 2010

Many approaches of terminology extraction make use of contextual information to acquire relations... more Many approaches of terminology extraction make use of contextual information to acquire relations between terms. The quality and the quantity of this information influence the accuracy of the terminology extractor. In this paper, we assume that logical structure of documents constitute a rich source of contextual information which can be used to infer semantic relations between terms and thus construct

The enhancement of semijoin strategies in distributed query optimization

Lecture Notes in Computer Science, 1998

We investigate the problem of optimizing distributed queries by using semijoins in order to minim... more We investigate the problem of optimizing distributed queries by using semijoins in order to minimize the amount of data communication between sites. The problem is reduced to that of finding an optimal semijoin sequence that locally fully reduces the relations referenced in a general query graph before processing the join operations.

A Patricia-Tree Approach for Frequent Closed Itemsets

Enformatika Conferences, 2005

Abstract—In this paper, we propose an adaptation of the Patricia-Tree for sparse datasets to gene... more Abstract—In this paper, we propose an adaptation of the Patricia-Tree for sparse datasets to generate ,non ,redundant ,rule associations. Using this adaptation, we can generate frequent closed itemsets that are more compact,than frequent itemsets used in Apriori approach. This adaptation has been experimented,on a set of datasets benchmarks. Keywords—Datamining, Frequent itemsets, Frequent closed

EGEA : A New Hybrid Approach Towards Extracting Reduced Generic Association Rule Set (Application to AML Blood Cancer Therapy)

Lecture Notes in Computer Science, 2006

To avoid obtaining an unmanageable highly sized association rule sets-compounded with their low p... more To avoid obtaining an unmanageable highly sized association rule sets-compounded with their low precision-that often make the perusal of knowledge ineffective, the extraction and exploitation of compact and informative generic basis of association rules is a becoming a must. Moreover, they provide a powerful verification technique for hampering gene mis-annotating or badly clustering in the Unigene library. However, extracted generic basis is still oversized and their exploitation is impractical. Thus, providing critical nuggets of extra-valued knowledge is a compellingly addressable issue. To tackle such a drawback, we propose in this paper a novel approach, called EGEA (Evolutionary Gene Extraction Approach). Such approach aims to considerably reduce the quantity of knowledge, extracted from a gene expression dataset, presented to an expert. Thus, we use a genetic algorithm to select the more predictive set of genes related to patient situations. Once, the relevant attributes (genes) have been selected, they serve as an input for a second approach stage, i.e., extracting generic association rules from this reduced set of genes. The notably decrease of the generic association rule cardinality, extracted from the selected gene set, permits to improve the quality of knowledge exploitation. Carried out experiments on a benchmark dataset pointed out that among this set, there are genes which are previously unknown prognosis-associated genes. This may serve as molecular targets for new therapeutic strategies to repress the relapse of pediatric acute myeloid leukemia (AML).

Adaptive Particle Swarm Optimizer for Feature Selection

Lecture Notes in Computer Science, 2010

The combinatorial nature of the Feature Selection problem has made the use of heuristic methods i... more The combinatorial nature of the Feature Selection problem has made the use of heuristic methods indispensable even for moderate dataset dimensions. Recently, several optimization paradigms emerged as attractive alternatives to classic heuristic based approaches. In this paper, we propose a new an adapted Particle Swarm Optimization for the exploration of the feature selection problem search space. In spite of the

Memetic Feature Selection: Benchmarking Hybridization Schemata

Lecture Notes in Computer Science, 2010

Feature subset selection is an important preprocessing and guiding step for classification. The c... more Feature subset selection is an important preprocessing and guiding step for classification. The combinatorial nature of the problem have made the use of evolutionary and heuristic methods indispensble for the exploration of high dimensional problem search spaces. In this paper, a set of hybridization schemata of genetic algorithm with local search are investigated through a memetic framework. Empirical study compares

Distributed optimization of cyclic queries with parallel semijoins

by Faiza Najjar and Y. Slimani

Proceedings Ninth International Workshop on Database and Expert Systems Applications (Cat. No.98EX130), 1998