Document image understanding denotes the recognition of semantically relevant components in the l... more Document image understanding denotes the recognition of semantically relevant components in the layout extracted from a document image. This recognition process is based on some visual models, whose manual specification can be a highly demanding task. In order to ...
... Therefore in order to answer a query it is necessary to collect enough derivations ending wit... more ... Therefore in order to answer a query it is necessary to collect enough derivations ending with a constrained empty clause such that every model of B satisfies the constraints associated with the final query of ... 2 Semantic Web Mining with AC-QuIn ... 'Arab':MiddleEasternEthnicGroup ...
Abstract INGENS is a prototypical GIS which integrates machine learning tools in order to discove... more Abstract INGENS is a prototypical GIS which integrates machine learning tools in order to discover geographic knowledge useful for the task of topographic map interpretation. It embeds ATRE, a novel learning system that can induce recursive logic theories from a set ...
Information given in topographic map captions or in GIS models is often insufficient to recognize... more Information given in topographic map captions or in GIS models is often insufficient to recognize interesting geographical patterns. Some prototypes of GIS have already been extended with a knowledge-base and some reasoning capabilities to support sophisticated map interpretation processes. Nevertheless, the acquisition of the necessary knowledge is still a demanding task for which machine learning techniques can be of great help. This paper presents INGENS, a prototypical GIS which integrates machine learning tools to assist users in the task of topographic map interpretation. The system can be trained to learn operational definitions of geographical objects that are not explicitly modeled in the database. INGENS has been applied to the task of Apulian map interpretation in order to discover geographic knowledge of interest to town planners. #
In this paper we carry on the work on Onto-Relational Learning by investigating the impact of hav... more In this paper we carry on the work on Onto-Relational Learning by investigating the impact of having disjunctive Datalog with default negation either in the language of hypotheses or in the language for the background theory. The inclusion of nonmonotonic features strengthens the ability of our ILP framework to deal with incomplete knowledge by performing some form of commonsense reasoning. One such ability can turn out to be useful in application domains, such as the Semantic Web, which require that kind of reasoning.
ABSTRACT ILP is a major approach to Relational Learning that exploits previous results in concept... more ABSTRACT ILP is a major approach to Relational Learning that exploits previous results in concept learning and is characterized by the use of prior conceptual knowledge. An increasing amount of conceptual knowledge is being made available in the form of ontologies, mainly formalized with Description Logics (DLs). In this paper we consider the problem of learning rules from observations that combine relational data and ontologies, and identify the ingredients of an ILP solution to it. Our proposal relies on the expressive and deductive power of the KR framework DL\mathcal{DL} +log that allows for the tight integration of DLs and disjunctive Datalog with negation. More precisely we adopt an instantiation of this framework which integrates the DL SHIQ\mathcal{SHIQ} and positive Datalog. We claim that this proposal lays the foundations of an extension of Relational Learning, called Onto-Relational Learning, to account for ontologies.
This paper deals with learning in AL-log, a hybrid language that merges the function-free Horn cl... more This paper deals with learning in AL-log, a hybrid language that merges the function-free Horn clause language Datalog and the description logic ALC. Our application context is descriptive data mining. We introduce O-queries, a rule-based form of unary conjunctive queries in AL-log, and a generality order B for structuring spaces of O-queries. We define a (downward) refinement operator ρO for B -ordered spaces of O-queries, prove its ideality and discuss an efficient implementation of it in the context of interest.
In recent times, there is a growing interest in both the extension of data mining methods and tec... more In recent times, there is a growing interest in both the extension of data mining methods and techniques to spatial databases and the application of inductive logic programming (ILP) to knowledge discovery in databases (KDD). In this paper, an ILP application to association rule mining in spatial databases is presented. The discovery method has been implemented into the ILP system SPADA, which benefits from the available prior knowledge on the spatial domain, systematically explores the hierarchical structure of taskrelevant geographic layers and deals with numerical aspatial properties of spatial objects. It operates on a deductive relational database set up by selecting and transforming data stored in the underlying spatial database. Preliminary experimental results have been obtained by running SPADA on geo-referenced census data of Manchester Stockport, UK.
Information given in topographic map captions or in GIS models is often insufficient to recognize... more Information given in topographic map captions or in GIS models is often insufficient to recognize interesting geographical patterns. Some prototypes of GIS have already been extended with a knowledge-base and some reasoning capabilities to support sophisticated map interpretation processes. Nevertheless, the acquisition of the necessary knowledge is still a demanding task for which machine learning techniques can be of great help. This paper presents INGENS, a prototypical GIS which integrates machine learning tools to assist users in the task of topographic map interpretation. The system can be trained to learn operational definitions of geographical objects that are not explicitly modeled in the database. INGENS has been applied to the task of Apulian map interpretation in order to discover geographic knowledge of interest to town planners. #
Journal of Intelligent Information Systems, Jan 1, 2000
A paper document processing system is an information system component which transforms informatio... more A paper document processing system is an information system component which transforms information on printed or handwritten documents into a computer-revisable form. In intelligent systems for paper document processing this information capture process is based on knowledge of the specific layout and logical structures of the documents. This article proposes the application of machine learning techniques to acquire the specific knowledge required by an intelligent document processing system, named WISDOM++, that manages printed documents, such as letters and journals. Knowledge is represented by means of decision trees and firstorder rules automatically generated from a set of training documents. In particular, an incremental decision tree learning system is applied for the acquisition of decision trees used for the classification of segmented blocks, while a first-order learning system is applied for the induction of rules used for the layout-based classification and understanding of documents. Issues concerning the incremental induction of decision trees and the handling of both numeric and symbolic data in first-order rule learning are discussed, and the validity of the proposed solutions is empirically evaluated by processing a set of real printed documents.
Census data mining has great potential both in business development and in good public policy, bu... more Census data mining has great potential both in business development and in good public policy, but still must be solved in this field a number of research issues. In this paper, problems related to the geo-referenciation of census data are considered. In particular, the accommodation of the spatial dimension in census data mining is investigated for the task of discovering spatial association rules, that is, association rules involving spatial relations among (spatial) objects. The formulation of a new method based on a multi-relational data mining approach is proposed. It takes advantage of the representation and inference techniques developed in the field of Inductive Logic Programming (ILP). In particular, the expressive power of predicate logic is profitably used to represent both spatial relations and background knowledge, such as spatial hierarchies and rules for spatial qualitative reasoning. The logical notions of generality order and of the downward refinement operator on the space of patterns are profitably used to define both the search space and the search strategy. The proposed method has been implemented in the ILP system SPADA (Spatial Pattern Discovery Algorithm). SPADA has been interfaced both to a module for the extraction of spatial features from a spatial database and to a module for numerical attribute discretization. The three modules have been used in an application to urban accessibility of a hospital in Stockport, Greater Manchester. Results obtained through a spatial analysis of geo-referenced census data are illustrated.
We consider the interoperability of information systems within a distributed environment, such as... more We consider the interoperability of information systems within a distributed environment, such as across statistical organisations of the Member States of the European Union. Within a logical layer between the physical storage of the data (server) and the presentation of the data to the user (client), we introduce a conceptual data model that can facilitate interoperability between a number of different data models and structures in a distributed environment. For interoperability, we discuss a micro-macrometadata model that can interface readily with data warehouses and other cube object models and can exploit the accommodation of the operational metadata necessary for statistical query processing.
Document image understanding denotes the recognition of semantically relevant components in the l... more Document image understanding denotes the recognition of semantically relevant components in the layout extracted from a document image. This recognition process is based on some visual models, whose manual specification can be a highly demanding task. In order to ...
... Therefore in order to answer a query it is necessary to collect enough derivations ending wit... more ... Therefore in order to answer a query it is necessary to collect enough derivations ending with a constrained empty clause such that every model of B satisfies the constraints associated with the final query of ... 2 Semantic Web Mining with AC-QuIn ... 'Arab':MiddleEasternEthnicGroup ...
Abstract INGENS is a prototypical GIS which integrates machine learning tools in order to discove... more Abstract INGENS is a prototypical GIS which integrates machine learning tools in order to discover geographic knowledge useful for the task of topographic map interpretation. It embeds ATRE, a novel learning system that can induce recursive logic theories from a set ...
Information given in topographic map captions or in GIS models is often insufficient to recognize... more Information given in topographic map captions or in GIS models is often insufficient to recognize interesting geographical patterns. Some prototypes of GIS have already been extended with a knowledge-base and some reasoning capabilities to support sophisticated map interpretation processes. Nevertheless, the acquisition of the necessary knowledge is still a demanding task for which machine learning techniques can be of great help. This paper presents INGENS, a prototypical GIS which integrates machine learning tools to assist users in the task of topographic map interpretation. The system can be trained to learn operational definitions of geographical objects that are not explicitly modeled in the database. INGENS has been applied to the task of Apulian map interpretation in order to discover geographic knowledge of interest to town planners. #
In this paper we carry on the work on Onto-Relational Learning by investigating the impact of hav... more In this paper we carry on the work on Onto-Relational Learning by investigating the impact of having disjunctive Datalog with default negation either in the language of hypotheses or in the language for the background theory. The inclusion of nonmonotonic features strengthens the ability of our ILP framework to deal with incomplete knowledge by performing some form of commonsense reasoning. One such ability can turn out to be useful in application domains, such as the Semantic Web, which require that kind of reasoning.
ABSTRACT ILP is a major approach to Relational Learning that exploits previous results in concept... more ABSTRACT ILP is a major approach to Relational Learning that exploits previous results in concept learning and is characterized by the use of prior conceptual knowledge. An increasing amount of conceptual knowledge is being made available in the form of ontologies, mainly formalized with Description Logics (DLs). In this paper we consider the problem of learning rules from observations that combine relational data and ontologies, and identify the ingredients of an ILP solution to it. Our proposal relies on the expressive and deductive power of the KR framework DL\mathcal{DL} +log that allows for the tight integration of DLs and disjunctive Datalog with negation. More precisely we adopt an instantiation of this framework which integrates the DL SHIQ\mathcal{SHIQ} and positive Datalog. We claim that this proposal lays the foundations of an extension of Relational Learning, called Onto-Relational Learning, to account for ontologies.
This paper deals with learning in AL-log, a hybrid language that merges the function-free Horn cl... more This paper deals with learning in AL-log, a hybrid language that merges the function-free Horn clause language Datalog and the description logic ALC. Our application context is descriptive data mining. We introduce O-queries, a rule-based form of unary conjunctive queries in AL-log, and a generality order B for structuring spaces of O-queries. We define a (downward) refinement operator ρO for B -ordered spaces of O-queries, prove its ideality and discuss an efficient implementation of it in the context of interest.
In recent times, there is a growing interest in both the extension of data mining methods and tec... more In recent times, there is a growing interest in both the extension of data mining methods and techniques to spatial databases and the application of inductive logic programming (ILP) to knowledge discovery in databases (KDD). In this paper, an ILP application to association rule mining in spatial databases is presented. The discovery method has been implemented into the ILP system SPADA, which benefits from the available prior knowledge on the spatial domain, systematically explores the hierarchical structure of taskrelevant geographic layers and deals with numerical aspatial properties of spatial objects. It operates on a deductive relational database set up by selecting and transforming data stored in the underlying spatial database. Preliminary experimental results have been obtained by running SPADA on geo-referenced census data of Manchester Stockport, UK.
Information given in topographic map captions or in GIS models is often insufficient to recognize... more Information given in topographic map captions or in GIS models is often insufficient to recognize interesting geographical patterns. Some prototypes of GIS have already been extended with a knowledge-base and some reasoning capabilities to support sophisticated map interpretation processes. Nevertheless, the acquisition of the necessary knowledge is still a demanding task for which machine learning techniques can be of great help. This paper presents INGENS, a prototypical GIS which integrates machine learning tools to assist users in the task of topographic map interpretation. The system can be trained to learn operational definitions of geographical objects that are not explicitly modeled in the database. INGENS has been applied to the task of Apulian map interpretation in order to discover geographic knowledge of interest to town planners. #
Journal of Intelligent Information Systems, Jan 1, 2000
A paper document processing system is an information system component which transforms informatio... more A paper document processing system is an information system component which transforms information on printed or handwritten documents into a computer-revisable form. In intelligent systems for paper document processing this information capture process is based on knowledge of the specific layout and logical structures of the documents. This article proposes the application of machine learning techniques to acquire the specific knowledge required by an intelligent document processing system, named WISDOM++, that manages printed documents, such as letters and journals. Knowledge is represented by means of decision trees and firstorder rules automatically generated from a set of training documents. In particular, an incremental decision tree learning system is applied for the acquisition of decision trees used for the classification of segmented blocks, while a first-order learning system is applied for the induction of rules used for the layout-based classification and understanding of documents. Issues concerning the incremental induction of decision trees and the handling of both numeric and symbolic data in first-order rule learning are discussed, and the validity of the proposed solutions is empirically evaluated by processing a set of real printed documents.
Census data mining has great potential both in business development and in good public policy, bu... more Census data mining has great potential both in business development and in good public policy, but still must be solved in this field a number of research issues. In this paper, problems related to the geo-referenciation of census data are considered. In particular, the accommodation of the spatial dimension in census data mining is investigated for the task of discovering spatial association rules, that is, association rules involving spatial relations among (spatial) objects. The formulation of a new method based on a multi-relational data mining approach is proposed. It takes advantage of the representation and inference techniques developed in the field of Inductive Logic Programming (ILP). In particular, the expressive power of predicate logic is profitably used to represent both spatial relations and background knowledge, such as spatial hierarchies and rules for spatial qualitative reasoning. The logical notions of generality order and of the downward refinement operator on the space of patterns are profitably used to define both the search space and the search strategy. The proposed method has been implemented in the ILP system SPADA (Spatial Pattern Discovery Algorithm). SPADA has been interfaced both to a module for the extraction of spatial features from a spatial database and to a module for numerical attribute discretization. The three modules have been used in an application to urban accessibility of a hospital in Stockport, Greater Manchester. Results obtained through a spatial analysis of geo-referenced census data are illustrated.
We consider the interoperability of information systems within a distributed environment, such as... more We consider the interoperability of information systems within a distributed environment, such as across statistical organisations of the Member States of the European Union. Within a logical layer between the physical storage of the data (server) and the presentation of the data to the user (client), we introduce a conceptual data model that can facilitate interoperability between a number of different data models and structures in a distributed environment. For interoperability, we discuss a micro-macrometadata model that can interface readily with data warehouses and other cube object models and can exploit the accommodation of the operational metadata necessary for statistical query processing.
Uploads
Papers by Lisi Francesca