Abstract. With the increasing number of ontologies available on the web, the problem of merging o... more Abstract. With the increasing number of ontologies available on the web, the problem of merging ontologies from different sources to interoperate applications becomes important. This paper presents a novel approach for merging of light-weight ontologies based on answer set programming (ASP) and linguistic background knowledge. ASP provides a declarative execution environment for intuitive merging rules. WordNet provides broad linguistic knowledge that is used to identify corresponding concepts.
Forgetting is an important tool for reducing ontologies by eliminating some concepts and roles wh... more Forgetting is an important tool for reducing ontologies by eliminating some concepts and roles while preserving sound and complete reasoning. Attempts have previously been made to address the problem of forgetting in relatively simple description logics (DLs) such as DL-Lite and extended EL . The ontologies used in these attempts were mostly restricted to TBoxes rather than general knowledge bases (KBs). However, the issue of forgetting for general KBs in more expressive description logics, such as ALC and OWL DL, is largely unexplored. In particular, the problem of characterizing and computing forgetting for such logics is still open. In this paper, we first define semantic forgetting about concepts and roles in ALC ontologies and state several important properties of forgetting in this setting. We then define the result of forgetting for concept descriptions in ALC , state the properties of forgetting for concept descriptions, and present algorithms for computing the result of forgetting for concept descriptions. Unlike the case of DL-Lite, the result of forgetting for an ALC ontology does not exist in general, even for the special case of concept forgetting. This makes the problem of how to compute forgetting in ALC more challenging. We address this problem by defining a series of approximations to the result of forgetting for ALC ontologies and studying their properties and their application to reasoning tasks. We use the algorithms for computing forgetting for concept descriptions to compute these approximations. Our algorithms for computing approximations can be embedded into an ontology editor to enhance its ability to manage and reason in (large) ontologies.
ABSTRACT The notion of uniform interpolation for description logic ALC\mathcal{ALC} has been intr... more ABSTRACT The notion of uniform interpolation for description logic ALC\mathcal{ALC} has been introduced in [9]. In this paper, we reformulate the uniform interpolation for ALC\mathcal{ALC} from the angle of forgetting and show that it satisfies all desired properties of forgetting. Then we introduce an algorithm for computing the result of forgetting in concept descriptions. We present a detailed proof for the correctness of our algorithm using the Tableau for ALC\mathcal{ALC}. Our results have been used to compute forgetting for ALC\mathcal{ALC} knowledge bases.
The language of dl-programs is a latest effort in developing an expressive representation for Web... more The language of dl-programs is a latest effort in developing an expressive representation for Web-based ontologies. It allows to build answer set programming (ASP) on top of description logic and thus some attractive features of ASP can be employed in the design of the Semantic Web architecture. In this paper we first generalize dl-programs by allowing multiple knowledge bases and then accordingly, define the answer set semantics for the dl-programs. A novel technique called forgetting is developed in the setting of dl-programs and applied to ontology merging and aligning.
The problem of rewriting queries using views has important applications in data integration, quer... more The problem of rewriting queries using views has important applications in data integration, query optimization, and physical data independence maintenance. Previous researchers have proposed rewriting algorithms for queries and views that are Datalog programs or conjunctive queries with arithmetic comparisons such as built-in predicates, using views. Our method also has advantages over previous algorithms when there are no built-in predicates
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '02, 2002
We study the problem of computing classification rule sets from relational databases so that accu... more We study the problem of computing classification rule sets from relational databases so that accurate predictions can be made on test data with missing attribute values. Traditional classifiers perform badly when test data are not as complete as the training data because they tailor a training database too much. We introduce the concept of one rule set being more robust than another, that is, able to make more accurate predictions on test data with missing attribute values. We show that the optimal class association rule set is as robust as the complete class association rule set. We then introduce the k-optimal rule set, which provides predictions exactly the same as the optimal class association rule set on test data with up to k missing attribute values. This leads to a hierarchy of k-optimal rule sets in which decreasing size corresponds to decreasing robustness, and they all more robust than a traditional classification rule set. We introduce two methods to find k-optimal rule sets, i.e. an optimal association rule mining approach and a heuristic approximate approach. We show experimentally that a k-optimal rule set generated by the optimal association rule mining approach performs better than that by the heuristic approximate approach and both rule sets perform significantly better than a typical classification rule set (C4.5Rules) on incomplete test data.
Several proposals have been put forward to support distributed agent cooperation in the Semantic ... more Several proposals have been put forward to support distributed agent cooperation in the Semantic Web, by allowing concepts and roles in one ontology be reused in another ontology. In general, these proposals reduce the autonomy of each ontology by defining the semantics of the ontology to depend on the semantics of the other ontologies. We propose a new framework for managing autonomy in a set of cooperating ontologies (or ontology space). In this framework, each language entity (concept/role/individual) in an ontology may have its meaning assigned either locally with respect to the semantics of its own ontology, to preserve the autonomy of the ontology, or globally with respect to the semantics of any neighbouring ontology in which it is defined, thus enabling semantic cooperation between multiple ontologies. In this way, each ontology has a "subjective semantics" based on local interpretation and a "foreign semantics" based on semantic binding to neighbouring ontologies. We study the properties of these two semantics and describe the conditions under which entailment and satisfiability are preserved. We also introduce two reasoning mechanisms under this framework: "cautious reasoning" and "brave reasoning". Cautious reasoning is done with respect to a local ontology and its neighbours (those ontologies in which its entities are defined); brave reasoning is done with respect to the transitive closure of this relationship. This framework is independent of ontology languages. As a case study, for Description Logic ALCN we present two tableau-based algorithms for performing each form of reasonings and prove their correctness.
Forgetting is a useful tool for tailoring ontologies by reducing the number of concepts and roles... more Forgetting is a useful tool for tailoring ontologies by reducing the number of concepts and roles, while preserving sound and complete reasoning. Some attempts have been made to address the problem of forgetting in some relatively simple description logics (DLs) such as DL-Lite and extended EL . Ontologies in those works are mostly expressed as TBoxes rather than general knowledge bases (KBs). However, the issue of forgetting for general KBs in more expressive ontology languages, such as ALC and OWL DL, is largely unexplored. In particular, the problem of characterizing and computing forgetting is still open. In this paper, we first define semantic forgetting about concepts and roles in ALC ontologies and show several important properties of forgetting in this setting. Unlike the case of DL-Lite, the result of forgetting in an ALC ontology may not exist in general, which makes the problem of how to compute forgetting in ALC more challenging. As a result, we tackle the non-existence ...
We propose a framework for heterogeneous multi-context systems, in which a special kind of semant... more We propose a framework for heterogeneous multi-context systems, in which a special kind of semantic/implicit bridge rules are introduced. Traditional bridge rules in heterogeneous multi-context systems may make the syntax and the semantics of a context more complex, e.g., in the approach of [3] an agent may have to facing a context composed by a description logic systems and a logic program with default negations. In this paper we hide the bridge rules by semantic binding on foreign knowledge fragment, and track the semantic property of a belief/knowledge in one context by a mirror-image of it in the other context. This framework can manage heterogeneous multi-contexts in a simple way, and it keeps the original reasoning properties of the context so that the original reasoning tools are still useful.
In this study, we investigate and present a new index structure, Triangular Decomposition Tree (T... more In this study, we investigate and present a new index structure, Triangular Decomposition Tree (TD-tree), which can efficiently store and query temporal data in modern database applications. TD-tree is based on spatial representation of interval data and a recursive triangular decomposition of this space. A bounded number of intervals are stored in each leaf of the tree, which hence may be unbalanced. We describe the algorithms used with this structure. A single query algorithm can be applied uniformly to different query types without the need of dedicated query transformation. In addition to the advantages related to the usage of a single query algorithm for different query types and better space complexity, the empirical performance of the TD-tree is demonstrated to be superior to its best known competitors. Also, presented concept can be extended to more dimensions and therefore applied to efficiently manage spatio-temporal data.
In extended relational databases, queries and integrity constraints often contain interpreted var... more In extended relational databases, queries and integrity constraints often contain interpreted variables and built-in relations. We extend previous work on semantic query containment for extended relational databases to handle disjunctive constrained tuple-generating dependencies (DCTGDs) which include almost all well-known classes of intergrity constraints. After defining this extended class of integrity constraints, we present a method for expanding a query Q, using DCTGDs, to a semantically equivalent set of queries. Our ...
The problem of finding contained rewritings of queries using views is of great importance in medi... more The problem of finding contained rewritings of queries using views is of great importance in mediated data integration systems. In this paper, we first present a general approach for finding contained rewritings of unions of conjunctive queries with arbitrary built-in predicates. Our approach is based on an improved method for testing conjunctive query containment in this context. Although conceptually simple, our approach generalizes previous methods for finding contained rewritings of conjunctive queries and is more powerful in the sense that many rewritings that can not be found using existing methods can be found by our approach. Furthermore, nullity-generating dependencies over the base relations can be easily handled. We then present a simplified approach which is less complete, but is much faster than the general approach, and it still finds maximum rewritings in several special cases. Our approaches compare favorably with existing methods.
J. LOGIC PROGRAMMING 1987:4:331343 331 INTEGRITY CONSTRAINT CHECKING IN STRATIFIED DATABASES JW L... more J. LOGIC PROGRAMMING 1987:4:331343 331 INTEGRITY CONSTRAINT CHECKING IN STRATIFIED DATABASES JW LLOYD, EA SONENBERG, AND RW TOPOR We prove the correctness of a simplification method for checking static integrity constraints in stratified deductive ...
To support the reuse and combination of ontologies in Semantic Web applications, it is often nece... more To support the reuse and combination of ontologies in Semantic Web applications, it is often necessary to obtain smaller ontologies from existing larger ontologies. In particular, applications may require the omission of many terms, e.g., concept names and role names, from an ontology. However, the task of omitting terms from an ontology is challenging because the omission of some terms may affect the relationships between the remaining terms in complex ways. We present the first solution to this problem by adapting the technique of forgetting, previously used in other domains. Specifically, we present a semantic definition of forgetting for description logics in general, which generalizes the standard definition for classical logic. We then introduce algorithms that implement forgetting in both DL-Lite TBoxes and ABoxes, and in DL-Lite knowledge bases. We prove that the algorithms are correct with respect to the semantic definition of forgetting, and that they run in polynomial time.
ABSTRACT Temporal and spatio-temporal data are present in many modern application systems, includ... more ABSTRACT Temporal and spatio-temporal data are present in many modern application systems, including monitoring moving objects. Such systems produce enormous volume of data, and therefore efficient indexing method is crucial. In this paper, we investigate and present a new concept based on virtual index structure, which can efficiently query such data. Concept is based on spatial representation of interval data and a recursive triangular decomposition of that space. The empirical performance of presented concept is demonstrated to be superior to its best known competitors. Yes Yes
ABSTRACT The need for efficient access and management of time dependent data in modern database a... more ABSTRACT The need for efficient access and management of time dependent data in modern database applications is well recognised and researched. Existing access methods are mostly derived from the family of spatial R-tree indexing techniques. These techniques are particularly not suitable to handle data involving open ended intervals, which are common in temporal databases. This is due to overlapping between nodes and huge dead space found in the database. In this study, we describe a detailed investigation of a new approach called “Triangular Decomposition Tree” (TD-Tree). The underlying idea for the TD-Tree is to manage temporal intervals by virtual index structures relying on geometric interpretations of intervals, and a space partition method that results in an unbalanced binary tree. We demonstrate that the unbalanced binary tree can be efficiently manipulated using a virtual index. We also show that the single query algorithm can be applied uniformly to different query types without the need of dedicated query transformations. In addition to the advantages related to the usage of a single query algorithm for different query types and better space complexity, the empirical performance of the TDtree has been found to be superior to its best known competitors. Yes Yes
Abstract. With the increasing number of ontologies available on the web, the problem of merging o... more Abstract. With the increasing number of ontologies available on the web, the problem of merging ontologies from different sources to interoperate applications becomes important. This paper presents a novel approach for merging of light-weight ontologies based on answer set programming (ASP) and linguistic background knowledge. ASP provides a declarative execution environment for intuitive merging rules. WordNet provides broad linguistic knowledge that is used to identify corresponding concepts.
Forgetting is an important tool for reducing ontologies by eliminating some concepts and roles wh... more Forgetting is an important tool for reducing ontologies by eliminating some concepts and roles while preserving sound and complete reasoning. Attempts have previously been made to address the problem of forgetting in relatively simple description logics (DLs) such as DL-Lite and extended EL . The ontologies used in these attempts were mostly restricted to TBoxes rather than general knowledge bases (KBs). However, the issue of forgetting for general KBs in more expressive description logics, such as ALC and OWL DL, is largely unexplored. In particular, the problem of characterizing and computing forgetting for such logics is still open. In this paper, we first define semantic forgetting about concepts and roles in ALC ontologies and state several important properties of forgetting in this setting. We then define the result of forgetting for concept descriptions in ALC , state the properties of forgetting for concept descriptions, and present algorithms for computing the result of forgetting for concept descriptions. Unlike the case of DL-Lite, the result of forgetting for an ALC ontology does not exist in general, even for the special case of concept forgetting. This makes the problem of how to compute forgetting in ALC more challenging. We address this problem by defining a series of approximations to the result of forgetting for ALC ontologies and studying their properties and their application to reasoning tasks. We use the algorithms for computing forgetting for concept descriptions to compute these approximations. Our algorithms for computing approximations can be embedded into an ontology editor to enhance its ability to manage and reason in (large) ontologies.
ABSTRACT The notion of uniform interpolation for description logic ALC\mathcal{ALC} has been intr... more ABSTRACT The notion of uniform interpolation for description logic ALC\mathcal{ALC} has been introduced in [9]. In this paper, we reformulate the uniform interpolation for ALC\mathcal{ALC} from the angle of forgetting and show that it satisfies all desired properties of forgetting. Then we introduce an algorithm for computing the result of forgetting in concept descriptions. We present a detailed proof for the correctness of our algorithm using the Tableau for ALC\mathcal{ALC}. Our results have been used to compute forgetting for ALC\mathcal{ALC} knowledge bases.
The language of dl-programs is a latest effort in developing an expressive representation for Web... more The language of dl-programs is a latest effort in developing an expressive representation for Web-based ontologies. It allows to build answer set programming (ASP) on top of description logic and thus some attractive features of ASP can be employed in the design of the Semantic Web architecture. In this paper we first generalize dl-programs by allowing multiple knowledge bases and then accordingly, define the answer set semantics for the dl-programs. A novel technique called forgetting is developed in the setting of dl-programs and applied to ontology merging and aligning.
The problem of rewriting queries using views has important applications in data integration, quer... more The problem of rewriting queries using views has important applications in data integration, query optimization, and physical data independence maintenance. Previous researchers have proposed rewriting algorithms for queries and views that are Datalog programs or conjunctive queries with arithmetic comparisons such as built-in predicates, using views. Our method also has advantages over previous algorithms when there are no built-in predicates
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '02, 2002
We study the problem of computing classification rule sets from relational databases so that accu... more We study the problem of computing classification rule sets from relational databases so that accurate predictions can be made on test data with missing attribute values. Traditional classifiers perform badly when test data are not as complete as the training data because they tailor a training database too much. We introduce the concept of one rule set being more robust than another, that is, able to make more accurate predictions on test data with missing attribute values. We show that the optimal class association rule set is as robust as the complete class association rule set. We then introduce the k-optimal rule set, which provides predictions exactly the same as the optimal class association rule set on test data with up to k missing attribute values. This leads to a hierarchy of k-optimal rule sets in which decreasing size corresponds to decreasing robustness, and they all more robust than a traditional classification rule set. We introduce two methods to find k-optimal rule sets, i.e. an optimal association rule mining approach and a heuristic approximate approach. We show experimentally that a k-optimal rule set generated by the optimal association rule mining approach performs better than that by the heuristic approximate approach and both rule sets perform significantly better than a typical classification rule set (C4.5Rules) on incomplete test data.
Several proposals have been put forward to support distributed agent cooperation in the Semantic ... more Several proposals have been put forward to support distributed agent cooperation in the Semantic Web, by allowing concepts and roles in one ontology be reused in another ontology. In general, these proposals reduce the autonomy of each ontology by defining the semantics of the ontology to depend on the semantics of the other ontologies. We propose a new framework for managing autonomy in a set of cooperating ontologies (or ontology space). In this framework, each language entity (concept/role/individual) in an ontology may have its meaning assigned either locally with respect to the semantics of its own ontology, to preserve the autonomy of the ontology, or globally with respect to the semantics of any neighbouring ontology in which it is defined, thus enabling semantic cooperation between multiple ontologies. In this way, each ontology has a "subjective semantics" based on local interpretation and a "foreign semantics" based on semantic binding to neighbouring ontologies. We study the properties of these two semantics and describe the conditions under which entailment and satisfiability are preserved. We also introduce two reasoning mechanisms under this framework: "cautious reasoning" and "brave reasoning". Cautious reasoning is done with respect to a local ontology and its neighbours (those ontologies in which its entities are defined); brave reasoning is done with respect to the transitive closure of this relationship. This framework is independent of ontology languages. As a case study, for Description Logic ALCN we present two tableau-based algorithms for performing each form of reasonings and prove their correctness.
Forgetting is a useful tool for tailoring ontologies by reducing the number of concepts and roles... more Forgetting is a useful tool for tailoring ontologies by reducing the number of concepts and roles, while preserving sound and complete reasoning. Some attempts have been made to address the problem of forgetting in some relatively simple description logics (DLs) such as DL-Lite and extended EL . Ontologies in those works are mostly expressed as TBoxes rather than general knowledge bases (KBs). However, the issue of forgetting for general KBs in more expressive ontology languages, such as ALC and OWL DL, is largely unexplored. In particular, the problem of characterizing and computing forgetting is still open. In this paper, we first define semantic forgetting about concepts and roles in ALC ontologies and show several important properties of forgetting in this setting. Unlike the case of DL-Lite, the result of forgetting in an ALC ontology may not exist in general, which makes the problem of how to compute forgetting in ALC more challenging. As a result, we tackle the non-existence ...
We propose a framework for heterogeneous multi-context systems, in which a special kind of semant... more We propose a framework for heterogeneous multi-context systems, in which a special kind of semantic/implicit bridge rules are introduced. Traditional bridge rules in heterogeneous multi-context systems may make the syntax and the semantics of a context more complex, e.g., in the approach of [3] an agent may have to facing a context composed by a description logic systems and a logic program with default negations. In this paper we hide the bridge rules by semantic binding on foreign knowledge fragment, and track the semantic property of a belief/knowledge in one context by a mirror-image of it in the other context. This framework can manage heterogeneous multi-contexts in a simple way, and it keeps the original reasoning properties of the context so that the original reasoning tools are still useful.
In this study, we investigate and present a new index structure, Triangular Decomposition Tree (T... more In this study, we investigate and present a new index structure, Triangular Decomposition Tree (TD-tree), which can efficiently store and query temporal data in modern database applications. TD-tree is based on spatial representation of interval data and a recursive triangular decomposition of this space. A bounded number of intervals are stored in each leaf of the tree, which hence may be unbalanced. We describe the algorithms used with this structure. A single query algorithm can be applied uniformly to different query types without the need of dedicated query transformation. In addition to the advantages related to the usage of a single query algorithm for different query types and better space complexity, the empirical performance of the TD-tree is demonstrated to be superior to its best known competitors. Also, presented concept can be extended to more dimensions and therefore applied to efficiently manage spatio-temporal data.
In extended relational databases, queries and integrity constraints often contain interpreted var... more In extended relational databases, queries and integrity constraints often contain interpreted variables and built-in relations. We extend previous work on semantic query containment for extended relational databases to handle disjunctive constrained tuple-generating dependencies (DCTGDs) which include almost all well-known classes of intergrity constraints. After defining this extended class of integrity constraints, we present a method for expanding a query Q, using DCTGDs, to a semantically equivalent set of queries. Our ...
The problem of finding contained rewritings of queries using views is of great importance in medi... more The problem of finding contained rewritings of queries using views is of great importance in mediated data integration systems. In this paper, we first present a general approach for finding contained rewritings of unions of conjunctive queries with arbitrary built-in predicates. Our approach is based on an improved method for testing conjunctive query containment in this context. Although conceptually simple, our approach generalizes previous methods for finding contained rewritings of conjunctive queries and is more powerful in the sense that many rewritings that can not be found using existing methods can be found by our approach. Furthermore, nullity-generating dependencies over the base relations can be easily handled. We then present a simplified approach which is less complete, but is much faster than the general approach, and it still finds maximum rewritings in several special cases. Our approaches compare favorably with existing methods.
J. LOGIC PROGRAMMING 1987:4:331343 331 INTEGRITY CONSTRAINT CHECKING IN STRATIFIED DATABASES JW L... more J. LOGIC PROGRAMMING 1987:4:331343 331 INTEGRITY CONSTRAINT CHECKING IN STRATIFIED DATABASES JW LLOYD, EA SONENBERG, AND RW TOPOR We prove the correctness of a simplification method for checking static integrity constraints in stratified deductive ...
To support the reuse and combination of ontologies in Semantic Web applications, it is often nece... more To support the reuse and combination of ontologies in Semantic Web applications, it is often necessary to obtain smaller ontologies from existing larger ontologies. In particular, applications may require the omission of many terms, e.g., concept names and role names, from an ontology. However, the task of omitting terms from an ontology is challenging because the omission of some terms may affect the relationships between the remaining terms in complex ways. We present the first solution to this problem by adapting the technique of forgetting, previously used in other domains. Specifically, we present a semantic definition of forgetting for description logics in general, which generalizes the standard definition for classical logic. We then introduce algorithms that implement forgetting in both DL-Lite TBoxes and ABoxes, and in DL-Lite knowledge bases. We prove that the algorithms are correct with respect to the semantic definition of forgetting, and that they run in polynomial time.
ABSTRACT Temporal and spatio-temporal data are present in many modern application systems, includ... more ABSTRACT Temporal and spatio-temporal data are present in many modern application systems, including monitoring moving objects. Such systems produce enormous volume of data, and therefore efficient indexing method is crucial. In this paper, we investigate and present a new concept based on virtual index structure, which can efficiently query such data. Concept is based on spatial representation of interval data and a recursive triangular decomposition of that space. The empirical performance of presented concept is demonstrated to be superior to its best known competitors. Yes Yes
ABSTRACT The need for efficient access and management of time dependent data in modern database a... more ABSTRACT The need for efficient access and management of time dependent data in modern database applications is well recognised and researched. Existing access methods are mostly derived from the family of spatial R-tree indexing techniques. These techniques are particularly not suitable to handle data involving open ended intervals, which are common in temporal databases. This is due to overlapping between nodes and huge dead space found in the database. In this study, we describe a detailed investigation of a new approach called “Triangular Decomposition Tree” (TD-Tree). The underlying idea for the TD-Tree is to manage temporal intervals by virtual index structures relying on geometric interpretations of intervals, and a space partition method that results in an unbalanced binary tree. We demonstrate that the unbalanced binary tree can be efficiently manipulated using a virtual index. We also show that the single query algorithm can be applied uniformly to different query types without the need of dedicated query transformations. In addition to the advantages related to the usage of a single query algorithm for different query types and better space complexity, the empirical performance of the TDtree has been found to be superior to its best known competitors. Yes Yes
Uploads
Papers by Rodney Topor