Abstract We present a decomposition-based approach to managing probabilistic information. We intr... more Abstract We present a decomposition-based approach to managing probabilistic information. We introduce world-set decompositions (WSDs), a space-efficient and complete representation system for finite sets of worlds. We study the problem of efficiently evaluating relational algebra queries on world-sets represented by WSDs. We also evaluate our technique experimentally in a large census data scenario and show that it is both scalable and efficient.
Abstract We introduce an extension of the XQuery language, FluX, that supports event-based query ... more Abstract We introduce an extension of the XQuery language, FluX, that supports event-based query processing and the conscious handling of main memory buffers. Purely event-based queries of this language can be executed on streaming XML data in a very direct way. We then develop an algorithm that allows to efficiently rewrite XQueries into the event-based FluX language. This algorithm uses order constraints from a DTD to schedule event handlers and to thus minimize the amount of buffering required for evaluating a query.
Tree pattern matching is a fundamental problem that has a wide range of applications in Web data ... more Tree pattern matching is a fundamental problem that has a wide range of applications in Web data management, XML processing, and selective data dissemination. In this paper we develop efficient algorithms for the tree homeomorphism problem, ie, the problem of matching a tree pattern with exclusively transitive (descendant) edges. We first prove that deciding whether there is a tree homeomorphism is LOGSPACE-complete, improving on the current LOGCFL upper bound.
DLV is an efficient Answer Set Programming (ASP) system implementing the consistent answer set se... more DLV is an efficient Answer Set Programming (ASP) system implementing the consistent answer set semantics [5] with various language enhancements like support for logic programming with inheritance and queries, integer arithmetics, and various other built-in predicates.
Abstract This survey gives an overview of formal results on the XML query language XPath. We iden... more Abstract This survey gives an overview of formal results on the XML query language XPath. We identify several important fragments of XPath, focusing on subsets of XPath 1.0. We then give results on the expressiveness of XPath and its fragments compared to other formalisms for querying trees, algorithms, and complexity bounds for evaluation of XPath queries, as well as static analysis of XPath queries.
Abstract Applications ranging from algorithmic trading to scientific data analysis require realti... more Abstract Applications ranging from algorithmic trading to scientific data analysis require realtime analytics based on views over databases that change at very high rates. Such views have to be kept fresh at low maintenance cost and latencies. At the same time, these views have to support classical SQL, rather than window semantics, to enable applications that combine current with aged or historical data.
Abstract We discuss the requirements for information integration in large scientific collaboratio... more Abstract We discuss the requirements for information integration in large scientific collaborations and arrive at the conclusion that an architecture is needed that follows the declarative paradigm for reasoning completeness, maintainability and reuse of previously encoded knowledge but does not take the classical approach of integrating all sources against a single common “global” information model.
Abstract A decade of experience with research proposals as well as standardized query languages f... more Abstract A decade of experience with research proposals as well as standardized query languages for the conventional Web and the recent emergence of query languages for the Semantic Web call for a reconsideration of design principles for Web and Semantic Web query languages.
Abstract This paper introduces U-relations, a succinct and purely relational representation syste... more Abstract This paper introduces U-relations, a succinct and purely relational representation system for uncertain databases. U-relations support attribute-level uncertainty using vertical partitioning. If we consider positive relational algebra extended by an operation for computing possible answers, a query on the logical level can be translated into, and evaluated as, a single relational algebra query on the U-relational representation.
During the last years, much research has been done concerning semantics and complexity of Disjunc... more During the last years, much research has been done concerning semantics and complexity of Disjunctive Deductive Databases (DDDBs). While DDDBs | function-free disjunctive logic programs with negation in rule bodies allowed | are now generally considered a powerful tool for common-sense reasoning and knowledge representation, there has been a shortage of actual (let alone e cient) implementations ( ST94, ADN97]). This paper presents a brief overview of the architecture of the dlv (datalog with disjunction) system system currently developed at TU Wien in the FWF project P11580-MAT \A Query System for Disjunctive Deductive Databases", especially focusing on the Model Generator { the \heart" of the dlv system { and the integrated frontends for diagnostic reasoning and SQL3.
Abstract Managing incomplete information is important in many real world applications. In this de... more Abstract Managing incomplete information is important in many real world applications. In this demonstration we present MayBMS-a system for representing and managing finite sets of possible worlds-that successfully combines expressiveness and efficiency.
Abstract MayBMS is a state-of-the-art probabilistic database management system which leverages th... more Abstract MayBMS is a state-of-the-art probabilistic database management system which leverages the strengths of previous database research for achieving scalability. As a proof of concept for its ease of use, we have built on top of MayBMS a Web-based application that offers NBA-related information based on what-if analysis of team dynamics using data available at www. nba. com.
Abstract Monadic query languages over trees currently receive considerable interest in the databa... more Abstract Monadic query languages over trees currently receive considerable interest in the database community, as the problem of selecting nodes from a tree is the most basic and widespread database query problem in the context of XML.
Abstract This is a survey of algorithms, complexity results, and general solution techniques for ... more Abstract This is a survey of algorithms, complexity results, and general solution techniques for efficiently processing queries on tree-structured data. I focus on query languages that compute nodes or tuples of nodes—conjunctive queries, first-order queries, datalog, and XPath. I also point out a number of connections among previous results that have not been observed before.
Abstract Youtopia is a platform for collaborative management and integration of relational data. ... more Abstract Youtopia is a platform for collaborative management and integration of relational data. At the heart of Youtopia is an update exchange abstraction: changes to the data propagate through the system to satisfy user-specified mappings. We present a novel change propagation model that combines a deterministic chase with human intervention. The process is fundamentally cooperative and gives users significant control over how mappings are repaired.
Abstract We study complexity and approximation of queries in an expressive query language for pro... more Abstract We study complexity and approximation of queries in an expressive query language for probabilistic databases. The language studied supports the compositional use of confidence computation. It allows for a wide range of new use cases, such as the computation of conditional probabilities and of selections based on predicates that involve marginal and conditional probabilities. These features have important applications in areas such as data cleaning and the processing of sensor data.
Abstract: Graph simulation (using graph schemata or data guides) has been successfully proposed a... more Abstract: Graph simulation (using graph schemata or data guides) has been successfully proposed as a technique for adding structure to semistructured data. Design patterns for description (such as meta-classes and homomorphisms between schema layers), which are prominent in the object-oriented programming community, constitute a generalization of this graph simulation approach.
Abstract: The past decade has seen the rise of $\ ell_1 $-relaxation methods to promote sparsity ... more Abstract: The past decade has seen the rise of $\ ell_1 $-relaxation methods to promote sparsity for better interpretability and generalization of learning results. However, there are several important learning applications, such as Markowitz portolio selection and sparse mixture density estimation, that feature simplex constraints, which disallow the application of the standard $\ ell_1 $-penalty. In this setting, we show how to efficiently obtain sparse projections onto the positive and general simplex with sparsity constraints.
Abstract We present a decomposition-based approach to managing probabilistic information. We intr... more Abstract We present a decomposition-based approach to managing probabilistic information. We introduce world-set decompositions (WSDs), a space-efficient and complete representation system for finite sets of worlds. We study the problem of efficiently evaluating relational algebra queries on world-sets represented by WSDs. We also evaluate our technique experimentally in a large census data scenario and show that it is both scalable and efficient.
Abstract We introduce an extension of the XQuery language, FluX, that supports event-based query ... more Abstract We introduce an extension of the XQuery language, FluX, that supports event-based query processing and the conscious handling of main memory buffers. Purely event-based queries of this language can be executed on streaming XML data in a very direct way. We then develop an algorithm that allows to efficiently rewrite XQueries into the event-based FluX language. This algorithm uses order constraints from a DTD to schedule event handlers and to thus minimize the amount of buffering required for evaluating a query.
Tree pattern matching is a fundamental problem that has a wide range of applications in Web data ... more Tree pattern matching is a fundamental problem that has a wide range of applications in Web data management, XML processing, and selective data dissemination. In this paper we develop efficient algorithms for the tree homeomorphism problem, ie, the problem of matching a tree pattern with exclusively transitive (descendant) edges. We first prove that deciding whether there is a tree homeomorphism is LOGSPACE-complete, improving on the current LOGCFL upper bound.
DLV is an efficient Answer Set Programming (ASP) system implementing the consistent answer set se... more DLV is an efficient Answer Set Programming (ASP) system implementing the consistent answer set semantics [5] with various language enhancements like support for logic programming with inheritance and queries, integer arithmetics, and various other built-in predicates.
Abstract This survey gives an overview of formal results on the XML query language XPath. We iden... more Abstract This survey gives an overview of formal results on the XML query language XPath. We identify several important fragments of XPath, focusing on subsets of XPath 1.0. We then give results on the expressiveness of XPath and its fragments compared to other formalisms for querying trees, algorithms, and complexity bounds for evaluation of XPath queries, as well as static analysis of XPath queries.
Abstract Applications ranging from algorithmic trading to scientific data analysis require realti... more Abstract Applications ranging from algorithmic trading to scientific data analysis require realtime analytics based on views over databases that change at very high rates. Such views have to be kept fresh at low maintenance cost and latencies. At the same time, these views have to support classical SQL, rather than window semantics, to enable applications that combine current with aged or historical data.
Abstract We discuss the requirements for information integration in large scientific collaboratio... more Abstract We discuss the requirements for information integration in large scientific collaborations and arrive at the conclusion that an architecture is needed that follows the declarative paradigm for reasoning completeness, maintainability and reuse of previously encoded knowledge but does not take the classical approach of integrating all sources against a single common “global” information model.
Abstract A decade of experience with research proposals as well as standardized query languages f... more Abstract A decade of experience with research proposals as well as standardized query languages for the conventional Web and the recent emergence of query languages for the Semantic Web call for a reconsideration of design principles for Web and Semantic Web query languages.
Abstract This paper introduces U-relations, a succinct and purely relational representation syste... more Abstract This paper introduces U-relations, a succinct and purely relational representation system for uncertain databases. U-relations support attribute-level uncertainty using vertical partitioning. If we consider positive relational algebra extended by an operation for computing possible answers, a query on the logical level can be translated into, and evaluated as, a single relational algebra query on the U-relational representation.
During the last years, much research has been done concerning semantics and complexity of Disjunc... more During the last years, much research has been done concerning semantics and complexity of Disjunctive Deductive Databases (DDDBs). While DDDBs | function-free disjunctive logic programs with negation in rule bodies allowed | are now generally considered a powerful tool for common-sense reasoning and knowledge representation, there has been a shortage of actual (let alone e cient) implementations ( ST94, ADN97]). This paper presents a brief overview of the architecture of the dlv (datalog with disjunction) system system currently developed at TU Wien in the FWF project P11580-MAT \A Query System for Disjunctive Deductive Databases", especially focusing on the Model Generator { the \heart" of the dlv system { and the integrated frontends for diagnostic reasoning and SQL3.
Abstract Managing incomplete information is important in many real world applications. In this de... more Abstract Managing incomplete information is important in many real world applications. In this demonstration we present MayBMS-a system for representing and managing finite sets of possible worlds-that successfully combines expressiveness and efficiency.
Abstract MayBMS is a state-of-the-art probabilistic database management system which leverages th... more Abstract MayBMS is a state-of-the-art probabilistic database management system which leverages the strengths of previous database research for achieving scalability. As a proof of concept for its ease of use, we have built on top of MayBMS a Web-based application that offers NBA-related information based on what-if analysis of team dynamics using data available at www. nba. com.
Abstract Monadic query languages over trees currently receive considerable interest in the databa... more Abstract Monadic query languages over trees currently receive considerable interest in the database community, as the problem of selecting nodes from a tree is the most basic and widespread database query problem in the context of XML.
Abstract This is a survey of algorithms, complexity results, and general solution techniques for ... more Abstract This is a survey of algorithms, complexity results, and general solution techniques for efficiently processing queries on tree-structured data. I focus on query languages that compute nodes or tuples of nodes—conjunctive queries, first-order queries, datalog, and XPath. I also point out a number of connections among previous results that have not been observed before.
Abstract Youtopia is a platform for collaborative management and integration of relational data. ... more Abstract Youtopia is a platform for collaborative management and integration of relational data. At the heart of Youtopia is an update exchange abstraction: changes to the data propagate through the system to satisfy user-specified mappings. We present a novel change propagation model that combines a deterministic chase with human intervention. The process is fundamentally cooperative and gives users significant control over how mappings are repaired.
Abstract We study complexity and approximation of queries in an expressive query language for pro... more Abstract We study complexity and approximation of queries in an expressive query language for probabilistic databases. The language studied supports the compositional use of confidence computation. It allows for a wide range of new use cases, such as the computation of conditional probabilities and of selections based on predicates that involve marginal and conditional probabilities. These features have important applications in areas such as data cleaning and the processing of sensor data.
Abstract: Graph simulation (using graph schemata or data guides) has been successfully proposed a... more Abstract: Graph simulation (using graph schemata or data guides) has been successfully proposed as a technique for adding structure to semistructured data. Design patterns for description (such as meta-classes and homomorphisms between schema layers), which are prominent in the object-oriented programming community, constitute a generalization of this graph simulation approach.
Abstract: The past decade has seen the rise of $\ ell_1 $-relaxation methods to promote sparsity ... more Abstract: The past decade has seen the rise of $\ ell_1 $-relaxation methods to promote sparsity for better interpretability and generalization of learning results. However, there are several important learning applications, such as Markowitz portolio selection and sparse mixture density estimation, that feature simplex constraints, which disallow the application of the standard $\ ell_1 $-penalty. In this setting, we show how to efficiently obtain sparse projections onto the positive and general simplex with sparsity constraints.
Uploads
Papers by Christoph Koch