Database Integration
48 Followers
Most cited papers in Database Integration
Peer-to-peer file-sharing networks are currently receiving much attention as a means of sharing and distributing information. However, as recent experience shows, the anonymous, open nature of these networks offers an almost ideal... more
The Gene Ontology (GO) project (http://www. geneontology.org/) provides structured, controlled vocabularies and classi®cations that cover several domains of molecular and cellular biology and are freely available for community use in the... more
A database integrating 90 years of empirical studies reporting intercorrelations among rated job performance dimensions was used to test the hypothesis of a general factor in job performance. After controlling for halo error and 3 other... more
PlasmoDB (http://PlasmoDB.org) is the official database of the Plasmodium falciparum genome sequencing consortium. This resource incorporates the recently completed P. falciparum genome sequence and annotation, as well as draft sequence... more
The current data explosion is intractable without advanced data management systems. The numerous data sets become really useful when they are interconnected under a uniform interface-representing the domain knowledge. The SRS has become... more
The rapid expansion of biomedical knowledge, reduction in computing costs, and spread of internet access have created an ocean of electronic data. The decentralized nature of our scientific community and healthcare system, however, has... more
This paper introduces a new model, based on so-called object-composition filters, that uniformly integrates database-like features into an object-oriented language. The focus is on providing persistent dynamic data structures, data... more
Motivation: Assembling the relevant information needed to interpret the output from high throughput, genome scale, experiments such as gene expression microarrays is challenging. Analysis reveals genes that show statistically significant... more
The Protein Information Resource (PIR; http://wwwnbrf.georgetown.edu/pir/ ) supports research on molecular evolution, functional genomics, and computational biology by maintaining a comprehensive, non-redundant, well-organized and freely... more
Uncertainty in categorical data is commonplace in many applications, including data cleaning, database integration, and biological annotation. In such domains, the correct value of an attribute is often unknown, but may be selected from a... more
Database integrity has two complementary components: validity , which guarantees that all false information is excluded from the database, and completeness , which guarantees that all true information is included in the database. This... more
Reasoning on queries is a basic problem both in knowledge representation and databases. A fundamental form of reasoning on queries is checking containment, i.e., verifying whether one query yields necessarily a subset of the result of... more
Background: Systems biologists work with many kinds of data, from many different sources, using a variety of software tools. Each of these tools typically excels at one type of analysis, such as of microarrays, of metabolic networks and... more
Background: Fungi secrete various proteins that have diverse functions. Prediction of secretory proteins using only one program is unsatisfactory. To enhance prediction accuracy, we constructed Fungal Secretome Database (FSD).
Polymorphism in microRNA Target Site (PolymiRTS) database is a collection of naturally occurring DNA variations in putative microRNA target sites. PolymiRTSs may affect gene expression and cause variations in complex phenotypes. The... more
Corporate databases are potentially rich sources of new and valuable knowledge. Various approaches to "discovering" or "mining" such knowledge have been proposed. We identify an important and previously ignored discovery task, data... more
This paper reports on current research utilising agent based methodologies in order to provide solutions in autonomous map generalisation. The research is in pursuit of systems able to support the derivation of multi scaled products from... more
Schema integration is important in two contexts, logical database design (in centralized DBMS) and global schema design (in distributed DBMS). Performing an integration on real-life schemas without a tool can be very difficult, tedious... more
Developing intelligent systems to integrate numerous, autonomous and heterogeneous data sources in order to give end users an uniform query interface is a great challenging issue. The process of constructing a global schema of the... more
ONTOFUSION is an ontology-based system designed for biomedical database integration. It is based on two processes: mapping and unification. Mapping is a semi-automated process that uses ontologies to link a database schema with a... more
Here we report on recent developments at the EBI SRS server (http://srs.ebi.ac.uk). SRS has become an integration system for both data retrieval and sequence analysis applications. The EBI SRS server is a primary gateway to major... more
In the new and emerging Agile Manufacturing paradigm, where multiple firms cooperate under flexible virtual enterprise structures, there exists much need for a mechanism to manage and control information flow among collaborating partners.... more
The sequencing and analysis of ESTs is for now the only practical approach for large-scale gene discovery and annotation in conifers because their very large genomes are unlikely to be sequenced in the near future. Our objective was to... more
The incidence of extreme precipitation has increased with the exacerbation of worldwide climate disruption. We hypothesize an association between precipitation and the distribution patterns that would affect the endemic burden of 8... more
The problem of merging multiple sources information is central in many information processing areas such as databases integrating problems, multiple criteria decision making, expert opinion pooling, etc. Recently, several approaches have... more
Background Traditional Chinese Medicine (TCM), a complementary and alternative medical system in Western countries, has been used to treat various diseases over thousands of years in East Asian countries. In recent years, many herbal... more
S. Janssen). a v a i l a b l e a t w w w . s c i e n c e d i r e c t . c o m journal homepage: www.elsevier.com/locate/envsci 1462-9011/$ -see front matter #
To catalogue data on chromosomal aberrations in cancer derived from emerging molecular cytogenetic techniques and to integrate these data with genome maps, we have established two resources, the NCI and NCBI SKY/M-FISH & CGH Database, and... more
During the process of updating a database, two interrelated problems could arise. On one hand, when an update is applied to the database, integrity constraints could become violated, thus falsifying database consistency. In this case, the... more
Every information system incorporates a database component, and a frequent activity of users of information systems is to present it with queries. These queries reflect the presuppositions of their authors about the system and the... more
The identification of molecular signatures predictive of clinical behavior and outcome in brain tumors has been the focus of many studies in the recent years. Despite the wealth of data that are available in the public domain on... more
Inference has been a longstanding issue in database security, and inference control, aiming to curb inference, provides an extra line of defense to the confidentiality of databases by complementing access control. However, in traditional... more
Taking advantage of recent advances in automated theorem proving, we present a new method for determining whether database transactions preserve integrity constraints. We consider check constraints and referential-integrity... more
We discuss recent seasonal and interannual variations of ice cover and lake surface level in the Aral Sea from satellite data for 1992–2006. First, we provide an overview of the evolution of the Aral Sea's environmental conditions,... more
Resolving domain incompatibility among independently developed databases often involves uncertain information. DeMichiel (1989) showed that uncertain information can be generated by the mapping of conflicting attributes to a common... more
In this paper, we describe OntoFusion, a database integration system. This system has been designed to provide unified access to multiple, heterogeneous biological and medical data sources that are publicly available over Internet. Many... more
Background: Last years' mapping of diverse genomes has generated huge amounts of biological data which are currently dispersed through many databases. Integration of the information available in the various databases is required to unveil... more
We show how to formalise different kinds of loop constructs within the refinement calculus, and how to use this formalisation to derive general transformation rules for loop constructs. The emphasis is on using algebraic methods for... more
A key aspect of interoperation among data-intensive systems involves the mediation of metadata and ontologies across database boundaries. One way to achieve such mediation between a local database and a remote database is to fold remote... more
The cell cycle is one of the biological processes most frequently investigated in systems biology studies and it involves the knowledge of a large number of genes and networks of protein interactions. A deep knowledge of the molecular... more
Many government, academic and research institutions collect environmental data that are relevant to understanding the relationship between environmental exposures and human health. Integrating these data with health outcome data presents... more
c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 1 0 8 ( 2 0 1 2 ) 90-101 Public health information systems Database integration Cervical cancer a b s t r a c t This paper aims at to present the integration of... more