Background: Semantically-enriched browsing has enhanced the browsing experience by providing cont... more Background: Semantically-enriched browsing has enhanced the browsing experience by providing contextualised dynamically generated Web content, and quicker access to searchedfor information. However, adoption of Semantic Web technologies is limited and user perception from the non-IT domain sceptical. Furthermore, little attention has been given to evaluating semantic browsers with real users to demonstrate the enhancements and obtain valuable feedback. The Sealife project investigates semantic browsing and its application to the life science domain. Sealife's main objective is to develop the notion of context-based information integration by extending three existing Semantic Web browsers (SWBs) to link the existing Web to the eScience infrastructure. BioMed Central Open Access polished interface was rated higher for usability, and semantic links were used by the users of all three browsers. Conclusion: Confirmation or contradiction of our original hypotheses with relation to SWBs is detailed along with observations of implementation issues.
This paper describes a software architecture designed to enable the evaluation of information pro... more This paper describes a software architecture designed to enable the evaluation of information processing and retrieval systems. The overall objective of our project is to provide an open technical framework for the integration of tools for collection, processing, analysis and communication of open source information 1 . However, enabling the integration of heterogeneous components does not make sense without a proper way to compare the capabilities of multiple tools.
Background: Ontology term labels can be ambiguous and have multiple senses. While this is no prob... more Background: Ontology term labels can be ambiguous and have multiple senses. While this is no problem for human annotators, it is a challenge to automated methods, which identify ontology terms in text. Classical approaches to word sense disambiguation use co-occurring words or terms. However, most treat ontologies as simple terminologies, without making use of the ontology structure or the semantic similarity between terms. Another useful source of information for disambiguation are metadata. Here, we systematically compare three approaches to word sense disambiguation, which use ontologies and metadata, respectively.
A perceived limitation of the current Web is that it is comprised of static links with no connect... more A perceived limitation of the current Web is that it is comprised of static links with no connection to any underlying domain knowledge. The Semantic Web is seen as a potential solution to this problem by delivering semantically related information to users dynamically. However, the benefit to users is rarely questioned and there have been few real-world user evaluations of semantic systems. In this paper we present a user-centred evaluation of three Semantic Web Browsers (SWB) that have been extended as part of the Sealife project. The results presented are based on analysis of the server logs from each application in relation to time taken to perform pre-defined tasks along with the amount of semantic activity carried out. It was found that the user experience was dependent on the SWB used but there was some indication users will be able to find information more quickly and that users will explore semantic features if present.
Abstract Semantic web approach seems interesting for supporting content mining of millions of pat... more Abstract Semantic web approach seems interesting for supporting content mining of millions of patents accessible through the Web. In this paper, we describe our approach for generating semantic annotations on patents, by relying on the structure and on a semantic representation of patent documents. We use both the structure of the patent documents and their textual contents processed by Natural Language Processing (NLP) tools. This method, primarily aimed at helping biologists use patent information can be generalized to all ...
In this paper, we describe SemanticFAQ system built using a method for semiautomatic creation of ... more In this paper, we describe SemanticFAQ system built using a method for semiautomatic creation of an ontology and of semantic annotations from a corpus of e-mails of a community of practice. Such an e-mail corpus raises several original issues for Natural Language Processing (NLP) techniques. The annotations thus generated will feed a frequently asked questions (FAQ). The ontology and the annotations constitute the basis of a semantic portal, SemanticFAQ, that offers the CoP members a semantic navigation among the e-mails.
Web usage mining can play an important role in supporting the navigation on the future Web. In fa... more Web usage mining can play an important role in supporting the navigation on the future Web. In fact detection of common or professional profiles allows browsers and web sites to personalise the user session and to recommend specific resources to the interested people. Semantic web approach seems interesting for this task. We propose in this paper a generic approach for profile detection relying on semantic web technologies. It takes advantages from ontologies, semantic annotations on web resources and inference engines.
We present the interest of the Semantic Web techniques, particularly semantic annotation, in the ... more We present the interest of the Semantic Web techniques, particularly semantic annotation, in the biochip domain. We propose a semi-automatic method using the information extraction (IE) techniques for facilitating the generation of ontology-based annotations for scientific articles. Furthermore, we evaluate and discuss our method by applying it to the annotation of textual corpus provided by biologists working in the biochip domain. Finally, we argue that ontologybased semantic annotation can improve information retrieval.
The basic principle of the Semantic Web carried by the RDF data model is that many RDF statements... more The basic principle of the Semantic Web carried by the RDF data model is that many RDF statements coexist all together and are universally true. However, some case studies imply contextual relevancy and truth -this is well known in the Conceptual Graph community and handled through the notion of contexts. In this paper, we present an approach and a tool for semantic annotation of textual data using graph contexts. We rely on both Natural Language Processing and Semantic Web technologies and propose a model of RDF contexts inspired by the nested Conceptual Graphs. Sentences are primarily analysed and their grammatical constituents (subject, verb, object) are extracted and mapped to RDF triples. Links between these triples are then established within a semantic scope (i.e., context). The context definition allows us to validate the generated annotations by disambiguating the misleading RDF triples. We show how far our approach is applicable to texts in Engineering Design.
Résumé. Les brevets sont une source d'information très riche puisque ce sont des doc... more Résumé. Les brevets sont une source d'information très riche puisque ce sont des documents qui servent à décrire les inventions. L'accès aux documents de brevets en ligne est possible grâce aux efforts des offices nationaux de la propriété intellectuelle. Par ailleurs, ayant des objectifs différents, la présentation de ces documents a pris des formes variées loin d'être unifiées. Ce papier présente une méthode et un système permettant l'analyse de brevets" Patent Mining" pour générer des annotations sémantiques. L'idée ...
Web usage mining can play an important role in supporting the navigation on the future Web. In fa... more Web usage mining can play an important role in supporting the navigation on the future Web. In fact detection of common or professional profiles allows browsers and web sites to personalise the user session and to recommend specific resources to the interested people. Semantic web approach seems interesting for this task. We propose in this paper a generic approach for profile detection relying on semantic web technologies. It takes advantages from ontologies, semantic annotations on web resources and inference engines.
... Remy Bars from Bayers Cropscience, Didier Bourigault for providing us the results of Syntex o... more ... Remy Bars from Bayers Cropscience, Didier Bourigault for providing us the results of Syntex on our corpus, Laurent Alamarguy for his assistance in linguistic domain, Martine Collard, Nhanh le ... Kim S., Alani H., Hall W., Lewis P., Millard D., Shadbolt N. and Weal M. (2002). ...
This paper describes an ontology-based approach aiming at helping biologists to annotate their do... more This paper describes an ontology-based approach aiming at helping biologists to annotate their documents and at facilitating their information retrieval task. Our approach, based on semantic web technologies, relies on formalised ontologies, semantic annotations of scientific articles and knowledge extraction from texts. We propose a method/system for the generation of ontology-based semantic annotations (MeatAnnot) and a system allowing biologists to draw advanced inferences on these annotations (MeatSearch). This approach was proposed to support biologists working on DNA microarray experiments in the validation and the interpretation of their results, but it can probably be extended to other massive analyses of biological events (as provided by proteomics, metabolomics…).
... Rose Dieng-Kuntz INRIA Sophia Antipolis 2004, route des lucioles 06902 Sophia Antipolis - FRA... more ... Rose Dieng-Kuntz INRIA Sophia Antipolis 2004, route des lucioles 06902 Sophia Antipolis - FRANCE [email protected] ... A view on related experiments: trying to identify rela-tions between experiments (local databases, on-line data ) and to discover new research ...
We present an ontology-driven word sense disambiguation process. The main idea consists of using ... more We present an ontology-driven word sense disambiguation process. The main idea consists of using the context of the ambiguous word to decide which class can be assigned to it. The disambiguation relies on similarities between classes assigned to the ambiguous word, classes assigned to terms close to it in the text, and on the type of properties that could occur between them. The computation of the similarity uses domain ontologies to provide semantic distances based on definitions in intension. We tested our approach in the extraction of annotations from biomedical texts.
In this paper, we present the Corese-NeLI semantic Web browser dedicated to navigating resources ... more In this paper, we present the Corese-NeLI semantic Web browser dedicated to navigating resources in the infectious disease domain. We describe an overview of the semantic Web browser and outline its functionality and the knowledge organization system used as a background knowledge for both the annotation and search processes. The evaluation of the vocabulary-based annotation, essential for the semantic browser, uses the National Electronic Library of Infection as a test bed and demonstrates over 96% correct annotations.
Résumé. Cet article décrit le projet MEAT (Mémoire d'Expériences pour l&... more Résumé. Cet article décrit le projet MEAT (Mémoire d'Expériences pour l'Analyse du Transcriptome) dont le but est d'assister les biologistes travaillant dans le domaine des puces à ADN, pour l'interprétation et la validation de leurs résultats. Nous proposons une ...
Semantic web approach seems interesting for supporting content mining of millions of patents acce... more Semantic web approach seems interesting for supporting content mining of millions of patents accessible through the Web. In this paper, we describe our approach for generating semantic annotations on patents, by relying on the structure and on a semantic representation of patent documents. We use both the structure of the patent documents and their textual contents processed by Natural Language Processing (NLP) tools. This method, primarily aimed at helping biologists use patent information can be generalized to all kinds of domains or of structured documents.
Background: Semantically-enriched browsing has enhanced the browsing experience by providing cont... more Background: Semantically-enriched browsing has enhanced the browsing experience by providing contextualised dynamically generated Web content, and quicker access to searchedfor information. However, adoption of Semantic Web technologies is limited and user perception from the non-IT domain sceptical. Furthermore, little attention has been given to evaluating semantic browsers with real users to demonstrate the enhancements and obtain valuable feedback. The Sealife project investigates semantic browsing and its application to the life science domain. Sealife's main objective is to develop the notion of context-based information integration by extending three existing Semantic Web browsers (SWBs) to link the existing Web to the eScience infrastructure. BioMed Central Open Access polished interface was rated higher for usability, and semantic links were used by the users of all three browsers. Conclusion: Confirmation or contradiction of our original hypotheses with relation to SWBs is detailed along with observations of implementation issues.
This paper describes a software architecture designed to enable the evaluation of information pro... more This paper describes a software architecture designed to enable the evaluation of information processing and retrieval systems. The overall objective of our project is to provide an open technical framework for the integration of tools for collection, processing, analysis and communication of open source information 1 . However, enabling the integration of heterogeneous components does not make sense without a proper way to compare the capabilities of multiple tools.
Background: Ontology term labels can be ambiguous and have multiple senses. While this is no prob... more Background: Ontology term labels can be ambiguous and have multiple senses. While this is no problem for human annotators, it is a challenge to automated methods, which identify ontology terms in text. Classical approaches to word sense disambiguation use co-occurring words or terms. However, most treat ontologies as simple terminologies, without making use of the ontology structure or the semantic similarity between terms. Another useful source of information for disambiguation are metadata. Here, we systematically compare three approaches to word sense disambiguation, which use ontologies and metadata, respectively.
A perceived limitation of the current Web is that it is comprised of static links with no connect... more A perceived limitation of the current Web is that it is comprised of static links with no connection to any underlying domain knowledge. The Semantic Web is seen as a potential solution to this problem by delivering semantically related information to users dynamically. However, the benefit to users is rarely questioned and there have been few real-world user evaluations of semantic systems. In this paper we present a user-centred evaluation of three Semantic Web Browsers (SWB) that have been extended as part of the Sealife project. The results presented are based on analysis of the server logs from each application in relation to time taken to perform pre-defined tasks along with the amount of semantic activity carried out. It was found that the user experience was dependent on the SWB used but there was some indication users will be able to find information more quickly and that users will explore semantic features if present.
Abstract Semantic web approach seems interesting for supporting content mining of millions of pat... more Abstract Semantic web approach seems interesting for supporting content mining of millions of patents accessible through the Web. In this paper, we describe our approach for generating semantic annotations on patents, by relying on the structure and on a semantic representation of patent documents. We use both the structure of the patent documents and their textual contents processed by Natural Language Processing (NLP) tools. This method, primarily aimed at helping biologists use patent information can be generalized to all ...
In this paper, we describe SemanticFAQ system built using a method for semiautomatic creation of ... more In this paper, we describe SemanticFAQ system built using a method for semiautomatic creation of an ontology and of semantic annotations from a corpus of e-mails of a community of practice. Such an e-mail corpus raises several original issues for Natural Language Processing (NLP) techniques. The annotations thus generated will feed a frequently asked questions (FAQ). The ontology and the annotations constitute the basis of a semantic portal, SemanticFAQ, that offers the CoP members a semantic navigation among the e-mails.
Web usage mining can play an important role in supporting the navigation on the future Web. In fa... more Web usage mining can play an important role in supporting the navigation on the future Web. In fact detection of common or professional profiles allows browsers and web sites to personalise the user session and to recommend specific resources to the interested people. Semantic web approach seems interesting for this task. We propose in this paper a generic approach for profile detection relying on semantic web technologies. It takes advantages from ontologies, semantic annotations on web resources and inference engines.
We present the interest of the Semantic Web techniques, particularly semantic annotation, in the ... more We present the interest of the Semantic Web techniques, particularly semantic annotation, in the biochip domain. We propose a semi-automatic method using the information extraction (IE) techniques for facilitating the generation of ontology-based annotations for scientific articles. Furthermore, we evaluate and discuss our method by applying it to the annotation of textual corpus provided by biologists working in the biochip domain. Finally, we argue that ontologybased semantic annotation can improve information retrieval.
The basic principle of the Semantic Web carried by the RDF data model is that many RDF statements... more The basic principle of the Semantic Web carried by the RDF data model is that many RDF statements coexist all together and are universally true. However, some case studies imply contextual relevancy and truth -this is well known in the Conceptual Graph community and handled through the notion of contexts. In this paper, we present an approach and a tool for semantic annotation of textual data using graph contexts. We rely on both Natural Language Processing and Semantic Web technologies and propose a model of RDF contexts inspired by the nested Conceptual Graphs. Sentences are primarily analysed and their grammatical constituents (subject, verb, object) are extracted and mapped to RDF triples. Links between these triples are then established within a semantic scope (i.e., context). The context definition allows us to validate the generated annotations by disambiguating the misleading RDF triples. We show how far our approach is applicable to texts in Engineering Design.
Résumé. Les brevets sont une source d'information très riche puisque ce sont des doc... more Résumé. Les brevets sont une source d'information très riche puisque ce sont des documents qui servent à décrire les inventions. L'accès aux documents de brevets en ligne est possible grâce aux efforts des offices nationaux de la propriété intellectuelle. Par ailleurs, ayant des objectifs différents, la présentation de ces documents a pris des formes variées loin d'être unifiées. Ce papier présente une méthode et un système permettant l'analyse de brevets" Patent Mining" pour générer des annotations sémantiques. L'idée ...
Web usage mining can play an important role in supporting the navigation on the future Web. In fa... more Web usage mining can play an important role in supporting the navigation on the future Web. In fact detection of common or professional profiles allows browsers and web sites to personalise the user session and to recommend specific resources to the interested people. Semantic web approach seems interesting for this task. We propose in this paper a generic approach for profile detection relying on semantic web technologies. It takes advantages from ontologies, semantic annotations on web resources and inference engines.
... Remy Bars from Bayers Cropscience, Didier Bourigault for providing us the results of Syntex o... more ... Remy Bars from Bayers Cropscience, Didier Bourigault for providing us the results of Syntex on our corpus, Laurent Alamarguy for his assistance in linguistic domain, Martine Collard, Nhanh le ... Kim S., Alani H., Hall W., Lewis P., Millard D., Shadbolt N. and Weal M. (2002). ...
This paper describes an ontology-based approach aiming at helping biologists to annotate their do... more This paper describes an ontology-based approach aiming at helping biologists to annotate their documents and at facilitating their information retrieval task. Our approach, based on semantic web technologies, relies on formalised ontologies, semantic annotations of scientific articles and knowledge extraction from texts. We propose a method/system for the generation of ontology-based semantic annotations (MeatAnnot) and a system allowing biologists to draw advanced inferences on these annotations (MeatSearch). This approach was proposed to support biologists working on DNA microarray experiments in the validation and the interpretation of their results, but it can probably be extended to other massive analyses of biological events (as provided by proteomics, metabolomics…).
... Rose Dieng-Kuntz INRIA Sophia Antipolis 2004, route des lucioles 06902 Sophia Antipolis - FRA... more ... Rose Dieng-Kuntz INRIA Sophia Antipolis 2004, route des lucioles 06902 Sophia Antipolis - FRANCE [email protected] ... A view on related experiments: trying to identify rela-tions between experiments (local databases, on-line data ) and to discover new research ...
We present an ontology-driven word sense disambiguation process. The main idea consists of using ... more We present an ontology-driven word sense disambiguation process. The main idea consists of using the context of the ambiguous word to decide which class can be assigned to it. The disambiguation relies on similarities between classes assigned to the ambiguous word, classes assigned to terms close to it in the text, and on the type of properties that could occur between them. The computation of the similarity uses domain ontologies to provide semantic distances based on definitions in intension. We tested our approach in the extraction of annotations from biomedical texts.
In this paper, we present the Corese-NeLI semantic Web browser dedicated to navigating resources ... more In this paper, we present the Corese-NeLI semantic Web browser dedicated to navigating resources in the infectious disease domain. We describe an overview of the semantic Web browser and outline its functionality and the knowledge organization system used as a background knowledge for both the annotation and search processes. The evaluation of the vocabulary-based annotation, essential for the semantic browser, uses the National Electronic Library of Infection as a test bed and demonstrates over 96% correct annotations.
Résumé. Cet article décrit le projet MEAT (Mémoire d'Expériences pour l&... more Résumé. Cet article décrit le projet MEAT (Mémoire d'Expériences pour l'Analyse du Transcriptome) dont le but est d'assister les biologistes travaillant dans le domaine des puces à ADN, pour l'interprétation et la validation de leurs résultats. Nous proposons une ...
Semantic web approach seems interesting for supporting content mining of millions of patents acce... more Semantic web approach seems interesting for supporting content mining of millions of patents accessible through the Web. In this paper, we describe our approach for generating semantic annotations on patents, by relying on the structure and on a semantic representation of patent documents. We use both the structure of the patent documents and their textual contents processed by Natural Language Processing (NLP) tools. This method, primarily aimed at helping biologists use patent information can be generalized to all kinds of domains or of structured documents.
Uploads
Papers by khaled khelif