Skip to main content

Johannes Leveling

Elsevier, Elsevier Labs, Research Technology Director

Dublin City University, CNGL, School of Computing, Research Fellow

Followers

81

Following

130

Co-authors

18

Public Views

I am a Research Fellow at the Centre for Next Generation Localisation (CNGL) at Dublin City University (DCU). My work is centered around information retrieval (IR), query reformulation, and relevance feedback. Other research interests include natural language processing (NLP), geographic information retrieval (GIR) and question answering (QA).
Phone: +353 (0)1 700 6716
Address: Centre for Next Generation Localisation (CNGL),
Dublin City University,
Dublin 9, Ireland

less

InterestsView All (14)

Uploads

Books by Johannes Leveling

Formale Interpretation von Nutzeranfragen für natürlichsprachliche Interfaces zu Informationsangeboten im Internet

Papers by Johannes Leveling

Simulation of Within-Session Query Variations Using a Text Segmentation Approach

Lecture Notes in Computer Science, 2011

We propose a generative model for automatic query reformulations from an initial query using the ... more We propose a generative model for automatic query reformulations from an initial query using the underlying subtopic structure of top ranked retrieved documents. We address three types of query reformulations a) specialization; b) generalization; and c) drift. To test our model we generate the three reformulation variants starting with selected fields from the TREC-8 topics as the initial queries. We use manual judgments from multiple assessors to calculate the accuracy of the reformulated query variants and observe accuracies of 65%, 82% and 69% respectively for specialization, generalization and drift reformulations.

Retrieval of similar chess positions

Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval, 2014

We address the problem of retrieving chess game positions similar to a given query position from ... more We address the problem of retrieving chess game positions similar to a given query position from a collection of archived chess games. We investigate this problem from an information retrieval (IR) perspective. The advantage of our proposed IR-based approach is that it allows using the standard inverted organization of stored chess positions, leading to an efficient retrieval. Moreover, in contrast to retrieving exactly identical board positions, the IR-based approach is able to provide approximate search functionality. In order to define the similarity between two chess board positions, we encode each game state with a textual representation. This textual encoding is designed to represent the position, reachability and the connectivity between chess pieces. Due to the absence of a standard IR dataset that can be used for this search task, a new evaluation benchmark dataset was constructed comprising of documents (chess positions) from a freely available chess game archive. Experiments conducted on this dataset demonstrate that our proposed method of similarity computation, which takes into account a combination of the mobility and the connectivities between the chess pieces, performs well on the search task, achieving MAP and nDCG values of 0.4233 and 0.6922 respectively.

Query recovery of short user queries

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval, 2010

User queries to search engines are observed to predominantly contain inflected content words but ... more User queries to search engines are observed to predominantly contain inflected content words but lack stopwords and capitalization. Thus, they often resemble natural language queries after case folding and stopword removal. Query recovery aims to generate a linguistically well-formed query from a given user query as input to provide natural language processing tasks and cross-language information retrieval (CLIR). The evaluation of query translation shows that translation scores (NIST and BLEU) decrease after case folding, stopword removal, and stemming. A baseline method for query recovery reconstructs capitalization and stopwords, which considerably increases translation scores and significantly increases mean average precision for a standard CLIR task.

An investigation of decompounding for cross-language patent search

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval, 2011

Decompounding has been found to improve information retrieval (IR) effectiveness in general domai... more Decompounding has been found to improve information retrieval (IR) effectiveness in general domains for languages such as German or Dutch. We investigate if cross-language patent retrieval can profit from decompounding. This poses two challenges: i) There may be few resources such as parallel corpora available for training an machine translation system for a compounding language. ii) Patents have a specific writing style and vocabulary ("patentese"), which may affect the performance of decompounding and translation methods. Experiments on data from the CLEF-IP 2010 task show that decompounding patents for translation can overcome out-of-vocabulary problems (OOV) and that decompounding improves IR performance significantly for small training corpora.

Towards evaluation of personalized and collaborative information retrieval

Proceedings of the First Workshop on Personalised Multilingual Hypertext Retrieval, 2011

We propose to extend standard information retrieval (IR) ad-hoc test collection design to facilit... more We propose to extend standard information retrieval (IR) ad-hoc test collection design to facilitate research on personalized and collaborative IR by gathering additional meta-information during the topic (query) development process. We propose a controlled query generation process with activity logging for each topic developer. The standard ad-hoc collection will thus be accompanied by a new set of thematically related topics and the associated log information, and has the potential to simulate a real-world search scenario to encourage retrieval systems to mine user information from the logs to improve IR effectiveness. The proposed methodology described in this paper will be applied in a pilot task which is scheduled to run in the FIRE 2011 evaluation campaign. The task aims at investigating the research question of whether personalized and collaborative IR retrieval experiments and evaluation can be pursued by enriching a standard ad-hoc collection with such meta-information.

Exploring Accumulative Query Expansion for Relevance Feedback

Lecture Notes in Computer Science, 2011

For the participation of Dublin City University (DCU) in the Relevance Feedback (RF) track of INE... more For the participation of Dublin City University (DCU) in the Relevance Feedback (RF) track of INEX 2010, we investigated the relation between the length of relevant text passages and the number of RF terms. In our experiments, relevant passages are segmented into non-overlapping windows of fixed length which are sorted by similarity with the query. In each retrieval iteration, we extend the current query with the most frequent terms extracted from these word windows. The number of feedback terms corresponds to a constant number, a number proportional to the length of relevant passages, and a number inversely proportional to the length of relevant passages, respectively. Retrieval experiments show a significant increase in MAP for INEX 2008 training data and improved precisions at early recall levels for the 2010 topics as compared to the baseline Rocchio feedback.

An architecture for rating and controlling text readability

Proceedings of KONVENS 2006, 2006

In the so-called information society with its strong tendency towards individualization, it becom... more In the so-called information society with its strong tendency towards individualization, it becomes more and more important to have all sorts of textual information available in a simple and easy to understand language. We present an approach that allows to automatically rate the readability of German texts and also provides suggestions how to make a given text more readable. Our system, called DeLite, employs a powerful NLP component that supports the syntactic and semantic analysis of German texts.

DCU@ TRECMed 2012: Using ad-hoc baselines for domain-specific retrieval

This paper describes the first participation of DCU in the TREC Medical Records Track (TRECMed) 2... more This paper describes the first participation of DCU in the TREC Medical Records Track (TRECMed) 2012. We performed initial experiments on the the 2011 TRECMed data based on the BM25 retrieval model. Surprisingly, we found that the standard BM25 model with default parameters performs comparable to the best automatic runs submitted to TRECMed 2011 and our experiments would have ranked among the top four out of 29 participating groups. We expected that some form of domain adaptation would increase performance. However, results on the 2011 data proved otherwise: query expansion decreased performance, and filtering and reranking by term proximity also decreased performance slightly. We submitted four runs based on the BM25 retrieval model to TRECMed 2012 using standard BM25, standard query expansion, result filtering, and concept-based query expansion. Official results for 2012 confirm that domain-specific knowledge, as applied by us, does not increase performance compared to the BM25 baseline.

Building a domain-specific document collection for evaluating metadata effects on information retrieval

This paper describes the development of a structured document collection containing user-generate... more This paper describes the development of a structured document collection containing user-generated text and numerical metadata for exploring the exploitation of metadata in information retrieval (IR). The collection consists of more than 61,000 documents extracted from YouTube video pages on basketball in general and NBA (National Basketball Association) in particular, together with a set of 40 topics and their relevance judgements. In addition, a collection of nearly 250,000 user profiles related to the NBA collection is available. Several baseline IR experiments report the effect of using video-associated metadata on retrieval effectiveness. The results surprisingly show that searching the videos titles only performs significantly better than searching additional metadata text fields of the videos such as the tags or the description.

HITS and misses: combining BM25 with HITS for expert search

This paper describes the participation of Dublin City University in the CriES (Cross-Lingual Expe... more This paper describes the participation of Dublin City University in the CriES (Cross-Lingual Expert Search) pilot challenge. To realize expert search, we combine traditional information retrieval (IR) using the BM25 model with reranking of results using the HITS algorithm. The experiments were performed on two indexes, one containing all questions and one containing all answers. Two runs were submitted. The first one contains the combination of results from IR on the questions with authority values from HITS; the second contains the reranked results from IR on answers with authority values. To investigate the impact of multilinguality, additional experiments were conducted on the English topic subset and on all topics translated into English with Google Translate. The overall performance is moderate and leaves much room for improvement. However, reranking results with authority values from HITS typically improved results and more than doubled the number of relevant and retrieved results and precision at 10 documents in many experiments.

Same query-different results? A study of repeat queries in search sessions

Typically, three main query reformulation types in sessions are considered: generalization, speci... more Typically, three main query reformulation types in sessions are considered: generalization, specification, and drift. We show that given the full context of user interactions, repeat queries represent an important reformulation type which should also be addressed in session retrieval evaluation. We investigate different query reformulation patterns in logs from The European Library. Using an automatic classification for query reformulations, we found that the most frequent (and presumably the most important) reformulation pattern corresponds to repeat queries. We aim to find possible explanations for repeat queries in sessions and try to uncover implications for session retrieval evaluation.

DCU@ FIRE2010: Term conflation, blind relevance feedback, and cross-language IR with manual and automatic query translation

For the first participation of Dublin City University (DCU) in the FIRE 2010 evaluation campaign,... more For the first participation of Dublin City University (DCU) in the FIRE 2010 evaluation campaign, information retrieval (IR) experiments on English, Bengali, Hindi, and Marathi documents were performed to investigate term conflation (different stemming approaches and indexing word prefixes), blind relevance feedback, and manual and automatic query translation. The experiments are based on BM25 and on language modeling (LM) for IR. Results show that term conflation always improves mean average precision (MAP) compared to indexing unprocessed word forms, but different approaches seem to work best for different languages. For example, in monolingual Marathi experiments indexing 5-prefixes outperforms our corpus-based stemmer; in Hindi, the corpus-based stemmer achieves a higher MAP. For Bengali, the LM retrieval model achieves a much higher MAP than BM25 (0.4944 vs. 0.4526). In all experiments using BM25, blind relevance feedback yields considerably higher MAP in comparison to experiments without it. Bilingual IR experiments (English→Bengali and English→Hindi) are based on query translations obtained from native speakers and the Google translate web service. For the automatically translated queries, MAP is slightly (but not significantly) lower compared to experiments with manual query translations. The bilingual English→Bengali (English→Hindi) experiments achieve 81.7%-83.3% (78.0%-80.6%) of the best corresponding monolingual experiments.

DCU@ CLEF-IP 2009: Exploring standard IR techniques on patent retrieval

This paper presents the experiments and results for our participation in CLEF-IP 2009, which in n... more This paper presents the experiments and results for our participation in CLEF-IP 2009, which in newly launched this year. Our work applied standard information retrieval (IR) techniques to patent search. Different experiments tested various methods for the patent retrieval, including query formulation, structured index, weighted fields, filtering, and relevance feedback. Some methods did not show expected good retrieval effectiveness such as blind relevance feedback, other experiments showed acceptable performance. Query formulation was the key task for achieving better retrieval effectiveness, and this was performed through giving some higher weights to the text in certain fields. For the best runs, the retrieval effectiveness is still lower than IR applications for other domains illustrating the fact of the difficulty of patent search. The official results have shown that among fifteen participants we achieved the seventh and the fourth ranks from the mean average precision (MAP) and recall point of view, respectively.

DCU's experiments in NTCIR-8 IR4QA task

DCU at WikipediaMM 2009: Document expansion from wikipedia abstracts

In this paper, we describe and analyze our participation in the WikipediaMM task at CLEF 2009. Ou... more In this paper, we describe and analyze our participation in the WikipediaMM task at CLEF 2009. Our main efforts concern the expansion of the image metadata from the Wikipedia abstracts collection-DBpedia. In our experiments, we use the Okapi feedback algorithm for document expansion. Compared with our text retrieval baseline, our best document expansion RUN improves MAP by 17.89%. As one of our conclusions, document expansion from external resource can play an effective factor in the image metadata retrieval task.

DCU-TCD@ LogCLEF 2010: Re-ranking document collections and query performance estimation

This paper describes the collaborative participation of Dublin City University and Trinity Colleg... more This paper describes the collaborative participation of Dublin City University and Trinity College Dublin in LogCLEF 2010. Two sets of experiments were conducted. First, different aspects of the TEL query logs were analysed after extracting user sessions of consecutive queries on a topic. The relation between the queries and their length (number of terms) and position (first query or further reformulations) was examined in a session with respect to query performance estimators such as query scope, IDF-based measures, simplified query clarity score, and average inverse document collection frequency. Results of this analysis suggest that only some estimator values show a correlation with query length or position in the TEL logs (e.g. similarity score between collection and query). Second, the relation between three attributes was investigated: the user's country (detected from IP address), the query language, and the interface language. The investigation aimed to explore the influence of the three attributes on the user's collection selection. Moreover, the investigation involved assigning different weights to the three attributes in a scoring function that was used to re-rank the collections displayed to the user according to the language and country. The results of the collection re-ranking show a significant improvement in Mean Average Precision (MAP) over the original collection ranking of TEL. The results also indicate that the query language and interface language have more influence than the user's country on the collections selected by the users.

Classifying and filtering blind feedback terms to improve information retrieval effectiveness

The classification of blind relevance feedback (BRF) terms described in this paper aims at increa... more The classification of blind relevance feedback (BRF) terms described in this paper aims at increasing precision or recall by determining which terms decrease, increase or do not change the corresponding information retrieval (IR) performance metric. Classification and IR experiments are performed on the German and English GIRT data, using the BM25 retrieval model. Several basic memory-based classifiers are trained on different feature sets, grouping together features from different query expansion (QE) approaches. Combined classifiers employ the results of the basic classifiers and correctness predictions as features. The best combined classifiers for German (English) yield 22.9% (26.4%) and 5.8% (1.9%) improvement for term classification wrt. precision and recall compared to the best basic classifiers. IR experiments based on this term classification have also been performed. Filtering out different types of BRF terms shows that selecting feedback terms predicted to increase precis...

Sub-Word Indexing and Blind Relevance Feedback for English, Bengali, Hindi, and Marathi IR

ACM Transactions on Asian Language Information Processing, 2010

The Forum for Information Retrieval Evaluation (FIRE) provides document collections, topics, and ... more The Forum for Information Retrieval Evaluation (FIRE) provides document collections, topics, and relevance assessments for information retrieval (IR) experiments on Indian languages. Several research questions are explored in this article: 1) How to create create a simple, language-independent corpus-based stemmer, 2) How to identify sub-words and which types of sub-words are suitable as indexing units, and 3) How to apply blind relevance feedback on sub-words and how feedback term selection is affected by the type of the indexing unit. More than 140 IR experiments are conducted using the BM25 retrieval model on the topic titles and descriptions (TD) for the FIRE 2008 English, Bengali, Hindi, and Marathi document collections. The major findings are: The corpus-based stemming approach is effective as a knowledge-light term conflation step and useful in the case of few language-specific resources. For English, the corpus-based stemmer performs nearly as well as the Porter stemmer and ...

DCU and ISI@INEX 2010: Adhoc and Data-Centric Tracks

by Gareth Jones and Johannes Leveling

Lecture Notes in Computer Science, 2011

We describe the participation of Dublin City University (DCU) and the Indian Statistical Institut... more We describe the participation of Dublin City University (DCU) and the Indian Statistical Institute (ISI) in INEX 2010. The main contributions of this paper are: i) a simplified version of Hierarchical Language Model (HLM) which involves scoring XML elements with a combined probability of generating the given query from itself and the top level article node, is shown to outperform the baselines of Language Model (LM) and Vector Space Model (VSM) scoring of XML elements; ii) the Expectation Maximization (EM) feedback in LM is shown to be the most effective on the domain specific collection of IMDB; iii) automated removal of sentences indicating aspects of irrelevance from the narratives of INEX ad-hoc topics is shown to improve retrieval effectiveness.

Formale Interpretation von Nutzeranfragen für natürlichsprachliche Interfaces zu Informationsangeboten im Internet

Simulation of Within-Session Query Variations Using a Text Segmentation Approach

Lecture Notes in Computer Science, 2011

We propose a generative model for automatic query reformulations from an initial query using the ... more We propose a generative model for automatic query reformulations from an initial query using the underlying subtopic structure of top ranked retrieved documents. We address three types of query reformulations a) specialization; b) generalization; and c) drift. To test our model we generate the three reformulation variants starting with selected fields from the TREC-8 topics as the initial queries. We use manual judgments from multiple assessors to calculate the accuracy of the reformulated query variants and observe accuracies of 65%, 82% and 69% respectively for specialization, generalization and drift reformulations.

Retrieval of similar chess positions

Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval, 2014

We address the problem of retrieving chess game positions similar to a given query position from ... more We address the problem of retrieving chess game positions similar to a given query position from a collection of archived chess games. We investigate this problem from an information retrieval (IR) perspective. The advantage of our proposed IR-based approach is that it allows using the standard inverted organization of stored chess positions, leading to an efficient retrieval. Moreover, in contrast to retrieving exactly identical board positions, the IR-based approach is able to provide approximate search functionality. In order to define the similarity between two chess board positions, we encode each game state with a textual representation. This textual encoding is designed to represent the position, reachability and the connectivity between chess pieces. Due to the absence of a standard IR dataset that can be used for this search task, a new evaluation benchmark dataset was constructed comprising of documents (chess positions) from a freely available chess game archive. Experiments conducted on this dataset demonstrate that our proposed method of similarity computation, which takes into account a combination of the mobility and the connectivities between the chess pieces, performs well on the search task, achieving MAP and nDCG values of 0.4233 and 0.6922 respectively.

Query recovery of short user queries

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval, 2010

User queries to search engines are observed to predominantly contain inflected content words but ... more User queries to search engines are observed to predominantly contain inflected content words but lack stopwords and capitalization. Thus, they often resemble natural language queries after case folding and stopword removal. Query recovery aims to generate a linguistically well-formed query from a given user query as input to provide natural language processing tasks and cross-language information retrieval (CLIR). The evaluation of query translation shows that translation scores (NIST and BLEU) decrease after case folding, stopword removal, and stemming. A baseline method for query recovery reconstructs capitalization and stopwords, which considerably increases translation scores and significantly increases mean average precision for a standard CLIR task.

An investigation of decompounding for cross-language patent search

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval, 2011

Decompounding has been found to improve information retrieval (IR) effectiveness in general domai... more Decompounding has been found to improve information retrieval (IR) effectiveness in general domains for languages such as German or Dutch. We investigate if cross-language patent retrieval can profit from decompounding. This poses two challenges: i) There may be few resources such as parallel corpora available for training an machine translation system for a compounding language. ii) Patents have a specific writing style and vocabulary ("patentese"), which may affect the performance of decompounding and translation methods. Experiments on data from the CLEF-IP 2010 task show that decompounding patents for translation can overcome out-of-vocabulary problems (OOV) and that decompounding improves IR performance significantly for small training corpora.

Towards evaluation of personalized and collaborative information retrieval

Proceedings of the First Workshop on Personalised Multilingual Hypertext Retrieval, 2011

We propose to extend standard information retrieval (IR) ad-hoc test collection design to facilit... more We propose to extend standard information retrieval (IR) ad-hoc test collection design to facilitate research on personalized and collaborative IR by gathering additional meta-information during the topic (query) development process. We propose a controlled query generation process with activity logging for each topic developer. The standard ad-hoc collection will thus be accompanied by a new set of thematically related topics and the associated log information, and has the potential to simulate a real-world search scenario to encourage retrieval systems to mine user information from the logs to improve IR effectiveness. The proposed methodology described in this paper will be applied in a pilot task which is scheduled to run in the FIRE 2011 evaluation campaign. The task aims at investigating the research question of whether personalized and collaborative IR retrieval experiments and evaluation can be pursued by enriching a standard ad-hoc collection with such meta-information.

Exploring Accumulative Query Expansion for Relevance Feedback

Lecture Notes in Computer Science, 2011

For the participation of Dublin City University (DCU) in the Relevance Feedback (RF) track of INE... more For the participation of Dublin City University (DCU) in the Relevance Feedback (RF) track of INEX 2010, we investigated the relation between the length of relevant text passages and the number of RF terms. In our experiments, relevant passages are segmented into non-overlapping windows of fixed length which are sorted by similarity with the query. In each retrieval iteration, we extend the current query with the most frequent terms extracted from these word windows. The number of feedback terms corresponds to a constant number, a number proportional to the length of relevant passages, and a number inversely proportional to the length of relevant passages, respectively. Retrieval experiments show a significant increase in MAP for INEX 2008 training data and improved precisions at early recall levels for the 2010 topics as compared to the baseline Rocchio feedback.

An architecture for rating and controlling text readability

Proceedings of KONVENS 2006, 2006

In the so-called information society with its strong tendency towards individualization, it becom... more In the so-called information society with its strong tendency towards individualization, it becomes more and more important to have all sorts of textual information available in a simple and easy to understand language. We present an approach that allows to automatically rate the readability of German texts and also provides suggestions how to make a given text more readable. Our system, called DeLite, employs a powerful NLP component that supports the syntactic and semantic analysis of German texts.

DCU@ TRECMed 2012: Using ad-hoc baselines for domain-specific retrieval

This paper describes the first participation of DCU in the TREC Medical Records Track (TRECMed) 2... more This paper describes the first participation of DCU in the TREC Medical Records Track (TRECMed) 2012. We performed initial experiments on the the 2011 TRECMed data based on the BM25 retrieval model. Surprisingly, we found that the standard BM25 model with default parameters performs comparable to the best automatic runs submitted to TRECMed 2011 and our experiments would have ranked among the top four out of 29 participating groups. We expected that some form of domain adaptation would increase performance. However, results on the 2011 data proved otherwise: query expansion decreased performance, and filtering and reranking by term proximity also decreased performance slightly. We submitted four runs based on the BM25 retrieval model to TRECMed 2012 using standard BM25, standard query expansion, result filtering, and concept-based query expansion. Official results for 2012 confirm that domain-specific knowledge, as applied by us, does not increase performance compared to the BM25 baseline.

Building a domain-specific document collection for evaluating metadata effects on information retrieval

This paper describes the development of a structured document collection containing user-generate... more This paper describes the development of a structured document collection containing user-generated text and numerical metadata for exploring the exploitation of metadata in information retrieval (IR). The collection consists of more than 61,000 documents extracted from YouTube video pages on basketball in general and NBA (National Basketball Association) in particular, together with a set of 40 topics and their relevance judgements. In addition, a collection of nearly 250,000 user profiles related to the NBA collection is available. Several baseline IR experiments report the effect of using video-associated metadata on retrieval effectiveness. The results surprisingly show that searching the videos titles only performs significantly better than searching additional metadata text fields of the videos such as the tags or the description.

HITS and misses: combining BM25 with HITS for expert search

This paper describes the participation of Dublin City University in the CriES (Cross-Lingual Expe... more This paper describes the participation of Dublin City University in the CriES (Cross-Lingual Expert Search) pilot challenge. To realize expert search, we combine traditional information retrieval (IR) using the BM25 model with reranking of results using the HITS algorithm. The experiments were performed on two indexes, one containing all questions and one containing all answers. Two runs were submitted. The first one contains the combination of results from IR on the questions with authority values from HITS; the second contains the reranked results from IR on answers with authority values. To investigate the impact of multilinguality, additional experiments were conducted on the English topic subset and on all topics translated into English with Google Translate. The overall performance is moderate and leaves much room for improvement. However, reranking results with authority values from HITS typically improved results and more than doubled the number of relevant and retrieved results and precision at 10 documents in many experiments.

Same query-different results? A study of repeat queries in search sessions

Typically, three main query reformulation types in sessions are considered: generalization, speci... more Typically, three main query reformulation types in sessions are considered: generalization, specification, and drift. We show that given the full context of user interactions, repeat queries represent an important reformulation type which should also be addressed in session retrieval evaluation. We investigate different query reformulation patterns in logs from The European Library. Using an automatic classification for query reformulations, we found that the most frequent (and presumably the most important) reformulation pattern corresponds to repeat queries. We aim to find possible explanations for repeat queries in sessions and try to uncover implications for session retrieval evaluation.

DCU@ FIRE2010: Term conflation, blind relevance feedback, and cross-language IR with manual and automatic query translation

For the first participation of Dublin City University (DCU) in the FIRE 2010 evaluation campaign,... more For the first participation of Dublin City University (DCU) in the FIRE 2010 evaluation campaign, information retrieval (IR) experiments on English, Bengali, Hindi, and Marathi documents were performed to investigate term conflation (different stemming approaches and indexing word prefixes), blind relevance feedback, and manual and automatic query translation. The experiments are based on BM25 and on language modeling (LM) for IR. Results show that term conflation always improves mean average precision (MAP) compared to indexing unprocessed word forms, but different approaches seem to work best for different languages. For example, in monolingual Marathi experiments indexing 5-prefixes outperforms our corpus-based stemmer; in Hindi, the corpus-based stemmer achieves a higher MAP. For Bengali, the LM retrieval model achieves a much higher MAP than BM25 (0.4944 vs. 0.4526). In all experiments using BM25, blind relevance feedback yields considerably higher MAP in comparison to experiments without it. Bilingual IR experiments (English→Bengali and English→Hindi) are based on query translations obtained from native speakers and the Google translate web service. For the automatically translated queries, MAP is slightly (but not significantly) lower compared to experiments with manual query translations. The bilingual English→Bengali (English→Hindi) experiments achieve 81.7%-83.3% (78.0%-80.6%) of the best corresponding monolingual experiments.

DCU@ CLEF-IP 2009: Exploring standard IR techniques on patent retrieval

This paper presents the experiments and results for our participation in CLEF-IP 2009, which in n... more This paper presents the experiments and results for our participation in CLEF-IP 2009, which in newly launched this year. Our work applied standard information retrieval (IR) techniques to patent search. Different experiments tested various methods for the patent retrieval, including query formulation, structured index, weighted fields, filtering, and relevance feedback. Some methods did not show expected good retrieval effectiveness such as blind relevance feedback, other experiments showed acceptable performance. Query formulation was the key task for achieving better retrieval effectiveness, and this was performed through giving some higher weights to the text in certain fields. For the best runs, the retrieval effectiveness is still lower than IR applications for other domains illustrating the fact of the difficulty of patent search. The official results have shown that among fifteen participants we achieved the seventh and the fourth ranks from the mean average precision (MAP) and recall point of view, respectively.

DCU's experiments in NTCIR-8 IR4QA task

DCU at WikipediaMM 2009: Document expansion from wikipedia abstracts

In this paper, we describe and analyze our participation in the WikipediaMM task at CLEF 2009. Ou... more In this paper, we describe and analyze our participation in the WikipediaMM task at CLEF 2009. Our main efforts concern the expansion of the image metadata from the Wikipedia abstracts collection-DBpedia. In our experiments, we use the Okapi feedback algorithm for document expansion. Compared with our text retrieval baseline, our best document expansion RUN improves MAP by 17.89%. As one of our conclusions, document expansion from external resource can play an effective factor in the image metadata retrieval task.

DCU-TCD@ LogCLEF 2010: Re-ranking document collections and query performance estimation

This paper describes the collaborative participation of Dublin City University and Trinity Colleg... more This paper describes the collaborative participation of Dublin City University and Trinity College Dublin in LogCLEF 2010. Two sets of experiments were conducted. First, different aspects of the TEL query logs were analysed after extracting user sessions of consecutive queries on a topic. The relation between the queries and their length (number of terms) and position (first query or further reformulations) was examined in a session with respect to query performance estimators such as query scope, IDF-based measures, simplified query clarity score, and average inverse document collection frequency. Results of this analysis suggest that only some estimator values show a correlation with query length or position in the TEL logs (e.g. similarity score between collection and query). Second, the relation between three attributes was investigated: the user's country (detected from IP address), the query language, and the interface language. The investigation aimed to explore the influence of the three attributes on the user's collection selection. Moreover, the investigation involved assigning different weights to the three attributes in a scoring function that was used to re-rank the collections displayed to the user according to the language and country. The results of the collection re-ranking show a significant improvement in Mean Average Precision (MAP) over the original collection ranking of TEL. The results also indicate that the query language and interface language have more influence than the user's country on the collections selected by the users.

Classifying and filtering blind feedback terms to improve information retrieval effectiveness

The classification of blind relevance feedback (BRF) terms described in this paper aims at increa... more The classification of blind relevance feedback (BRF) terms described in this paper aims at increasing precision or recall by determining which terms decrease, increase or do not change the corresponding information retrieval (IR) performance metric. Classification and IR experiments are performed on the German and English GIRT data, using the BM25 retrieval model. Several basic memory-based classifiers are trained on different feature sets, grouping together features from different query expansion (QE) approaches. Combined classifiers employ the results of the basic classifiers and correctness predictions as features. The best combined classifiers for German (English) yield 22.9% (26.4%) and 5.8% (1.9%) improvement for term classification wrt. precision and recall compared to the best basic classifiers. IR experiments based on this term classification have also been performed. Filtering out different types of BRF terms shows that selecting feedback terms predicted to increase precis...

Sub-Word Indexing and Blind Relevance Feedback for English, Bengali, Hindi, and Marathi IR

ACM Transactions on Asian Language Information Processing, 2010

The Forum for Information Retrieval Evaluation (FIRE) provides document collections, topics, and ... more The Forum for Information Retrieval Evaluation (FIRE) provides document collections, topics, and relevance assessments for information retrieval (IR) experiments on Indian languages. Several research questions are explored in this article: 1) How to create create a simple, language-independent corpus-based stemmer, 2) How to identify sub-words and which types of sub-words are suitable as indexing units, and 3) How to apply blind relevance feedback on sub-words and how feedback term selection is affected by the type of the indexing unit. More than 140 IR experiments are conducted using the BM25 retrieval model on the topic titles and descriptions (TD) for the FIRE 2008 English, Bengali, Hindi, and Marathi document collections. The major findings are: The corpus-based stemming approach is effective as a knowledge-light term conflation step and useful in the case of few language-specific resources. For English, the corpus-based stemmer performs nearly as well as the Porter stemmer and ...

DCU and ISI@INEX 2010: Adhoc and Data-Centric Tracks

by Gareth Jones and Johannes Leveling

Lecture Notes in Computer Science, 2011

We describe the participation of Dublin City University (DCU) and the Indian Statistical Institut... more We describe the participation of Dublin City University (DCU) and the Indian Statistical Institute (ISI) in INEX 2010. The main contributions of this paper are: i) a simplified version of Hierarchical Language Model (HLM) which involves scoring XML elements with a combined probability of generating the given query from itself and the top level article node, is shown to outperform the baselines of Language Model (LM) and Vector Space Model (VSM) scoring of XML elements; ii) the Expectation Maximization (EM) feedback in LM is shown to be the most effective on the domain specific collection of IMDB; iii) automated removal of sentences indicating aspects of irrelevance from the narratives of INEX ad-hoc topics is shown to improve retrieval effectiveness.

Document Expansion for Text-Based Image Retrieval at CLEF 2009

by Gareth Jones and Johannes Leveling

Lecture Notes in Computer Science, 2010

We describe and analyze our participation in the Wikipedi-aMM task at ImageCLEF 2010. Our approac... more We describe and analyze our participation in the Wikipedi-aMM task at ImageCLEF 2010. Our approach is based on text-based image retrieval using information retrieval techniques on the metadata documents of the images. We submitted two English monolingual runs and one multilingual run. The monolingual runs used the query to retrieve the metadata document with the query and document in the same language; the multilingual run used queries in one language to search the metadata provided in three languages. The main focus of our work was using the English query to retrieve images based on the English metadata. For these experiments the English metadata data was expanded using an external resource -DBpedia. This study expanded on our application of document expansion in our previous participation in Image-CLEF 2009. In 2010 we combined document expansion with a document reduction technique which aimed to include only topically important words to the metadata. Our experiments used the Okapi feedback algorithm for document expansion and Okapi BM25 model for retrieval. Experimental results show that combining document expansion with the document reduction method give the best overall retrieval results.