Web Information Retrieval
809 Followers
Recent papers in Web Information Retrieval
With increasingly higher numbers of non-English language web searchers the problems of efficient handling of non-English Web documents and user queries are becoming major issues for search engines. The main aim of this review paper is to... more
The information era has brought with it the wellknown problem of 'Information Explosion'. There are many and varied search engines on the Internet but it is still hard to locate and concentrate only on materials relevant to a... more
Conference (SIGIR'07) aiming at bringing together researchers interested in non-English web searching.
With increasingly higher numbers of non-English language web searchers the problems of efficient handling of non-English Web documents and user queries are becoming major issues for search engines. The main aim of this review paper is to... more
Due to the numerous health documents available on the Web, information retrieval remains problematic with existing tools. This paper is positioned within the context of the CISMeF project (acronym of Catalogue and Index of French-speaking... more
The objective of this paper is to analyze web logs using data mining so as to present users with more personalized web content. We classify users based on their internet usage patterns and for each class, maintain a cache of web... more
As more information becomes available on the World Wide Web (there are currently over 4 billion pages covering most areas of human endeavor), it becomes more difficult to provide effective search tools for information access. Today,... more
Vertical search engines and web portals are gaining ground over the general-purpose engines due to their limited size and their high precision for the domain they cover. The number of vertical portals has rapidly increased over the last... more
The set of information spaces collectively referred as Internet poses serious problems to information retrieval tasks. Content evolution of Internet spaces and documents is reviewed and distinctive features of web documents are... more
Abstract. Metasearch engines are increasingly becoming a very useful tool for Web information retrieval. Their success depends mainly on their rank aggregation (fusion) method, their interface and their total “sustainability”, meaning... more
With the advent of the Internet, a new era of digital information exchange has begun. Currently, the Internet encompasses more than five billion online sites and this number is exponentially increasing every day. Fundamentally,... more
Content-Based Image Retrieval (CBIR) locates, retrieves and displays images alike to one given as a query, using a set of features. It demands accessible data in medical archives and from medical equipment, to infer meaning after some... more
The information era has brought with it the wellknown problem of 'Information Explosion'. There are many and varied search engines on the Internet but it is still hard to locate and concentrate only on materials relevant to a specific... more
Queries and click-through data taken from search engine transaction logs is an attractive alternative to traditional test collections, due to its volume and the direct relation to end-user querying. The overall aim of this paper is to... more
SARE-Bi es un sistema de gestión integral de documentos multilingües, que está basado en esquemas de descripción de metadatos que provienen de la anotación de corpora textuales (TEI), de la traducción asistida por ordenador (TMX) y de la... more
Information retrieval mechanisms from the web are a great need of the hour as the amount of the content is growing dynamically every day. There are many algorithms which have been proposed in literature mainly relying on the output of the... more
With the advent of the cloud computing, web servers, as the major channel in cloud computing, need to be redesigned to meet performance and power constraints. Considerable efforts have been invested in distributed web servers and web... more
Development of methods for Information Retrieval based on conceptual aspects is vital to reduce the quantity of unimportant documents retrieved by the search engines. In this paper, a method for expanding user queries is presented, such... more
In (Computing with Words, Wiley, New York, 2001, p. 251; Soft Comput. 6 (2002) 320; Fuzzy Logic and The Internet, Physica-Verlag, Springer, Wurzburg, Berlin, 2003) we presented di erent fuzzy linguistic multi-agent models for helping... more
Language identification is an important task for web information retrieval. This paper presents the implementation of a tool for language identification in mono-and multi-lingual documents. The tool implements four algorithms for language... more
Purpose -To measure the exact size of the World Wide Web (i.e., a census). The measure used is the number of publicly accessible web servers on port 80.
Today, internet has become the most important source of information. People are highly accustomed to the use of internet for acquiring information which they need. Many times, it is revealed that, the information seeker does not get... more
The success of the semantic web depends largely on how well ontologies can be utilized and formulated. Interoperability between systems using different versions of the same ontology i s essential, and this implies the need for a regulated... more
We give an overview of our experience in utilizing several open source packages and composing them into sophisticated applications to solve several challenging problems as part of some of the research projects at the Knowledge Discovery &... more
Ambiguous queries constitute a significant fraction of search instances and pose real challenges to web search engines. With current approaches the top results for these queries tend to be homogeneous, making it difficult for users... more
With the advent of the Internet, a new era of digital information exchange has begun. Currently, the Internet encompasses more than five billion online sites and this number is exponentially increasing every day. Fundamentally,... more
Link analysis is the most important application of web structure mining and serves as a source of new knowledge in web information retrieval. Designing an effective link structure for customer interfaces is critical for the success of... more
We give an overview of our experience in utilizing several open source packages and composing them into sophisticated applications to solve several challenging problems as part of some of the research projects at the Knowledge Discovery &... more
We present a dialogue system that enables the access in natural language to a web information retrieval system. We use a Web Semantic Language to model the knowledge conveyed by the texts. In this way we are able to obtain the associated... more
In this paper we propose a multi-agent architecture for web information retrieval using fuzzy logic based result fusion mechanism. The model is designed in JADE framework and takes advantage of JXTA agent communication method to allow... more
As an attempt to solve some contemporary web information retrieval problems, a construction of a cooperative multiagent system is proposed. This paper introduces the system and presents the use of a unique aggregation and fusion technique... more
As an attempt to solve some contemporary web information retrieval problems, a construction of a cooperative multiagent system is proposed. This paper introduces the system and presents the use of a unique aggregation and fusion technique... more
Contextual search tries to better capture a user's information need by augmenting the user's query with contextual information extracted from the search context (for example, terms from the web page the user is currently reading or a file... more
In this paper we investigate the use of stylistic features of Web texts in Portuguese to classify web pages according to users' needs, in order to improve Web Information Retrieval. We first describe a seven categories classification of... more
There is growing interest in accessing, relating, and combining data from multiple sources on the Web. Enormous amounts of heterogeneous information have been accumulated within corporations, government organization and universities. Such... more
A common task in both Webmetrics and Web information retrieval is to identify a set of Web pages or sites that are similar in content. In this paper we assess the extent to which links, colinks and couplings can be used to identify... more
In this paper we propose a multi-agent architecture for web information retrieval using fuzzy logic based result fusion mechanism. The model is designed in JADE framework and takes advantage of JXTA agent communication method to allow... more
Information retrieval on the Web is very different from retrieval in traditional indexed databases. This difference arises from: the high degree of dynamism of the Web; its hyper-linked character; the absence of a controlled indexing... more