Topic Detection and Tracking Research Papers

The Linguistic Data Consortium at the University of Pennsylvania has recently been engaged in the creation of large-scale annotated corpora of broadcast news materials in support of the ongoing Topic Detection and Tracking (TDT) research... more

In this paper, two clustering algorithms called dynamic hierarchical compact and dynamic hierarchical star are presented. Both methods aim to construct a cluster hierarchy, dealing with dynamic data sets. The first creates disjoint... more

Ce travail porte sur la question de la visualisation thématique en recherche d’informations. Dans un contexte de plus en plus prégnant de circulation d’informations et face à d’importants flux de données il convient de synthétiser... more

A crucial step in processing speech audio data for information extraction, topic detection, or browsing/playback is to segment the input into sentence and topic units. Speech segmentation is challenging, since the cues typically present... more

Topic detection and tracking (TDT) applications aim to organize the temporally ordered stories of a news stream according to the events. Two major problems in TDT are new event detection (NED) and topic tracking (TT). These problems focus... more

Web mining - is the application of data mining techniques to discover patterns from the Web. Topic tracking is one of the technologies that has been developed and can be used in the text mining process. The main purpose of topic tracking... more

Bookmark
Download
- by IJEETE Journals
- •
- 2
  Text Mining, Topic Detection and Tracking

A methodology for automatically identifying and clustering semantic features or topics in a heterogeneous text collection is presented. Textual data is encoded using a low rank nonnegative matrix factorization algorithm to retain natural... more

Resumen: En los últimos años las de tecnologías de visualización de información y minería de datos se han posicionado como herramientas clave para el análisis de grandes almacenes digitales de documentación científica. Un ejemplo de estas... more

This paper presents a keyword extraction technique that can be used for tracking topics over time. In our work, keywords are a set of significant words in an article that gives high-level description of its contents to readers.... more

Bookmark
Download
- by Sungjick Lee
- •
- 12
  Information Retrieval, Data Mining, Text Mining, Search Engines

Near-duplicate keyframes (NDK) play a unique role in large-scale video search, news topic detection and tracking. In this paper, we propose a novel NDK retrieval approach by exploring both visual and textual cues from the visual... more

Bookmark
Download
- by NDK 13
- •
- 7
  Video Search, Image Retrieval, Nearest Neighbor, Performance Improvement

How to cope with information overload is becoming an increasingly important problem even for scientists. Search engines such as Scholar, CiteSeer, SmealSearch, Google, MSN and Yahoo tries to solve this problem by indexing (variable size)... more

This study present SocialStories - a system based on incremental clustering for streaming tweets, for identifying fine-grained stories within a broader trending topic on Twitter. The contributions include a novel tf-metric, called the... more

In this paper propose a topic tracking and visualization method using Independent Topic Analysis. Independent Topic Analysis is a method for extracting mutually independent topics from the documents data by using the Independent Component... more

In recent years, the rapid growth of the Internet has changed the way people interact globally. The internet usage is quite diverse, which one of them is a media to collect user generated content, including online review. Public sentiment... more

This paper presents the TNO tracking system which was evaluatedat the 2000 Topic Detection and Tracking evaluation project (TDT2000). The objective of the TDT tracking task is to track eventsof interest over time. We built a baseline... more

Bookmark
Download
- by Chirag Shah
- •
- Topic Detection and Tracking

In this paper, we carry out a study about the main themes treated by the International Journal of Information Technology & Decision Making during its¯rst 10 years (2002À2011). The themes are detected, quanti¯ed and visualized using an... more

Microblog is a social network service which is able to aggregate messages to explore new knowledge. Nowadays, more and more users contribute what they found and what they thought by posting short messages. This phenomenon makes people... more

Bookmark
Download
- by Lucas Wei
- •
- 6
  Social Networks, Data Mining, Text Mining, Social Media

In this work, we present a new semantic language modeling approach to model news stories in the Topic Detection and Tracking (TDT) task. In the new approach, we build a unigram language model for each semantic class in a news story. We... more

Rapid proliferation of the World Wide Web led to an enormous increase in the availability of textual corpora. In this paper, the problem of topic detection and tracking is considered with application to news items. The proposed approach... more

Web content clustering is very important part of topic detection and tracking issue. In our paper we focus on pre-processing phase of web content clustering. We focus on blog articles published in Slovak language. We evaluate the impact... more

Bookmark
Download
- by Pavol Navrat
- •
- 6
  Natural Language Processing, Data Mining, Text Mining, Categorization

The Center for Intelligent Information Retrieval at UMass Amherst submitted runs for all four tasks, namely, Hierarchical Topic Detection, Topic Tracking, New Event Detection and Link Detection. In this paper, we describe our models,... more

Bookmark
Download
- by Chirag Shah
- •
- Topic Detection and Tracking

This paper introduces Topic Tracking for Punjabi language. Text mining is a field that automatically extracts previously unknown and useful information from unstructured textual data. It has strong connections with natural language... more

The Linguistic Data Consortium at the University of Pennsylvania has recently been engaged in the creation of large-scale annotated corpora of broadcast news materials in support of the ongoing Topic Detection and Tracking (TDT) research... more

Information Retrieval (IR) aims at modelling, designing and implementing systems able to provide fast and effective content-based access to a large amount of information. Information can be of any kind: textual, visual, or auditory. The... more

We describe a new probabilistic Sentence Tree Language Modeling approach that captures term dependency patterns in Topic Detection and Tracking's (TDT) Story Link Detection task. New features of the approach include modeling the... more

Story clustering is a critical step for news retrieval, topic mining, and summarization. Nonetheless, the task remains highly challenging owing to the fact that news topics exhibit clusters of varying densities, shapes, and sizes.... more

Bookmark
Download
- by Xiao Wu
- •
- 5
  Engineering, Data Mining, Topic Detection and Tracking, Visual Cues

Topic Detection and Tracking (TDT) tasks are evaluated using a cost function. The standard TDT cost function assumes a constant probability of relevance P (rel) across all topics. In practice, P (rel) varies widely across topics. We argue... more

Bookmark
Download
- by AO Feng
- •
- 2
  Topic Detection and Tracking, Cost Function

We present an algorithm that allows for indexing music by topic. The application scenario is an information retrieval system into which any song with known lyrics can be inserted and indexed so as to make a music collection browseable by... more

Topic detection and tracking and topic segmentation play an important role in capturing the local and sequential information of documents. Previous work in this area usually focuses on single documents, although similar multiple documents... more

In this paper we introduce a probabilistic framework to exploit hierarchy, structure sharing and duration information for topic transition detection in videos. Our probabilistic detection framework is a combination of a shot... more

Tracking topics on social media streams is non-trivial as the number of topics mentioned grows without bound. This complexity is compounded when we want to track such topics against other fast moving streams. We go beyond traditional... more

Sentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework called joint... more

Bookmark
Download
- by Richard Everson
- •
- 11
  Data Mining, Sentiment Analysis, Motion Pictures, Opinion Mining

The technologies for single-and multi-document summarization that are described and evaluated in this article can be used on heterogeneous texts for different summarization tasks. They refer to the extraction of important sentences from... more

From last few decades there is wide spread usage of social network platforms such as twitter or other micro blogging systems which contains huge amount of timely generated data. Tweeter is fastest means of information sharing where user... more

This paper presents several methods for topic detection on newspaper articles, using either a general vocabulary or topic-specific vocabularies. Specific vocabularies are determined manually or statistically. In both cases, we aim at... more

Continuing progress in the automatic transcription of broadcast speech via speech recognition has raised the possibility of applying information retrieval techniques to the resulting (errorful) text. In this paper we describe a general... more

Information retrieval (IR) research has reached a point where it is appropriate to assess progress and to define a research agenda for the next five to ten years. This report summarizes a discussion of IR research challenges that took... more

Information retrieval is moving beyond the stage where users simply type one or more keywords and retrieve a ranked list of documents. In such a scenario users have to go through the returned documents in order to find what they are... more

Topics in situated and task oriented communication depend heavily on the given, often changing environment, making the detection of predetermined topics in many cases useless. Detection of non-predefined topics can enhance... more

Twitter is a user-generated content system that allows its users to share short text messages, called tweets, for a variety of purposes, including daily conversations, URLs sharing and information news. Considering its world-wide... more

Bookmark
Download
- by Luigi Di Caro
- •
- 5
  User Generated Content, Text Analysis, Life Cycle, Real Time

In this paper we introduce a probabilistic framework to exploit hierarchy, structure sharing and duration information for topic transition detection in videos. Our probabilistic detection framework is a combination of a shot... more

The TDT-3 Text and Speech Corpus expands on previous phases of Topic Detection and Tracking data collections, by increasing the number of news sources being sampled, by including Mandarin Chinese as well as English news data, and by... more

The LDC began its first Broadcast News (BN) speech collection in the spring of 1996, facing a host of challenges including IPR negotiations with broadcasters, establishment of new transcription conventions and tools, and a compressed... more

First Story Detection is hard because the most accurate systems become progressively slower with each document processed. We present a novel approach to FSD, which operates in constant time/space and scales to very high volume streams. We... more

Nous présentons dans cet article une mémoire de traduction sous-phrastique sensible au domaine de traduction, une première étape vers l'intégration du contexte. Ce système est en mesure de recycler les traductions déjà « vues » par la... more

As part of MITRE's work under the DARPA TIDES (Translingual Information Detection, Extraction and Summarization) program, we are preparing a series of demonstrations to showcase the TIDES Integrated Feasibility Experiment on Bio-Security... more

We introduce the relative rank differential statistic which is a non-parametric approach to document and dialog analysis based on word frequency rank-statistics. We also present a simple method to establish semantic saliency in dialog,... more

Topic Detection and Tracking

Log In