Skip to main content
The Linguistic Data Consortium at the University of Pennsylvania has recently been engaged in the creation of large-scale annotated corpora of broadcast news materials in support of the ongoing Topic Detection and Tracking (TDT) research... more
    • by 
    •   8  
      Computer ScienceQuality ControlQuality AssuranceMandarin Chinese
In this paper, two clustering algorithms called dynamic hierarchical compact and dynamic hierarchical star are presented. Both methods aim to construct a cluster hierarchy, dealing with dynamic data sets. The first creates disjoint... more
    • by 
    •   9  
      Cognitive ScienceInformation OrganizationDocument ClusteringTopic Detection and Tracking
Ce travail porte sur la question de la visualisation thématique en recherche d’informations. Dans un contexte de plus en plus prégnant de circulation d’informations et face à d’importants flux de données il convient de synthétiser... more
    • by 
    •   5  
      Information RetrievalNamed Entity RecognitionData VisualisationNamed Entity Extraction
A crucial step in processing speech audio data for information extraction, topic detection, or browsing/playback is to segment the input into sentence and topic units. Speech segmentation is challenging, since the cues typically present... more
    • by 
    •   10  
      Cognitive ScienceInformation ExtractionAutomatic Speech RecognitionLinguistics
Topic detection and tracking (TDT) applications aim to organize the temporally ordered stories of a news stream according to the events. Two major problems in TDT are new event detection (NED) and topic tracking (TT). These problems focus... more
    • by 
    •   8  
      Information SystemsInformation RetrievalLibrary and Information StudiesThe
Web mining - is the application of data mining techniques to discover patterns from the Web. Topic tracking is one of the technologies that has been developed and can be used in the text mining process. The main purpose of topic tracking... more
    • by 
    •   2  
      Text MiningTopic Detection and Tracking
A methodology for automatically identifying and clustering semantic features or topics in a heterogeneous text collection is presented. Textual data is encoded using a low rank nonnegative matrix factorization algorithm to retain natural... more
    • by 
    •   10  
      Information SystemsPrincipal Component AnalysisText MiningInformation Processing
Resumen: En los últimos años las de tecnologías de visualización de información y minería de datos se han posicionado como herramientas clave para el análisis de grandes almacenes digitales de documentación científica. Un ejemplo de estas... more
    • by 
    • Topic Detection and Tracking
    • by 
    •   11  
      Information RetrievalNatural Language ProcessingData MiningFrequent Itemset Mining
This paper presents a keyword extraction technique that can be used for tracking topics over time. In our work, keywords are a set of significant words in an article that gives high-level description of its contents to readers.... more
    • by 
    •   12  
      Information RetrievalData MiningText MiningSearch Engines
Near-duplicate keyframes (NDK) play a unique role in large-scale video search, news topic detection and tracking. In this paper, we propose a novel NDK retrieval approach by exploring both visual and textual cues from the visual... more
    • by 
    •   7  
      Video SearchImage RetrievalNearest NeighborPerformance Improvement
How to cope with information overload is becoming an increasingly important problem even for scientists. Search engines such as Scholar, CiteSeer, SmealSearch, Google, MSN and Yahoo tries to solve this problem by indexing (variable size)... more
    • by 
    •   2  
      Natural Language ProcessingTopic Detection and Tracking
This study present SocialStories - a system based on incremental clustering for streaming tweets, for identifying fine-grained stories within a broader trending topic on Twitter. The contributions include a novel tf-metric, called the... more
    • by  and +3
    •   4  
      Automatic Text SummarizationTwitterTopic Detection and TrackingSocial Media Analytics
In this paper propose a topic tracking and visualization method using Independent Topic Analysis. Independent Topic Analysis is a method for extracting mutually independent topics from the documents data by using the Independent Component... more
    • by 
    •   3  
      Data MiningText MiningTopic Detection and Tracking
In recent years, the rapid growth of the Internet has changed the way people interact globally. The internet usage is quite diverse, which one of them is a media to collect user generated content, including online review. Public sentiment... more
    • by 
    •   3  
      Topic Detection and TrackingNaive BayesSentiment Analysis and Opinion Mining
This paper presents the TNO tracking system which was evaluatedat the 2000 Topic Detection and Tracking evaluation project (TDT2000). The objective of the TDT tracking task is to track eventsof interest over time. We built a baseline... more
    • by 
    •   4  
      Adaptive FilterTopic Detection and TrackingLanguage ModelTracking system
    • by 
    • Topic Detection and Tracking
In this paper, we carry out a study about the main themes treated by the International Journal of Information Technology & Decision Making during its¯rst 10 years (2002À2011). The themes are detected, quanti¯ed and visualized using an... more
    • by 
    •   5  
      Information TechnologyDecision MakingPerformance AnalysisBusiness and Management
Microblog is a social network service which is able to aggregate messages to explore new knowledge. Nowadays, more and more users contribute what they found and what they thought by posting short messages. This phenomenon makes people... more
    • by 
    •   6  
      Social NetworksData MiningText MiningSocial Media
In this work, we present a new semantic language modeling approach to model news stories in the Topic Detection and Tracking (TDT) task. In the new approach, we build a unigram language model for each semantic class in a news story. We... more
    • by 
    •   4  
      Topic Detection and TrackingLanguage Modellog likelihood ratiolearning algorithm
Rapid proliferation of the World Wide Web led to an enormous increase in the availability of textual corpora. In this paper, the problem of topic detection and tracking is considered with application to news items. The proposed approach... more
    • by 
    •   3  
      Text MiningTopic ModelsTopic Detection and Tracking
Web content clustering is very important part of topic detection and tracking issue. In our paper we focus on pre-processing phase of web content clustering. We focus on blog articles published in Slovak language. We evaluate the impact... more
    • by 
    •   6  
      Natural Language ProcessingData MiningText MiningCategorization
The Center for Intelligent Information Retrieval at UMass Amherst submitted runs for all four tasks, namely, Hierarchical Topic Detection, Topic Tracking, New Event Detection and Link Detection. In this paper, we describe our models,... more
    • by 
    • Topic Detection and Tracking
This paper introduces Topic Tracking for Punjabi language. Text mining is a field that automatically extracts previously unknown and useful information from unstructured textual data. It has strong connections with natural language... more
    • by 
    •   5  
      Text MiningNLPKeyword ExtractionTopic Detection and Tracking
The Linguistic Data Consortium at the University of Pennsylvania has recently been engaged in the creation of large-scale annotated corpora of broadcast news materials in support of the ongoing Topic Detection and Tracking (TDT) research... more
    • by 
    •   7  
      Quality ControlQuality AssuranceMandarin ChineseTopic Detection and Tracking
Information Retrieval (IR) aims at modelling, designing and implementing systems able to provide fast and effective content-based access to a large amount of information. Information can be of any kind: textual, visual, or auditory. The... more
    • by 
    •   10  
      Information RetrievalInformation FilteringUser InterfaceData Visualisation
We describe a new probabilistic Sentence Tree Language Modeling approach that captures term dependency patterns in Topic Detection and Tracking's (TDT) Story Link Detection task. New features of the approach include modeling the... more
    • by 
    •   11  
      Computer ScienceTrackingEfficient Algorithm for ECG CodingTopic Detection and Tracking
Story clustering is a critical step for news retrieval, topic mining, and summarization. Nonetheless, the task remains highly challenging owing to the fact that news topics exhibit clusters of varying densities, shapes, and sizes.... more
    • by 
    •   5  
      EngineeringData MiningTopic Detection and TrackingVisual Cues
Topic Detection and Tracking (TDT) tasks are evaluated using a cost function. The standard TDT cost function assumes a constant probability of relevance P (rel) across all topics. In practice, P (rel) varies widely across topics. We argue... more
    • by 
    •   2  
      Topic Detection and TrackingCost Function
We present an algorithm that allows for indexing music by topic. The application scenario is an information retrieval system into which any song with known lyrics can be inserted and indexed so as to make a music collection browseable by... more
    • by 
    •   5  
      Text MiningVector Space ModelTopic Detection and TrackingNon-negative matrix factorization
Topic detection and tracking and topic segmentation play an important role in capturing the local and sequential information of documents. Previous work in this area usually focuses on single documents, although similar multiple documents... more
    • by 
    •   3  
      Mutual InformationTopic Detection and TrackingPerforation
In this paper we introduce a probabilistic framework to exploit hierarchy, structure sharing and duration information for topic transition detection in videos. Our probabilistic detection framework is a combination of a shot... more
    • by 
    •   6  
      Topic Detection and TrackingEducation and TrainingHierarchical Hidden Markov ModelSample Complexity
Tracking topics on social media streams is non-trivial as the number of topics mentioned grows without bound. This complexity is compounded when we want to track such topics against other fast moving streams. We go beyond traditional... more
    • by 
    •   4  
      Information RetrievalData Stream MiningInformation Retrieval, Topic Detection and TrackingTopic Detection and Tracking
Sentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework called joint... more
    • by 
    •   11  
      Data MiningSentiment AnalysisMotion PicturesOpinion Mining
The technologies for single-and multi-document summarization that are described and evaluated in this article can be used on heterogeneous texts for different summarization tasks. They refer to the extraction of important sentences from... more
    • by 
    •   8  
      Information SystemsMulti-Document SummarizationInformation ProcessingLibrary and Information Studies
The technologies for single-and multi-document summarization that are described and evaluated in this article can be used on heterogeneous texts for different summarization tasks. They refer to the extraction of important sentences from... more
    • by 
    •   8  
      Information SystemsMulti-Document SummarizationInformation ProcessingLibrary and Information Studies
From last few decades there is wide spread usage of social network platforms such as twitter or other micro blogging systems which contains huge amount of timely generated data. Tweeter is fastest means of information sharing where user... more
    • by 
    • Topic Detection and Tracking
This paper presents several methods for topic detection on newspaper articles, using either a general vocabulary or topic-specific vocabularies. Specific vocabularies are determined manually or statistically. In both cases, we aim at... more
    • by 
    •   2  
      Topic Detection and TrackingLanguage Model
Continuing progress in the automatic transcription of broadcast speech via speech recognition has raised the possibility of applying information retrieval techniques to the resulting (errorful) text. In this paper we describe a general... more
    • by 
    •   6  
      Information RetrievalSpeech Recognitionhidden Markov modelText Segmentation
Information retrieval (IR) research has reached a point where it is appropriate to assess progress and to define a research agenda for the next five to ten years. This report summarizes a discussion of IR research challenges that took... more
    • by 
    •   12  
      Information RetrievalWeb searchInformation ExtractionResearch Agenda
Information retrieval is moving beyond the stage where users simply type one or more keywords and retrieve a ranked list of documents. In such a scenario users have to go through the returned documents in order to find what they are... more
    • by 
    •   8  
      Information RetrievalSemanticsMultimediaTopic Detection and Tracking
Topics in situated and task oriented communication depend heavily on the given, often changing environment, making the detection of predetermined topics in many cases useless. Detection of non-predefined topics can enhance... more
    • by 
    •   5  
      Latent Semantic AnalysisHuman Robot InteractionMobile RobotTopic Detection and Tracking
Twitter is a user-generated content system that allows its users to share short text messages, called tweets, for a variety of purposes, including daily conversations, URLs sharing and information news. Considering its world-wide... more
    • by 
    •   5  
      User Generated ContentText AnalysisLife CycleReal Time
In this paper we introduce a probabilistic framework to exploit hierarchy, structure sharing and duration information for topic transition detection in videos. Our probabilistic detection framework is a combination of a shot... more
    • by 
    •   6  
      Topic Detection and TrackingEducation and TrainingHierarchical Hidden Markov ModelSample Complexity
The TDT-3 Text and Speech Corpus expands on previous phases of Topic Detection and Tracking data collections, by increasing the number of news sources being sampled, by including Mandarin Chinese as well as English news data, and by... more
    • by 
    •   3  
      Data CollectionTopic Detection and TrackingBoolean Satisfiability
The LDC began its first Broadcast News (BN) speech collection in the spring of 1996, facing a host of challenges including IPR negotiations with broadcasters, establishment of new transcription conventions and tools, and a compressed... more
    • by 
    •   8  
      Cognitive ScienceLinguisticsSpeech CommunicationLevel Of Detail (LOD)
First Story Detection is hard because the most accurate systems become progressively slower with each document processed. We present a novel approach to FSD, which operates in constant time/space and scales to very high volume streams. We... more
    • by 
    •   4  
      Information RetrievalData Stream MiningInformation Retrieval, Topic Detection and TrackingTopic Detection and Tracking
Nous présentons dans cet article une mémoire de traduction sous-phrastique sensible au domaine de traduction, une première étape vers l'intégration du contexte. Ce système est en mesure de recycler les traductions déjà « vues » par la... more
    • by 
    •   6  
      Machine TranslationWord alignmentTranslation MemoryTopic Detection and Tracking
As part of MITRE's work under the DARPA TIDES (Translingual Information Detection, Extraction and Summarization) program, we are preparing a series of demonstrations to showcase the TIDES Integrated Feasibility Experiment on Bio-Security... more
    • by 
    •   21  
      Information SystemsInformation RetrievalComputational LinguisticsMetadata
We introduce the relative rank differential statistic which is a non-parametric approach to document and dialog analysis based on word frequency rank-statistics. We also present a simple method to establish semantic saliency in dialog,... more
    • by 
    •   4  
      Word FrequencyTopic Detection and TrackingAutomatic code generationTerm Frequency