Skip to main content
    • by 
    •   3  
      LearningDocument ClassificationChi Square Test
Interest in the area of pattern recognition has been renewed recently due to emerging applications which are not only challenging but also computationally more demanding. These applications include data mining (identifying a "pattern",... more
    • by 
    •   14  
      Computer ScienceData MiningPattern RecognitionNeural Network
Legal texts play an essential role in the organisation, be it public or private where each actor must be aware of, and comply with regulations. However, because of the difficulties of the legal domain, the actors prefer to rely on the... more
    • by  and +1
    •   4  
      AnnotationArabic Natural Language ProcessingDocument ClassificationLegal text
In recent years, XML has been established as a major means for information management, and has been broadly utilized for complex data representation (e.g. multimedia objects). Owing to an unparalleled increasing use of the XML standard,... more
    • by 
    •   8  
      Computer ScienceInformation RetrievalInformation ManagementData Warehousing
To help the growing qualitative and quantitative demands for information from the WWW, efficient automatic Web page classifiers are urgently needed. However, a classifier applied to the WWW faces a huge-scale dimensionality problem since... more
    • by 
    •   8  
      Decision MakingFeature SelectionMathematical SciencesDocument Classification
The web contains a wealth of product reviews, but sifting through them is a daunting task. Ideally, an opinion mining tool would process a set of search results for a given item, generating a list of product attributes (quality, features,... more
    • by 
    •   6  
      Information RetrievalMachine LearningWeb searchFeature Extraction
The widespread use of information technologies for construction is considerably increasing the number of electronic text documents stored in construction management information systems. Consequently, automated methods for organizing and... more
    • by 
    •   16  
      Information SystemsEngineeringInformation TechnologyInformation Management
Document image classification is an important step in Office Automation, Digital Libraries, and other document image analysis applications. There is great diversity in document image classifiers: they differ in the problems they solve, in... more
    • by 
    •   19  
      Artificial IntelligenceImage ProcessingFuzzy LogicModeling
Frequent itemset mining (FIM) is a core operation for several data mining applications as association rules computation, correlations, document classification, and many others, which has been extensively studied over the last decades.... more
    • by 
    •   8  
      Data MiningFrequent Itemset MiningShared memoryDocument Classification
— With the increasing availability of electronic documents and the rapid growth of the World Wide Web, the task of automatic categorization of documents became the key method for organizing the information and know-ledge discovery. Proper... more
    • by 
    •   8  
      Information SystemsInformation RetrievalInformation TechnologyMachine Learning
"Este manual originou-se da necessidade de padronização e instruções de normalização mais detalhadas para a entrada de termos de indexação. Almeja-se a recuperação da informação de maneira uniforme e apropriada nos sistemas de informação... more
    • by  and +2
    •   21  
      Information ScienceArchival StudiesInformation LiteracyInformation Society
TWLT is an acronym of Twente Workshop(s) on Language Technology. These workshops on natural language theory and technology are organised by the Parlevink Project, a language theory and technology project of the . For each workshop... more
    • by 
    •   11  
      Natural Language GenerationMachine TranslationInformation ExtractionCross Language Information Retrieval
With the increasing availability of electronic documents and the rapid growth of the World Wide Web, the task of automatic categorization of documents became the key method for organizing the information and knowledge discovery. Proper... more
    • by 
    •   7  
      Information SystemsInformation RetrievalInformation TechnologyMachine Learning
Instead of trying to produce a programme to simulate the adult mind, why not rather try to produce one which simulates the child's? If this were then subjected to an appropriate course of education one would obtain the adult brain. A mis... more
    • by 
    •   11  
      Text MiningLinked DataKnowledge RepresentationBayesian Networks
BackgroundOpen-source clinical natural-language-processing (NLP) systems have lowered the barrier to the development of effective clinical document classification systems. Clinical natural-language-processing systems annotate the syntax... more
    • by  and +1
    •   15  
      EngineeringNatural Language ProcessingMachine LearningData Mining
Integrating Different Strategies for Cross-Language Information Retrieval in the MIETTA Project Paul Buitelaar, Klaus Netter, Feiyu Xu DFKI Language Technology Lab Stuhlsatzenhausweg 3, 66123 Saarbrücken, Germany {paulb, netter, feiyu}@... more
    • by 
    •   11  
      Natural Language GenerationMachine TranslationInformation ExtractionCross Language Information Retrieval
Automatic classification has become an important research area due to the rapid increase of digital information. Evidently, manual classification of documents is a tough work due to occurrences of vocabulary ambiguities of classification... more
    • by 
    •   6  
      Automatic Classification (Machine Learning)Semi-Automatic ClassificationOntology Based Semantic Information RetrievalDocument Classification
    • by 
    •   17  
      Data AnalysisComputer NetworksPrincipal Component AnalysisPattern Recognition
The use of ontology in order to provide a mechanism to enable machine reasoning has continuously increased during the last few years. This paper suggests an automated method for document classification using an ontology, which expresses... more
    • by 
    •   4  
      OntologyMachine LearningData MiningDocument Classification
An increasing and overwhelming amount of biomedical information is available in the research literature mainly in the form of free-text. Biologists need tools that automate their information search and deal with the high volume and... more
    • by 
    •   6  
      Information RetrievalInformation ProcessingSubject headingsDocument Classification
Automatic document classification due to its various applications in data mining and information technology is one of the important topics in computer science. Classification plays a vital role in many information management and retrieval... more
    • by 
    •   17  
      Information RetrievalInformation TechnologyNatural Language ProcessingInformation Management
    • by 
    •   9  
      Machine LearningText MiningModular Systems (Architecture)Feature Selection
A method of document comparison based on a hierarchical dictionary of topics (concepts) is described. The hierarchical links in the dictionary are supplied with the weights that are used for detecting the main topics of a document and for... more
    • by 
    •   10  
      Information RetrievalNatural Language ProcessingStatistical AnalysisHierarchy
It is well known that links are an important source of information when dealing with Web collections. However, the question remains on whether the same techniques that are used on the Web can be applied to collections of documents... more
    • by 
    •   12  
      Computer ScienceInformation RetrievalDigital LibraryText Classification
The automatic classification of legal case documents has become very important, owing to the justice denials, delays and failures observed in the judicial case management systems. Our hybrid text classification model employed extensive... more
    • by  and +1
    •   6  
      Machine LearningText MiningSupport Vector MachinesStatistical machine learning
Product or company names used in this set are for identification purposes only. Inclusion of the names of the products or companies does not indicate a claim of ownership by IGI Global of the trademark or registered trademark.
    • by 
    •   4  
      Text ClassificationThesaurusDocument ClassificationThesauri
Automated document classification is the machine learning fundamental that refers to assigning automatic categories among scanned images of the documents. It reached the state-of-art stage but it needs to verify the performance and... more
    • by  and +1
    •   6  
      AlgorithmsArtificial IntelligenceNatural Language ProcessingMachine Learning
The combination of multiple features or views when representing documents or other kinds of objects usually leads to improved results in classification (and retrieval) tasks. Most systems assume that those views will be available both at... more
    • by 
    •   12  
      Computer ScienceVisualizationPrincipal Component AnalysisImage Classification
Pattern classification has been successfully applied in many problem domains, such as biometric recognition, document classification or medical diagnosis. Missing or unknown data are a common drawback that pattern recognition techniques... more
    • by 
    •   9  
      Cognitive ScienceMachine LearningPattern RecognitionStatistical Learning Theory
The amount of narrative clinical text documents stored in Electronic Patient Records (EPR) of Hospital Information Systems is increasing. Physicians spend a lot of time finding relevant patient-related information for medical decision... more
    • by 
    •   10  
      Information RetrievalData MiningData AnalysisMedical Decision Making
With the increasing availability of electronic documents and the rapid growth of the World Wide Web, the task of automatic categorization of documents became the key method for organizing the information and knowledge discovery. Proper... more
    • by  and +1
    •   7  
      Information SystemsInformation RetrievalInformation TechnologyMachine Learning
With the increased use of Internet, a large number of consumers first consult on line resources for their healthcare decisions. The problem of the existing information structure primarily lies in the fact that the vocabulary used in... more
    • by 
    •   4  
      Text ClassificationThesaurusDocument ClassificationThesauri
In this work, we jointly apply several text mining methods to a corpus of legal documents in order to compare the separation quality of two inherently different document classification schemes. The classification schemes are compared with... more
    • by 
    •   4  
      Active LearningPrincipal Component AnalysisText MiningDocument Classification
The amount of narrative clinical text documents stored in Electronic Patient Records (EPR) of Hospital Information Systems is increasing. Physicians spend a lot of time finding relevant patient-related information for medical decision... more
    • by 
    •   23  
      Information RetrievalNatural Language ProcessingData MiningDocumentation
The goal of the reported research is the development of a computational approach that could help a cognitive scientist to interactively represent a learner's mental models, and to automatically validate their coherence with respect to the... more
    • by 
    •   17  
      Cognitive ScienceMachine LearningCausal reasoningKnowledge Representation
Classifier for the Internet Resource Discovery (ACIRD), which uses machine learning techniques to organize and retrieve Internet documents. ACIRD consists of a knowledge acquisition process, document classifier and two-phase search... more
    • by 
    •   11  
      Information RetrievalMachine LearningData MiningSearch Engines
0-7803-7868-7/03/$17.00 0 2003 IEEE.
    • by 
    •   18  
      Data MiningImage AnalysisWritingImage Classification
Quantifying the concept of co-occurrence and iterated co-occurrence yields indices of similarity between words or between documents. These similarities are associated with a reversible Markov transition matrix, the formal properties of... more
    • by 
    •   6  
      Cognitive ScienceLinguisticsCorrespondence AnalysisQuantitative linguistics
We propose a simple Bayesian network-based text classifier, which may be considered as a discriminative counterpart of the generative multinomial naive Bayes classifier. The method relies on the use of a fixed network topology with the... more
    • by 
    •   5  
      Bayesian NetworksProbabilistic Graphical ModelsText ClassificationDocument Classification
We propose a method which, given a document to be classified, automatically generates an ordered set of appropriate descriptors extracted from a thesaurus. The method creates a Bayesian network to model the thesaurus and uses... more
    • by 
    •   9  
      Supervised Learning TechniquesBayesian NetworksProbabilistic Graphical ModelsText Classification
This paper uses Systemic Functional Linguistic (SFL) theory as a basis for extracting semantic features of documents. We focus on the pronominal and determination system and the role it plays in constructing interpersonal distance. By... more
    • by 
    •   20  
      Discourse AnalysisPsychologyGeographyComputer Science
Email has become an important means of electronic communication but the viability of its usage is marred by Un-solicited Bulk Email (UBE) messages. UBE poses technical and socio-economic challenges to usage of emails. Besides, the... more
    • by 
    •   16  
      Machine LearningClassification (Machine Learning)Clustering and Classification MethodsApplications of Machine Learning
A new algorithm based on learning vector quantisation classifier is presented based on a modified proximity-measure, which enforces a predetermined correct classification level in training while using sliding-mode approach for stable... more
    • by 
    •   10  
      Machine LearningSupport Vector MachinesNeural NetworksText Classification
ii iii
    • by 
    •   7  
      Information SystemsEnd User License AgreementMeta-modelDocument Classification
ABSTRACT: Improvements in hardware, communication technology and database have led to the explosion of multimedia information repositories. In order to provide the quality of information retrieval and the quality of services, it is... more
    • by 
    •   3  
      Information RetrievalKey wordsDocument Classification
Feature selection is of paramount concern in document classification process which improves the efficiency and accuracy of text classifier. Vector Space Model is used to represent the "Bag of Word" BOW of the documents with term weighting... more
    • by 
    •   16  
      OntologyMachine LearningData MiningFeature Selection
In this paper we propose a matching algorithm for measuring the structural similarity between an XML document and a DTD. The matching algorithm, by comparing the document structure against the one the DTD requires, is able to identify... more
    • by 
    •   4  
      Information SystemsDocument ClassificationDocument StructureStructural Similarity Index
We present a novel approach for classifying documents that combines different pieces of evidence (e.g., textual features of documents, links, and citations) transparently, through a data mining technique which generates rules associating... more
    • by 
    •   5  
      Data MiningDigital LibraryClassificationQuality Criteria
This paper reports the results of an experiment in which an attempt is made to determine whether word length and sentence length can be considered as the two indispensable parameters in the identification of Bangla medical text documents,... more
    • by  and +2
    •   2  
      Text CategorizationDocument Classification
In this paper we present the Dual Support Apriori for Temporal data (DSAT) algorithm. This is a novel technique for discovering Jumping Emerging Patterns (JEPs) from time series data using a sliding window technique. Our approach is... more
    • by 
    •   20  
      Time SeriesTemporal Data MiningText ClassificationGraph Mining