Skip to main content
In this paper, a new approach based on multiple instance learning is proposed to predict student's performance and to improve the obtained results using a classical single instance learning. Multiple instance learning provides a more... more
    • by 
    •   9  
      Computer ScienceArtificial IntelligenceMachine LearningMultiple Instance Learning
This article examined the relationships among race-related stress, quality of life indicators, and life satisfaction among elderly African Americans. A sample of 127 elderly African Americans, consisting of 87 women and 26 men (and 14... more
    • by 
    •   12  
      Mental HealthQuality of lifeCultural DiversityLife Satisfaction
Neutropenia and its complications, including febrile neutropenia, are major dose-limiting toxicities of systemic cancer chemotherapy. A number of studies have attempted to identify risk factors for neutropenia and its consequences to... more
    • by 
    •   19  
      Risk assessmentMultivariate AnalysisBlood PressureCancer Chemotherapy
A. Abraham el al.(Eels.) 251 Soft Computing Systems: Design, Management and Applications IOS Press. 2002 A Study of i^-Nearest Neighbour as an Imputation Method Gustavo EAPA Batista and Maria Carolina Monard University of Sao Paulo-USP... more
    • by 
    •   8  
      Machine LearningHybrid Intelligent SystemsHisMissing Data
The objective of this research is to implement a method for estimating the real missing data in heart disease datasets and to show how it affects the resulting knowledge. Missing data is common problem in knowledge discovery from database... more
    • by 
    •   10  
      Comparative StudyNeural NetworkMissing DataHeart Disease
Information related to land surface phenology is important for a variety of applications. For example, phenology is widely used as a diagnostic of ecosystem response to global change. In addition, phenology influences seasonal scale... more
    • by 
    •   15  
      Remote SensingAtmospheric AerosolsTemporal dynamicsGlobal change
    • by 
    •   6  
      PsychologyMissing DataMultiple ImputationMissing Values
    • by 
    •   19  
      StatisticsMachine LearningBiometricsPublic Health
Cloud cover usually denies the generation of land surface temperature (LST) time series over large areas from thermal infrared (TIR) data sensed by satellites. If cloud cover is brief ( < 4 h) and only partial on the scale of a METEOSAT... more
    • by 
    •   16  
      Time SeriesA Priori KnowledgeTemporal ResolutionGlobal change
Modeling the activated sludge wastewater treatment plant plays an important role in improving its performance. However, there are many limitations of the available data for model identification, calibration, and verification, such as the... more
    • by 
    •   20  
      Environmental EngineeringChemical EngineeringCivil EngineeringCartography
Imputation of missing values is one of the major tasks for data pre-processing in many areas. Whenever imputation of data from official statistics comes into mind, several (additional) challenges almost always arise, like large data sets,... more
    • by  and +1
    •   7  
      EconometricsStatisticsLarge Data SetsMissing Values
Background and Objective: Epidemiologic studies commonly estimate associations between predictors (risk factors) and outcome. Most software automatically exclude subjects with missing values. This commonly causes bias because missing... more
    • by 
    •   15  
      PredictionClinical EpidemiologyBiasComputer Simulation
Copyright c 2002, 2004, 2008 by StataCorp LP All rights reserved. First edition 2002 Revised edition 2004 Second edition 2008 Published by Stata Press, 4905 Lakeway Drive, College Station, Texas 77845 Typeset in LATE X2ε Printed in the... more
    • by 
    •   8  
      Survival AnalysisCox RegressionStatistical softwareMultiple Imputation
Copyright c 2002, 2004, 2008 by StataCorp LP All rights reserved. First edition 2002 Revised edition 2004 Second edition 2008 Published by Stata Press, 4905 Lakeway Drive, College Station, Texas 77845 Typeset in LATE X2ε Printed in the... more
    • by 
    •   8  
      Survival AnalysisCox RegressionStatistical softwareMultiple Imputation
Cross-sectional data from a survey of Danish firms are used to examine branding behavior in 2002 and its change between 1997 and 2002+ Summary data from the survey are presented+ Branding behavior is defined and relevant literature is... more
    • by 
    •   6  
      MarketingAgribusinessFood IndustryCross Section
    • by 
    •   9  
      Time SeriesMultidisciplinaryLow FrequencyExponential Smoothing
The increasing scientific and industrial interest towards metabonomics takes advantage from the high qualitative and quantitative information level of nuclear magnetic resonance (NMR) spectroscopy. However, several chemical and physical... more
    • by 
    •   21  
      EngineeringAlgorithmsData AnalysisNuclear Magnetic Resonance
An Introduction to Survival Analysis Using Stata, Third Edition is the ideal tutorial for professional data analysts who want to learn survival analysis for the first time or who are well versed in survival analysis but are not as... more
    • by 
    •   8  
      StatisticsSurvival AnalysisCox RegressionStatistical software
    • by 
    •   11  
      PsychologyEvaluation ResearchStatistical AnalysisSyntax
Classification and regression trees are prediction models constructed by recursively partitioning a data set and fitting a simple model to each partition. Their name derives from the usual practice of describing the partitioning process... more
    • by 
    •   19  
      StatisticsMachine LearningBiometricsPublic Health
Environmental time series are often affected by the “presence” of missing data, but when dealing statistically with data, the need to fill in the gaps estimating the missing values must be considered. At present, a large number of... more
    • by 
    •   16  
      Time SeriesEnvironmental MonitoringRiversMultidisciplinary
records with missing values. This work evaluates the performance of several statistical and machine learning imputation methods that were used to predict recurrence in patients in an extensive real breast cancer data set. Materials and... more
    • by 
    •   14  
      EngineeringDemographyAlgorithmsArtificial Intelligence
Hydrognomon is a software tool for the processing of hydrological data. It is an open source application running on standard Microsoft Windows platforms, and it is part of the openmeteo.org framework. Data are imported through standard... more
    • by 
    •   14  
      Monte Carlo SimulationTime SeriesOpen Source SoftwareOpen Source
Gene expression profiling plays an important role in a broad range of areas in biology. The raw gene expression data, may contain missing values. It is an important preprocessing step to accurately estimate missing values in microarray... more
    • by 
    •   14  
      BioinformaticsGeneticsFuzzy set theoryCell Cycle
The software package AMDIS performs gas chromatography-mass spectrometry (GC-MS) peak deconvolution but tends to produce false positives and leaves missing values where peaks are found in only a proportion of a set of chromatograms. We... more
    • by 
    •   10  
      Analytical ChemistrySoftwareData QualityGas Chromatography/mass Spectrometry
Several models for longitudinal data with nonrandom missingness are available. The selection model of Diggle and Kenward is one of these models. It has been mentioned by many authors that this model depends on untested modelling... more
    • by 
    •   11  
      StatisticsBreast CancerQuality of lifeSensitivity Analysis
An efficient methodology for dealing with missing values and outlying observations simultaneously in principal component analysis (PCA) is proposed. The concept described in the paper consists of using a robust technique to obtain robust... more
    • by  and +1
    •   5  
      Analytical ChemistryPrincipal Component AnalysisExpectation MaximizationSimulation Study
The use of Patient-reported Outcomes (PROs) as secondary endpoints in the development of new antidepressants has grown in recent years. The objective of this study was to assess the psychometric properties of the 9-item,... more
    • by 
    •   26  
      PsychiatryPsychometricsDrug DiscoveryTreatment Outcome
Objective: To develop and validate a clinically feasible measure of communication effectiveness for people with any type of communication problem following stroke. Design: Cross-sectional, interview-based, psychometric study, building on... more
    • by 
    •   12  
      Communication DisordersClinical PracticeStrokeClinical
There were 19 submissions from 11 countries. Each paper has rigorously been reviewed by at least two international experts who were mostly members of the program committee. The reviewing process of each paper was coordinated by the... more
    • by 
    •   9  
      Cognitive ScienceMachine LearningSupervised LearningMissing Data
Classification and regression trees are prediction models constructed by recursively partitioning a data set and fitting a simple model to each partition. Their name derives from the usual practice of describing the partitioning process... more
    • by 
    •   19  
      StatisticsMachine LearningBiometricsPublic Health
This impressive book contains formulae for computing sample size in a wide range of settings. Onesample studies and two-sample comparisons for quantitative, binary and time-to-event outcomes are covered comprehensively, with separate... more
    • by 
    •   13  
      EconometricsStatisticsTime SeriesTime series analysis
    • by 
    •   14  
      Molecular BiologyGenomicsData AnalysisPrincipal Component Analysis
Missing data are quite common in practical applications of statistical methods. Imputation is general statistical method for the analysis of incomplete data sets. The goal of the paper is to review selected imputation techniques. Special... more
    • by 
    • Missing Values
A web-based application has been designed from a genetic-epidemiology point of view to analyze association studies. Main capabilities include: descriptive analysis, test for Hardy-Weinberg equilibrium and linkage disequilibrium. Analysis... more
    • by 
    •   21  
      BioinformaticsGenetic EpidemiologyPopulation GeneticsLogistic Regression
A new framework for removing impulse noise from images is presented in which the nature of the filtering operation is conditioned on a state variable defined as the output of a classifier that operates on the differences between the input... more
    • by 
    •   24  
      Cognitive ScienceComputational ComplexityComputational ModelingImage segmentation
In real life situations, we often encounter data sets containing missing observations. Statistical methods that address missingness have been extensively studied in recent years. One of the more popular approaches involves imputation of... more
    • by 
    •   11  
      StatisticsApplied EconomicsParameter estimationRandom Forest
This paper presents a new robust method to estimate the parameters of ARMA models. This method makes use of the autocorrelations estimates based on the ratio of medians together with a robust filter cleaner able to reject a large fraction... more
    • by  and +1
    •   28  
      StatisticsSignal ProcessingTime SeriesModeling
The statistical analysis of compositional data based on logratios of parts is not suitable when zeros are present in a data set. Nevertheless, if there is interest in using this modeling approach, several strategies have been published in... more
    • by 
    •   5  
      Statistical AnalysisMathematical GeologyMathematicalMissing Values
We review second-and third-order multivariate calibration, based on the growing literature in this field, the variety of data being produced by modern instruments, and the proliferation of algorithms capable of dealing with higher-order... more
    • by 
    •   7  
      Analytical ChemistryFluorescence SpectroscopyKineticsHigher Order Thinking
Incomplete data is a common drawback in many pattern classification applications. A classical way to deal with unknown values is missing data estimation. Most machine learning techniques work well with missing values, but they do not... more
    • by 
    •   7  
      Machine LearningNeural NetworkMulti-Task LearningMissing Data
    • by 
    •   7  
      Social SecurityJob SearchSpectrumRandom sampling
Intelligent data analysis has gained increasing attention in business and industry environments. Many applications are looking not only for solutions that can automate and de-skill the data analysis process, but also methods that can deal... more
    • by  and +1
    •   12  
      Cognitive ScienceApplied MathematicsFuzzy LogicData Analysis
The growing demand on geospatial services requires an intensified study of geoinformation from various sources covering the same geographic space. Data matching is one of the fundamental measures that help make different datasets... more
    • by 
    •   6  
      Urban And Regional PlanningGeomatic EngineeringROAD NETWORKLarge Data Sets
Classification procedures are some of the most widely used statistical methods in ecology. Random forests (RF) is a new and powerful statistical classifier that is well established in other disciplines but is relatively unknown in... more
    • by  and +3
    •   26  
      DemographyAlgorithmsMachine LearningEcology
Homogeneity analysis combines the idea of maximizing the correlations between variables of a multivariate data set with that of optimal scaling. In this article we present methodological and practical issues of the R package homals which... more
    • by 
    •   10  
      Correspondence AnalysisOptimal ScalingDiscriminant AnalysisStatistical software
This paper tests the sensitivity of poverty indexes to the choice of adult equivalence scales, assumptions about the existence of economies of scale in consumption, methods for treating missing and zero incomes, and different adjustments... more
    • by 
    •   14  
      Human GeographyPovertySensitivity AnalysisApplied Economics
An efficient methodology for dealing with missing values and outlying observations simultaneously in principal component analysis (PCA) is proposed. The concept described in the paper consists of using a robust technique to obtain robust... more
    • by 
    •   5  
      Analytical ChemistryPrincipal Component AnalysisExpectation MaximizationSimulation Study
Environmental data contains lengthy records of sequential missing values. Practical problem arose in the analysis of adverse health effects of sulphur dioxide (SO 2 ) levels and asthma hospital admissions for Sydney, Nova Scotia, Canada.... more
    • by 
    •   6  
      EnvironmetricsEnvironmental SciencesMathematical SciencesTime Series Data