Variable Selection
956 Followers
Recent papers in Variable Selection
This study is concerned with understanding of the formation of ore deposits (precious and base metals) and contributes to the exploration and discovery of new occurrences using artificial neural networks. From the different digital data... more
Recently, variable selection by penalized likelihood has attracted much research interest. In this paper, we propose adaptive Lasso quantile regression (BALQR) from a Bayesian perspective. The method extends the Bayesian Lasso quantile... more
Telemarketers of online job advertising firms face significant challenges understanding the advertising demands of small-sized enterprises. The effective use of data mining approach can offer e-recruitment companies an improved... more
A variable selection problem is analysed for use in Principal Component Analysis (PCA). In this case, the set of original variables is divided into disjoint groups. The problem resides in the selection of variables, but with the... more
Part I of this paper (Adjiman et al., 1997) described the theoretical foundations of a global optimization algorithm, the BB algorithm, which can be used to solve problems belonging to the broad class of twice-di erentiable NPLs. For any... more
, restructuring banks and strengthening the resources of credit institutions. The objective of this Fund is to manage bank restructuring processes and help strengthen their resources. The initial funding provided for this Fund is 9.000... more
LASSO (Least Absolute Shrinkage and Selection Operator) is a useful tool to achieve the shrinkage and variable selection simultaneously. Since LASSO uses the L 1 penalty, the optimization should rely on the quadratic program (QP) or... more
This study examined baseline characteristics associated with abstinence from tobacco 6 months after treatment for nicotine dependence. A total of 1224 cigarette smokers (619 females, 605 males) receiving clinical services for treatment of... more
The problem of variable selection in neural network regression models with dependent data is considered. In this framework, a test procedure based on the introduction of a measure for the variable relevance to the model is discussed. The... more
In the presented research three measurement strategies of Fourier transform infrared (FT-IR) spectrometry (horizontal-and micro-attenuated total reflection (HATR and lATR, respectively) and a novel high throughput transmission (HTT)) in... more
Diagnosis Limiting factor Model mixing Model selection Stepwise Parameter estimation Wheat a
The study examines the effectiveness of different neural networks in predicting bankruptcy filing. Two approaches for training neural networks, Back-Propagation and Optimal Estimation Theory, are considered. Within the back-propagation... more
Species distribution modelling is central to both fundamental and applied research in biogeography. Despite widespread use of models, there are still important conceptual ambiguities as well as biotic and algorithmic uncertainties that... more
Extensive optimisation of a mathematical model's fit to a relatively small set of empirical data, may lead to over-optimistic validation results. If the assessment of the final, optimised model is based on the same validation method and... more
. The relationship between absorption in the near-infrared NIR spectral region and the target analytical parameter is frequently of the non-linear type. The origin of the non-linearity can be widely varied and difficult to identify. In... more
The problem of predicting a future value of a time series is considered in this paper. If the series follows a stationary Markov process, this can be done by nonparametric estimation of the autoregression function. Two forecasting... more
Physically based models are commonly used as an integral step in landslide hazard assessment. Geomorphic principles can be applied to a broad area, resulting in first order assessment of landslide susceptibility. New techniques are now... more
Hypothesis testing is a model selection problem for which the solution proposed by the two main statistical streams of thought, frequentists and Bayesians, substantially differ. One may think that this fact might be due to the prior... more
In this paper we introduce two procedures for variable selection in cluster analysis and classification rules. One is mainly oriented to detect the "noisy" non-informative variables, while the other deals also with multicolinearity. A... more
A problem of supervised learning from the multivariate time series (MTS) data where the target variable is potentially a highly complex function of MTS features is considered. This paper focuses on finding a compressed representation of... more
. The goal of the presented study is two-fold. First, we want to emphasize the power of Near Infrared Reflectance NIR spectroscopy for discrimination between mayonnaise samples containing different vegetable oils. Secondly, we want to use... more
Dimension reduction and variable selection are two types of effective methods that deal with high-dimensional data. In particular, variable selection techniques are of wide-spread use and essentially consist of individual selection... more
In this presentation a methodology is suggested for site assessment and site selection based in a data analysis framework. This framework consists in tree steeps using different data analysis methods from cluster analysis, classification... more
Ž . The application of a genetic algorithm GA to the selection of principal components PCs is proposed as an efficient method to determine the optimal multivariate regression model. This stochastic method was compared with other... more
Statistics in Medicine, 38: 558-582.
We propose a data mining approach to predict human wine taste preferences that is based on easily available analytical tests at the certification step. A large dataset (when compared to other studies in this domain) is considered, with... more
This work deals with the use of multiple correspondence analysis (MCA) and a weighted Euclidean distance (the tolerance distance) as an exploratory tool in developing predictive logistic models. The method was applied to a living-donor... more
Wind energy is having an increasing influence on the energy supply in many countries, but in contrast to conventional power plants, it is a fluctuating energy source. For its integration into the electricity supply structure, it is... more
Variable selection is fundamenta l to high-dimensiona l statistical modeling, including nonparametri c regression. Many approaches in use are stepwise selection procedures , which can be computationally expensive and ignore stochastic... more
We provide an overview of latent variable methods used in pharmaceutics and integrated with advanced characterization techniques such as vibrational spectroscopy. The basics of the most common latent variable methods, principal component... more
Selection of proper phenotypic trait among various traits related with interesting performance plays an important role in genetic evaluation. In this study, principal components analysis (PCA) was adapted to generate a new index as a... more
Mixture designs and corresponding analysis techniques are of considerable importance in food science and industry. Mixture data are generally challenging to model, since the mixture restrictions leads to both exact and near collinearity.... more
Least Angle Regression is a promising technique for variable selection applications, offering a nice alternative to stepwise regression. It provides an explanation for the similar behavior of LASSO (ℓ 1 -penalized regression) and forward... more
A new approach for variable influence on projection (VIP) is described, which takes full advantage of the orthogonal projections to latent structures (OPLS) model formalism for enhanced model interpretability. This means that it will... more
The archives of species range polygons developed under comprehensive assessments of the conservation status of species, such as the IUCN's Global Assessments, are a significant resource in the analysis of biodiversity for conservation... more
We present a new Stata program, vselect, that helps users perform variable selection after performing a linear regression. Options for stepwise meth- ods such as forward selection and backward elimination are provided. The user may... more
We present a new Stata program, vselect, that helps users perform variable selection after performing a linear regression. Options for stepwise meth- ods such as forward selection and backward elimination are provided. The user may... more
In this paper, we describe an extension approach to the backtracking with look-ahead forward checking method that adopts weighted partial satisfaction of soft constraints that has been implemented to the development of an automated... more
Penalized regression methods for simultaneous variable selection and coefficient estimation, especially those based on the lasso of , have received a great deal of attention in recent years, mostly through frequentist models. Properties... more
Similar to variable selection in the linear regression model, selecting significant components in the popular additive regression model is of great interest. However, such components are unknown smooth functions of independent variables,... more
Eight different variable selection techniques for model-based and non-model-based clustering are evaluated across a wide range of cluster structures. It is shown that several methods have difficulties when non-informative variables (i.e.,... more
Sparse estimation methods are aimed at using or obtaining parsimonious representations of data or models. They were first dedicated to linear variable selection but numerous extensions have now emerged such as structured sparsity or... more
The relationship between climatic change and human evolution can be framed in terms of three major hypotheses. A modern version of the long-held savanna hypothesis posits that the expansion of grassland ecosystems in Africa was driven by... more
A web-based application has been designed from a genetic-epidemiology point of view to analyze association studies. Main capabilities include: descriptive analysis, test for Hardy-Weinberg equilibrium and linkage disequilibrium. Analysis... more
It is frequently the case with Big Data that you have a gazillion exogenous predictor time series that you need to analyze so you can build a dynamic regression time series model. 1 The %ARIMA_SELECT macro can help you to determine which... more