Binary Classification
66 Followers
Recent papers in Binary Classification
One of the central problems in modern biology is to identify the complete set of interactions among the proteins in a cell. The structural interaction of proteins and their domains in networks is one of the most basic molecular mechanisms... more
Telemarketers of online job advertising firms face significant challenges understanding the advertising demands of small-sized enterprises. The effective use of data mining approach can offer e-recruitment companies an improved... more
Retinal blood vessel assessment plays an important role in the diagnosis of ophthalmic pathologies. The use of digital images for this purpose enables the application of a computerized approach and has fostered the development of multiple... more
Cohen's kappa index is reformulated for multiple classifications based on exchangeable random variables. It is found that kappa is between 0 and I inclusive, Two characterizations for kappa are stated in terms of the relationship between... more
The authors describe a model-based kappa statistic for binary classifications which is interpretable in the same manner as Scott's pi and Cohen's kappa, yet does not suffer from the same flaws. They compare this statistic with the... more
Study notes summarizing the basics on Support Vector Machines (SVM) for binary classification. Some pseudo-codes are provided as an introduction to SVM programming.
This paper investigates the influence of different page features on the ranking of search engine results. We use Google (via its API) as our testbed and analyze the result rankings for several queries of different categories using... more
Due to the enormous amount of data and opinions being produced, shared and transferred everyday across the internet and other media, Sentiment analysis has become vital for developing opinion mining systems. This paper introduces a... more
The purpose of this paper is to propose a new binary classification method for predicting corporate failure based on genetic algorithm, and to validate its prediction power through empirical analysis. Establishing virtual companies... more
Several real problems involve the classification of data into categories or classes. Given a data set containing data whose classes are known, Machine Learning algorithms can be employed for the induction of a classifier able to predict... more
Healthcare expenditure is a growing concern in the US. In 2012, the total US annual healthcare expenditure reached $2.8 trillion. As a percentage of GDP, it is the highest among all the nations worldwide. In this study we investigate the... more
Convolutional Neural Networks (CNNs) are a subset of Supervised Learning class of algorithms that are very similar to regular Neural Networks and aim to find an optimal predictive model that assigns the input variable to the correct... more
The problem of testing equivalence of two ROC curves is addressed. A transformation of corresponding ROC curves, which motivates a test statistic based on a distance of two empirical quantile processes, is suggested, its asymptotic... more
Due to the enormous amount of data and opinions being produced, shared and transferred everyday across the internet and other media, Sentiment analysis has become vital for developing opinion mining systems. This paper introduces a... more
The primary purposes of binary classification is performance optimization since even the slightest prediction improvements can have signification implications for each field application. Finding the most effective separation of classes is the... more
This tutorial provides a concise overview of support vector machines and different closely related techniques for pattern classification. The tutorial starts with the formulation of support vector machines for classification. The method... more
In an era of strong customer relationship management (CRM) emphasis, firms strive to build valuable relationships with their existing customer base. In this study, we attempt to better understand three important measures of customer... more
Background Apraxia in patients with stroke may be overlooked, as clumsiness and deficient gestural communication are often attributed to frequently coexisting sensorimotor deficits and aphasia. Early and reliable detection of apraxia by a... more
This paper addresses the prediction of epileptic seizures from the online analysis of EEG data. This problem is of paramount importance for the realization of monitoring/control units to be implanted on drug-resistant epileptic patients.... more
—A Worm is a standalone malware that replicates itself and does not need a host to propagate. The prevailing detection approach uses fixed size sequence of characters extracted from the worm as a signature. Although very fast, the... more
Bankruptcy prediction has been a topic of research for decades, both within the financial and the academic world. The implementations of international financial and accounting standards, such as Basel II and IFRS, as well as the recent... more
Two enhancements are proposed to the application and theory of support vector machines. The first is a method of multicategory classification based on the binary classification version of the support vector machine (SVM). The method,... more
Support vector machine (SVM) is a well sound learning method and a robust classification procedure. Choosing a suitable kernel function in SVM is crucial for obtaining good performance; the difficulty is how to choose a suitable data... more
a b s t r a c t Support vector machines (SVMs) were originally formulated for the solution of binary classification problems. In multiclass problems, a decomposition approach is often employed, in which the multiclass problem is divided... more
Learning algorithms can suffer a performance bias when data sets only have a small number of training examples for one or more classes. In this scenario learning methods can produce the deceptive appearance of "good looking" results even... more
Words and phrases acquire meaning from the way they are used in society, from their relative semantics to other words and phrases. For computers the equivalent of 'society' is 'database,' and the equivalent of 'use' is 'way to search the... more
This paper investigates the feature subset selection problem for the binary classification problem using logistic regression model. We developed a modified discrete particle swarm optimization (PSO) algorithm for the feature subset... more
In this paper we present a novel approach and a new machine learning problem, called Supervised Novelty Detection (SND). This problem extends the One-Class Support Vector Machine setting for binary classification while keeping the nice... more
This paper presents an approach to the joint optimization of neural network structure and weights which can take advantage of backpropagation as a specialized decoder. The approach is applied to binary classification of brain waves in the... more
We show that the Confusion Entropy, a measure of performance in multiclass problems has a strong (monotone) relation with the multiclass generalization of a classical metric, the Matthews Correlation Coefficient. Analytical results are... more
Signalized intersections are accident-prone areas especially for rear-end crashes due to the fact that the diversity of the braking behaviors of drivers increases during the signal change. The objective of this article is to improve... more
We participated (as Team 9) in the Article Classification Task of the Biocreative II.5 Challenge: binary classification of fulltext documents relevant for protein-protein interaction. We used two distinct classifiers for the online and... more
Misbehavior detection in vehicular ad hoc networks (VANETs) is performed to improve the traffic safety and driving accuracy. All the nodes in the VANETs communicate to each other through message logs. Malicious nodes in the VANETs can... more
We define voice activity detection (VAD) as a binary classification problem and solve it using the support vector machine (SVM). Challenges in SVM-based approach include selection of representative training segments, selection of... more
We explored the use of a fiber-optic probe for in vivo fluorescence spectroscopy of breast tissues during percutaneous image-guided breast biopsy. A total of 121 biopsy samples with accompanying histological diagnosis were obtained... more
We have proposed a hybrid SVM based decision tree to speedup SVMs in its testing phase for binary classification tasks. While most existing methods addressed towards this task aim at reducing the number of support vectors, we have focused... more
The properties of acoustic speech have previously been investigated as possible cues for depression in adults. However, these studies were restricted to small populations of patients and the speech recordings were made during patients'... more
Ensembles that combine the decisions of classifiers generated by using perturbed versions of the training set where the classes of the training examples are randomly switched can produce a significant error reduction, provided that large... more
In many real-world binary classification tasks (e.g. detection of certain objects from images), an available dataset is imbalanced, i.e., it has much less representatives of a one class (a minor class), than of another. Generally,... more
INTRODUCTION: One limitation for the study of Uromyces appendiculatus (Pers.: Pers.) Unger (sin. U. phaseoli (Reben) Wint.), the causal agent of rust in common bean (Phaseolus vulgaris L.), was the system used for the classification of... more
Forecasting applications on the stock market attract much interest from researchers in the artificial intelligence field. The problem tackled in this study concerns predicting the direction of change of stock price indices, formulated in... more
In several computer-aided diagnosis (CAD) applications of image processing, there is no sufficiently sensitive and specific method for determining what constitutes a normal versus an abnormal classification of a chest radiograph. In the... more
The paper proposes a different approach to data modeling. Analogous to the rejection method, where the misclassifications are removed and manually evaluated, we focus here on difficult to distinguish cases for binary classification. Such... more
We consider two on-line learning frameworks: binary classification through linear threshold functions and linear regression. We study a family of on-line algorithms, called p-norm algorithms, introduced by Grove, Littlestone and... more
Machine learning algorithms such as genetic programming (GP) can evolve biased classifiers when data sets are unbalanced. Data sets are unbalanced when at least one class is represented by only a small number of training examples (called... more