Questions tagged [classification]
Statistical classification is the problem of identifying the sub-population to which new observations belong, where the identity of the sub-population is unknown, on the basis of a training set of data containing observations whose sub-population is known. Therefore these classifications will show a variable behavior which can be studied by statistics.
6,947 questions
1
vote
0
answers
7
views
Density ratio estimation based classification
Consider the classification problem $p(y \mid x)$ where $y$ is the label and $x$ is the feature vector, e.g. an image. Normally, we would fit, e.g. a convolutional network, and predict $y$ given $x$. ...
1
vote
0
answers
32
views
Preprocessing and model selection strategies
I am working on a fault detection problem where each sample is a time series labeled with a specific type of fault. I am using a CNN model and a validation set for hyperparameter tuning. Currently, I ...
1
vote
1
answer
12
views
Varying sequence lengths between classes in LSTM
I am working on a project where the goal is to predict whether students in an online course will drop out of the course. The course is divided into 20 course weeks. For each week, I have certain kinds ...
6
votes
2
answers
683
views
Building a Statistically Sound ML Model
Silent reader here in the statistics substack. One thing I've learned is that many "default" machine learning practices are being challenged due to fundamental statistical mistakes. This has ...
1
vote
1
answer
111
views
How to compute confidence and uncertainity of model without ground truth from softmax output?
Suppose I have 3 classes A,B,C.
Performing:
y_pred = model.predict(X) # suppose X only two sampel
Returning vector with length ...
0
votes
0
answers
14
views
How to assign the true classes in a column to proceed the analysis [closed]
I want to do a two way anova based on two independent varaibles question_type and string_lenght and one dependant variable response_time.
But i have this dataset where i have a problem in ...
0
votes
0
answers
15
views
Comparing probabilities of two models
Consider a dataset and two binary classes CLASS_A and CLASS_B, with different proportions of 0 and 1. Suppose we train a model such as XGBClassifier for both ...
-1
votes
0
answers
27
views
Every regression problem, can be cast into classification problem. How about the vice versa? [closed]
If regression is about estimating the continuous value of ground truth and predicted values, and continuous values can be cast into discrete values, then discrete values can be labeled as ...
0
votes
0
answers
24
views
Develop credit rating based on binary variable (in R preferably)
I work with credit data in R. I have a loan dataset where I have borrower and credit specific variables and a binary indicator default/notdefault. At first step I do logit and get probability of ...
0
votes
0
answers
18
views
XGBoost F1 score impovement methods for multi class classification [duplicate]
I am building a multiclass classification (5 classes) using XGBoost. Currently using 56 features for 1.6 million customer base having balanced classes.
The overall accuracy is 83%, F1 score is 0.81, ...
0
votes
0
answers
18
views
ROC curve threshold coincides with mean of the predicted variable
I was fitting a bunch of logistic regression models to some dataset, where the variables to predict were all binary. After I fit the models, I then ran some simple code to use the ROC curve to find ...
0
votes
2
answers
51
views
Is it possible that false-positive rate decreases with increasing prevalence?
I am interested in the effect of prevalence on prediction performance. Chouldechova (2016) states that:
[w]hen using a test-fair [recidivism prediction instrument] in
populations where recidivism ...
1
vote
0
answers
23
views
Difference between conditional probability and naive bayes classifier
I have this dataset
a
b
c
K
1
0
1
1
1
1
1
1
0
1
1
0
1
1
0
0
1
0
1
0
0
0
0
1
0
0
0
1
0
0
1
0
I'm trying to calculate the probability of
P(K=1 ∣ a=1 ∧ b=1 ∧ c=0) using a Naive Bayes Classifier
...
2
votes
0
answers
15
views
Forecast optimality for categorical dependent variable
I am familiar with several criteria of forecast optimality for variables on a ratio scale. E.g. Diebold Forecasting in Economics, Business, Finance and Beyond introduces the unforecastability ...
2
votes
2
answers
55
views
How do I make a classifier that's robust to variation in data imbalance?
I'm writing a binary classifier for data that is often pretty unbalanced, e.g. 99% in the majority class (using gradient boosting), but I'd like to have a classifier that is at least somewhat robust ...
2
votes
1
answer
56
views
Why is it ok NOT to use soft labels in classification? [duplicate]
I have, in some sense, an opposite question to Is it okay to use cross entropy loss function with soft labels? which is why is it ok NOT to use soft labels in classification?
Let's say you have a ...
1
vote
1
answer
27
views
Reason for softmax approximation in Ian Goodfellow's deep learning book
In section 6.2.2.2 (equation 6.31) they state:
Overall, unregularized maximum likelihood will drive the model to learn parameters that drive the softmax to predict the fraction of counts of each ...
4
votes
1
answer
144
views
why am I getting worst results when using CNN for feature extraction and SVM for classification
I am using 3d cnn for feature extraction and svm for classification But I got worst results then using the 3d cnn for both feature extraction and classification is that a normal thing ?
1
vote
0
answers
19
views
Stacking Vs Voting Vs Blending
I am working on an experiment with a dataset, where I compared the performance of stacking, blending, and voting using base models and logistic regression as my meta model. Although, voting seems to ...
0
votes
0
answers
37
views
Stacking for very high imbalanced class problem
Background: I'm facing a 1 : 40 000 class imbalanced problem.
It's a binary classification problem with positive class around ~500-700 instances and negative class in the tens of millions instances.
I ...
0
votes
0
answers
24
views
Model Stacking - Out of Fold Procedures
I am attempting to use a model stacking procedure where I am using a time-series split on a set of data I have (around 5000 entries). The goal is binary classification.
After obtaining hyper ...
1
vote
0
answers
14
views
How to Evaluate a Model in a Train Size Ablation Study with Small Datasets?
I’m working on a supervised classification model with a small dataset (~200 samples). My goal is to create a model that performs well with as few training samples as possible. To test this, I’m ...
2
votes
1
answer
36
views
Model Stacking Train Test Split Methdods
I am trying to validate my processes in terms of how I am engaging in model stacking for binary classification. Say I have two models as my base models, models A and B both with different classifiers ...
0
votes
0
answers
92
views
Using classifier prediction as independent variable with within-subjects data?
In my work I've stumbled upon an interesting result when a classifier is applied to within-subjects data. My question is whether this is a known result, and if so, does it have a name? I can't find ...
4
votes
1
answer
54
views
Why is my conditional inference tree so different from my random forest?
I am analyzing a data set using conditional inference trees and random forests, using the R package partykit (v. 1.2.20). For my dependent variable (likelihood of responding), the tree and the forest ...
2
votes
1
answer
31
views
Low AUC value for XGBoost binary classification model
I am using XGBoost to construct a binary classification model to identify individuals with a diagnosis based on things like weight, average steps per day, age, etc. My model is quite small (only 200 ...
1
vote
1
answer
30
views
How to evaluate performance of classification model for different subsets of classes?
Consider a classification problem where there are N classes. While this may seem strange, I have a model that processes features, and essentially, evaluate which classes are impossible (or near ...
0
votes
0
answers
24
views
Derivation of LDA estimates for 1 dimensional case
In An Introduction to Statistical Learning Section 4.4.1, where it discusses LDA for classification, which is just a Bayes classifier with an assumption that the probability density function of each ...
2
votes
2
answers
70
views
Applying Shapley values to classification
According to the definition by Štrumbelj and Kononenko (2013), Shapley values are defined for regression predictions.
They should, however, also be applicable to classifications when the classifier ...
2
votes
0
answers
84
views
How can minimalist models achieve even higher accuracy on MNIST?
I recently came across a fascinating discussion on how a simple logistic regression model can achieve around 92% classification accuracy on the MNIST dataset (reference: How does a simple logistic ...
0
votes
0
answers
13
views
Understanding the role of prior probabilities on the misclassification rate of a DD Classifier
So I'm analyzing this paper for a research:
DD CLassifier
I'm having troubles understanding the role of the prior probabilities in the misclassification rate formula.
My doubt is:
Having this answer ...
0
votes
0
answers
15
views
assessing classifier accuracy when class presence is scarce
What can I do, to assess a classifiers accuracy, when class presence is scarce.
Setup 1: I have 1000 boxes, 500 contain gold. I build an automated tool to find the gold.
The recommended approach would ...
1
vote
1
answer
51
views
What do you call the error model of swapping observations not labels?
I am working on a classification problem and have reason to believe that the only error the classifier makes is due to swapping the predictions of some cases with those of other, unobserved cases. ...
1
vote
0
answers
39
views
Is my XGBoost Model Still Overfitting (Binary Classifcation)?
I am trying to build a binary classification model with XGBoost. I made sure to split my data into the training, validation and test sets. I performed feature selection, early stoppage and ...
0
votes
0
answers
22
views
Imbalanced dataset and binary classification [duplicate]
In my dataset there are 31657 data and each has a binary output( 1 or 0).
The problem is that the output 1 occurs only 886/31657 times, while 0 occurs 30771/31657 times.
Since my dataset is imbalanced,...
4
votes
1
answer
30
views
Report classifier accuracy with missing data
Suppose we are reading blood pressure. Some of the readings are corrupt and unusable. We then train a binary classifer to detect high blood pressure. What does the theory of experimental design say ...
0
votes
0
answers
9
views
Methods to analyze biomarker data to detect pathologies (and co-pathologies)
I'm trying to do analysis to determine what biomarkers are able to detect specific pathologies. I am looking at 4 separate pathologies, however the problem is they often co-occur with eachother. So if ...
0
votes
0
answers
14
views
Scaling for binary data features
My goal is a binary classification of a sports match between two players, particularly I am concerned with the probabilities of each player winning.
My current dataframe has feature values of [...
0
votes
0
answers
23
views
False Positive Rates at a fixed False Negative Rate
When reporting False Positive Rates (FPRs) at a fixed False Negative Rate (FNR), does the choice of threshold still matter for comparison purposes, or is it sufficient to compare FPRs directly since ...
1
vote
0
answers
43
views
Correct way to select data for (probabilistic) churn model?
Context: The team I work for would like to model the probability that a customer will churn given the specific covariates of the customer. We have two main ideas: a model using survival analysis and a ...
3
votes
1
answer
124
views
How is a PR curve plotted?
Whilst reading Machine Learning by Zhi-Hua Zhou (pg 34-35), I was a little confused on the method used to plot a PR curve and was hoping you could help me become a little less confused.
In the book it ...
0
votes
0
answers
20
views
Can a neural network learn element wise relationships of some matrix?
Suppose I have a dataset of matrices which I'd like to classify based on a hidden relationship between their elements. For example, let's have $\Pi_{ij} A_{ij} = B$ and if $B$ is the same then they ...
0
votes
0
answers
29
views
Advice on fine-tuning an email classifier for a Pharma company
I'm an intern working on implementing a binary email classifier for a client (Pharmaceutical company) and I need some advice on fine-tuning the model.
The model I'm using is Longformer (because it has ...
2
votes
1
answer
40
views
How do I reverse engineer an unknown classifier?
Let's say I have a set of binary predictions along with ground truths for a binary classification problem. The predictions are made by an unknown classifier. My task is to reverse engineer the ...
0
votes
0
answers
11
views
What hugging model is recommended for finding terms in texts (or synonims)
I would like to use a hugging face model in order to for example find the concept "pedestrian" in a description like "In this image there is a person driving a minibike and another on a ...
0
votes
1
answer
80
views
How to handle an imbalanced dataset to predict university admission using a predictive model in R?
I'm working on a project where I need to predict university admission (binary variable: 0 = not admitted, 1 = admitted) based on a set of candidate features (e.g., academic scores, age, gender, etc.). ...
0
votes
0
answers
53
views
How to fine-tune an ensemble-level hyper-parameter in the following problem?
I have a relatively small dataset of 4000 instances. I aim to perform classification using an ensemble approach. My ensemble consists of 5 different classifiers and I have an ensemble-level hyper-...
1
vote
0
answers
42
views
How to evaluate the performance of a class based on a binary target while removing the effect of other dependent features?
Here is the problem i'm facing:
I have a dataset of maintenance performed with two categorical variables (or more) (e.g. supplier and customer). Each have hundreds of different classes. These features ...
0
votes
0
answers
34
views
Flattening a likelihood
Background
Let $y_1,y_2,\dots,y_K$ be a sequence of measurements.
I've derived a likelihood $\mathcal{L}(y|i)$ to solve a classification problem via the Bayesian classifier
\begin{equation}
p_k(i)=\...
1
vote
0
answers
22
views
Discriminative vs generative & Bayes risk
In the context of binary classification with features denoted $X$ and labels $y$, one could specify the joint distribution of $(X, y)$ in two equivalent but different ways :
The generative way ...