Skip to main content

Questions tagged [classification]

Statistical classification is the problem of identifying the sub-population to which new observations belong, where the identity of the sub-population is unknown, on the basis of a training set of data containing observations whose sub-population is known. Therefore these classifications will show a variable behavior which can be studied by statistics.

Filter by
Sorted by
Tagged with
1 vote
0 answers
7 views

Density ratio estimation based classification

Consider the classification problem $p(y \mid x)$ where $y$ is the label and $x$ is the feature vector, e.g. an image. Normally, we would fit, e.g. a convolutional network, and predict $y$ given $x$. ...
Kaiwen's user avatar
  • 211
1 vote
0 answers
32 views

Preprocessing and model selection strategies

I am working on a fault detection problem where each sample is a time series labeled with a specific type of fault. I am using a CNN model and a validation set for hyperparameter tuning. Currently, I ...
S.H.W's user avatar
  • 77
1 vote
1 answer
12 views

Varying sequence lengths between classes in LSTM

I am working on a project where the goal is to predict whether students in an online course will drop out of the course. The course is divided into 20 course weeks. For each week, I have certain kinds ...
Computeraar's user avatar
6 votes
2 answers
683 views

Building a Statistically Sound ML Model

Silent reader here in the statistics substack. One thing I've learned is that many "default" machine learning practices are being challenged due to fundamental statistical mistakes. This has ...
easymoneysniper's user avatar
1 vote
1 answer
111 views

How to compute confidence and uncertainity of model without ground truth from softmax output?

Suppose I have 3 classes A,B,C. Performing: y_pred = model.predict(X) # suppose X only two sampel Returning vector with length ...
Muhammad Ikhwan Perwira's user avatar
0 votes
0 answers
14 views

How to assign the true classes in a column to proceed the analysis [closed]

I want to do a two way anova based on two independent varaibles question_type and string_lenght and one dependant variable response_time. But i have this dataset where i have a problem in ...
MexcelsiorB's user avatar
0 votes
0 answers
15 views

Comparing probabilities of two models

Consider a dataset and two binary classes CLASS_A and CLASS_B, with different proportions of 0 and 1. Suppose we train a model such as XGBClassifier for both ...
Ale's user avatar
  • 1,690
-1 votes
0 answers
27 views

Every regression problem, can be cast into classification problem. How about the vice versa? [closed]

If regression is about estimating the continuous value of ground truth and predicted values, and continuous values can be cast into discrete values, then discrete values can be labeled as ...
Muhammad Ikhwan Perwira's user avatar
0 votes
0 answers
24 views

Develop credit rating based on binary variable (in R preferably)

I work with credit data in R. I have a loan dataset where I have borrower and credit specific variables and a binary indicator default/notdefault. At first step I do logit and get probability of ...
Danylo Krasovytskyi's user avatar
0 votes
0 answers
18 views

XGBoost F1 score impovement methods for multi class classification [duplicate]

I am building a multiclass classification (5 classes) using XGBoost. Currently using 56 features for 1.6 million customer base having balanced classes. The overall accuracy is 83%, F1 score is 0.81, ...
saurabh15's user avatar
0 votes
0 answers
18 views

ROC curve threshold coincides with mean of the predicted variable

I was fitting a bunch of logistic regression models to some dataset, where the variables to predict were all binary. After I fit the models, I then ran some simple code to use the ROC curve to find ...
Juan Felipe Salamanca Lozano's user avatar
0 votes
2 answers
51 views

Is it possible that false-positive rate decreases with increasing prevalence?

I am interested in the effect of prevalence on prediction performance. Chouldechova (2016) states that: [w]hen using a test-fair [recidivism prediction instrument] in populations where recidivism ...
Max J.'s user avatar
  • 113
1 vote
0 answers
23 views

Difference between conditional probability and naive bayes classifier

I have this dataset a b c K 1 0 1 1 1 1 1 1 0 1 1 0 1 1 0 0 1 0 1 0 0 0 0 1 0 0 0 1 0 0 1 0 I'm trying to calculate the probability of P(K=1 ∣ a=1 ∧ b=1 ∧ c=0) using a Naive Bayes Classifier ...
imcakmak's user avatar
2 votes
0 answers
15 views

Forecast optimality for categorical dependent variable

I am familiar with several criteria of forecast optimality for variables on a ratio scale. E.g. Diebold Forecasting in Economics, Business, Finance and Beyond introduces the unforecastability ...
Richard Hardy's user avatar
2 votes
2 answers
55 views

How do I make a classifier that's robust to variation in data imbalance?

I'm writing a binary classifier for data that is often pretty unbalanced, e.g. 99% in the majority class (using gradient boosting), but I'd like to have a classifier that is at least somewhat robust ...
mrz123456's user avatar
2 votes
1 answer
56 views

Why is it ok NOT to use soft labels in classification? [duplicate]

I have, in some sense, an opposite question to Is it okay to use cross entropy loss function with soft labels? which is why is it ok NOT to use soft labels in classification? Let's say you have a ...
YuseqYaseq's user avatar
1 vote
1 answer
27 views

Reason for softmax approximation in Ian Goodfellow's deep learning book

In section 6.2.2.2 (equation 6.31) they state: Overall, unregularized maximum likelihood will drive the model to learn parameters that drive the softmax to predict the fraction of counts of each ...
Philipp's user avatar
  • 11
4 votes
1 answer
144 views

why am I getting worst results when using CNN for feature extraction and SVM for classification

I am using 3d cnn for feature extraction and svm for classification But I got worst results then using the 3d cnn for both feature extraction and classification is that a normal thing ?
anya's user avatar
  • 41
1 vote
0 answers
19 views

Stacking Vs Voting Vs Blending

I am working on an experiment with a dataset, where I compared the performance of stacking, blending, and voting using base models and logistic regression as my meta model. Although, voting seems to ...
0 votes
0 answers
37 views

Stacking for very high imbalanced class problem

Background: I'm facing a 1 : 40 000 class imbalanced problem. It's a binary classification problem with positive class around ~500-700 instances and negative class in the tens of millions instances. I ...
user24758287's user avatar
0 votes
0 answers
24 views

Model Stacking - Out of Fold Procedures

I am attempting to use a model stacking procedure where I am using a time-series split on a set of data I have (around 5000 entries). The goal is binary classification. After obtaining hyper ...
user54565's user avatar
1 vote
0 answers
14 views

How to Evaluate a Model in a Train Size Ablation Study with Small Datasets?

I’m working on a supervised classification model with a small dataset (~200 samples). My goal is to create a model that performs well with as few training samples as possible. To test this, I’m ...
Philipp's user avatar
  • 11
2 votes
1 answer
36 views

Model Stacking Train Test Split Methdods

I am trying to validate my processes in terms of how I am engaging in model stacking for binary classification. Say I have two models as my base models, models A and B both with different classifiers ...
user54565's user avatar
0 votes
0 answers
92 views

Using classifier prediction as independent variable with within-subjects data?

In my work I've stumbled upon an interesting result when a classifier is applied to within-subjects data. My question is whether this is a known result, and if so, does it have a name? I can't find ...
David B's user avatar
  • 1,944
4 votes
1 answer
54 views

Why is my conditional inference tree so different from my random forest?

I am analyzing a data set using conditional inference trees and random forests, using the R package partykit (v. 1.2.20). For my dependent variable (likelihood of responding), the tree and the forest ...
user3499014's user avatar
2 votes
1 answer
31 views

Low AUC value for XGBoost binary classification model

I am using XGBoost to construct a binary classification model to identify individuals with a diagnosis based on things like weight, average steps per day, age, etc. My model is quite small (only 200 ...
xgboost's user avatar
  • 21
1 vote
1 answer
30 views

How to evaluate performance of classification model for different subsets of classes?

Consider a classification problem where there are N classes. While this may seem strange, I have a model that processes features, and essentially, evaluate which classes are impossible (or near ...
Ralff's user avatar
  • 252
0 votes
0 answers
24 views

Derivation of LDA estimates for 1 dimensional case

In An Introduction to Statistical Learning Section 4.4.1, where it discusses LDA for classification, which is just a Bayes classifier with an assumption that the probability density function of each ...
user442779's user avatar
2 votes
2 answers
70 views

Applying Shapley values to classification

According to the definition by Štrumbelj and Kononenko (2013), Shapley values are defined for regression predictions. They should, however, also be applicable to classifications when the classifier ...
cdalitz's user avatar
  • 5,730
2 votes
0 answers
84 views

How can minimalist models achieve even higher accuracy on MNIST?

I recently came across a fascinating discussion on how a simple logistic regression model can achieve around 92% classification accuracy on the MNIST dataset (reference: How does a simple logistic ...
Mountain's user avatar
  • 121
0 votes
0 answers
13 views

Understanding the role of prior probabilities on the misclassification rate of a DD Classifier

So I'm analyzing this paper for a research: DD CLassifier I'm having troubles understanding the role of the prior probabilities in the misclassification rate formula. My doubt is: Having this answer ...
user avatar
0 votes
0 answers
15 views

assessing classifier accuracy when class presence is scarce

What can I do, to assess a classifiers accuracy, when class presence is scarce. Setup 1: I have 1000 boxes, 500 contain gold. I build an automated tool to find the gold. The recommended approach would ...
Klops's user avatar
  • 156
1 vote
1 answer
51 views

What do you call the error model of swapping observations not labels?

I am working on a classification problem and have reason to believe that the only error the classifier makes is due to swapping the predictions of some cases with those of other, unobserved cases. ...
jan-glx's user avatar
  • 379
1 vote
0 answers
39 views

Is my XGBoost Model Still Overfitting (Binary Classifcation)?

I am trying to build a binary classification model with XGBoost. I made sure to split my data into the training, validation and test sets. I performed feature selection, early stoppage and ...
Shak Jivraj's user avatar
0 votes
0 answers
22 views

Imbalanced dataset and binary classification [duplicate]

In my dataset there are 31657 data and each has a binary output( 1 or 0). The problem is that the output 1 occurs only 886/31657 times, while 0 occurs 30771/31657 times. Since my dataset is imbalanced,...
Daniela's user avatar
4 votes
1 answer
30 views

Report classifier accuracy with missing data

Suppose we are reading blood pressure. Some of the readings are corrupt and unusable. We then train a binary classifer to detect high blood pressure. What does the theory of experimental design say ...
Alex's user avatar
  • 403
0 votes
0 answers
9 views

Methods to analyze biomarker data to detect pathologies (and co-pathologies)

I'm trying to do analysis to determine what biomarkers are able to detect specific pathologies. I am looking at 4 separate pathologies, however the problem is they often co-occur with eachother. So if ...
jcoop's user avatar
  • 1
0 votes
0 answers
14 views

Scaling for binary data features

My goal is a binary classification of a sports match between two players, particularly I am concerned with the probabilities of each player winning. My current dataframe has feature values of [...
user54565's user avatar
0 votes
0 answers
23 views

False Positive Rates at a fixed False Negative Rate

When reporting False Positive Rates (FPRs) at a fixed False Negative Rate (FNR), does the choice of threshold still matter for comparison purposes, or is it sufficient to compare FPRs directly since ...
babygould's user avatar
1 vote
0 answers
43 views

Correct way to select data for (probabilistic) churn model?

Context: The team I work for would like to model the probability that a customer will churn given the specific covariates of the customer. We have two main ideas: a model using survival analysis and a ...
ImpactGuide's user avatar
3 votes
1 answer
124 views

How is a PR curve plotted?

Whilst reading Machine Learning by Zhi-Hua Zhou (pg 34-35), I was a little confused on the method used to plot a PR curve and was hoping you could help me become a little less confused. In the book it ...
Thomas Stokes's user avatar
0 votes
0 answers
20 views

Can a neural network learn element wise relationships of some matrix?

Suppose I have a dataset of matrices which I'd like to classify based on a hidden relationship between their elements. For example, let's have $\Pi_{ij} A_{ij} = B$ and if $B$ is the same then they ...
DjM's user avatar
  • 1
0 votes
0 answers
29 views

Advice on fine-tuning an email classifier for a Pharma company

I'm an intern working on implementing a binary email classifier for a client (Pharmaceutical company) and I need some advice on fine-tuning the model. The model I'm using is Longformer (because it has ...
Bhashwar Sengupta's user avatar
2 votes
1 answer
40 views

How do I reverse engineer an unknown classifier?

Let's say I have a set of binary predictions along with ground truths for a binary classification problem. The predictions are made by an unknown classifier. My task is to reverse engineer the ...
ProfessionalUsername's user avatar
0 votes
0 answers
11 views

What hugging model is recommended for finding terms in texts (or synonims)

I would like to use a hugging face model in order to for example find the concept "pedestrian" in a description like "In this image there is a person driving a minibike and another on a ...
KansaiRobot's user avatar
0 votes
1 answer
80 views

How to handle an imbalanced dataset to predict university admission using a predictive model in R?

I'm working on a project where I need to predict university admission (binary variable: 0 = not admitted, 1 = admitted) based on a set of candidate features (e.g., academic scores, age, gender, etc.). ...
user avatar
0 votes
0 answers
53 views

How to fine-tune an ensemble-level hyper-parameter in the following problem?

I have a relatively small dataset of 4000 instances. I aim to perform classification using an ensemble approach. My ensemble consists of 5 different classifiers and I have an ensemble-level hyper-...
E-O's user avatar
  • 1
1 vote
0 answers
42 views

How to evaluate the performance of a class based on a binary target while removing the effect of other dependent features?

Here is the problem i'm facing: I have a dataset of maintenance performed with two categorical variables (or more) (e.g. supplier and customer). Each have hundreds of different classes. These features ...
Sayanel's user avatar
  • 11
0 votes
0 answers
34 views

Flattening a likelihood

Background Let $y_1,y_2,\dots,y_K$ be a sequence of measurements. I've derived a likelihood $\mathcal{L}(y|i)$ to solve a classification problem via the Bayesian classifier \begin{equation} p_k(i)=\...
matteogost's user avatar
1 vote
0 answers
22 views

Discriminative vs generative & Bayes risk

In the context of binary classification with features denoted $X$ and labels $y$, one could specify the joint distribution of $(X, y)$ in two equivalent but different ways : The generative way ...
Ramufasa's user avatar

1
2 3 4 5
139