Newest 'classification&' Questions

1 vote

0 answers

7 views

Density ratio estimation based classification

Consider the classification problem $p(y \mid x)$ where $y$ is the label and $x$ is the feature vector, e.g. an image. Normally, we would fit, e.g. a convolutional network, and predict $y$ given $x$. ...

Kaiwen

211

asked 2 days ago

1 vote

0 answers

32 views

Preprocessing and model selection strategies

I am working on a fault detection problem where each sample is a time series labeled with a specific type of fault. I am using a CNN model and a validation set for hyperparameter tuning. Currently, I ...

S.H.W

77

asked Dec 12 at 18:02

1 vote

1 answer

12 views

Varying sequence lengths between classes in LSTM

I am working on a project where the goal is to predict whether students in an online course will drop out of the course. The course is divided into 20 course weeks. For each week, I have certain kinds ...

Computeraar

11

asked Dec 12 at 13:38

6 votes

2 answers

683 views

Building a Statistically Sound ML Model

Silent reader here in the statistics substack. One thing I've learned is that many "default" machine learning practices are being challenged due to fundamental statistical mistakes. This has ...

easymoneysniper

69

asked Dec 11 at 9:06

1 vote

1 answer

111 views

How to compute confidence and uncertainity of model without ground truth from softmax output?

Suppose I have 3 classes A,B,C. Performing: y_pred = model.predict(X) # suppose X only two sampel Returning vector with length ...

Muhammad Ikhwan Perwira

151

asked Dec 9 at 2:06

0 votes

0 answers

14 views

How to assign the true classes in a column to proceed the analysis [closed]

I want to do a two way anova based on two independent varaibles question_type and string_lenght and one dependant variable response_time. But i have this dataset where i have a problem in ...

MexcelsiorB

11

asked Dec 8 at 13:59

0 votes

0 answers

15 views

Comparing probabilities of two models

Consider a dataset and two binary classes CLASS_A and CLASS_B, with different proportions of 0 and 1. Suppose we train a model such as XGBClassifier for both ...

Ale

1,690

asked Dec 7 at 20:08

-1 votes

0 answers

27 views

Every regression problem, can be cast into classification problem. How about the vice versa? [closed]

If regression is about estimating the continuous value of ground truth and predicted values, and continuous values can be cast into discrete values, then discrete values can be labeled as ...

Muhammad Ikhwan Perwira

151

asked Dec 7 at 17:09

0 votes

0 answers

24 views

Develop credit rating based on binary variable (in R preferably)

I work with credit data in R. I have a loan dataset where I have borrower and credit specific variables and a binary indicator default/notdefault. At first step I do logit and get probability of ...

Danylo Krasovytskyi

1

asked Dec 6 at 8:49

0 votes

0 answers

18 views

XGBoost F1 score impovement methods for multi class classification [duplicate]

I am building a multiclass classification (5 classes) using XGBoost. Currently using 56 features for 1.6 million customer base having balanced classes. The overall accuracy is 83%, F1 score is 0.81, ...

saurabh15

1

asked Dec 4 at 12:02

0 votes

0 answers

18 views

ROC curve threshold coincides with mean of the predicted variable

I was fitting a bunch of logistic regression models to some dataset, where the variables to predict were all binary. After I fit the models, I then ran some simple code to use the ROC curve to find ...

Juan Felipe Salamanca Lozano

29

asked Dec 3 at 19:43

0 votes

2 answers

51 views

Is it possible that false-positive rate decreases with increasing prevalence?

I am interested in the effect of prevalence on prediction performance. Chouldechova (2016) states that: [w]hen using a test-fair [recidivism prediction instrument] in populations where recidivism ...

Max J.

113

asked Dec 3 at 11:09

1 vote

0 answers

23 views

Difference between conditional probability and naive bayes classifier

I have this dataset a b c K 1 0 1 1 1 1 1 1 0 1 1 0 1 1 0 0 1 0 1 0 0 0 0 1 0 0 0 1 0 0 1 0 I'm trying to calculate the probability of P(K=1 ∣ a=1 ∧ b=1 ∧ c=0) using a Naive Bayes Classifier ...

imcakmak

11

asked Dec 2 at 15:33

2 votes

0 answers

15 views

Forecast optimality for categorical dependent variable

I am familiar with several criteria of forecast optimality for variables on a ratio scale. E.g. Diebold Forecasting in Economics, Business, Finance and Beyond introduces the unforecastability ...

Richard Hardy

69.5k

asked Nov 27 at 12:19

2 votes

2 answers

55 views

How do I make a classifier that's robust to variation in data imbalance?

I'm writing a binary classifier for data that is often pretty unbalanced, e.g. 99% in the majority class (using gradient boosting), but I'd like to have a classifier that is at least somewhat robust ...

mrz123456

43

asked Nov 26 at 20:26

2 votes

1 answer

56 views

Why is it ok NOT to use soft labels in classification? [duplicate]

I have, in some sense, an opposite question to Is it okay to use cross entropy loss function with soft labels? which is why is it ok NOT to use soft labels in classification? Let's say you have a ...

YuseqYaseq

161

asked Nov 26 at 7:50

1 vote

1 answer

27 views

Reason for softmax approximation in Ian Goodfellow's deep learning book

In section 6.2.2.2 (equation 6.31) they state: Overall, unregularized maximum likelihood will drive the model to learn parameters that drive the softmax to predict the fraction of counts of each ...

Philipp

11

asked Nov 22 at 14:02

4 votes

1 answer

144 views

why am I getting worst results when using CNN for feature extraction and SVM for classification

I am using 3d cnn for feature extraction and svm for classification But I got worst results then using the 3d cnn for both feature extraction and classification is that a normal thing ?

anya

41

asked Nov 19 at 8:43

1 vote

0 answers

19 views

Stacking Vs Voting Vs Blending

I am working on an experiment with a dataset, where I compared the performance of stacking, blending, and voting using base models and logistic regression as my meta model. Although, voting seems to ...

Community wiki

2 revs, 2 users 100%
user54565

0 votes

0 answers

37 views

Stacking for very high imbalanced class problem

Background: I'm facing a 1 : 40 000 class imbalanced problem. It's a binary classification problem with positive class around ~500-700 instances and negative class in the tens of millions instances. I ...

user24758287

1

asked Nov 14 at 9:35

0 votes

0 answers

24 views

Model Stacking - Out of Fold Procedures

I am attempting to use a model stacking procedure where I am using a time-series split on a set of data I have (around 5000 entries). The goal is binary classification. After obtaining hyper ...

user54565

47

asked Nov 14 at 1:07

1 vote

0 answers

14 views

How to Evaluate a Model in a Train Size Ablation Study with Small Datasets?

I’m working on a supervised classification model with a small dataset (~200 samples). My goal is to create a model that performs well with as few training samples as possible. To test this, I’m ...

Philipp

11

asked Nov 13 at 14:50

2 votes

1 answer

36 views

Model Stacking Train Test Split Methdods

I am trying to validate my processes in terms of how I am engaging in model stacking for binary classification. Say I have two models as my base models, models A and B both with different classifiers ...

user54565

47

asked Nov 13 at 1:58

0 votes

0 answers

92 views

Using classifier prediction as independent variable with within-subjects data?

In my work I've stumbled upon an interesting result when a classifier is applied to within-subjects data. My question is whether this is a known result, and if so, does it have a name? I can't find ...

David B

1,944

asked Nov 8 at 17:05

4 votes

1 answer

54 views

Why is my conditional inference tree so different from my random forest?

I am analyzing a data set using conditional inference trees and random forests, using the R package partykit (v. 1.2.20). For my dependent variable (likelihood of responding), the tree and the forest ...

user3499014

41

asked Nov 8 at 2:26

2 votes

1 answer

31 views

Low AUC value for XGBoost binary classification model

I am using XGBoost to construct a binary classification model to identify individuals with a diagnosis based on things like weight, average steps per day, age, etc. My model is quite small (only 200 ...

xgboost

21

asked Nov 6 at 0:06

1 vote

1 answer

30 views

How to evaluate performance of classification model for different subsets of classes?

Consider a classification problem where there are N classes. While this may seem strange, I have a model that processes features, and essentially, evaluate which classes are impossible (or near ...

Ralff

252

asked Nov 2 at 2:52

0 votes

0 answers

24 views

Derivation of LDA estimates for 1 dimensional case

In An Introduction to Statistical Learning Section 4.4.1, where it discusses LDA for classification, which is just a Bayes classifier with an assumption that the probability density function of each ...

user442779

1

asked Nov 2 at 2:19

2 votes

2 answers

70 views

Applying Shapley values to classification

According to the definition by Štrumbelj and Kononenko (2013), Shapley values are defined for regression predictions. They should, however, also be applicable to classifications when the classifier ...

cdalitz

5,730

asked Oct 30 at 16:22

2 votes

0 answers

84 views

How can minimalist models achieve even higher accuracy on MNIST?

I recently came across a fascinating discussion on how a simple logistic regression model can achieve around 92% classification accuracy on the MNIST dataset (reference: How does a simple logistic ...

Mountain

121

asked Oct 28 at 15:24

0 votes

0 answers

13 views

Understanding the role of prior probabilities on the misclassification rate of a DD Classifier

So I'm analyzing this paper for a research: DD CLassifier I'm having troubles understanding the role of the prior probabilities in the misclassification rate formula. My doubt is: Having this answer ...

Kronk

asked Oct 26 at 9:36

0 votes

0 answers

15 views

assessing classifier accuracy when class presence is scarce

What can I do, to assess a classifiers accuracy, when class presence is scarce. Setup 1: I have 1000 boxes, 500 contain gold. I build an automated tool to find the gold. The recommended approach would ...

Klops

156

asked Oct 25 at 9:58

1 vote

1 answer

51 views

What do you call the error model of swapping observations not labels?

I am working on a classification problem and have reason to believe that the only error the classifier makes is due to swapping the predictions of some cases with those of other, unobserved cases. ...

jan-glx

379

asked Oct 25 at 8:32

1 vote

0 answers

39 views

Is my XGBoost Model Still Overfitting (Binary Classifcation)?

I am trying to build a binary classification model with XGBoost. I made sure to split my data into the training, validation and test sets. I performed feature selection, early stoppage and ...

Shak Jivraj

11

asked Oct 17 at 2:49

0 votes

0 answers

22 views

Imbalanced dataset and binary classification [duplicate]

In my dataset there are 31657 data and each has a binary output( 1 or 0). The problem is that the output 1 occurs only 886/31657 times, while 0 occurs 30771/31657 times. Since my dataset is imbalanced,...

Daniela

1

asked Oct 12 at 18:04

4 votes

1 answer

30 views

Report classifier accuracy with missing data

Suppose we are reading blood pressure. Some of the readings are corrupt and unusable. We then train a binary classifer to detect high blood pressure. What does the theory of experimental design say ...

Alex

403

asked Oct 12 at 0:40

0 votes

0 answers

9 views

Methods to analyze biomarker data to detect pathologies (and co-pathologies)

I'm trying to do analysis to determine what biomarkers are able to detect specific pathologies. I am looking at 4 separate pathologies, however the problem is they often co-occur with eachother. So if ...

jcoop

1

asked Oct 10 at 22:51

0 votes

0 answers

14 views

Scaling for binary data features

My goal is a binary classification of a sports match between two players, particularly I am concerned with the probabilities of each player winning. My current dataframe has feature values of [...

user54565

47

asked Oct 10 at 0:22

0 votes

0 answers

23 views

False Positive Rates at a fixed False Negative Rate

When reporting False Positive Rates (FPRs) at a fixed False Negative Rate (FNR), does the choice of threshold still matter for comparison purposes, or is it sufficient to compare FPRs directly since ...

babygould

11

asked Oct 9 at 12:24

1 vote

0 answers

43 views

Correct way to select data for (probabilistic) churn model?

Context: The team I work for would like to model the probability that a customer will churn given the specific covariates of the customer. We have two main ideas: a model using survival analysis and a ...

ImpactGuide

103

asked Oct 8 at 15:32

3 votes

1 answer

124 views

How is a PR curve plotted?

Whilst reading Machine Learning by Zhi-Hua Zhou (pg 34-35), I was a little confused on the method used to plot a PR curve and was hoping you could help me become a little less confused. In the book it ...

Thomas Stokes

33

asked Oct 8 at 11:20

0 votes

0 answers

20 views

Can a neural network learn element wise relationships of some matrix?

Suppose I have a dataset of matrices which I'd like to classify based on a hidden relationship between their elements. For example, let's have $\Pi_{ij} A_{ij} = B$ and if $B$ is the same then they ...

DjM

1

asked Oct 4 at 10:13

0 votes

0 answers

29 views

Advice on fine-tuning an email classifier for a Pharma company

I'm an intern working on implementing a binary email classifier for a client (Pharmaceutical company) and I need some advice on fine-tuning the model. The model I'm using is Longformer (because it has ...

Bhashwar Sengupta

1

asked Sep 24 at 16:03

2 votes

1 answer

40 views

How do I reverse engineer an unknown classifier?

Let's say I have a set of binary predictions along with ground truths for a binary classification problem. The predictions are made by an unknown classifier. My task is to reverse engineer the ...

ProfessionalUsername

23

asked Sep 24 at 7:44

0 votes

0 answers

11 views

What hugging model is recommended for finding terms in texts (or synonims)

I would like to use a hugging face model in order to for example find the concept "pedestrian" in a description like "In this image there is a person driving a minibike and another on a ...

KansaiRobot

257

asked Sep 24 at 5:01

0 votes

1 answer

80 views

How to handle an imbalanced dataset to predict university admission using a predictive model in R?

I'm working on a project where I need to predict university admission (binary variable: 0 = not admitted, 1 = admitted) based on a set of candidate features (e.g., academic scores, age, gender, etc.). ...

Carla Bruni

asked Sep 20 at 15:28

0 votes

0 answers

53 views

How to fine-tune an ensemble-level hyper-parameter in the following problem?

I have a relatively small dataset of 4000 instances. I aim to perform classification using an ensemble approach. My ensemble consists of 5 different classifiers and I have an ensemble-level hyper-...

E-O

1

asked Sep 18 at 14:19

1 vote

0 answers

42 views

How to evaluate the performance of a class based on a binary target while removing the effect of other dependent features?

Here is the problem i'm facing: I have a dataset of maintenance performed with two categorical variables (or more) (e.g. supplier and customer). Each have hundreds of different classes. These features ...

Sayanel

11

asked Sep 17 at 8:32

0 votes

0 answers

34 views

Flattening a likelihood

Background Let $y_1,y_2,\dots,y_K$ be a sequence of measurements. I've derived a likelihood $\mathcal{L}(y|i)$ to solve a classification problem via the Bayesian classifier \begin{equation} p_k(i)=\...

matteogost

473

asked Sep 10 at 13:15

1 vote

0 answers

22 views

Discriminative vs generative & Bayes risk

In the context of binary classification with features denoted $X$ and labels $y$, one could specify the joint distribution of $(X, y)$ in two equivalent but different ways : The generative way ...

Ramufasa

11

asked Sep 8 at 16:40

Questions tagged [classification]

Related Tags