Newest 'precision-recall' Questions

0 votes

0 answers

7 views

Drawbacks of stratified test set bootstrapping for metric UQ

I am using test-set (percentile) bootstrapping to quantify the uncertainty of various model performance metrics, such as AUROC, AUPR, etc. To avoid any confusion, the approach is simply: bootstrap ...

Eike P.

3,098

asked Nov 15 at 0:06

0 votes

1 answer

35 views

Time-dependent area under the precision recall

How to compare the time-dependent precision recall (PR) receiver operating curve (ROC) values for two cox regression models at multiple time points? To compare two time-dependent AUC values, I would ...

obruzzi

101

asked Nov 4 at 19:29

0 votes

0 answers

19 views

Recursive Random Search and Categorical Cost Functions

I'm currently working on a project that involves optimizing the default Spark-submit configurations to minimize execution time. I've developed two models to aid in this process: Binary Classification ...

Hijaw

175

asked Sep 2 at 19:22

0 votes

0 answers

56 views

Average precision vs Average recall in object detection

There are two popular metrics for object detection: Average precision and Average recall. Do you can explain with examples, what are the cases to use AP, and what are the cases to use AR? I agree that ...

Ars ML

31

asked Jul 24 at 7:47

5 votes

3 answers

566 views

Judging a model through the TP, TN, FP, and FN values

I am evaluating a model that predicts the existence or not existence of a "characteristic" (for example, "there is a dog in this image") using several datasets. The system outputs ...

KansaiRobot

257

asked Jul 9 at 5:43

1 vote

0 answers

38 views

How to estimate precision and recall without taking a huge random sample when the positive class is relatively rare

I have a binary text classification model, and I would like to test how well it works, in terms of precision and recall, on a new dataset of 2 million text documents that have not been annotated yet. ...

Alex

467

asked Jul 8 at 22:33

2 votes

1 answer

130 views

We have sensitivity-specificity space (ROC curves) and precision-recall space (PR curves, $F_1$ score). What work has been done with PPV-NPV space?

Receiver-operator characteristic (ROC) curves display the balance between sensitivity and specificity: how good you are at detecting category $1$ (sensitivity) while not falsely identifying category $...

Dave

67.2k

asked May 30 at 14:06

2 votes

0 answers

29 views

Re-calculate accuracy, precision and recall after treatment effect in a model

Working in a churn-prediction model where the goal is to detect the players that have a high chance to churn from the site and send those players an offer to keep them in site. In the initial training ...

ELTono

21

asked May 13 at 10:41

2 votes

3 answers

234 views

Is relying on just the confusion matrix for highly imbalanced test sets to evaluate model performance a bad idea?

I have a binary classification model with a test set that is highly skewed, the majority class 0 is 22 times greater than the minority class 1. This causes my Precision to be low and Recall to be high,...

statsnoob

21

asked May 2 at 9:39

1 vote

1 answer

42 views

Is it possible to estimate the number of positives from precision and recall values?

Let's say, I have a binary predictor, and its performance in precision and recall is known from the previous study. Now, we apply the predictor on the new (unknown) dataset with 1000 samples, and got ...

ysakamoto

113

asked Apr 17 at 21:05

1 vote

1 answer

127 views

Choosing the correct evaluation metric between F1-score and Area under the Precision-Recall Curve (AUPRC)

We're currently working on detecting specific objects (e.g. poultry farms, hospitals) from satellite images. We've modeled the problem as a binary image classification task (i.e. classifying images ...

meraxes

739

asked Apr 14 at 19:23

3 votes

2 answers

467 views

Is F-score the same as accuracy when there are only two classes of equal size?

The title says it all: Is F-score the same as accuracy when there are only two classes of equal sizes? For my specific case, I have measurements of a group of people under two different situations and ...

user1596274

169

asked Feb 22 at 18:39

2 votes

1 answer

68 views

What is the best method to calculate confidence intervals on precision and recall which are not independent?

I have a classification model which classifies user's bank transactions into two categories. From this model I produce precision and recall metrics. I would like to understand the confidence around ...

Barnaby Cooper

21

asked Feb 19 at 11:41

2 votes

1 answer

225 views

How to define Precision when we have multiple predictions for each ground truth instance?

In my problem, it is possible to have multiple predictions for a ground truth instance. How we define precision in such scenarios? For further clarification consider the following example. We have 1 ...

Meysam Sadeghi

151

asked Feb 16 at 10:17

1 vote

0 answers

23 views

Precision and Recall of a combined classifier

I have two classifiers, trained on the same dataset, each predicting a different variable. let's call these X1 and X2. They have their respective precision and recall measures X1-p,r, and X2-p,r. I ...

Bi Act

123

asked Feb 12 at 20:10

0 votes

0 answers

28 views

Why would area under the PR curve include points off of the Pareto front?

(Let's set aside thoughts about if we should be calculating PR curves or areas under them at all.) A precision-recall curve for a "classification" model can contain points that should not be ...

Dave

67.2k

asked Feb 7 at 0:45

2 votes

1 answer

222 views

Comparing probability threshold graphs for F1 score for different models

Below are two plots, side-by side, for an imbalanced dataset. We have a very large imbalanced dataset that we are processing/transforming in different manner. After each transformation, we run an ...

Ashok K Harnal

567

asked Feb 5 at 6:08

0 votes

0 answers

29 views

How to calculate AUC for a P-R curve with unusual starting point

I am working with a binary classifier that is outputting scores between 0 and 1, indicating probabilities of class membership, according to the model. I produced a P-R curve and the first point (i.e., ...

CopyOfA

187

asked Jan 31 at 16:37

8 votes

1 answer

603 views

Lack of rigor when describing prediction metrics

I constantly see metrics that measure the quality of a classifier's predictions, such as TPR, FPR, Precision, etc., being described as probabilities (see Wikipedia, for example). As far as I know, ...

synack

371

asked Jan 8 at 14:17

0 votes

0 answers

22 views

Can the Log of PR AUC curve plot be any useful?

I was doing some tests regarding my PR curve for 2 different models (first image), and I got the idea of ploting the log of those curves (second image) to see if there were any insights that I could ...

GabrielPast

1

asked Dec 16, 2023 at 13:16

1 vote

1 answer

152 views

Why does my PR Curve look like this?

These are my recall and precision stats for the model I built. The Curve does not look good where recall is 0. Not sure why there are so many points there. Can anyone help and explain why the curve ...

ibarbo

65

asked Nov 27, 2023 at 16:48

1 vote

1 answer

295 views

Should we use train, validation, or test data when creating PR/AUC curves to optimize the decision threshold?

It makes sense to me that we can use the ROC-AUC and PR-AP scores of the validation sets during CV to tune our model hyperparameter selection. And when reporting the models final performance, it makes ...

another_student

107

asked Nov 18, 2023 at 3:02

3 votes

2 answers

198 views

Calculate area under precision-recall curve from area under ROC curve and the prevalence

I am reading material that reports the area under a ROC curve. I am curious to know what the performance would be in precision-recall space. From the sensitivity and specificity values in the ROC ...

Dave

67.2k

asked Nov 3, 2023 at 16:04

0 votes

0 answers

208 views

Is Hyperparameter Tuning for Maximized Recall a Bad Thing?

I have a somewhat theoretical question: I work in an area that requires a number of anomaly detection solutions. When we approach these problems, we cross-validate and for each fold, we oversample ...

Branden Keck

51

asked Oct 27, 2023 at 17:34

2 votes

1 answer

340 views

Poor balanced accuracy and minority recall but perfect calibration of probabilities? Imbalanced dataset

I have a dataset with a class imbalance in favour of the positive class (85% occurence) I'm getting a fantastically calibrated probabilities profile but balanced accuracy is 0.65 and minority recall ...

Kat

21

asked Sep 30, 2023 at 15:14

1 vote

0 answers

33 views

How to choose k for MAP@K?

Scenario: We want to evaluate our recommender system, which recommends items to potential customers when visiting a product detail page. Here are actual relevant items: ...

etang

1,027

asked Sep 18, 2023 at 23:52

2 votes

1 answer

668 views

Sudden drop to zero for precision recall curve

I am training a neural network classifier with 250k training samples and 54k validation samples. The output activation is sigmoid. I noticed a sudden drop in the precision for the very top probability ...

Florian

21

asked Aug 24, 2023 at 17:12

1 vote

0 answers

56 views

Confidence intervals for Object Detection metrics

I would like to come back on this "When do we require to calculate the confidence Interval?" since recently a reviewer asked me to provide confidence intervals for metrics regarding my work ...

rok

111

asked Aug 23, 2023 at 13:24

1 vote

1 answer

522 views

Why use average_precision_score from sklearn? [duplicate]

I have precision and recall values and want to measure an estimator performance: ...

Ars ML

31

asked Jul 19, 2023 at 15:35

5 votes

2 answers

713 views

Understanding Precision Recall in business context

So, I know yet another Precision, Recall question which is asked umpteenth times now. I wanted to ask some specific business related questions. Imagine if you are building a classifier to predict ...

Baktaawar

1,115

asked Jun 21, 2023 at 3:22

2 votes

1 answer

118 views

Binary classification metrics - Combining sensitivity and specificity?

The harmonic mean between precision and recall (F1 score) is a common metric to evaluate binary classification. It is useful because it strikes a balance between precision (FP) and recall (FN). For ...

usual me

1,257

asked Jun 14, 2023 at 10:09

1 vote

0 answers

29 views

How to start GNN optimization to get higher precision?

I'm developing a GNN for missing links prediction following this blog post for PyG library. I'm using almost the same GNN with a different dataset. Altough my dataset is similar to the MovieLens ...

James

135

asked May 24, 2023 at 11:48

1 vote

2 answers

298 views

Are there any difference using scores or probabilities for roc_auc_score and precision_recall_curve functions?

I'm working with a GNN model for link prediction and using precision_recall_curve and roc_auc_score from the ...

James

135

asked May 19, 2023 at 13:52

3 votes

1 answer

47 views

Precision and recall reported in classification model

I have one question about the evaluation metrics of classification models. I see many people report the precision and recall value for their classification models. Do they choose a threshold to ...

Salty Gold Fish

519

asked May 16, 2023 at 18:07

1 vote

0 answers

389 views

Interpretation of area under the precision-recall curve

The area under the receiver-operator characteristic curve has a interpretation of how well the predictions of two categories are separated. This post gives the area under the precision-recall curve as ...

Dave

67.2k

asked May 10, 2023 at 12:38

2 votes

0 answers

40 views

Is there a way to effect the shape of precision-recall curve?

As long as I know, for both ROC and PR curves, the classifier performance is usually measured by the AUC. This might indicate that classifiers with equivalent performance might have different ROC/PR ...

Gideon Kogan

390

asked May 5, 2023 at 19:48

1 vote

1 answer

49 views

combine specificity and

I am performing classification on an imbalanced dataset (70% negatives). If a prediction is negative I take a specific action otherwise an opposite one. As in both cases some costs are implied, I want ...

shamalaia

295

asked Apr 1, 2023 at 13:39

1 vote

4 answers

239 views

Why don't we use the harmonic mean of sensitivity and specificity?

There is this question on the F-1 score, asking why we compute the harmonic mean of precision and recall rather than its arithmetic mean. There were good arguments in the answers in favor of the ...

user209974

211

asked Mar 22, 2023 at 21:28

4 votes

1 answer

518 views

When to use a ROC Curve vs. a Precision Recall Curve?

Looking for the circumstances of when we should use a ROC curve vs. a Precision Recall curve. Example of answers I am looking for: Use a ROC Curve when: you have a balanced or imbalanced dataset (...

Katsu

1,021

asked Mar 14, 2023 at 20:57

3 votes

1 answer

696 views

ROC AUC has $0.5$ as random performance. Does PR AUC have a similar notion?

In considering ROC AUC, there is a sense in which $0.5$ is the performance of a random model. Conveniently, this is true, no matter the data or the prior probability of class membership; the ROC AUC ...

Dave

67.2k

asked Mar 13, 2023 at 19:54

11 votes

8 answers

9k views

My machine learning model has precision of 30%. Can this model be useful?

I've encountered an interesting discussion at work on interpretation of precision (confusion matrix) within a machine learning model. The interpretation of precision is where there is a difference of ...

wmmwmm

121

asked Mar 7, 2023 at 20:32

0 votes

0 answers

19 views

Flipping inputs in multilabel classification

I have framed a classification problem as follows: I have $N$ items, and wish to predict a set of relevant tags for each out of $M$ tags. An item can have anywhere from 0 to $M$ applicable tags. To ...

John

1

asked Feb 19, 2023 at 8:12

1 vote

2 answers

392 views

Confidence interval for Accuracy, Precision and Recall

Classification accuracy or classification error is a proportion or a ratio. It describes the proportion of correct or incorrect predictions made by the model. Each prediction is a binary decision that ...

dokondr

287

asked Feb 7, 2023 at 17:05

1 vote

0 answers

80 views

A better linear model has less precision(relative to the worse model) at a larger threshold

I trained two models using the same algorithm - logistic regression (LogisticRegression(max_iter=180, C=1.05) for ~27 features and ~330K observations). I used the ...

konstantin_doncov

173

asked Jan 29, 2023 at 23:15

0 votes

0 answers

100 views

Is weighting still needed when using undersampling?

I have an model that I want to test with my existing data to calculate precision, recall etc. The data is actually unbalanced dataset: Class A 70%/Class B 30%. I created a data set by undersampling ...

Kar781Lopsdsds

1

asked Jan 25, 2023 at 7:04

1 vote

0 answers

107 views

Accounting for overrepresentation of positives in binary classification test set for calculation of precision and recall

I have a binary classification task with highly imbalanced data, since the class to be detected (in the following referred to as the positives) is very rare. For data limitation reasons my test set ...

user15774062

11

asked Jan 20, 2023 at 12:00

0 votes

0 answers

28 views

Weighted precision & recall - With class weights vs oversampling

If weighted precision is calculated as ...

Kaushik J

101

asked Jan 6, 2023 at 3:53

0 votes

0 answers

416 views

Comparing AUC-PR between groups with different baselines

So I know that the area under the precision-recall curve is often a more useful metric than AUROC when dealing with highly imbalanced datasets. However, while AUROC can easily be used to compare ...

Eike P.

3,098

asked Jan 1, 2023 at 17:37

0 votes

0 answers

43 views

Does the Precision-Recall AUC approach the ROC AUC as the data becomes balanced?

I am working on a Machine Learning classifier. It is a binary response and most predictor variables are categorical. I have several years of data and for some years, the response is imbalanced (more ...

wisamb

161

asked Dec 28, 2022 at 19:09

1 vote

0 answers

61 views

F1 Score vs PR Curve

If I understood correctly, PR Curve it's just the mean of F1 score computed multiple times with different thresholds. In the task of outlier detection those are two suggested metrics given the fact ...

Loris

23

asked Dec 24, 2022 at 16:38

Questions tagged [precision-recall]

Related Tags