Newest 'precision-recall+random-forest' Questions

1 vote

2 answers

1k views

High Precision and High Recall issue- Random Forest Classification

I am building a classification model using Random Forest technique using GridSearchCV. The target variable is binary where 1 is 7.5% of total population. I have used several values of GridSearch ...

totalsurfer_v1

11

asked Oct 7, 2021 at 6:28

0 votes

0 answers

65 views

Low model performance for an imbalanced data, is there any hope to improve the metrics?

I am working with an imbalanced data: 70k:0 and 1K:1 with 12 features. I would like to perform classification to choose the important features. So far, I have done under-sampling, over-sampling, ...

ricecooker

11

asked Mar 24, 2021 at 14:18

1 vote

1 answer

861 views

k-fold cross validation much better than unseen data

this is my first "real" project and I am not understanding a certain behaviour. My dataset spans from 2017 up to today. What I did is cleaning data, getting rid of missing values etc. There are mixed ...

tuxmania

121

asked Apr 29, 2020 at 17:56

3 votes

1 answer

1k views

Why does a class weight fraction improve precision compared to undersampling approach where precision drops?

I have an imbalanced data where the ratio between positive to negative samples is 1:3 (positive samples are 3 times higher than negative). For my case it is is important to have a higher precision (...

David

31

asked Dec 10, 2019 at 11:54

1 vote

1 answer

1k views

Is it possible to have recall and precision of 0 while having an area under PR ~0.5?

As the title suggests, I am running a Random Forest classifier using Scala. To evaluate this classifier (and since I am handling highly imbalanced classes), I used the ...

Toutsos

157

asked Apr 17, 2019 at 18:02

1 vote

0 answers

104 views

Poor P-R curve for binary classifier trained on balanced data, with imbalanced test data

I have a very imbalanced dataset (9:1), for which I have performed under-sampling and achieved a balanced training set (~130k samples total post balancing). I am performing classification using ...

Anakimi

11

asked Jul 31, 2018 at 21:37

0 votes

0 answers

98 views

Average Precision or FBeta & Decision Threshold Tuning for Binary Classifier [duplicate]

I'm working with an imbalanced binary classifier data set (3% positive) in sklearn. The cost of a false negative is extremely high so recall is much more important than precision. To baseline my ...

Nahyyz

53

asked Jul 12, 2018 at 3:13

1 vote

2 answers

294 views

Classification: Random Forest vs. Decision tree

Suppose you are given a dataset with 4 attributes (F1, F2, F3, and F4). The class label is contained in attribute F4. Now you build a random forest classification model and you test its performance ...

kalyani Bethi

19

asked Oct 27, 2017 at 16:49

0 votes

0 answers

106 views

Precision is going down as the testing sample increases why?

I trained the random forest machine learning model with 100K records (overall 700k records) with 297 features with 100 trees ...

vinaykva

409

asked Oct 4, 2017 at 19:56

1 vote

0 answers

2k views

ROC curves from cross-validation are identical/overlaid and AUC is the same for each fold

UPDATE Confidence Intervals I have an imbalanced dataset with around 200k instances and 50 predictors. The imbalance has a 4:1 ratio for the negative class (i.e class 0). In other words the negative ...

RMS

231

asked Mar 18, 2017 at 16:14

2 votes

0 answers

4k views

What is the difference between oob (out of bag) error and (1 - accuracy) in RandomForest?

In a Random Forest, I know that the Out Of Bag Error is described as the fraction of number incorrect classifications over number of out of bag samples. Accuracy is defined as the number of correct ...

makansij

2,309

asked Jan 17, 2016 at 5:19

14 votes

1 answer

21k views

How to reduce number of false positives?

I'm trying to solve task called pedestrian detection and I train binary clasifer on two categories positives - people, negatives - background. I have dataset: number of positives= 3752 number of ...

mrgloom

2,227

asked May 7, 2015 at 9:59

10 votes

1 answer

13k views

classification threshold in RandomForest-sklearn

1) How can I change classification threshold (i think it is 0.5 by default) in RandomForest in sklearn? 2) how can I under-sample in sklearn? 3) I have the following result from RandomForest ...

Big Data Lover

437

asked Dec 18, 2013 at 6:37

Stack Exchange Network

All Questions

High Precision and High Recall issue- Random Forest Classification

Low model performance for an imbalanced data, is there any hope to improve the metrics?

k-fold cross validation much better than unseen data

Why does a class weight fraction improve precision compared to undersampling approach where precision drops?

Is it possible to have recall and precision of 0 while having an area under PR ~0.5?

Poor P-R curve for binary classifier trained on balanced data, with imbalanced test data

Average Precision or FBeta & Decision Threshold Tuning for Binary Classifier [duplicate]

Classification: Random Forest vs. Decision tree

Precision is going down as the testing sample increases why?

ROC curves from cross-validation are identical/overlaid and AUC is the same for each fold

What is the difference between oob (out of bag) error and (1 - accuracy) in RandomForest?

How to reduce number of false positives?

classification threshold in RandomForest-sklearn

Hot Network Questions

All Questions

Related Tags