5
$\begingroup$

So, I know yet another Precision, Recall question which is asked umpteenth times now.

I wanted to ask some specific business related questions.

Imagine if you are building a classifier to predict which patient who are diagnosed with certain disease are most likely to going to become eligible for treatment in next 3 months.

You build a binary classifier.

Here's the numbers u get for model in production:

Precision : 0.0032 or 0.32% 
Recall : 50% or 0.5 

Same stats during development: 

Precision: 24% or 0.24 
Recall: 68% or 0.68

My Question:

  • How good is this model?

From what I understand if a model has such a low precision close to zero, then it means model is basically predicting all false positives. Meaning it is telling that lot of patients will become eligible for treatment.

But the Recall of 50% should mean anything? I believe since FP is so high, any hit rate is mostly due to chance since model is calling many patients as positive.

So, is it right to say, this model is of no good and Recall of 50% actually doesn't mean model picked out 50% of predictions correct.

Am I thinking on the right track or what am I missing?

I believe the data is highly skewed also

Scenario where the above can happen:

Have a patient pool of 10k. Model predicts 1k as Positive or eligible for treatment. Total actual eligible (TP) = 6. Model gets 3 of them. So Precision = 3/1000 = 0.003. 
While Recall = 3/3+3(FN)= 0.5
$\endgroup$
1
  • $\begingroup$ You have 6 positive cases and 994 negatives, is that right? That is very unbalanced, and a tricky case. In those cases, I'd recommend changing the question so it is easier to solve, while still answering the business problem. $\endgroup$
    – Davidmh
    Commented Jun 22, 2023 at 7:29

2 Answers 2

5
$\begingroup$

Recall is the probability of catching a positive case. That is, if a case is positive, how likely you are to identify it as positive is the recall. Synonyms for the recall are sensitivity and true positive rate.

Precision is the probability that a case identified as positive truly is positive. A synonym for precision is the positive predictive value.

Without context, it is difficult to say if a particular measure of performance is good. It depends on the application needs. If you need to catch almost every positive case but also want to have extreme confidence that your predicted positives really are positive, the numbers you give do not seem consistent with such a characterization. Your models seem to miss many positive cases yet “cry wolf” frequently, but it’s really for you to put those numbers in context.

You might think in terms of a smoke detector. High recall means the device catches most of the fires, while low recall means the device misses many fires. High precision means that, when the alarm goes off, you should believe there to be a fire, while low precision means that you might not be so inclined to believe there is a fire. “Ignore it, rings all the time without there being a fire,” you might say.

The Boy Who Cried Wolf

$\endgroup$
4
  • $\begingroup$ hello. Thanks for response. I understand what Recall and Precsiion is. My Q is something basic. Just because recall = 50% shouldn't mean model is able to catch 50% of true positives correctly. If a model has such low Precision nearly zero, it means its due to model making almost everyone as positive model ends up catching some true positives by chance by sheer nature of calling everyone positive. Like if I simply say every patient is positive(eligibile for treatment) then I will have a recall of 1 or 100%. But then model didnt do anything there. U just got zero FN due to high FP $\endgroup$
    – Baktaawar
    Commented Jun 22, 2023 at 5:57
  • $\begingroup$ @Baktaawar Go to the definitions. If recall is 50%, it means that the probability of identifying a case as positive, given that it is positive, is 0.5. If you label every case as positive, then, yes, you wind up with a recall of 100%. However, you will wind up with a precision that is poor by having so many false positives. This is why precision and recall often get discussed together (or graphed together in a precision-recall curve). $\endgroup$
    – Dave
    Commented Jun 22, 2023 at 12:48
  • $\begingroup$ that is what I am asking. If my model has 0.3% Precision and Recall is 50% that model doesn't mean anything is what I am saying $\endgroup$
    – Baktaawar
    Commented Jun 23, 2023 at 2:37
  • $\begingroup$ @Baktaawar It means that it catches half of the positive cases and that, when a case is identified as positive, there is a $0.3\%$ chance the case truly is positive. That might be poor performance, perhaps unacceptable, sure. $\endgroup$
    – Dave
    Commented Jun 23, 2023 at 2:49
4
$\begingroup$

Recall=Sensitivity=True Positive Rate (TPR) is given by $$ TPR=\frac{TP}{TP+FN}, $$ which is the number of correct diagnoses among all the patients having the disease.

Precision=Positive Predictive Value (PPV) is $$ PPV=\frac{TP}{TP+FP},$$ that is the number of correct diagnoses among all the patients diagnosed with the disease.

Recall of 0.5 means that we capture only half of the patients that are actually affected by the disease. Low precision means that there are lots of false positives, e.g., we are treating lots of people who are not really sick.

In practice, on the development stage, there is usually a threshold, which controls whether the value of the biomarker (or another disease indicator) for a given patient is classes as positive or negative. There is a value of TPR and PPV corresponding to every value of threshold, which gives us precision-recall curve, and the area under this curve, AUPR, can be used to estimate the goodness of the model (although its properties are not as good as those of ROC curve and AUC, sometimes precision-recall curve is preferable.) This allows balancing the cost of the treatment and the risk of not treating sick people (image source):

enter image description here

It seems that in this case the model was already quite bad at the development stage. However, it is much worse in production, which suggests that the model is really bad (the biomarker actually is not indicative of the disease.)

$\endgroup$
7
  • $\begingroup$ hello. Thanks for response. I understand what Recall and Precsiion is. My Q is something basic. Just because recall = 50% shouldn't mean model is able to catch 50% of true positives correctly. If a model has such low Precision nearly zero, it means its due to model making almost everyone as positive model ends up catching some true positives by chance by sheer nature of calling everyone positive. Like if I simply say every patient is positive(eligibile for treatment) then I will have a recall of 1 or 100%. But then model didnt do anything there. U just got zero FN due to high FP $\endgroup$
    – Baktaawar
    Commented Jun 22, 2023 at 5:57
  • $\begingroup$ @Baktaawar Yes, this is correct. Precision near 0% basically means that most people that you treat are not really sick (in your case you are going to treat 300 healthy people per every sick person.) However, in your case Recall is 50%, so you do identify a sick person with probability 50%. Hypothetically, if we are speaking about a lethal condition, treating 300 people who are not sick might still turn out to be less expensive than missing one sick person - depends on how the cost of treatment compares with the cost of funerals, malpractice claims, etc. $\endgroup$
    – Roger V.
    Commented Jun 22, 2023 at 10:51
  • $\begingroup$ well, think like a oncology setting. Missing one patient who could be eligible for drug, is few hundred thousand dollar revenue loss for the company while a possible few years of life extension loss for patient. So I understand the cost benefit stuff. My Q is any cost benefit stuff still can't be atrributed to such a model where Precision is near ZERO. I am not asking should I focus on which option is more expensive for the business. I am simply saying, this model doesn't tell anything as model is useless. Is that correct or not $\endgroup$
    – Baktaawar
    Commented Jun 23, 2023 at 2:40
  • $\begingroup$ @Baktaawar let me give you two real-world examples where we test the totality of (sub)population. 1) anyone reaching certain age (e.g., 50) might be tested for cancer and cardiac disease with a cheap test. Those who are positive are then sent for more precise (and more expensive ) testing. 2) A few years ago it was common to measure temperature of anyone arriving on a plane from China, in a hope of weeding out those infected with covid, and delay the pandemics. This is a very low precision test. Btw, in either case recall might be less than 100% - we fail to detect some cases. $\endgroup$
    – Roger V.
    Commented Jun 23, 2023 at 5:34
  • $\begingroup$ I am not sure how that example helps here? If a model has 0.3% Precision, forget about the cost benefit analysis, a higher recall while gives a sense that you are able to optimize on false negatives which you don't want in ur business settings, but that "optimize false negatives" isn't because of the model if that model's Precision is near ZERO. Its just due to luck. Because in the end model is basically calling every one as eligible for treatment and raising ton of false alarms. While False alarms itself isn't bad for business from cost perspective, the recall is also not due to model $\endgroup$
    – Baktaawar
    Commented Jun 25, 2023 at 1:20

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.