I constantly see metrics that measure the quality of a classifier's predictions, such as TPR, FPR, Precision, etc., being described as probabilities (see Wikipedia, for example).
As far as I know, these are not probabilities but observations of a classifier's predictive behavior estimated during testing: for instance, they measure the rate of observed True Positives, False Positives, and so on.
Wouldn't it be more correct to distinguish these from the actual probabilities? That is, to say that by the Law of Large Numbers, these mesaurements converge to the probabilities. For instance, Precision (the measurement) converges to the Bayesian detection Rate (the actual probability).