I have an model that I want to test with my existing data to calculate precision, recall etc. The data is actually unbalanced dataset: Class A 70%/Class B 30%. I created a data set by undersampling class A, so that I get an equal distribution of both classes: Class A 50%/Class B 50%. When calculating the metrics for evaluation, do I have to weight the results? So would a false positive have a higher weight, due to the unbalance in the actual population?
$\begingroup$
$\endgroup$
2
-
$\begingroup$ In short, your metrics are not appropriate for your data, change the metrics not the data. $\endgroup$– user2974951Commented Jan 25, 2023 at 7:24
-
$\begingroup$ Please take some time to read what's already written about these topics on Cross Validated. Good places to start: When is unbalanced data really a problem in Machine Learning? and Why is accuracy not the best measure for assessing classification models?. $\endgroup$– dipetkovCommented Jan 25, 2023 at 8:05
Add a comment
|