3
$\begingroup$

Having data sets regarding symptoms and diseases such that I use to observe the conditional distributions P(Disease X | Symptom A , Symptom H , Age >20 ) which I use for classification and diagnosis.

Now, a Domain expert comes and says - the data do not reflect reality, Disease X does not come really often with Symptom A. Or, Combination of Symptom H and A can also lead to Disease Y which never observed in the data.

What is the modern approach to combine the new knowledge that comes from domain experts to "tune" the classifiers / Augment the original data with the expert inputs? Without using just pure rule-base which won't help the model generalizes.

$\endgroup$

1 Answer 1

2
$\begingroup$

The short answer to your question is Bayesian modelling.

Beta-distributed priors and Dirichlet priors - these are places to start with when you want to combine number statistics with export knowledge of expected distributions. Bayesian modelling is a whole subfield in itself, within statistics.

$\endgroup$
6
  • $\begingroup$ Then the expert needs to choose the prior distribution family or the exact mean/variance of the distribution? $\endgroup$
    – Latent
    Commented Jul 21, 2020 at 17:36
  • 1
    $\begingroup$ The expert provides the prior parameter values. The Beta-distribution is for variables/outcomes with two possible values, the Dirichlet for more than 2 possible values. The choice of the type of distribution itself is given by your application domain. $\endgroup$ Commented Jul 21, 2020 at 17:47
  • $\begingroup$ Expert won't give you the parameter values.. only confidence in rules, that is why i thought about augmentation that will change the distribution some how.. $\endgroup$
    – Latent
    Commented Jul 21, 2020 at 18:05
  • 2
    $\begingroup$ In the past, I use to work with researchers who elicited such probabilities from physicians through different gaming and estimation techniques. That's where such numbers can be obtained from. $\endgroup$ Commented Jul 21, 2020 at 18:17
  • 2
    $\begingroup$ Take a look at LC van der Gaag et al (2002), Probabilities for a Probabilistic Network: A Case-study in Oesophageal Carcinoma: dspace.library.uu.nl/bitstream/handle/1874/2018/… $\endgroup$ Commented Jul 22, 2020 at 7:55

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.