Estimating the strength of causal effects from observational data is a common problem in scientif... more Estimating the strength of causal effects from observational data is a common problem in scientific research. A popular approach is based on exploiting observed conditional in-dependences between variables. It is well-known that this approach relies on the assumption of faithfulness. In our opinion, an even more important practical limitation of this approach is that it relies on the ability to distinguish independences from (arbitrarily weak) dependences. We present a simple analysis, based on purely algebraic and geometrical arguments, of how the estimation of the causal effect strength, based on conditional independence tests and background knowledge, can have an arbitrarily large error due to the uncontrollable type II error of a single conditional independence test. The scenario we are studying here is related to the LCD algorithm by Cooper [1] and to the instrumental variable setting that is popular in epidemiology and econometry. It is one of the simplest settings in which causal discovery and prediction methods based on conditional independences arrive at non-trivial conclusions, yet for which the lack of uniform consistency can result in arbitrarily large prediction errors.
Estimating the strength of causal effects from observational data is a common problem in scientif... more Estimating the strength of causal effects from observational data is a common problem in scientific research. A popular approach is based on exploiting observed conditional in-dependences between variables. It is well-known that this approach relies on the assumption of faithfulness. In our opinion, an even more important practical limitation of this approach is that it relies on the ability to distinguish independences from (arbitrarily weak) dependences. We present a simple analysis, based on purely algebraic and geometrical arguments, of how the estimation of the causal effect strength, based on conditional independence tests and background knowledge, can have an arbitrarily large error due to the uncontrollable type II error of a single conditional independence test. The scenario we are studying here is related to the LCD algorithm by Cooper [1] and to the instrumental variable setting that is popular in epidemiology and econometry. It is one of the simplest settings in which causal discovery and prediction methods based on conditional independences arrive at non-trivial conclusions, yet for which the lack of uniform consistency can result in arbitrarily large prediction errors.
Uploads
Papers by Joris Mooij