I'm just checking that I understand things correctly. If one wants Cohen's d-style effect sizes, and you have estimates from a GLM -- say a logit model -- there is no such thing--right? Because there is no error variance to reference.
Nonetheless, I came to the startling and upsetting realization that it's easy for users of the emmeans package to think they are computing Cohen's d in such situations, just by following examples in the documentation. For example:
require(emmeans)
## Loading required package: emmeans
neuralgia.glm = glm(formula = Pain ~ Treatment * Sex + Age,
family = binomial(), data = neuralgia)
EMM = emmeans(neuralgia.glm, ~ Treatment|Sex)
eff_size(EMM, sigma = sigma(neuralgia.glm), edf = df.residual(neuralgia.glm))
## Sex = F:
## contrast effect.size SE df asymp.LCL asymp.UCL
## A - B 0.963 1.71 Inf -2.38 4.306
## A - P -2.951 1.38 Inf -5.66 -0.238
## B - P -3.915 1.57 Inf -7.00 -0.829
##
## Sex = M:
## contrast effect.size SE df asymp.LCL asymp.UCL
## A - B 0.430 1.14 Inf -1.80 2.663
## A - P -3.705 1.59 Inf -6.83 -0.584
## B - P -4.135 1.69 Inf -7.44 -0.829
##
## sigma used for effect sizes: 0.9578
## Confidence level used: 0.95
Created on 2023-02-22 with reprex v2.0.2
This "works" because sigma()
and df.residual()
do return answers, albeit inappropriate ones. sigma(neuralgia.glm)
returns the square root of the residual deviance divided by its d.f. -- and that is a measure of model fit, not anything like an appropriate reference for the effects in question.
Since I'm the emmeans developer, I think I can do something to keep users from doing this; it's just a shock that I have unwittingly enabled the production of meaningless results. And it sickens me to imagine how many times this has already been done.
And in fact I am kind of anti-effect-size in general. I provide the eff_size()
function mostly because I figured out a way to incorporate the uncertainty in both the numerator of a ratio, and I figured that if people would look at those intervals (which they don't, but if they did...) they'd realize that in most cases, they'd be amazed how huge a range of Cohen's d values they have, and thus be persuaded (correctly, in my view) to decline to report them.
Please note that I am not asking for what are appropriate effect size measures for a GLM. I am sure there are some, but that's of little interest to me, and I do see in the answer to another posting on a similar subject a suggestion that one use the odds ratios as effect-size measures. That can certainly be done:
confint(contrast(EMM, "pairwise", type = "response"))
## Sex = F:
## contrast odds.ratio SE df asymp.LCL asymp.UCL
## A / B 2.5154 4.1026 Inf 0.055020 115.002
## A / P 0.0592 0.0782 Inf 0.002680 1.308
## B / P 0.0235 0.0351 Inf 0.000711 0.779
##
## Sex = M:
## contrast odds.ratio SE df asymp.LCL asymp.UCL
## A / B 1.5089 1.6461 Inf 0.117026 19.456
## A / P 0.0288 0.0434 Inf 0.000839 0.987
## B / P 0.0191 0.0306 Inf 0.000443 0.819
##
## Confidence level used: 0.95
## Conf-level adjustment: tukey method for comparing a family of 3 estimates
## Intervals are back-transformed from the log odds ratio scale
To understand how useful these results are, take a look at those confidence intervals!
effect_size()
function. $\endgroup$