4
$\begingroup$

I am interested in comparing the performance of two different techniques (i.e., Method A vs. Method B) on a continuous outcome of interest. The performance of such techniques is measured through test scores of the participants (e.g., the Participant 1 passed 30 questions out of 50 with Method A and 32 with Method B, the Participant 2 passed 22 questions out of 50 with Method A, and 20 with Method B etc.). Most studies report the mean number of questions passed with each technique. Rarely the standard deviation. Different studies use a different number of questions or methods for measuring the outcome.

I managed to back-compute the standard deviations from studies' box-plots so I could run a meta-analysis with Cohen's d.

I also thought about running a meta-analysis with response ratios (i.e., mean(A)/mean(B)) to check the robustness of results to the effect size used. However, Borenstein et al. say: "The response ratio is not meaningful for studies (such as most social science studies) that measure outcomes such as test scores, attitude measures, or judgments, since these have no natural scale units and no natural zero points".

I do not understand why. To me the ratio between the means makes complete sense (i.e., Method A outperforms Method B to a 200%, if Method B's mean is twice as high as that of Method A and viceversa, regardless of the scales or number of questions used in the studies).

What is wrong with response ratios in this case?

Borenstein, M., Hedges, L. V., Higgins, J. P., & Rothstein, H. R. (2011). Introduction to meta-analysis. John Wiley & Sons.

$\endgroup$
1
  • 1
    $\begingroup$ Surely proportions have a true zero? And it could be argued they have a natural scale unit - one percentage point is one percentage point everywhere. $\endgroup$
    – mdewey
    Commented Nov 26, 2018 at 14:02

1 Answer 1

2
$\begingroup$

Hmm...Even in meta-analysis, you should compare apples with apples and oranges with oranges. In your setting, it seems that you are comparing vastly different studies, which use different numbers of questions and methods.

Let's say that in study #1, Method B outperforms Method A by an additive amount x1 or by a multiplicative factor f1, while in study #2, Method B outperforms Method A by an additive amount x2 or by a multiplicative factor f2 (depending on how you define your effect). Let's also say that x2 > x1 and f2 > f1. How are you going to be able to interpret this?

Does this happen because Method B is genuinely better than Method A in both studies? Or does it happen because study # 1 used fewer questions or poorer methods than study # 2, which gave an unfair advantage to study # 2?

For interpretation purposes, because the two studies are not similar in terms of number of questions and methods, you just can't untangle the effect of Method B relative to Method A from the effect of the number of questions and the effect of the methods being used. So, if you want meaningful interpretation of study-specific effects across studies (which in turns would lead to meaningful interpretation of the overall effect you'll report), you'll have to perhaps control in your meta-analysis for the number of questions utilized in your study as well as the method utilized in your study (presuming you have enough data to do it).

As it is right now, defining your study-specific effect differently (difference vs. ratio) will not change the fact that you have control for those elements of study design which muddy the waters and make interpretation difficult. This will place you into a meta-regression setting.

In terms of what study-specific effect to feed into your meta-analysis/meta-regression, the overall study effect you settle on should be conceptually related to the effects that were reported in each study. If each study reported their effect as a difference (e.g., mean difference), it makes sense to report an overall effect across studies in your meta-analysis that is a (standardized?) mean difference. This way, when a researcher who plans a similar study looks at your meta-analysis results, they can easily map them back to their own study setting.

$\endgroup$
2
  • 1
    $\begingroup$ In each study Method A and Method B are compared under identical circumstances. Method A never uses fewer or poorer questions than Method B. Different questions are used in each study. No muddy waters design-wise. Just a pure statistical question. $\endgroup$ Commented Nov 26, 2018 at 14:51
  • $\begingroup$ Thanks for your clarifications, Adrian! I corrected my answers - please see all my corrections in bold above. My last paragraph explains why using a ratio wouldn't reflect the original intent of the study authors, who reported effects in terms of a difference, not a ratio. $\endgroup$ Commented Nov 26, 2018 at 15:40

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.