Guyatt-2017-GRADE Guidelines 17 - Assessing The

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Journal of Clinical Epidemiology 87 (2017) 14e22

GRADE guidelines 17: assessing the risk of bias associated with missing
participant outcome data in a body of evidence
Gordon H. Guyatta,b, Shanil Ebrahima,c, Pablo Alonso-Coelloa,d, Bradley C. Johnstona,c,e,f,
Alexander G. Mathioudakisd, Matthias Briela,g, Reem A. Mustafaa,h, Xin Suni,
Stephen D. Waltera, Diane Heels-Ansdella, Ignacio Neumannj, Lara A. Kahalek, Alfonso Iorioa,b,
Joerg Meerpohll,m, Holger J. Sch€unemanna,b, Elie A. Akla,k,*
a
Department of Health Research Methods, Evidence and Impact, McMaster University, 1200 Main St. West, Hamilton L8S 4K1, Canada
b
Department of Medicine, McMaster University, 1200 Main St. West, Hamilton L8S 4K1, Canada
c
Systematic Overviews through Advancing Research Technology (SORT), Child Health Evaluative Sciences, The Hospital for Sick Children Research
Institute, 555 University Ave, Toronto, ON M5G 1X8, Canada
d
Iberoamerican Cochrane Centre, CIBERESP-IIB Sant Pau, Casa de Convalescencia, 4 th floor, C. Sant Antoni Maria Claret 171, Barcelona 08041, Spain
e
Department of Anesthesia and Pain Medicine, The Hospital for Sick Children, University of Toronto, 555 University Ave, Toronto, ON M5G 1X8, Canada
f
Institute for Health Policy, Management and Evaluation, Dalla Lana School of Public Health, University of Toronto, 155 College St, Toronto,
ON M5T 3M7, Canada
g
Department of Clinical Research, Basel Institute for Clinical Epidemiology and Biostatistics, University Hospital Basel, Hebelstrasse 10,
Basel 4056, Switzerland
h
Department of Internal Medicine, Kansas University Medical Center, 3901 Rainbow Blvd, Kansas City, KS MS3002, USA
i
Chinese Evidence-based Medicine Center, West China Hospital, Sichuan University, Chengdu 610041, China
j
Department of Internal Medicine, Pontificia Universidad Catolica de Chile, Av Libertador Bernardo O’Higgins 340, Santiago, Region Metropolitana, Chile
k
Department of Internal Medicine, American University of Beirut, Riad-El-Solh Beirut, Beirut 1107 2020, Lebanon
l
Cochrane Germany, Medical CentereUniversity of Freiburg, Breisacher Strasse 153, Freiburg 79110, Germany
m
Centre de Recherche Epid  emiologie et Statistique Sorbonne Paris CiteeU1153, Inserm/Universite Paris Descartes, Cochrane France, H^opital H^otel-Dieu,
1 place du Parvis Notre Dame, Paris Cedex 04 75181, France
Accepted 2 May 2017; Published online 18 May 2017

Abstract
Objective: To provide GRADE guidance for assessing risk of bias across an entire body of evidence consequent on missing
data for systematic reviews of both binary and continuous outcomes.
Study Design and Setting: Systematic survey of published methodological research, iterative discussions, testing in systematic
reviews, and feedback from the GRADE Working Group.
Results: Approaches begin with a primary meta-analysis using a complete case analysis followed by sensitivity meta-analyses
imputing, in each study, data for those with missing data, and then pooling across studies. For binary outcomes, we suggest
use of ‘‘plausible worst case’’ in which review authors assume that those with missing data in treatment arms have
proportionally higher event rates than those followed successfully. For continuous outcomes, imputed mean values come from
other studies within the systematic review and the standard deviation (SD) from the median SDs of the control arms of all
studies.
Conclusions: If the results of the primary meta-analysis are robust to the most extreme assumptions viewed as plausible, one
does not rate down certainty in the evidence for risk of bias due to missing participant outcome data. If the results prove not

by a Fellowship in Guidelines Methodology by European Respiratory


Conflict of interest: All the authors have completed the ICMJE
Society (MTF 2015-01). The funders were not involved in study
uniform disclosure form and declare no support from any organization
design and the collection, analysis, and interpretation of data and the
for the submitted work and no financial relationships with any organi-
writing of the article and the decision to submit it for publication.
zations that might have an interest in the submitted work in the pre-
The researchers are independent from funders and had full access to
vious 3 years. They declare being involved in previous publications
all the data.
making recommendations on the topic missing participant outcome
* Corresponding author. Department of Internal Medicine, Clin-
data.
ical Epidemiology Unit, American University of Beirut Medical Cen-
Funding: This study is part of a project on addressing missing trial
ter, P.O. Box: 11-0236, Riad-El-Solh Beirut 1107, 2020 Beirut,
participant data in systematic reviews funded by the Cochrane Collabo-
Lebanon.
ration. P.A.-C. was funded by a Miguel Servet research contract from
E-mail address: [email protected] (E.A. Akl).
the Instituto de Salud Carlos III (CP16/00137). A.G.M. was funded

http://dx.doi.org/10.1016/j.jclinepi.2017.05.005
0895-4356/Ó 2017 Elsevier Inc. All rights reserved.
G.H. Guyatt et al. / Journal of Clinical Epidemiology 87 (2017) 14e22 15

robust to plausible assumptions, one would rate down certainty in the evidence for risk of bias. Ó 2017 Elsevier Inc. All rights
reserved.

Keywords: GRADE; Missing participant data; Risk of bias; Systematic reviews; Trials

1. Introduction addressing possible approaches to handling missing data


when conducting a meta-analysis [7e9]. Iterative discus-
The extent to which risk of bias associated with missing sions among the investigators and testing our approaches
participant outcome data (hereafter, missing data) reduce
in a number of systematic reviews completed the process.
confidence in results represents a key issue for all systematic
The GRADE Working Group reviewed the approaches
reviews [1,2]. Currently, the Cochrane Collaboration Hand-
at a meeting in Vienna in October 2015, providing feedback
book [3] focuses on determining whether individual studies
that led to modifications from what had been previously
are at low or high risk of bias with respect to missing data.
published. The Working Group reviewed the resulting mod-
When considering whether to rate down for risk of bias
ifications, and a draft of this study, at a subsequent meeting
across an entire body of evidence, this approach suffers lim-
in May 2016 and there approved the approaches as GRADE
itations. Assume, for instance, that one sets a threshold of guidance.
10% missing data for high risk of bias, and of six studies
in a meta-analysis, three have no missing data and three have
12% missing data. How is one to decide whether, across the
3. Scope and definitions
entire body of evidence, one shoulddor should notdrate
down for risk of bias due to missing participant data? This guide is for meta-analyses of trial-level data and
Sensitivity meta-analyses based on different assumptions does not address methods for meta-analyses of individual
can address these issues, particularly if such analyses consider participant data that may be available to investigators. We
issues beyond simply the frequency of missing data, such as deal only with missing data and not other elements of risk
the event rate in the intervention and control groups, the dis- of bias in a body of evidence (e.g., allocation concealment,
tribution of missing data in intervention and control groups, blinding) that systematic review authors must address.
and the reasons for missingness. The Cochrane Handbook We define participant outcome data as ‘‘missing’’ if they
encourages such analyses but, with respect to missing data, are unavailable to the reviewers; that is, unavailable to
does not provide specific guidance regarding how to proceed. investigators of the primary studies, or available to the
Three prior publications have filled this gap by present- primary study investigators but not included in published
ing approaches for systematic reviews of randomized trials reports and not provided after inquiry. A common problem
to address missing data for binary [4] and continuous out- when dealing with missing data is identifying whether a
comes [5,6]. With some modifications, the GRADE Work- group of participants (e.g., those who withdrew consent or
ing Group has endorsed these approaches as GRADE violated the protocol) have missing data or not [10e12].
guidance to assess the risk of bias associated with missing Another problem is that the trial authors are sometimes
data in systematic reviews. In this article, we summarize not clear about how they dealt with participants missing data
our modified approaches, providing sufficient detail for in their analysis (e.g., excluded them, or made assumptions)
their application, and provide several illustrative examples. [10e13]. Before applying our approach, we recommend
We present approaches for three situations: binary out- making all possible efforts to obtain unreported but poten-
comes; continuous outcomes in which all studies have used tially available outcome data from primary study authors,
the same instruments; and continuous outcomes in which or at least understand how they dealt with missing data.
studies have used different instruments to measure the same For conceptual clarity, we distinguish the issue of
construct. In each case, the goal is to make inferences for handling of missing participant outcome data from that of
the entire body of evidence for a particular outcome with intention to treat (ITT) analysis [14]. The basic principle
respect to risk of bias. Within the GRADE framework, of ITT involves analyzing participants with available data
the issue is whether reviewers should rate down certainty in the arm to which they were randomized. A methodolog-
in the evidence (quality of evidence, or confidence in ical survey found a large variation in the definition of ITT:
evidence) for risk of bias due to missing data. some suggest ITT is only possible with complete follow-up;
some demand imputation of missing data for an ITT anal-
ysis; and some take our position that ITT should be
2. Development of methods
restricted to how one handles participants with available
In developing our approaches, we formed a group data, and that dealing with missing data should be treated
consisting of clinical epidemiologists, methodologists, and as a separate issue [7]. Thus, what follows begins with a
biostatisticians, all with extensive experience in systematic complete case analysis and deals with missing data as a
reviews. We conducted a systematic survey of the literature separate issue best addressed in sensitivity analyses.
16 G.H. Guyatt et al. / Journal of Clinical Epidemiology 87 (2017) 14e22

missing data are less favorable to the intervention than


What is new? results from participants for whom the data are available.
One then pools across studies to determine the impact on
Key findings the point estimate and confidence interval.
 When assessing risk of bias associated with partic- For outcomes of harm (i.e., that suggest treated patients are
ipant outcome data across an entire body of evi- worse of), one may challenge in a similar way the inference
dence, we propose using a complete case analysis that apparent harm with respect to a particular outcome does
for the primary meta-analysis. indeed represent a real effect. To do so, one imputes data
attributing a lower rate of adverse events in the treatment
 When the results of the primary meta-analysis sug-
group. Alternatively, or in addition, one may attribute a higher
gest a statistically significant treatment effect,
rate of adverse events in the control group to participants with
conduct sensitivity meta-analyses using plausible
missing data than in those in whom the data are available.
assumptions to impute events in participants with
In addition, one may be interested in the robustness of
missing outcome data in each study, and then pool
inferences that an intervention is not harmful. To address
across studies.
this issue, one would impute data suggesting a higher rate
of adverse events in the treatment group. Alternatively, or
What is the implication and what should change
in addition, one may attribute a lower rate in the control
now?
group among participants with missing data than in those
 If the results of the primary meta-analysis are
in whom the data are available.
robust to the most extreme plausible assumptions,
Finally, one may challenge failure to establish benefit.
one does not rate down certainty in the evidence
This would involve imputing a higher success rate in
for risk of bias due to missing participant outcome
treatment group patients with missing data than in those
data.
followed and/or a lower success rate in control patients with
 If the results are not robust to plausible assump- missing data than in those followed.
tions, one would rate down certainty in the evi-
dence for risk of bias.
5. Binary outcomes
5.1. Traditional imputations
There are many possible ways to impute missing data in
individual primary studies. One might assume that all
4. Common elements of the approaches participants with missing data in either group had events,
that no participants with missing data had events, or a
We recommend, as do other authors who have written worst-case scenario in which all participants with missing
about the issue of missing data in the context of meta- data in the intervention group suffered adverse events but
analyses, that systematic review authors’ primary anal- none of the participants in the control group suffered such
ysis include only those for whom data are available events. That worst-case scenario calculation assumes that
(complete case analysis) [7]. An alternative is to use the results of the primary analysis are suggesting the inter-
imputation approaches for the primary analysis, an op- vention reduces the incidence of the outcome of interest.
tion that is particularly attractive if investigators have
strong hypotheses regarding the direction and magnitude
5.2. Imputations using ratios
of bias associated with missing data. Generating these
alternative estimates requires considering the uncertainty Our suggested imputation strategy is based on making as-
associated with imputation and this consideration de- sumptions regarding the events in those with missing data as
mands sophisticated statistical approaches. Such ap- a ratio relative to those with available data in the same arm.
proaches are now available for both binary [15] and Three such ratios have been proposed: the incidence of
continuous variables [16]. outcome events in participants with missing data relative
For outcomes of putative benefit of an experimental to those with complete follow-up (RIMPD/FU) [17], the infor-
intervention, we recommend the approaches primarily, if mative missingness odds ratio (IMOR) [15,18,19], and the
not exclusively, for meta-analyses in which the results Bayesian version of the IMOR [20]. In this study, we use
suggest a statistically significant treatment effect. The pur- RIMPD/FU when providing illustrative examples. In positive
pose of the analyses is to challenge the robustness of the trials, one might challenge the robustness of the results by
inference that a benefit with respect to a particular outcome imputing RIMPD/FU O 1 in the intervention group and
does indeed exist. The approaches involve a series of RIMPD/FU ! 1 in the control group. For instance, an event
progressively more stringent imputations of data in primary rate of 10% in participants with available data and a
studies, postulating that results from participants with RIMPD/FU of 1.5 would result in an imputed event rate in
G.H. Guyatt et al. / Journal of Clinical Epidemiology 87 (2017) 14e22 17

those with missing data of 15%. An event rate of 20% in participants that were successfully followed. For the probiot-
control participants with available data with a RIMPD/FU of ic group, we recalculated pooled treatment effects by using
0.5 would result in an imputed event rate in those with our assumed RI in participants with missing data compared
missing data of 10%. Similarly to the RIMPD/FU, the IMOR with those who were successfully followed using the
describes the relationship between the unknown odds among following assumptions: RIMPD/FU 1.5, 2.0, 3.0, and 5.0.
participants with missing data and the known odds among Our results proved robust to each of the RI assumptions,
participants with available data [15]. It differs in the use of and even with the 5.0 ratio, the probiotic effect remained
odds instead of risks. large and the 95% CI narrow (relative risk, 0.50
In trials suggesting an apparent benefit, for the sake of [0.34e0.76]) (Appendix Fig. 1 at www.jclinepi.com).
simplicity, we suggest a constant RIMPD/FU of 1.0 for con- The Appendix at www.jclinepi.com provides another
trol group missing participants (i.e., assume the same event example of applying the method to a benefit outcome with
rate in those with missing as those with available data). For binary data and shows how the decision to rate down certainty
treatment group participants with missing data, one might in the evidence for risk of bias due to missing data can
start with the least stringent assumptions (for instance a vary across outcomes within the same study (Example 2,
RIMPD/FU of 1.5) and repeat the meta-analysis with the Figures 2 and 3 in Appendix at www.jclinepi.com).
associated individual primary study results. If imputed data
do not materially affect the results (in particular, confi- 5.4. Application to harms
dence intervals continue to exclude a null effect), one
might then examine the impact of progressively more strin- One could apply a similar approach to outcomes for
gent but less plausible assumptions (RIMPD/FU of up to 3.0, which the results suggest harm with the experimental treat-
or possibly 5.0). ment but, in this case, impute a RIMPD/FU of less than 1.0 to
We have used 5.0 as the most stringent but still plausible treatment and 1.0 to control. Our suggestion, in parallel to
RIMPD/FU because we identified one study in which partici- that for benefit outcomes, is to assume RIMPD/FU of 1.0 for
pants lost to follow-up were subsequently found to have control, and a value as low as 0.20 in the intervention
had five times the rate of events than followed-up participants, group. Alternatively, one could impute a RIMPD/FU for the
but none that reported a higher ratio [21]. We refer to the intervention group and RIMPD/FU of O1.0 for the control
meta-analysis using the plausible most stringent RIMPD/FU group. Example 3 in the Appendix at www.jclinepi.com
as the ‘‘plausible worst case.’’ The reviewers should ideally provides an illustration of use of both options.
select the value of the plausible most stringent RIMPD/FU a pri-
ori. The choice will be based on factors such as the clinical 5.5. Application to nonstatistically significant results
scenario (e.g., higher value of RIMPD/FU in a trial of cardiac
One could also apply the approach to determine if findings
transplant in which participants are more likely to have suf-
of no increase in harm are robust. This would involve the
fered a bad outcome if lost to follow-up), and the baseline
same approach as in the benefit setting: assume a RIMPD/FU
prognostic profile of participants with missing participant
of 1.0 in control participants with missing data and O1.0 in
outcome data, when reported.
treatment group participants with missing data, possibly as
To the extent that pooled estimates remain similar when
high as 5.0. Again, one would examine whether results
making progressively more stringent assumptions (and in
change appreciably and in particular whether previous results
particular, results remain statistically significant), one
that were not significant become significant. Appendix
would conclude that the results are robust to the missing
Example 4 at www.jclinepi.com provides an illustration.
data and, in the GRADE framework, not rate down
We have a created a freely downloadable Excel document
certainty in the evidence for risk of bias. If results change
that allows a systematic review author to determine the numer-
materially, and particularly if statistical significance is lost,
ators and denominators to be used for each trial included in the
one would rate down certainty in the evidence for risk of
meta-analysis according to the selected assumptions www.
bias due to missing data. In general, one would be more
dropbox.com/s/opstwgm45qiq57k/Assumptions%20about%
willing to rate down if significance is lost with the less
20MPD%20v5.xls?dl50.
stringent assumptions.

5.3. Illustrative examples 6. Binary outcomesdchoosing the stringency of the


imputations
We have used this approach in several recently published
Cochrane and non-Cochrane reviews [22e32]. One of these Investigators using our approaches will need to decide
studies assessed probiotics for the prevention of Clostridium on which extreme a RIMPD/FU they are willing to consider
difficile infection (CDI) [28]. In 13 of 20 included random- plausible. The choice will be based on factors such as the
ized trials, data on CDI were missing for 5% to 45% of par- clinical scenario (e.g., higher value of RIMPD/FU in a trial
ticipants. We assumed that the event rate was the same of cardiac transplant in which participants are more likely
among control group participants with missing data and to have suffered a bad outcome if lost to follow-up).
18 G.H. Guyatt et al. / Journal of Clinical Epidemiology 87 (2017) 14e22

Another consideration will be the frequency of the event of We tested a number of sources of measures of variability
interest. If it is infrequent (say, 5%), it may be reasonable (SDs) for the imputed data and found they yielded very
to assume a maximum RIMPD/FU of 5, and thus an event similar results. We therefore suggest the simplest and most
rate in those with missing data of 25%. If it is frequent plausible source of data, the median SD in the control
(say 40%), a RIMPD/FU of even three results in a 100% group of all included trials.
event rate in those lost. One may conclude that a rate of To generate a pooled estimate across trials using the
100% is not plausible, in which case a maximum RIMPD/ imputed data, we suggest, for each arm in each trial, pool-
FU of only 2 may be appropriate. ing the observed means and SDs of the participants with
available data with the imputed means and SDs for partic-
ipants with missing data using the following formulas:

7. Continuous outcomesdall studies using the same ðMFTi  nFTi Þ þ ðMLTi  nLTi Þ
1: MXTi 5
measure nFTi þ nLTi
ðMFCi  nFCi Þ þ ðMLCi  nLCi Þ
Addressing risk of bias consequent on missing data in sys- 2: MXCi 5
tematic reviews addressing continuous outcomes provides nFCi þ nLCi
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
additional challenges, including the necessity of imputing
ðnFTi  1ÞSD2FTi þ ðnLTi  1ÞSD2LTi
both means and standard deviations (SDs). Once again, we 3: SDXTi 5
suggest the primary meta-analysis used only participants nFTi þ nLTi  2
with available outcome data (complete case). When pooled sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
estimates are statistically significant, we suggest sensitivity ðnFCi  1ÞSD2FCi þ ðnLCi  1ÞSD2LCi
4: SDXCi 5
meta-analyses imputing outcome data that are missing, to nFCi þ nLCi  2
challenge the robustness of these pooled estimates. 5: nXTi 5 nFTi þ nLTi
To impute means, we consider five possible sources of
data. In characterizing these sources, we use ‘‘best’’ to 6: nXCi 5 nFCi þ nLCi
describe the most desirable health state (which could be a
high or low score) and ‘‘worst’’ to describe the least desir- where ‘‘M’’ represents the mean, ‘‘SD’’ the standard deviation,
able health state. ‘‘n’’ the group size, ‘‘X’’ the combined estimates, ‘‘F’’ the
followed-up group, ‘‘L’’ the lost to follow-up group, ‘‘T’’ the
A. The best mean score among the intervention arms of treatment group, ‘‘C’’ the control group, and ‘‘i’’ the trial.
the eligible trials. For each study, one can then calculate the treatment
B. The best mean score among the control arms of the effectda mean differencedby combining means and SDs
eligible trials. from the treatment and control arms using a fixed-effects
C. The mean score from the control arm of the trial un- model. One can then pool treatment effects across studies
der consideration. using, according to one’s preference, either a standard
D. The worst mean score among the intervention arms of fixed-effect or random-effects meta-analysis, to generate
the eligible trials. the mean difference across all included studies.
E. The worst mean score among the control arms of the As was the case for the approach to binary data, if results
eligible trials. were robust (statistical significance maintained even with
the most stringent assumptions one considers plausible),
To test the robustness of a pooled estimate showing an
one would not, within the GRADE framework, rate down
apparent benefit, using the five suggested sources of data
certainty in the evidence for risk of bias. If statistical signif-
mentioned previously, we recommend four imputation
icance were lost for any of the more stringent plausible
strategies that will almost always be progressively more
approach, one would rate down. Our prior papers [4e6]
stringent. Table 1 provides a matrix describing the four
provide examples of use of the approach to challenging
strategies:
the robustness of findings of apparent benefit, as do
 Strategy 1 uses source C for missing data in both the Examples 5 and 6 in the Appendix at www.jclinepi.com.
intervention and control arms. One could apply a similar approach to harm outcomes in
 Strategy 2 uses source D for missing data in the inter- which the results suggest harm with the experimental treat-
vention arm, and source B for missing data in the con- ment. In this case, the approach would involve imputing
trol arm. more favorable results (less harm) to those in the interven-
 Strategy 3 uses source E for missing data in the tion group with missing data, and less favorable results to
intervention arm, and source B for missing data in control group participants with missing data. The most
the control arm. extreme challenge would be to attribute the best mean
 Strategy 4 uses source E for those with missing data available from either group to intervention participants with
in the intervention arm, and source A for missing data missing data, and the worst intervention group mean to
in the control arm. control participants with missing data.
G.H. Guyatt et al. / Journal of Clinical Epidemiology 87 (2017) 14e22 19

Table 1. Matrix of assumptions for participants with missing data for continuous outcomes in interventions and controls arms

One could also apply the approach to determine if find- naire, and the St. Georges Respiratory Questionnaire) [33].
ings of no (statistically significant) increase in harm are The use of different instruments requires a modification of
robust. In this case, the approach would involve imputing the methods described in the previous section.
unfavorable results (greater harm) to those in the interven- We suggest, for this modified approach, choosing a single
tion group with missing data, and favorable results to con- reference measurement instrument, converting scores from
trol group participants with missing data. The most extreme different instruments to the units of the reference
challenge would be to attribute the worst mean (whether it instrument, and then proceeding with imputation of missing
comes from intervention or control) to intervention partic- values, combining the available data with estimates from the
ipants with missing data, and the best mean (whether from missing data for each study, and then pooling across studies.
intervention or control) to control participants with missing Alternatively, one might proceed exactly as in the
data. Example 7 in the Appendix at www.jclinepi.com example when all studies use the same instrument, but
provides an illustration. instead of natural units use the standardized mean differ-
We have a created a freely downloadable Excel document ence (SMD). Because of limitations of the SMD both with
that allows a systematic review author to determine the means respect to vulnerability to varying between-study heteroge-
and SDs to be used for each trial included in the meta- neity, and its interpretability [34], we prefer to base
analysis according to the selected assumptions per strategy: calculations on choosing a single reference instrument as
https://www.dropbox.com/s/3ie12qfwjnfwx0z/MPD%20for% described in the following.
20continuous%20outcomes_Template.xlsx?dl50. We suggest two key criteria when choosing the reference
instrument. The first is its frequency of use, and thus its
familiarity to the target audience. The second criterion is
8. Continuous outcomesdstudies using different
the measurement properties of the instrument. In the
measures
context of clinical trials, the key measurement properties
For certain continuous outcomes and in particular are instrument longitudinal validity (correlations of change
participant-important outcomes focusing on issues such as with other related measures), responsiveness (ability to
health-related quality of life (HRQL), clinical trial investi- detect important change over time, even if that change is
gators often choose alternative measures of the same under- small), and interpretability (typically, an established
lying construct. For example, there are at least five anchor-based minimally important difference) [35]. Details
instruments available to measure HRQL in participants of the application of the approach follow.
with chronic obstructive pulmonary disease (COPD) Once one has chosen the reference instrument, one must
(Chronic Respiratory Questionnaire, Clinical COPD convert all results into the units of that instrument. Let us
Questionnaire, Pulmonary Functional Status and Dyspnea say that A represents the reference instrument and B repre-
Questionnaire, Seattle Obstructive Lung Disease Question- sents an alternative instrument. To convert B units to A
20 G.H. Guyatt et al. / Journal of Clinical Epidemiology 87 (2017) 14e22

units, one first converts the means and SDs of the scores 10. Dealing with limitations in reporting
from instrument B to the units of instrument A [36] using
Systematic review authors will find challenges when
the following formula:
authors of primary studies fail to adequately report missing
MAi 5 ðMBi  LB Þ  ðRA ORBi Þ þ LA and data. [10] For example, trial authors may not clearly report
SDAi 5 SDBi  ðRAi ORBi Þ; whether they imputed outcomes for participants with
missing data. Consequently, a sensitivity analysis making
where M represents the mean, LA and LB represent the imputations for participants with missing data risks double
worst possible outcome score of instrument A and B, counting. Elsewhere, we have described in detail the solu-
respectively, RA and RB the ranges (the highest possible tions for a number of these challenges [10]. For trials in
outcome score minus the lowest possible outcome score) which authors do not report the frequency of missing data,
for instruments A and B, respectively, and i the trial. One we suggest using the median missing data rate from all trials
applies these formulas separately to the intervention group included in the review. If one perceives this assumption is too
and the control group of each trial [6]. One then proceeds stringent, alternatives include a sensitivity analysis using a
exactly as in the previous section using the converted score. missing participant data rate of zero in both arms.
For trials in which authors fail to report missing data for
each study arm and report total missing data only, we sug-
gest assuming the same rate of missing data in both inter-
9. Alternative threshold for rating down for risk of vention and control groups. For trials in which the
bias: the context of health care guidelines authors report a single imputed analysis only, we suggest
using the imputed results for both primary and sensitivity
In the discussion thus far, we have suggested an
analyses. Reviewers should acknowledge such limitations
approach to rating down using only one threshold: the
when discussing the results of sensitivity analyses related
95% confidence interval includes a relative effect of 1.0,
to missing data.
or an absolute difference of 0. This threshold corresponds
to the P-value including the traditional boundary of 0.05.
This is not the only threshold one might use. Instead,
11. Discussion
one might choose the smallest effect that patients are likely
to consider important and apply the approach to that We have developed structured and transparent
threshold. approaches to determine the extent to which missing data
For instance, consider the outcome of prevention of a across an entire of evidence introduce risk of bias and thus
myocardial infarction. Even for an intervention associated threaten the certainty in the evidence in systematic reviews.
with small burden and toxicity, patients are unlikely to Our approaches to binary outcomes, and to continuous data
choose the treatment if effects were very small (e.g., a when all studies use the same outcome measure, do not
reduction in infarction of only 1, or perhaps even 5 in require a high level of statistical sophistication, and can
1,000). If, however, the intervention is associated with large be carried out relatively easily in many statistical programs
burden and toxicity, the threshold would be much higher including RevMan which is the software developed by the
(10, or perhaps even 20 or more in 1,000). Cochrane collaboration used for preparing and maintaining
Applying this logic to the latter situation and choosing a Cochrane Reviews including text, characteristics of studies,
threshold of 20 in 1,000 were the boundary of the confidence comparison tables, and meta-analysis. Our approach to
interval closest to no effect to remain greater than 20 in 1,000 continuous data when studies use different outcome mea-
for even the most stringent imputation, one would not rate sures begins with converting all instruments to the units
down certainty in the evidence for risk of bias. If, however, of a common instrument, requires greater statistical sophis-
the confidence interval in an imputation considered plausible tication, but is nevertheless straightforward. The ap-
included the threshold of benefit of 20 in 1,000 (i.e., included proaches have received GRADE working group
reductions in infarction of less than 20 in 1,000, even if it re- endorsement, and their use in any systematic review using
mained above an effect of 0), one would rate down certainty GRADE approaches would be desirable.
in the evidence for risk of bias. The analyses that we describe are sensitivity analyses
Because choosing a threshold other than no effect designed to facilitate inferences regarding risk of bias,
involves a value judgmentdthe choice depends on the rather than to generate alternative best estimates of inter-
importance placed on the target outcome (in the example vention effects. Thus, the approaches do not need to deal
myocardial infarction) and the importance placed on the with the uncertainty associated with the imputed values.
burden and toxicitydthis approach may be best applied The approaches assume that investigators have little idea
in the context of a meta-analysis associated with a health about the direction that bias as a result of missing data may
care guideline. It will also be restricted to consideration take, hence the use of complete case approach in the
of absolute rather than relative effects. We have applied this primary meta-analysis. If investigators opt to make imputa-
approach presented in one of our prior articles [6]. tions in the primary analyses (as discussed earlier), they
G.H. Guyatt et al. / Journal of Clinical Epidemiology 87 (2017) 14e22 21

should consider the uncertainty associated with imputation and design of the study. G.H.G., S.E., P.A.-C., B.C.J.,
using the appropriate statistical approaches for both binary A.G.M., M.B., R.A.M., X.S., S.D.W., D.H.-A., I.N., L.A.K.,
[15] and continuous variables [16]. A.I., J.M., H.J.S., and E.A.A. contributed to the analysis
Our approaches all require judgment regarding what is and interpretation of the data. S.D.W. and D.H.-A. contributed
and is not plausible; judgments some may find arbitrary. to the statistical expertise. G.H.G. contributed to drafting of
Our approaches do, however, permit multiple progressively the article. G.H.G., S.E., P.A.-C., B.C.J., A.G.M., M.B.,
more stringent sensitivity analyses. This allows R.A.M., X.S., S.D.W., D.H.-A., I.N., L.A.K., A.I., J.M.,
investigatorsdand users of meta-analysesdto choose the H.J.S., and E.A.A. contributed to the critical revision of the
most extreme threshold that they consider plausible and article for important intellectual content. G.H.G., S.E.,
then determine whether results are robust to that threshold. P.A.-C., B.C.J., A.G.M., M.B., R.A.M., X.S., S.D.W.,
We make specific suggestions for thresholds of D.H.-A., I.N., L.A.K., A.I., J.M., H.J.S., and E.A.A. contrib-
plausibilitydthresholds other than those we suggest may be uted to the final approval of the article.
more appropriate in individual meta-analyses. For instance,
for continuous variables, one might not choose the most
extreme results from other studies, but results adjacent to Supplementary data
or near the extremes. Investigators concerned about confi-
dence intervals being excessively narrow as a result of not Supplementary data related to this article can be found at
taking into account uncertainty in imputations may choose http://dx.doi.org/10.1016/j.jclinepi.2017.05.005.
more stringent thresholds. In general, for any particular
meta-analysis, those who consider extremes more plausible References
will be more likely to rate down the certainty of the evidence
for risk of bias due to missing data than those who do not. [1] Guyatt GH, Oxman AD, Vist G, Kunz R, Brozek J, Alonso-Coello P,
Deciding on specific imputation strategies also entailed et al. GRADE guidelines: 4. Rating the quality of evidenceestudy
limitations (risk of bias). J Clin Epidemiol 2011;64:407e15.
some degree of arbitrariness: we opted, where possible, for [2] Balshem H, Helfand M, Schunemann HJ, Oxman AD, Kunz R,
simplicity. For example, to address apparently, beneficial Brozek J, et al. GRADE guidelines: 3. Rating the quality of evidence.
treatment effects in binary outcomes we suggest, for the J Clin Epidemiol 2011;64:401e6.
control group, assuming that event rates in missing [3] Higgins JPT, Green S S, editors. Cochrane Handbook for Systematic
participants do not differ from those in participants with Reviews of Interventions Version 5.1.0. The Cochrane Collaboration.
2011. Available from www.handbook.cochrane.org.
complete data. Thus, the only variation is the increase in event [4] Akl EA, Johnston BC, Alonso-Coello P, Neumann I, Ebrahim S,
rates imputed to missing data, relative to those with complete Briel M, et al. Addressing dichotomous data for participants excluded
data, in the intervention group. It is possible, of course, for from trial analysis: a guide for systematic reviewers. PLoS One 2013;
investigators to reasonably deviate from our guidance and 8:e57132.
to also vary control group event rates imputed to control [5] Ebrahim S, Akl EA, Mustafa RA, Sun X, Walter SD, Heels-
Ansdell D, et al. Addressing continuous data for participants
groups. Systematic review authors might consider similar excluded from trial analysis: a guide for systematic reviewers. J Clin
reasonable alternatives regarding our suggestions for how to Epidemiol 2013;66:1014e1021.e1.
deal with harm outcomes, and with continuous variables. [6] Ebrahim S, Johnston BC, Akl EA, Mustafa RA, Sun X, Walter SD,
In our presentation, we have focused on rating down cer- et al. Addressing continuous data measured with different instru-
tainty in the evidence for risk of bias only when meta- ments for participants excluded from trial analysis: a guide for sys-
tematic reviewers. J Clin Epidemiol 2014;67:560e70.
analyses that include plausible imputations for missing data [7] Akl EA, Kahale LA, Agoritsas T, Brignardello-Petersen R, Busse JW,
result in loss of statistical significance. We have also pointed Carrasco-Labra A, et al. Handling trial participants with missing
out, however, that one could be even more stringent: one outcome data when conducting a meta-analysis: a systematic survey
could rate down if the boundary of the confidence interval of proposed approaches. Syst Rev 2015;4:98.
closest to no effect includes a threshold of patient importance. [8] Akl EA, Carrasco-Labra A, Brignardello-Petersen R, Neumann I,
Johnston BC, Sun X, et al. Reporting, handling and assessing
In summary, this GRADE guidance includes structured, the risk of bias associated with missing participant data in system-
transparent, and relatively easily implementable strategies atic reviews: a methodological survey. BMJ Open 2015;5(9):
to determine whether the extent of missing data warrants e009368.
rating down the certainty in a body of evidence for a [9] Akl EA, Shawwa K, Kahale LA, Agoritsas T, Brignardello-
particular outcome for risk of bias. Ongoing work involves Petersen R, Busse JW, et al. Reporting missing participant data in
randomised trials: systematic survey of the methodological literature
examining the impact of the approaches on a large sample and a proposed guide. BMJ Open 2015;5(12):e008431.
of meta-analyses and may inform future updates of this [10] Akl EA, Kahale LA, Ebrahim S, Alonso-Coello P, Sch€unemann HJ,
guidance [37]. Guyatt GH. Three challenges described for identifying participants
with missing data in trials reports, and potential solutions suggested
to systematic reviewers. J Clin Epidemiol 2016;76:147e54.
Acknowledgments [11] Abraha I, Montedori A. Modified intention to treat reporting in
randomised controlled trials: systematic review. BMJ 2010;340:c2697.
Authors’ contributions: G.H.G., S.E., P.A.-C., B.C.J., [12] Abraha I, Cherubini A, Cozzolino F, De Florio R, Luchetta ML,
R.A.M., S.D.W., and E.A.A. contributed to the conception Rimland JM, et al. Deviation from intention to treat analysis in
22 G.H. Guyatt et al. / Journal of Clinical Epidemiology 87 (2017) 14e22

randomised trials and treatment effect estimates: meta- [26] Akl EA, Ramly EP, Kahale LA, Yosuico VE, Barba M, Sperati F,
epidemiological study. BMJ 2015;350:h2445. et al. Anticoagulation for people with cancer and central venous cath-
[13] Schulz KF. Assessing allocation concealment and blinding in rando- eters. Cochrane Database Syst Rev 2014;10:CD006468.
mised controlled trials: why bother? Evid Based Nurs 2001;4(1):4e6. [27] Lytvyn L, Quach K, Banfield L, Johnston BC, Mertz D. Probiotics
[14] Alshurafa M, Briel M, Akl EAA, Haines T, Moayyedi P, Gentles SJ, and synbiotics for the prevention of postoperative infections
et al. Inconsistent definitions for intention-to-treat in relation to following abdominal surgery: a systematic review and meta-
missing outcome data: systematic review of the methods literature. analysis of randomized controlled trials. J Hosp Infect 2016;92(2):
PLoS One 2012;7:e49163. 130e9.
[15] Higgins JP, White IR, Wood AM. Imputation methods for missing [28] Johnston BC, Ma SS, Goldenberg JZ, Thorlund K, Vandvik PO,
outcome data in meta-analysis of clinical trials. Clin Trials 2008;5: Loeb M, et al. Probiotics for the prevention of Clostridium difficile-
225e39. associated diarrhea: a systematic review and meta-analysis. Ann Intern
[16] Mavridis D, White IR, Higgins JP, Cipriani A, Salanti G. Allowing Med 2012;157:878e88.
for uncertainty due to missing continuous outcome data in pairwise [29] Spencer FA, Sekercioglu N, Prasad M, Lopes LC, Guyatt GH. Culprit
and network meta-analysis. Stat Med 2015;34:721e41. vessel versus immediate complete revascularization in patients with
[17] Akl EA, Briel M, You JJ, Sun W, Johnston BC, Busse JW, et al. Po- ST-segment myocardial infarction-a systematic review. Am Heart J
tential impact on estimated treatment effects of information lost to 2015;170:1133e9.
follow-up in randomised controlled trials (LOST-IT): systematic re- [30] Spencer FA, Lopes LC, Kennedy SA, Guyatt GH. Systematic review
view. BMJ 2012;344:e2809. of percutaneous closure versus medical therapy in patients with
[18] White IR, Higgins JP, Wood AM. Allowing for uncertainty due to cryptogenic stroke and patent foramen ovale. BMJ Open 2014;4(3):
missing data in meta-analysisepart 1: two-stage methods. Stat Med e004282.
2008;27:711e27. [31] Spencer FA, Prasad M, Vandvik PO, Chetan D, Zhou Q, Guyatt GH.
[19] White IR, Welton NJ, Wood AM, Ades AE, Higgins J. Allowing for Longer- versus shorter-duration dual-antiplatelet therapy after drug-
uncertainty due to missing data in meta-analysisepart 2: hierarchical eluting stent placement: a systematic review and meta-analysis.
models. Stat Med 2008;27:728e45. Ann Intern Med 2015;163:118e26.
[20] Turner NL, Dias S, Ades AE, Welton NJ. A Bayesian framework to [32] How to read clinical journals: V: to distinguish useful from use-
account for uncertainty due to missing binary outcome data in pair- less or even harmful therapy. Can Med Assoc J 1981;124(9):
wise meta-analysis. Stat Med 2015;34:2062e80. 1156e62.
[21] Geng EH, Emenyonu N, Bwana MB, Glidden DV, Martin JN. Sam- [33] Johnston BC, Thorlund K, Schunemann HJ, Xie F, Murad MH,
pling-based approach to determining outcomes of patients lost to Montori VM, et al. Improving the interpretation of quality of life ev-
follow-up in antiretroviral therapy scale-up programs in Africa. JA- idence in meta-analyses: the application of minimal important differ-
MA 2008;300:506e7. ence units. Health Qual Life Outcomes 2010;8:116.
[22] Akl EA, Kahale L, Barba M, Neumann I, Labedi N, Terrenato I, et al. An- [34] Guyatt GH, Thorlund K, Oxman AD, Walter SD, Patrick D,
ticoagulation for the long-term treatment of venous thromboembolism in Furukawa TA, et al. GRADE guidelines: 13. Preparing summary of
patients with cancer. Cochrane Database Syst Rev 2014;7:CD006650. findings tables and evidence profiles-continuous outcomes. J Clin Ep-
[23] Akl EA, Kahale L, Sperati F, Neumann I, Labedi N, Terrenato I, et al. idemiol 2013;66:173e83.
Low molecular weight heparin versus unfractionated heparin for peri- [35] Guyatt GH, Feeny DH, Patrick DL. Measuring health-related quality
operative thromboprophylaxis in patients with cancer. Cochrane of life. Ann Intern Med 1993;118:622e9.
Database Syst Rev 2014;6:CD009447. [36] Thorlund K, Walter SD, Johnston BC, Furukawa TA, Guyatt GH.
[24] Akl EA, Kahale L, Terrenato I, Neumann I, Yosuico VE, Barba M, Pooling health-related quality of life outcomes in meta-analysis-a
et al. Oral anticoagulation in patients with cancer who have no ther- tutorial and review of methods for enhancing interpretability. Res
apeutic or prophylactic indication for anticoagulation. Cochrane Synth Methods 2011;2:188e203.
Database Syst Rev 2014;7:CD006466. [37] Akl EA, Kahale LA, Agarwal A, Al-Matari N, Ebrahim S,
[25] Akl EA, Kahale LA, Ballout RA, Barba M, Yosuico VE, van Alexander PE, et al. Impact of missing participant data for dichotomous
Doormaal FF, et al. Parenteral anticoagulation in ambulatory patients outcomes on pooled effect estimates in systematic reviews: a protocol
with cancer. Cochrane Database Syst Rev 2014;12:CD006652. for a methodological study. Syst Rev 2014;3:137.

You might also like