Grubbs2014
Grubbs2014
Grubbs2014
testing, have high interobserver biases, lack standardization Validity and Reliability Testing
among ophthalmologist providers, and correlate poorly with Following the sixth and seventh PROMIS standards
subjective symptoms.4,6,7,15 Consequently, physician and (validity and reliability testing),18 we began field-testing on
patient ratings of dry eye status and its effect on QOL are a larger cohort of patients to determine the validity and test–
often discordant. It is essential that physicians take this dis- retest reliability of the UNC DEMS compared with the
cordance into account when attempting to treat a patient’s dry current gold standard, the OSDI. Using the initial validation
eye because clinical signs alone may not fully capture the pa- study of the OSDI as a guide, we determined that 50 patients
tient’s experience with DED. If a busy clinical practice does not would be an adequate sample size to estimate the intraclass
allow a clinician the necessary time to administer the OSDI or correlation to within 0.06 using a 1-sided 95% confidence
IDEEL, however, this discordance may not be detected and interval (CI) if the true correlation is 0.9 (to determine reli-
a patient’s symptoms may go untreated.1,5,7,13 Thus, there is ability of the questionnaire); in addition, we determined that
a pressing need to produce a more efficient and reliable this sample size would provide 98% power to detect a 0.50
patient-driven instrument that physicians can use to help guide correlation with the OSDI and 89% power to detect a 0.40
therapy and management of DED.3 In response to this need, our correlation with artificial tear use.11 Therefore, we aimed to
research team developed a single-item dry eye questionnaire recruit 75 patients (with the goal of at least two thirds of those
called the University of North Carolina Dry Eye Management patients fully completing the study). Ultimately, a total of 66
Scale (UNC DEMS). Following the Patient-Reported Outcome patients were recruited into the study, of which 46 had dry
Measurement Information System (PROMIS) guidelines for eye (ICD-9 code: 375.15) and 20 were controls without DED.
instrument development as outlined below, we created the To be included in the dry eye group, patients had to be 18
UNC DEMS to simultaneously assess both DED symptoms years of age or older with a known diagnosis of dry eye.
and their effect on QOL in dry eye patients. These patients must have experienced dry eye symptoms
within the last 3 months before enrolling in the study. Control
patients were healthy and without ocular surface disease or
METHODS vision correction surgery. All study participants were English
speaking as the UNC DEMS was available only in English.
PROMIS Standards
The exclusion criteria for both dry eye and control patients
The US Food and Drug Administration’s publication were: intraocular surgery within the past 90 days, history of
of guidelines for instrument development for patient-
corneal transplant or neurotrophic keratitis, dry eye secondary
reported outcomes measures (PROs) in 2006 spurred the
to Stevens–Johnson syndrome and/or cicatricial pemphigoid,
paradigm shift favoring the proper development and imple-
severe conjunctival goblet cell loss or scarring conditions,
mentation of PROs in policy and health systems.16,17 The
congenitally absent meibomian or lacrimal glands, or active
PROMIS, a National Institute of Health–funded initiative,
ocular infection such as blepharitis or lid margin inflamma-
has established a set of 9 standards for instrument devel-
tion. This study was approved by the University of North
opment, refinement, and field-testing of PROs for research
and clinical use.18 The UNC DEMS (Fig. 1), a 1-item, Carolina Institutional Review Board before enrolling patients.
graded scale (1–10), was developed using these PROMIS During a regular clinic visit, consent was obtained from
standards as a guide. qualified study patients who were then asked to complete the
DEMS (the word “UNC” was omitted from the questionnaire in
this field test to prevent bias), OSDI, and a short survey. In this
Instrument Development and Psychometric survey, patients were asked to report whether they have used
Evaluation of the UNC DEMS artificial tears, frequency of artificial tear use, and their subjective
To date, in this ongoing effort, we have followed the first rating of their DED status as either normal, mild-to-moderate, or
7 PROMIS standards for UNC DEMS instrument development. severe. The study investigators also obtained TBUT and fluores-
The first 5 PROMIS standards: (1) defining target concept and cein corneoconjunctival staining (FUL-GLO, Akorn, MD). The
conceptual model, (2) generating and design of individual average TBUT of both eyes was determined by taking 3 consec-
items, (3) constructing item pool, (4) determining item bank utive measurements of TBUT in each eye and then using the
properties, and (5) field-testing and instrument format18 are mean of these 6 measurements for our analyses. A fluorescein
detailed in a previous publication by the authors.19 In brief, corneoconjunctival staining score was determined for each eye
we used a comprehensive literature search of dry eye symptoms using the Oxford Grading Scale; the average of the OD and OS
and disease effect on QOL combined with direct consultations score for each patient was used for our statistical analyses. An
with multiple ophthalmologists and dry eye patients to create an attending ophthalmologist then performed a complete slit-lamp
initial dry eye questionnaire. This questionnaire was adminis- examination and provided his or her assessment of the patient
tered to 18 patients with DED (International Classification of DED status as being normal, mild-to-moderate, or severe. Of
Diseases, 9th Revision, code: 375.15) followed by a 15-minute note, the ophthalmologist performing this assessment was
cognitive interviewing session. The UNC DEMS was then masked to the UNC DEMS and OSDI patient-reported outcomes.
refined using feedback obtained from the cognitive interviews Using his or her clinical judgment and experience, the ophthal-
(see Appendix, Supplemental Digital Content 1 and 2, http:// mologist also recorded the presence or absence of ocular condi-
links.lww.com/ICO/A250; http://links.lww.com/ICO/A251). A tions commonly associated with DED, including chalasis,
final version of the UNC DEMS was produced (Fig. 1). meibomian gland dysfunction, superficial punctate keratitis
(SPK), lid wiper epitheliopathy, and poor lid apposition. After at between the DEMS, the OSDI, and other DED measures.
least a week had passed since the clinic visit, the study partic- Additionally, we plotted the DEMS against the OSDI at the
ipants were asked to complete the DEMS and OSDI a second clinic visit and fit a simple linear regression with 95%
time. The post-1-week forms were sent to the patients either by prediction intervals. To assess construct validity, we com-
regular mail or by a Web-link in an e-mail. Participants were able pared the mean DEMS scores between the DED and control
to submit their responses either by mail or online, depending on patients as well as between groups based on patient and
their preferences. All participants were compensated with physician ratings of DED status. For these comparisons, we
a redeemable $25 gift card. used nonparametric Kruskal–Wallis tests. For the compari-
sons across the ratings groups, we first conducted an overall
test to compare across the 3 groups, and, only if the overall
Statistical Analysis
To assess criterion-related validity of the UNC DEMS,
we estimated the Pearson correlation coefficients and their TABLE 2. Patient and Physician Rating of Disease Severity
95% confidence intervals using Fisher z transformation, Control Patients, Dry Eye Patients, Total Patients,
N = 20, n (%) N = 46, n (%) N = 66, n (%)
TABLE 1. Patient Demographics Patient rating of
disease severity
Control Patients, Dry Eye Total Patients,
Normal 18 (90.0) 4 (8.6) 22 (33.3)
N = 20 Patients, N = 46 N = 66
Mild-to-moderate 2 (10.0) 26 (56.5) 28 (42.4)
Ethnicity, n (%) Severe 0 (0.0) 14 (30.4) 14 (21.2)
African American 2 (10.0) 10 (21.7) 12 (18.1) No rating 0 (0.0) 2 (4.3) 2 (3.0)
White 18 (90.0) 33 (71.7) 51 (77.2) provided
Other/unknown 0 (0.0) 3 (6.5) 3 (4.5) Physician rating of
Sex, n (%) disease severity
Female 15 (75.0) 37 (80.4) 52 (78.7) Normal 18 (90.0) 5 (10.8) 23 (34.8)
Male 5 (25.0) 9 (19.5) 14 (21.2) Mild-to-moderate 2 (10.0) 39 (84.7) 41 (62.1)
Age, mean (SD) 62.8 (13.1) 61.6 (13.1) 62.4 (13.0) Severe 0 (0.0) 2 (4.3) 2 (3.0)
test was significant at the 0.05 level, we then conducted all In the control group, the mean DEMS score was 1.85 6
pairwise comparisons. For test–retest reliability, we used only 1.72; whereas in the dry eye group, it was 5.73 6 2.15 (P ,
data from patients who provided follow-up measurements, 0.001). In the control group, the mean OSDI score was
and we applied linear mixed models to estimate the reliability 9.49 6 14.18 as compared with 37.18 6 23.24 in the dry
coefficient along with 95% bootstrapped confidence intervals. eye group (P , 0.001). The average DEMS and OSDI scores
Additionally, we created a limits of agreement plot20 for both varied significantly (P , 0.001 for each) across patients
repeated measurements of the DEMS, including only dry with normal, mild-to-moderate, and severe DED (Table 4).
eye patients to avoid artificially inflating the number of zero
differences. Finally, we compared the DEMS scores across
groups of patients based on the presence or absence of ocular Test–Retest Reliability
conditions using Kruskal–Wallis tests, and we fit a simple Fifty-seven of 66 patients (86.4%) completed the post-1-
linear regression to assess the association of the DEMS score week follow-up DEMS; 55 patients (83.3%) also completed the
with the total number of ocular conditions. All analyses were post-1-week follow-up OSDI. The test–retest reliability coeffi-
conducted in SAS, version 9.3 (SAS Institute, Cary, NC). cient of the UNC DEMS was estimated to be 0.90 (95% CI,
0.84–0.95). By comparison, the OSDI’s reliability coefficient
was 0.81 (95% CI, 0.70–0.91). Among dry eye patients, the
RESULTS mean difference between the post-1-week and clinic DEMS
Population Demographics score was 20.24 units, and agreement was very good with only
A total of 66 patients participated in the UNC DEMS 1 value being outside the limits of agreement (Fig. 3).
validity and reliability study; 46 were dry eye patients and 20
were control participants free of ocular disease. Patients in the Other Findings
dry eye and control groups were similar in age range and On average, patients who were noted to have SPK on
numbers of men and women (Table 1). Artificial tear use was examination had a DEMS score of 2.80 points higher than did
higher in the dry eye group than in the control (mean: 2.53 those without SPK (P , 0.001). Patients with poor lid apposi-
times per day vs. 0.15 times per day). Physicians and patients tion had an average DEMS score of 2.56 points higher than did
both rated the DED disease status as normal, mild-to- those without poor lid apposition, but this difference was not
moderate, or severe (Table 2). Physicians were less likely statistically significant (P = 0.090). There was no statistically
than patients to rate the patient’s DED status as severe. significant evidence of association between DEMS score and
the presence of chalasis, meibomian gland dysfunction, or lid
Validity Testing
Figure 2 presents a scatter plot of the DEMS against the
TABLE 3. Pearson Correlations of DEMS and OSDI With DED
OSDI at the clinic visit. The DEMS is correlated to the OSDI
Parameters (Overall Study Participants, N = 66)
across all study participants at an estimated coefficient of 0.80
(95% CI, 0.69–0.87; P , 0.001). The DEMS was correlated DEMS (P) OSDI (P)
to the OSDI in the dry eye group at an estimated coefficient of OSDI 0.80 (,0.001) 1.00
0.69 (95% CI, 0.49–0.81; P , 0.001). Table 3 presents the Frequency of artificial tear use 0.43 (,0.001) 0.39 (0.001)
correlations of the DEMS and OSDI with DED clinical meas- Average Oxford Score OU 0.39 (0.001) 0.42 (,0.001)
ures. Overall, the DEMS and OSDI have similar moderate, Average TBUT OU 20.26 (0.032) 20.38 (0.002)
but significant, correlations with artificial tear usage, fluores- OU, both eyes.
cein staining, and TBUT.
17. Reeve BB, Burke LB, Chiang YP, et al. Enhancing measurement in 20. Bland JM, Altman DG. Measuring agreement in method comparison
health outcomes research supported by Agencies within the US Depart- studies. Stat Methods Med Res. 1999;8:135–160.
ment of Health and Human Services. Qual Life Res. 2007;16(suppl 1): 21. Chalmers RL, Begley CG, Edrington T, et al. The agreement between
175–186. self-assessment and clinician assessment of dry eye severity. Cornea.
18. PROMIS: Instrument Development and Validation Scientific Standards: 2005;24:804–810.
Version 2.0. May, 2013. Available at: http://www.nihpromis.org/ 22. Guillemin I, Begley C, Chalmers R, et al. Appraisal of patient-reported
Documents/PROMISStandards_Vers2.0_Final.pdf. Accessed April 25, 2014. outcome instruments available for randomized clinical trials in dry eye:
19. Grubbs J. Instrument Development of the UNC Dry Eye Management Scale. revisiting the standards. Ocul Surf. 2012;10:84–99.
Chapel Hill, NC, UNC University Libraries; 2013:1–34. Available at: http:// 23. Taylor R. Interpretation of the correlation coefficient: a basic review.
dc.lib.unc.edu/cdm/ref/collection/s_papers/id/2652. Accessed May 1, 2014. J Diagn Med Sonogr. 1990;1:35–39.