Inter Reliability

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Interrater Reliability of a Clinical Scale of Rigidity

LINDA R. VAN DILLEN andKATHRYN E. ROACH The purposes of this study were 1) to describe a clinical scale of rigidity and testing procedure for use in patients with Parkinson's disease and 2) to examine the scale's interrater reliability. Twenty subjects (3 women, 17 men; age = 64 years, s = 16.3) participated in the study. Criteria for participation were 1) diagnosis of Parkinson's disease, 2) physician-documented rigidity, 3) ability to follow one-step verbal directions, and 4) ability to attain at least 75% of the standard passive-range-of-motion measurements of the elbow, forearm, and wrist of the tested upper extremity. Each of two raters used a standardized set of instructions and test procedures. The degree of rigidity was assessed using a four-point scale ranging from 0 (absent) to 3 (severe). The observed agreement between raters was 16 out of 20 trials. A Cohen's weighted Kappa was used to analyze the data (Kw = .636, p = .20). Factors were identified that may have contributed to the discrepancy between agreement and the agreement beyond chance. Key Words: Parkinson disease; Tests and measurements, range of motion; Upper extremity, general.

Accurate examination of signs, symptoms, and functional disability of patients with Parkinson's disease is essential in monitoring the progression of the disease, determining the effects of medication regimens, and directing physical and surgical treatment.1-6 Two approaches have been developed and used to assess the clinical features frequently exhibited in the patient with Parkinson's disease: 1) qualitative observer rating scales for use by the clinician and 2) quantitative assessment with the use of electronic instrumentation.1-4 Observer rating scales typically define specific cri-

L. Van Dillen, MHS-PT, is Instructor, Program in Physical Therapy, Washington University, PO Box 8083,6060 S Euclid Ave, St. Louis, MO 63110, and a doctoral candidate, Department of Psychology, Washington University. Address correspondence to 4355 Maryland Ave, #127, St. Louis, MO 63108 (USA). K. Roach, MHS-PT, is Research Coordinator, Health Services Research Division, Edward J. Hines Veterans Administration Hospital, Fifth Ave and Roosevelt Rd, Hines, IL 60141, and a doctoral student, Department of Epidemiology, School of Public Health, University of Illinois at the Medical Center, Chicago, IL. Ms. Van Dillen and Ms. Roach were graduate students at Washington University Medical School, St. Louis, MO, when this study was conducted. This study was completed in partial fulfillment of the requirements for Ms. Van Dillen's and Ms. Roach's Master of Health Science in Physical Therapy degree, Washington University Medical School. This article was adapted from a presentation at the Sixtieth Annual Conference of the American Physical Therapy Association, Las Vegas, NV, June 17-21, 1984. This article was submitted March 25, 1988, and was accepted May 23, 1988. Potential Conflict of Interest: 5.

teria operationally for each of the clinical features to be assessed and identify the conditions under which the features will be tested. The clinical scales most frequently used in Parkinson's disease studies are the Columbia University Rating Scale,7 the Northwestern University Disability Scale,5 Hoehn and Yahr's staging for Parkinson's disease,8 and Webster's Parkinson's disease rating scale.9 Two instruments include the measurement of rigidity: 1) the Columbia Scale7 and 2) Webster's Parkinson's disease rating scale.9 Both scales define criteria for clinically assessing the degree of rigidity in a patient's trunk or limbs at any given point in time. The Columbia Scale defines criteria for five degrees of rigidity: absent, slight, mild, marked, and severe.7 Webster's scale defines criteria for four degrees of rigidity: absent, mild, moderate, and severe.9 Researchers have stated that the advantage of clinical methods of assessing parkinsonian signs and symptoms are specificity, sensitivity, and low cost.1,6 The disadvantage of clinical methods can be in the lack of consistency between observers or within an observer using the rating scale.1,610 This disadvantage is particularly important if clinical decisions about patient treatment and management are to be determined from the findings of a clinical scale. 11,12(pp53-68) Of the abovementioned scales that measure signs and symptoms of Parkinson's disease, inter-

rater reliability has been reported for selected clinical features of the Columbia Scale.10 The clinical features evaluated were bradykinesia, gait disturbance, postural abnormality, and tremor. Rigidity, although part of the Columbia Scale, was not evaluated. The same study also reported interrater reliability for Hoehn and Yahr's disability staging.10 The Northwestern University Disability Scale5 was examined for interrater reliability, but the study did not include the measurement of rigidity. No reliability studies have been reported for Webster's Parkinson's disease rating scale.9 Because of the minimal attention given to the reliability of clinical rating scales for Parkinson's disease, we decided to test the interrater reliability of the rigidity scale used in our clinic. We developed our own clinical rating scale to more precisely identify operational criteria and testing procedures for the rating of rigidity that we felt were lacking in the above-mentioned instruments. We defined the scale based on the description of rigidity by Adams and Victor,13 who defined rigidity as a hypertonus characterized by a sustained involuntary muscle contraction that affects both flexor and extensor muscles. Our rigidity scale was developed as part of a composite assessment used with patients with Parkinson's disease. The composite assessment is currently being used to document the baseline status of selected clinical features and to monitor 1679

Volume 68 / Number 11, November 1988

changes of status throughout a day as medications are adjusted.14 The purposes of this study were 1) to describe the clinical rating scale and testing procedure we use to assess rigidity in patients with Parkinson's disease and 2) to examine the interrater reliability of the rigidity scale when used with patients with Parkinson's disease. We expected that the two clinicians participating in the study would be reliable in their measurement of rigidity using the clinical scale described.

METHOD Subjects
Twenty patients (3 women, 17 men) treated in the Movement Disorders Clinic and acute care service of a major university-based hospital gave their informed consent to participate in this study. Subjects' mean age was 64 years (s = 16.3). Each subject was diagnosed by a neurologist as having Parkinson's disease with rigidity as part of the clinical syndrome. Criteria for study participation were 1) physician-documented rigidity; 2) ability to follow simple onestep verbal directions; 3) ability to attain at least 75% of the standard passiverange-of-motion (PROM) measurements of the elbow, forearm, and wrist of the tested upper extremity (UE); and 4) diagnosis of Parkinson's disease.

Rating Scale
The degree of rigidity was assessed by a defined measurement scale we developed. The scale was based on a review of the literature and on our own clinical experience. The four-point rating scale was defined as follows: 1. 0 (Absent)Normal muscle tone is present. No resistance to passive movement is detected. 2. 1 (Slight)Resistance to passive movement can be detected, but the resistance is mild and inconsistent throughout the PROM. 3. 2 (Moderate)Resistance to passive movement is detected consistently throughout the PROM, but full available PROM is easily obtained. 4. 3 (Severe)Resistance to passive movement requires maximal effort by the rater to obtain the full PROM, or full PROM cannot be attained.

instructions and test movements that they had developed. After agreeing on the procedure for rating and the operational definitions for rigidity, a practice session was conducted. During this session, the raters became familiar with the use of the rating scale and practiced the study protocol on healthy individuals and patients with Parkinson's disease. Each subject's UE PROM was assessed in the supine position before rating the degree of rigidity. Each subject was then rated in the sitting position with the back supported. The UE to be rated was chosen on a random basis. Identical verbal instructions were given to each subject and included the following: The first rater would be moving the subject's arm, and the subject was to relax, allowing the tester to hold the arm completely, and neither resist nor assist in the movement. The rater then grasped the subject's elbow with one hand so that the subject's upper arm was supported by the rater's forearm. With her other hand, the rater grasped around the dorsum of the subject's hand and then performed a series of random movements of the extremity. The movements consisted of varying degrees of elbow flexion and extension, forearm pronation and supination, and wrist flexion and extension. Each joint was taken through the full available PROM at least once during a test. The random movements were performed at varying speeds within an individual test, with each test lasting no longer than one minute. The degree of rigidity detected was then recorded by the rater on a data sheet. The same procedure was repeated by the second rater immediately after the first rater. Neither rater was allowed to observe the other rater during testing. Raters alternated being first and second in the testing procedure until all subjects were assessed. Data remained confidential until completion of the study.

Data Analysis
We analyzed agreement between the two raters with Cohen's weighted Kappa.15

RESULTS
The Table shows the agreement between raters. The raters agreed in 16 out of 20 trials (80%) and never disagreed by more than a single level on the rating scale. Of the 4 disagreements, neither rater consistently graded lower than the other rater. A weighted Kappa value of .636 (p = .20) was obtained.

DISCUSSION
The results of this study indicate a high percentage of observed agreement (80%) between raters in estimating the degree of rigidity detected in the subjects' UE. The results of the statistical analysis with Cohen's weighted Kappa,15 however, showed that the level of reliability achieved when agreement caused by chance was considered was .636 and was not significant at the level of .05. The discrepancy between the percentage of observed agreement and the obtained weighted Kappa value can partially be explained by the nonuniform distribution of our sample and the relatively small sample size. As stated by M. J. Strubbe (unpublished report), "The consequence of deviating from a uniform distribution is to increase the instability of the Kappa statistic." The majority of our subjects (90%) were classified in the 1 or 2 rigidity category (Table). The effect of such a nonuniform distribution is to increase the variance of the weighted Kappa, thus diminishing its strength as an estimate of reliability. The small sample size (N = 20) we used may also have been a contributing factor to the discrepancy observed. Because the sample size was

TABLE Interrater Agreement8 Between Therapists Rating Degree of Rigidity in Upper Extremities of Subjects with Parkinson's Disease (N = 20) Scale Categoryb 0 1 2 3 Sum 0 0(0) 0 0 0 0 1 2 7(4) 1 0 10 Rater 1 2 0 1 9(5) 0 10 3 0 0 0 0(0) 0 Sum 2 8 10 0 20

Rater 2

Procedure
Two physical therapists (L.V.D. and K.E.R.) each used a standardized set of 1680

a Kw = .636, p = .20; number in parentheses indicates number of agreements expected on the hypothesis of chance association. b O = absent, 1 = slight, 2 = moderate, 3 = severe.

PHYSICAL THERAPY

RESEARCH small and the number of subjects was used in the calculation of the probability value, an increase in sample size alone may improve the level of significance. An increase in the sample size, if selected to represent the actual spectrum of patient types, should also improve the uniformity of the distribution and potentially improve the stability of the weighted Kappa as an indication of reliability. 12(pp638-640) We attribute the observed agreement in this study to the raters' involvement in the development of the scale and to their practice in using the rating scale with patients.16 We also attribute the high degree of reliability to our attempt to operationally identify the testing procedure and define the rating scale as precisely as possible. Although both the Columbia Scale7 and Webster's scale9 include the rating of rigidity, we found both scales to be lacking in definition and standardization. As discussed previously, testing was performed at the wrist, forearm, and elbow. Both raters noted during the study that, in some instances, the degree of rigidity varied among these joints, making rigidity difficult to estimate for the entire UE. Upon completion of the study, this difficulty was discussed. We decided that our rating in these instances was based most frequently on the greatest amount of resistance encountered. Schwab stated It is well known that hypertonia varies from day to day in each patient and even from hour to hour. A host of variable and unknown factorsanxiety, tension, volitional effort, altered alertness, and relaxationchanges the level of the hypertonia, thereby confounding our efforts to measure it quantitatively and estimate it clinically.2 Although testing was performed in a standard method and within a short time frame, the actual degree of rigidity may have varied somewhat between rating attempts. This variance may have contributed to some of the disagreement between the raters in rating subjects in this study. The definition of "moderate" rigidity in the scale we used encompassed a wide range of hypertonia. If the grade of moderate could be separated into two distinct ratings, the sensitivity of the scale could potentially be increased. This change could increase the overall reliability of the scale but would require further testing. CONCLUSION Rigidity is a sign frequently identified in patients with Parkinson's disease. Along with other clinical signs and symptoms of Parkinson's disease, rigidity is monitored to determine the natural history of the disease and to establish the efficacy of treatment and its effect on function. Standardized and reliable measurement of clinical signs and symptoms, including rigidity, therefore, is necessary. We developed and tested a clinical scale of rigidity and found that our observed agreement was 80%. When statistical analysis was performed, we found that our agreement beyond chance was 63.6% and was not significant. We have discussed the factors that may have contributed to this discrepancy. Further clinical investigations of the rigidity scale should increase the number and spectrum of subjects used.
REFERENCES 1. Marsden CD, Schachter M: Assessment of extrapyramidal disorders. Br J Clin Pharmacol 11:129-151,1981 2. Schwab RS: Problems in the clinical estimation of rigidity (hypertonia). Clin Pharmacol Ther 5:942-946, 1964 3. Brumlik J, Boshes B: Quantitation of muscle tone in normals and Parkinsonism. Arch Neurol 4:399-406,1961 4. Larsen TA, Calne S, Calne DB: Assessment of Parkinson's disease. Clin Neuropharmacol 7:165-169,1984 5. Canter GJ, La Torre R, Mier M: Method for evaluating disability in patients with Parkinson's disease. J Nerv Ment Dis 122:143-147, 1961 6. Ward CD, Sanes JN, Dambrosia JM, et al: Methods for evaluating treatment in Parkinson's disease. In Fahn S, et al (eds): Experimental Therapeutics of Movement Disorders. New York, NY, Raven Press, 1983, vol 37, pp 1-7 7. Montgomery GK: Parkinson's disease and the Columbia scale. Neurology 34:557-558,1984 8. Hoehn MM, Yahr MD: Parkinsonism: Onset, progression and mortality. Neurology 5:427442,1967 9. Webster DD: Critical analysis of the disability in Parkinson's disease. Modern Treatment 5:257-282, 1968 10. Montgomery GK, Reynolds NC, Warren RM: Qualitative assessment of Parkinson's disease: Study of reliability and data reduction with an Abbreviated Columbia Scale. Clin Neuropharmacol 8:83-92, 1985 11. Rothstein JM: Measurement and clinical practice: Theory and application. In Rothstein JM (ed): Measurement in Physical Therapy: Clinics in Physical Therapy. New York, NY, Churchill Livingstone Inc, 1985, vol 7, pp 1-46 12. Feinstein AR: Clinical Epidemiology: The Architecture of Clinical Research. Philadelphia, PA, W B Saunders Co, 1984, pp 53-68, 638-640 13. Adams RD, Victor M: Abnormalities of movement and posture due to disease of the extrapyramidal motor system. In Adams RD, Victor M: Principles of Neurology, ed 2. New York, NY, McGraw-Hill Book Co, 1981, p 53 14. Van Dillen LR, Nuessen J, Montgomery E, et al: A description of an all-day Parkinson's evaluation. Abstract. Phys Ther 68:864,1988 15. Cohen J: A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20(1 ):37-46,1960 16. Garraway WM, Gore AS, Prescott RJ, et al: Observer variation in the clinical assessment of stroke. Age Ageing 5:233-240,1976

Volume 68 / Number 11, November 1988

1681

You might also like