Eba Adhd

Download as pdf or txt
Download as pdf or txt
You are on page 1of 29

This article was downloaded by: [the Bodleian Libraries of the University of Oxford] On: 01 April 2012, At:

06:43 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Clinical Child & Adolescent Psychology


Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/hcap20

Evidence-Based Assessment of Attention Deficit Hyperactivity Disorder in Children and Adolescents


William E. Pelham, Jr., Gregory A. Fabiano & Greta M. Massetti Available online: 07 Jun 2010

To cite this article: William E. Pelham, Jr., Gregory A. Fabiano & Greta M. Massetti (2005): Evidence-Based Assessment of Attention Deficit Hyperactivity Disorder in Children and Adolescents, Journal of Clinical Child & Adolescent Psychology, 34:3, 449-476 To link to this article: http://dx.doi.org/10.1207/s15374424jccp3403_5

PLEASE SCROLL DOWN FOR ARTICLE Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae, and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand, or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material.

Journal of Clinical Child and Adolescent Psychology 2005, Vol. 34, No. 3, 449476

Copyright 2005 by Lawrence Erlbaum Associates, Inc.

Evidence-Based Assessment of Attention Deficit Hyperactivity Disorder in Children and Adolescents


William E. Pelham, Jr., Gregory A. Fabiano, and Greta M. Massetti
Department of Psychology, State University of New York at Buffalo This article examines evidence-based assessment practices for attention deficit hyperactivity disorder (ADHD). The nature, symptoms, associated features, and comorbidity of ADHD are briefly described, followed by a selective review of the literature on the reliability and validity of ADHD assessment methods. It is concluded that symptom rating scales based on the Diagnostic and Statistical Manual of Mental Disorders (4th ed. [DSMIV]; American Psychiatric Association, 1994), empirically and rationally derived ADHD rating scales, structured interviews, global impairment measures, and behavioral observations are evidence-based ADHD assessment methods. The most efficient assessment method is obtaining information through parent and teacher rating scales; both parent and teacher ratings are needed for clinical purposes. Brief, non-DSM based rating scales are highly correlated with DSM scales but are much more efficient and just as effective at diagnosing ADHD. No incremental validity or utility is conferred by structured interviews when parent and teacher ratings are utilized. Observational procedures are empirically valid but not practical for clinical use. However, individualized assessments of specific target behaviors approximate observations and have both validity and treatment utility. Measures of impairment that report functioning in key domains (peer, family, school) as well as globally have more treatment utility than nonspecific global measures of impairment. DSM diagnosis per se has not been demonstrated to have treatment utility, so the diagnostic phase of assessment should be completed with minimal time and expense so that resources can be focused on other aspects of assessment, particularly treatment planning. We argue that the main focus of assessment should be on target behavior selection, contextual factors, functional analyses, treatment planning, and outcome monitoring. Attention deficit hyperactivity disorder (ADHD) is one of the most common mental health disorders of childhood, with prevalence rates ranging from 2% to 9% (American Academy of Child and Adolescent Psychiatry, 1997). By definition, children with ADHD have deficits compared to other children of the same age in attending to and completing tasks such as schoolwork, impulse control, and activity-level modulation. In addition, they have a host of impairments in multiple domains of functioning, including adult relationships (e.g., noncompliance with adult requests), school functioning (e.g., classroom disruption, poor achievement), and peer and sibling relationships (e.g., annoying, intrusive, overbearing, and aggressive behaviors). These difficulties continue into adolescence and adulthood even though core symptoms may improve with age (e.g., Barkley, Fischer, Smallish, & Fletcher, 2004; Mannuzza & Klein, 1999). A great deal of research has examined ADHD over the past three decades, with most of it focused on psychopathology and treatment (Barkley, in press). Far less research has been directed toward assessment of ADHD beyond symptom cut points for a diagnosis. Few articles have attempted to delineate, for example, the best ways of combining information across informants and methods and whether the purpose of assessment influences the answers to questions about best assessment practices. Our goal is not to exhaustively review the literature on instruments that have been used to assess children 449

Downloaded by [the Bodleian Libraries of the University of Oxford] at 06:43 01 April 2012

During the preparation of this article, William E. Pelham, Jr., was supported by grants from the National Institute of Alcohol Abuse and Alcoholism (AA11873), the National Institute on Drug Abuse (DA12414), the National Institute of Mental Health (MH53554, MH62946, MH065899), the Institute of Education Sciences (LO3000665A), and the National Institute of Neurological Disorders and Stroke (NS39087). Gregory A. Fabiano was supported by a National Institutes of Mental Health National Research Service Award (NRSA 1 F31 MH6424301A1). Greta M. Massetti was supported by American Psychological Association/Institute of Education Sciences Postdoctoral Education Research Training fellowship under Department of Education, Institute of Education Sciences Grant R305U030004. Requests for reprints should be sent to William E. Pelham, Jr., Department of Psychology, State University of New York at Buffalo, 3435 Main Street, 318 Diefendorf Hall, Buffalo, NY 14214. E-mail: [email protected]

PELHAM, FABIANO, MASSETTI

Downloaded by [the Bodleian Libraries of the University of Oxford] at 06:43 01 April 2012

with ADHD and their families; such a task would fill a textbook and would span the range from standardized tests of intelligence and achievement to measures of marital functioning in parents. Instead, our focus is on three fundamental aspects of assessment for ADHD: (a) the reliability and validity (e.g., concurrent, convergent, discriminant, and incremental) of approaches to assessing the two requisite features of ADHDsymptoms and associated impairment in daily life functioning; (b) how different instruments, procedures, and informants are best integrated to produce evidence-based assessment procedures for clinical practice; and (c) recommendations for evidence-based procedures with the best treatment utility and cost effectiveness. BACKGROUND Purposes of Assessment The evidence base for assessment may vary as a function of the purpose of assessment (Mash & Hunsley, 2005)diagnosis, prognosis, treatment planning, or evaluation of outcome (Mash & Terdal, 1997). These purposes may vary depending on the setting in which assessment occurs. In research settingsfor example, epidemiological or clinical studies diagnostic assessments based on the Diagnostic and Statistical Manual of Mental Disorders (4th ed. [DSMIV]; American Psychiatric Association, 1994) are conducted to determine whether a childs behavior deviates from normative child behavior sufficiently to meet criteria for a diagnosis. The goal is to maximize diagnostic homogeneity and thus etiological and prognostic homogeneity (though it is not clear that the former assures the latter). If the study has a longitudinal component, prognosis is also a purpose of the diagnosis. In basic research studies, assessment typically ends with the diagnosis. In clinical settings, diagnosis is also a purpose of assessment, although its main use is typically administrative (e.g., eligibility for services, reimbursement). The underlying, fundamental reason for clinical assessment goes well beyond diagnosisto determine the need for treatment, conceptualize the case, specify treatment goals, develop treatment targets, and monitor progress and outcome. Consider, for example, the primary care setting, where the majority of children with ADHD will receive initial evaluations (Hoagwood, Kelleher, Feil, & Comer, 2000). Recognizing the importance of accurately assessing and treating ADHD, the American Academy of Pediatrics (AAP; 2000, 2001) developed practice guidelines for the diagnosis, evaluation, and treatment of ADHD. The diagnostic guidelines state that primary care physicians should (a) screen for ADHD when core symptoms are present, (b) employ DSMIV criteria, (c) gather information about DSMIV 450

symptoms directly from parents and teachers, (d) assess for functioning and coexisting conditions, and (e) not use diagnostic tests other than DSM IV-based rating scales. The AAP (2001) treatment guidelines expand greatly on assessment, using the term target outcomes rather than symptoms to ensure that practitioners would broadly conceptualize the targets of treatment for ADHD. Clinicians are told to work with parents and schools to target domains for intervention and to measure resulting target outcomes over time to evaluate the effectiveness of treatment. Thus, the purpose of assessment in the AAP guidelines is relatively narrowly defined when the goal is diagnosis but more broadly conceptualized when treatment is the focus. As in primary care, the main initial purpose of assessment in mental health and educational settings is to provide a diagnosis. After diagnosis, the main questions left to address involve treatmentwhether the child is sufficiently impaired to need medication (primary care question), therapy (mental health), or special services (education) and then to evaluate treatment outcomes. Thus, all purposes of assessment after diagnosis in clinical settings involve planning for or monitoring treatment, and assessment for ADHD must be designed to easily incorporate these functions. There are two classes of logical targets for intervention in ADHD: the DSMIV symptoms, and the various impairments in daily life functioning and deficient adaptive skills for which children are referred. These domains are not mutually exclusive, but neither do they overlap completely. Target behaviors could include (a) attention, impulsivity, and hyperactivity; (b) peer relationship difficulties, academic achievement, and dysfunctional parenting skills; or both. We take the position that symptoms of ADHD are not socially valid targets for intervention and that beyond their use to make DSMIV diagnoses, there is little reason to focus on them in a comprehensive, cost-effective approach to assessment. We argue instead that the common domains of impaired functioning discussed in this article are the reasons that children with ADHD are referred, the mediators of their long-term outcomes, and the most appropriate targets for initial assessment, intervention, and monitoring. Thus the main focus of assessment in ADHD should be impairment and adaptive skills. Further, beyond identification of target outcomes in these domains, assessment should focus on functional behavioral assessments (FBA) of these target domains with the goals of treatment planning and monitoring (Gresham, Watson, & Skinner, 2001). As we discuss here, the literature on assessment of ADHD has focused almost exclusively (but not entirely) on measuring the core symptoms of ADHD to yield DSMIV diagnoses. There is thus a disconnection between the activities and tools with which clinicians should be familiar and the activities and tools they typically use to assess ADHD. We discuss this in more depth later.

ADHD ASSESSMENT

Nature of ADHD For the past 40 years, it has been widely accepted that the core symptoms of ADHD are inattention, impulsivity, and hyperactivity. As the DSM has gone through three editions and as theories of ADHD have been modified, which symptoms have been considered preeminent and how they should best be combined to yield an ADHD diagnosis or diagnostic subtype have changed. Early theorists focused mostly (though not exclusively) on inattention, whereas more recently impulsivity and disinhibition has been considered the core cognitive deficit (Barkley, 1997). The precise nature of the attentional and impulse control deficits has been widely studied and debated (Douglas, 1999; Huang-Pollock & Nigg, 2003; Sergeant, Oosterlaan, & Van Der Meere, 1999). Most recently, the constructs of executive dysfunction and frontal lobe functioning have played prominent roles in discussions of the core cognitive deficit in ADHD (Barkley, in press). Research has been extended to the use of functional MRI and molecular genetics to examine biological differences between ADHD and comparison children (e.g., Castellanos et al., 2003; Castellanos & Swanson, 2002), though reliable markers of ADHD are yet to be demonstrated. Our lack of understanding of the biological or cognitive basis of ADHD may well be one reason that tests and procedures that tap those domains are not currently useful in diagnosis or assessment (AAP, 2000; Barkley, in press; Epstein et al., 2003; Matier-Sharma, Perachio, Newcorn, Sharma, & Halperin, 1995; Rapport, Chung, Shore, Denney, & Isaacs, 2000). Because the definition of ADHD is currently a behavioral one based on the individuals functioning in daily life (APA, 1994), assessment procedures must focus on the observable behavior as reported by adults or otherwise measured in natural (home and classroom) and laboratory (clinic, analogue classroom) settings.

DSMIV Symptoms of ADHD Despite the shifts in emphasis and the debate about the fundamental cognitive deficit, the core characteristics of ADHD have remained consistent for three decades. Thus, the current DSMIV definition lists nine behavioral descriptors of inattention and nine descriptors of impulsivity/hyperactivity (APA, 1994). A diagnosis is given if (a) six symptoms from either or both lists is met (inattentive type, hyperactiveimpulsive type, or combined type), (b) the symptoms are maladaptive and inconsistent with developmental level, (c) symptoms began before the age of 7 years, and (d) symptoms have associated clinically significant impairment in two or more settings. Most of the current symptoms listed in DSMIV came from the two previous editions and before that

from existing empirically developed rating scales (e.g., Conners, 1969; Quay & Peterson, 1983). Some of the symptoms (e.g., often loses things necessary for tasks or activities) resulted directly from empirical work (Atkins, Pelham, & Licht, 1985). Others (e.g., is often easily distracted by extraneous stimuli) have persisted because they are especially salient as reported by parents and teachers (e.g., Milich, Widiger, & Landau, 1987) despite fairly consistent failure in controlled studies to find supportive evidence (Huang-Pollock & Nigg, 2003). Other symptoms (e.g., often engages in physically dangerous activities without considering the possible consequences; often acts without thinking) have been dropped from the DSM list despite evidence that they are among the most discriminating items or have high face validity (Frick et al., 1994; Pelham, Gnagy, Greenslade, & Milich, 1992). The fact that impulsivity items are embedded withinand arguably diluted bytwice the number of hyperactivity items may contribute to counterintuitive findings regarding the relative weight of inattention versus impulsivity in predicting outcomes among ADHD samples (e.g., Molina & Pelham, 2003). Indeed, subsets of items from the DSMIV, including those that are worded slightly differently and appear on empirically derived scales, appear to be just as reliable, valid, and discriminating as the entire DSMIV symptom list (see following discussion). To the extent that almost all of the work on assessing ADHD has had as its goal assessing the DSMIV symptoms, the literature is by definition limited by the nature and grouping of those items. As the symptom criteria have changed in subsequent editions of the DSMs, so too has subtyping of ADHD. The fact that the symptoms were listed in three (Diagnostic and Statistical Manual of Mental Disorders, 3rd ed.; American Psychiatric Association, 1980), one (Diagnostic and Statistical Manual of Mental Disorders, 3rd ed., rev.; American Psychiatric Association, 1987), and two (DSMIV) clusters reflected changing thinking among researchers based mainly on factor analytic studies (e.g., Lahey et al., 1988). Children diagnosed with the DSMIV hyperactiveimpulsive subtype are relatively rare and consist of predominantly young (e.g., kindergarten-age) children (Lahey et al., 1994) who will become diagnosed as combined type when they reach the age in school where sustained attention to task is required (Lahey, Pelham, Loney, Lee, & Willcutt, in press). The inattentive subtype has been shown to be distinct from the combined type not only by definitionnot having a sufficient number of hyperactiveimpulsive symptoms but also by virtue of differential associated features including severity, impairment, familial history, and outcomes (Milich, Balentine, & Lynam, 2001). However, with respect to assessment, the same procedures need to be employed for assessing subtypes of ADHD, 451

Downloaded by [the Bodleian Libraries of the University of Oxford] at 06:43 01 April 2012

PELHAM, FABIANO, MASSETTI

because the subtypes have a common set of symptoms that result in similar functional impairments (Pelham, 2001). Impairment in ADHD Although both are required, researchers, clinicians, and school personnel often emphasize the importance of obtaining an accurate assessment of DSMIV symptoms with relatively less emphasis on the assessment of impairment. This may be misplaced for some of the purposes of assessment. For example, DSMIV symptoms of ADHD do not predict long-term outcome (e.g., Mannuzza & Klein, 1999) and are not the basis of referrals for treatment (Angold, Costello, Farmer, Burns, & Erkanli, 1999). In contrast, three areas of psychosocial impairment common in ADHD childrendifficulties in family functioning, peer relationships, and academic functioningare predictive of negative longterm outcome, are typically the basis of referral, and arguably are the target behaviors that must be modified to improve both current and long-term functioning (e.g., Angold et al., 1999; Chamberlain & Patterson, 1995; Huesmann, Eron, Lefkowitz, & Walder, 1984; Jimerson, Egeland, Sroufe, & Carlson, 2000). We discuss each of these three domains in the following. Children with ADHD live in families with a host of problems that both impact and are impacted by the child. Relative to comparison groups, parents of children with ADHD report more frequent and severe interparental discord and child-rearing disagreements, more negative parenting practices, greater parenting stress and caregiver strain, and more psychopathology themselves (Johnston & Mash, 2001). These factors both contribute to and are exacerbated by the childs ADHD (e.g., Lang, Pelham, Atkeson, & Murphy, 1999; Pelham et al., 1998) and highlight the importance of assessing the familial context of children with ADHD, including the nature of parenting skills, which both predict and mediate long-term outcomes and are thus key targets for intervention. The source of most complaints about ADHD is the classroom teacher. Indeed, studies of children with ADHD in classroom settings have routinely documented that they are more off-task, complete less assigned work with less accuracy, are more disruptive and break more classroom rules, and are less likely to comply with adults compared to other children (e.g., Atkins et al., 1985). These behaviors contribute to lower levels of academic achievement and higher rates of disciplinary referrals, retention, and later dropout (e.g., DuPaul & Stoner, 2003). Comprehensive assessment of ADHD must therefore include evaluation of classroom behavior and academic functioning. It has long been known that most children with ADHD have severe difficulties interacting with other children (Milich & Landau, 1982; Nangle & Erdley, 452

Downloaded by [the Bodleian Libraries of the University of Oxford] at 06:43 01 April 2012

2001). They are bossy, intrusive, immature, boisterous, boastful, less aware of social cues, and aggressive both physically and verballyrelative to other children. Numerous studies have documented these difficulties in multiple settings (Cunningham & Siegel 1987; Hinshaw & Melnick, 1995). Peer nomination procedures reveal that children with ADHD or high levels of ADHD behaviors are dramatically more disliked and less well liked than comparison children (Pelham & Bender, 1982). Although not all children with ADHD have problems with peers, disturbances in peer relationships are among the best predictors and mediators of a variety of adverse adult outcomes (Coie & Dodge, 1998; Huesmann et al., 1984) and are a key focus of assessment in ADHD. There are many well-validated, standardized instruments used to assess functioning in the domains of family relationships, peer relationships, and school functioning. These include parent, teacher, and peer reports and direct assessments of IQ and academic achievement. In contrast, the childs general lack of insight into problem areas and concomitant overestimation of abilities and skills in specific functional domains (e.g., academic and social performance; Hoza, Pelham, Dobbs, Owens, & Pillow, 2002), along with the low reliability of child report (e.g., Shaffer, Fisher, Lucas, Dulcan, & Schwab-Stone, 2000), underscore the lack of support for child report in clinical assessments for ADHD (Hart, Lahey, Loeber, & Hanson, 1994; Loeber, Green, & Lahey, 1990). Because of these factors, child report is not a valid method of obtaining an ADHD diagnosis or planning for treatment. Comorbidity Epidemiological surveys (e.g., August, Realmuto, MacDonald, Nugent, & Crosby, 1996; Pelham, Evans, Gnagy, & Greenslade, 1992) and clinical studies (e.g., MTA Cooperative Group, 1999a, 1999b) evidence high concentrations of comorbid DSM disorders in samples of children with ADHD. The highest comorbidity is between ADHD and disorders related to aggression (i.e., children with oppositional defiant disorder and conduct disorder; Lahey, Miller, Gordon, & Riley, 1999) and learning problems, with much lower rates of comorbid internalizing problems (MTA Cooperative Group, 1999a). The procedures for making DSMIV diagnoses of comorbidities are similar to the procedures for ADHD with a few exceptions (see other articles in this special section), and we do not discuss them here. However, a critical question in the assessment of comorbid disorders is identical to assessments for ADHDshould a DSMIV diagnosis be the main focus? Though many children with ADHD meet criteria for comorbid diagnoses, the diagnosis is of little use beyond studies of epidemiology. This is because comorbid diagnoses per

ADHD ASSESSMENT

Downloaded by [the Bodleian Libraries of the University of Oxford] at 06:43 01 April 2012

se typically have small to no effect on treatment outcome or approach (e.g., Kolko, Bukstein, & Barron, 1999; MTA Cooperative Group, 1999b). The nature of the associated impairmentthat is, what behaviors will be targeted in treatmentis of much greater value for clinicians (Pelham & Fabiano, 2001). For example, it is widely thought that a comorbid diagnosis of conduct disorder confers increased risk for antisocial outcomes among children. However, the bulk of the extant longitudinal literature actually employs parent, teacher, and peer ratings of aggressive and disruptive behavior and peer nominations of disliking and rejection that is, measures of impairmentnot DSMIV conduct disorder diagnoses (Coie & Dodge, 1998). The most parsimonious approach to diagnosing comorbidities is arguably to rely on rating scales that have broad coverage of symptoms rather than employing a full diagnostic structured interview, which would be costly and time intensive. In keeping with the theme of our approach, the assessment of comorbidity is integrated into the broader assessment goals of using an FBA to operationalize the presenting problems and gathering information on setting events and antecedents, as well as environmental contingencies that may precipitate, maintain, or exacerbate the behavior. These targeted behaviors are then tracked along with outcomes associated with ADHD to determine whether meaningful change has been obtained.

levels of evidence and psychometrics, so some degree of generalization to instruments not reviewed would be appropriate. In some cases, the method of assessment is generic (e.g., individualized daily target behavior frequencies), and we have selected studies of which we are aware that presumably accurately reflect a much larger universe of use in practice. Table 1 displays information on the reliability and validity of common ADHD assessment measures. When available, we present information on internal consistency (i.e., the relation among the individual items on the scale), testretest reliability (i.e., temporal stability), and interrater reliability (the relation among independent raters scores). The second section of the table displays validity information for each measure. When available, we present information on concurrent validity (i.e., the relation between the measure and similar measures), predictive validity (i.e., the ability of the measure to accurately discriminate between groups), and convergent and discriminant validity (i.e., the relation of the measure to others, correlating with measures purported to assess the same construct and not correlating with measures purported to assess a different construct; Campbell & Fiske, 1959). We also note whether the measure is sensitive to behavioral and pharmacological treatment effects. ADHD Symptom Rating Scales Rating scales of ADHD-like behavior have been used since the late 1960s to describe and diagnose participants in research studies and to measure treatment outcomes (e.g., Conners, 1969; Goyette, Conners, & Ulrich, 1978; Quay & Peterson, 1983). The first DSM IV symptom based rating scale of ADHD, the Swanson and Pelham Rating Scale (Atkins et al., 1985), was constructed because no parent or teacher rating scale of the Diagnostic and Statistical Manual of Mental Disorders (3rd ed.) attention deficit disorder symptoms existed. Swanson and Pelham wanted to make a diagnosis using the new attention deficit disorder category to enroll individuals in a study, so they listed the Diagnostic and Statistical Manual of Mental Disorders (3rd ed.) symptoms of attention deficit disorder using the response format of the widely used Conners Rating Scales. Since then, many rating scales have been developed based on the Swanson, Nolan, and Pelham Rating Scale. There are few differences among these scales beyond the range of items assessed. For example, the Disruptive Behavior Disorders rating scale (Pelham et al., 1992) includes Diagnostic and Statistical Manual of Mental Disorders (3rd ed., rev.) and DSMIV ADHD symptoms, making it useful in research settings where comparisons to studies that used prior DSM algorithms are desired. It also includes the DSMIV symptoms of oppositional defiant disorder and conduct disorder. 453

Review of the Evidence for ADHD Assessment Methods In the following we discuss each of the major techniques involved in assessment and the evidence for the respective techniques. As recommended in most guidelines for diagnosis and assessment (e.g., American Academy of Child and Adolescent Psychiatry, 1997; AAP, 2001), we focus on gathering information from adults in the childs life via ratings and interviews and from observational data in the natural setting, and we deal with both symptoms of ADHD and ADHD-related impairments. Numerous, comprehensive reviews are available on assessment strategies for ADHD (e.g., Anastopoulos & Shelton, 2001; Barkley, in press; Collett, Ohan, & Meyers, 2003; Hinshaw & Nigg, 1999), and there are a number of well-developed, commercially available rating scales for assessing ADHD and related impairment, both narrow and broadband. Our purpose in this section is to selectively review the literature and determine the evidence-base for some of the more common assessment instruments utilized for children with ADHD in research and clinical practice. This review is therefore not exhaustive and is limited by the measures chosen for inclusion. At the same time, most instruments within a method of assessment (e.g., rating scales) have similar

454

Table 1. Summary of Reliability and Validity Information for Rating Scales of ADHD Symptoms, Broadband Behavior Rating Scale Subscales, Structured and Semistructured Interviews, and Ratings of Psychosocial Impairment
Reliability Type of Measurea DSMIV ADHD rating scale Internal Consistency Not reported Validity Convergent and Discriminant Convergent: r = .23 to .35 for measures of academics; r = .31 to .37 for observations of verbal intrusion. Discriminant: Academic measures and group play observational measure discriminated between teacher ratings of H-I/I and peer relationships Not reported Sensitive Measure of Treatment Outcome Sensitive to behavioral and pharmacological treatment effects in multiple studies

Downloaded by [the Bodleian Libraries of the University of Oxford] at 06:43 01 April 2012

Type of Rating Scale Swanson, Nolan, & Pelham rating scale (SNAP; Atkins et al., 1985; Atkins et al., 1988; Gaub & Carlson, 1997; MTA Cooperative Group, 1999a; Pelham & Bender, 1982)

TestRetest r = .77 to .80 (T)

Interrater Not reported

Concurrent r (SNAP Impulsivity score, SNAP Peer interaction items) = .73

Predictive Different subtypes of ADHD exhibit differential patterns of impairment relative to a control group

Disruptive Behavior Disorders rating scale (DBD; Massetti et al., 2003; Pelham Gnagy, et al., 1992, Pelham, Evans, et al., 1992; Pelham & Hoza, 1996) ADHD rating scale (DuPaul, 1991; DuPaul, Anastopoulos, et al., 1998; DuPaul et al., 1997; DuPaul, Power, et al., 1998; Gomez et al., 1999; Power et al., 1998)

= .91 to .96 (T) = .82 to .85 (P)

r = .49 to .61 (P, before treatment/after treatment)

r = .14 to .26 (P, T)

r (P DBD rating scale, DISC) = .38 to .62

Not reported

Sensitive to behavioral and pharmacological treatment effects in multiple studies Not established (though items are identical to other symptom rating scales used to measure treatment effects)

= .88 to .95 (T) = .86 to .92 (P)

r = .55 to .90 (T); r = .70 to .86 (P)

r = .40 to .59 (P, T)

Vanderbilt rating scale (Wolraich et al., 1998; Wolraich et al., 2003)

= .90 to .94 (T). = .94 to .95 (P).

Not reported

r = .32 (P, Identified child with ADHD; T, Identified child with ADHD).

r (ADHD rating scale, T Conners) = .80 to .88 (T). r (ADHD rating scale, Abbrev Conners T scale) = .77 to .90. r (ADHD rating scale, P Conners) = .68 to .84 (P) r (P Vanderbilt rating scale, CDISC) = .79

Statistically significant differences in mean scores between ADHD and control groups (P, T); Effective in predicting ADHD subgroup and ADHD versus control group membership. Not reported

Some evidence for discriminant validity of H/I and I scales. Moderate correlations with observations of on-task behavior. For I factor, moderate correlations with academic productivity (P, T)

r (T rating I, measures of impairment) = .50 to .66. r (T rating H/I, measures of impairment) = .23 to .63. T rating scale H-I/IA correlated with learning problem status to a much lesser extent.

Not established (though items are very similar to other symptom rating scales used to measure treatment effects)

Downloaded by [the Bodleian Libraries of the University of Oxford] at 06:43 01 April 2012

Broad band rating scales

ADHD Symptom Checklist4 (Gadow & Nolan, 2002; Gadow & Sprafkin, 1997; Gadow et al., 2001; Mattison et al., 2003; Sprafkin et al., 2001; Sprafkin et al., 2002) CBCL/TRF (Achenbach, 1991; Achenbach & Edelbrock, 1981; Achenbach & Rescorla, 2001; Anastopolous et al., 1993; Barkley et al., 2000; Ostrander et al., 1998) Child Attention Problems (Barkley, 1990; Barkley et al., 1990; Power, Andrews, et al., 1998)

= .87 to .92 (P). = .94 to .95 (T).

r > .65 (P). r = . .75 to .85 (T)

r = .23 to .51 (P, T). r = .50 to .54 (T, teacher aide).

r ADHD diagnoses, Attention problems(ADHD Symptom Checklist 4, TRF) = .35 to .78. r (CBCL attention problem scale, Conners) = .59. r (CBCL, Behavior problem checklist) = .66 to .77. r (TRF, Conners) = .80. r (other ADHD symptom rating scales) >>.90

When P and T ratings are combined, Sensitivity = .91

Convergent/discriminant validity with CBCL attention problems and delinquency factors.

(CBCL) = .84. (TRF) = .94.

Split-half r = .84

r (CBCL) = .90 (1-week), .71 to .77 (1-year), .75 (2-year). r (TRF) = .96 (15-day), ,77 (2-month), .73 (4-month). Testretest r = .96 (2 weeks), .76 (2 months), .70 (4 months)

r (CBCL) = .93 to .96 (interviewer), .79 (interparent). r = .61 (interteacher), .62 (teacheraide). r (interteacher) = .77

CBCL, TRF competency scales lower and problem scales higher in clinically referred youth versus nonreferred youth.

Multitrait method matrix in CBCL/TRF manuals illustrating convergent validity of the scale.

Not established (though items are identical to other symptom rating scales used to measure treatment effects) Sensitive to behavioral treatment effects in multiple studies

The Behavior Assessment System for Children (BASC; Ostrander et al., 1998; Reynolds & Kamphaus, 2002)

Attention problem Composite = .85 to .87 (T), .76 to .81 (P)

Attention problem Composite r = .83 to .92 (T), .78 to .92 (P)

r (teacher 1, teacher 2) = .63 to .69; r(mother, father) = .56 to .73

BASC scales correlate highly with corresponding CBCL/TRF scales and the Conners Rating scale. r (CBCL attention problem scale, Conners) = .59. R (CBCL, Behavior problem checklist) = .66 to .77. r (TRF, Conners) = .80. Not reported

Conners Parent and Teacher rating scales (I and HI subtests; Achenbach, 1991; Conners et al., 1998a, 1998b, Goyette, et al., 1978; Roberts et al., 1981)

= .73 to .95 (T). = .75 to .94 (P)

r = .47 to .87 (T). r = .71 to .78 (P).

r (Mother, Father) = .55. r (Parent, Teacher) = .49.

Discriminated between children referred for services and those not referred; Discriminated between children with H/I and those without Groups of children with preexisting diagnoses exhibit distinct BASC profiles; differentiates between attention problems and hyperactivity impulsivity. Overall correct classification rate = 93.4% (P) and 84.7% (T).

Not reported

Sensitive to pharmacological treatment effects

Not reported

Not established

Convergent and discriminant validity supported by multitrait, multimethod matrix

Sensitive to behavioral and pharmacological treatment effects in multiple studies

IOWA Conners Rating scale (Atkins et al., 1989; Kolko et al., 1999; Loney & Milich, 1982; Milich et al., 1982; Pelham et al., 1989; Pelham et al., 1993)

(T I/O) = .80 to .89. (T O/D) = .85 to .92. r (I/O, O/D) = .60 to .63

r (T I/O) = .89. r (T I/O, 2-year) = .42. r(T O/D) = .86

r (teacher, teacher) = .35 to .49

Differentially identifies children with hyperactive behavior versus aggressive behavior. Discriminated ADHD children from controls.

Academic variables, peer relationship variables, neat desktop, and behavioral observations of disruptive behavior differentiate I/O from O/D factor.

Sensitive to behavioral and pharmacological treatment effects in multiple studies

455

(continued)

456
Table 1. (Continued)

Downloaded by [the Bodleian Libraries of the University of Oxford] at 06:43 01 April 2012

Reliability Type of Measurea Structured Interviews Internal Consistency r > .90, = .60, ICC = .75

Validity Convergent and Discriminant Not reported Sensitive Measure of Treatment Outcome Not established

Type of Rating Scale Diagnostic Interview for Children and AdolescentsRevised (DICAR; August, Braswell, & Thuras, 1998; Boyle et al., 1993; Reich, 2000; Reich, Shayka, & Taibleson, 1991) Diagnostic Interview Schedule for Children Version IV (DISCIV; Jensen et al., 1996; Lewczyk et al., 2003; MTA Cooperative Group, 1999a; Shaffer et al., 2000) Kiddie Schedule for Affective Disorders and Schizophrenia (KSADS; Biederman et al., 1993; Orvaschel, 1985; Tillman et al., 2003)

TestRetest = .78 to . 86 for parents, .24 to .43 for self-reports

Interrater = .01 to .34 (parent vs. self-report); .71 (lay vs. psychiatrist reports)

Concurrent Not reported

Predictive Good concordance, specificity, and sensitivity of assessment and diagnosis reported

Semistructured interviews

Child and Adolescent Psychiatric Assessment (CAPA; Angold & Costello, 2000)

= .60 (P), .10(Y), .48(P+Y); ICC = .84(P), .65(Y), .79(P+Y); = .22 Not reported for ADHD; moderate to strong for affective and conduct disorders Not reported for ADHD

79 (P), .42 (Y), .62 (P+Y)

Not reported for ADHD

.70 (P), .10 (Y), .48 (P+Y) for symptom counts; .65 (P), .19 (Y), .56 (P+Y) for criteria + impairment = .56

72(P), .27(Y), .70(P+Y) (r with clinician scores); however, only minor diagnostic agreement with clinician ratings Demonstrated excellent convergence with CBCL Attention Problems scale

Exhibited excellent ability to discriminate probands relative to the CBCL

Classifications using DISC at higher risk for indexes of impairment (i.e., school dysfunction, family distress) for parent but not child reports

Sensitive to behavioral and pharmacological treatment effects

Discriminated between children with ADHD and children with bipolar disorder

Not reported

Not established

Not reported for ADHD

Concordance with interviewer ratings of inattention and H/I

Significant relations with CBCL and TRF scores

Not reported

CAPA interviews reflect clinical and research experience and prevalence data, comorbidity patterns; children diagnosed using CAPA are at increased risk for impaired functioning

Not established

Impairment rating scales

Columbia Impairment Rating scale (CIR; Bird et al., 1993; Bird et al., 1996)

Downloaded by [the Bodleian Libraries of the University of Oxford] at 06:43 01 April 2012

r = .82 to .89 (P); r = .70 to .78 (Child). Not reported

r = .89 (P); r = .63 (Child)

Not reported

Childrens Global Assessment of Functioning (CGAS; Bird et al., 1987; Bird et al., 1990; Shaffer, et al., 1983)

r = .74 to .84 (Cl)

r = .69 to .87 (Cl)

Impairment Rating Scale (IRS; Evans et al., under review; Fabiano et al., under review; Pelham & Hoza, 1996)

Not reported

Vanderbilt Rating ScaleTeacher Version (Wolraich et al., 1998)

Not reported

r (1-year) = .40 to .67 (T). r (1year) = .54 to .76 (P). T r = .64 to .89 (6month); .57 to .84 (4-month); .66 to .98 (2month). P r = .60 to .89 (6month); .76 to .91 (4-month); .82 to .95 (2month). Not reported

r (P,T) = .33 to .83

r (P CIR, P CGAS) = .35 to .73. r (Child CIR, Child CGAS) = .48 to .50. r (CGAS, GAF) = .87 to .92; r (CGAS, severity scale) = .80 to .90; r (CGAS, P Abbreviated Conners) = .25; r (CGAS, CBCL) = .40 to .65 r (IRS, P CGAS) = ..62 to .77 (T). r (IRS, Interviewer CGAS) = .55 to .73 (P).

Mean scores higher in clinical participants compared to community respondents Scores associated with service use and need, and behavior problem ratings. Significant mean score difference between clinical cases and controls and accurate classification rates. Positive predictive power = .85 to .92 (T). Negative predictive power = .82 to .91 (T). Postive predictive power = .91 to .97 (P). Negative predictive power = .80 to 1.00 (P); Predicts use of mental health or school services

Correlations with other measures of psychological dysfunction = .32 to .71 (P), .08 to .41 (Child). Not reported

Not established

Not established

Evidence for convergent validity, the IRS correlated moderately with behavioral observations and frequency counts of behavior.

Sensitive to behavioral and pharmacological treatment effects

Child and Adolescent Functional Assessment Scale (CAFAS; Hodges et al., 1999; Hodges & Wong, 1996)

r = .73 to .78

Not reported

r (Classroom Behavior Performance) = .94; r (Academic Performance) = .95 r (Total) = .92 to .96

Not reported

Not reported

r (Classroom Behavior Performance, ADHD symptoms) = .53 to .66; r (Academic Performance, ADHD symptoms) = .23 to .50 Higher impairment associated with behavioral indices of impairment (i.e., poor grades and school attendance, contact with the law).

Not established

Not reported

Higher impairment ratings associated with more severe disorders.

Sensitive to behavioral treatment effects

(continued)

457

458
Table 1. (Continued)

Downloaded by [the Bodleian Libraries of the University of Oxford] at 06:43 01 April 2012

Reliability Type of Measurea Observations Internal Consistency Not reported

Validity Convergent and Discriminant Convergent or discriminant validity with teacher ratings of inattention and aggression Sensitive Measure of Treatment Outcome Sensitive to behavioral and pharmacological treatment effects in multiple studies

Type of Rating Scale Classroom Observations of Conduct and Attention Deficit Disorders (COCADD; Atkins et al., 1985; Atkins et al., 1988; Atkins et al., 1989)

TestRetest Not reported

Interrater Phi coefficient = .52 to .95 (all but one > 60). (classroom) = .67 to 1.00. (playground) = .79 to 1.00. (Desk observations) = .60 to .79 Mean Phi coefficient for interobserver agreement = .76 to .82

Concurrent Not reported

Predictive Discriminant function reliably identified children with ADHD and those without.

Classroom behavior code (Abikoff et al., 1977, 1980; Klein & Abikoff, 1997; MTA Cooperative Group, 1999a)

Not reported

Response Class Matrix (Barkley & Cunningham, 1979; Cunningham & Barkley, 1979; Cunningham & Siegel, 1987; Mash & Johnston, 1982; Mash, Terdal, & Anderson, 1973) Playroom Observations (Milich et al., 1982, 1986)

Not reported

As expected, ADHD group had greater variability across categories over successive observations. Not reported

Not reported

Interobserver agreement = .76 to .97

Not reported

Accurately discriminated between hyperactive and comparison children and code accurately classifies children in groups (80% correctly classified) Observation code differentiated between children with ADHD and those without.

Not reported

Sensitive to behavioral and pharmacological treatment effects in multiple studies

Not reported

Sensitive to pharmacological treatment effects in multiple studies

Not reported

r (Free play, 2-year) = .08 to .53. r (Restricted academic, 2-year) = .09 to .57.

Interobserver agreement = .87 to .95

Not reported

Not reported

Observations of behaviors in both settings uniquely accounted for variance on the Conners Hyperactivity factor; Observations differentiated between Hyperactivity and Aggression factors

Not established

Downloaded by [the Bodleian Libraries of the University of Oxford] at 06:43 01 April 2012

Barkley (Barkley, 1990; Barkley et al., 2000; DuPaul, 1991)

Not reported

Not reported

Interobserver agreement = .77 to .85

Not reported

Not reported

STP point system (Chronis et al., 2004; Pelham et al., 1993; Pelham, Greiner, & Gnagy, 1998; Pelham & Hoza, 1996; Pelham et al., 2001; Pelham et al., 2005; Pelham et al., 2000) Individualized target behavior evaluation (ITBE; Pelham, unpublished data; Pelham et al., 2001; Pelham et al., 2002; Pelham et al., in press)

Not reported

r (Placebo condition, Low MPH condition) = .57

= .77 to .88

r (even days, odd days) = .62

Interobserver agreement = .66 to .98; (following rules) = .65 to .89 across sites in MTA Not reported

r (STP point system, Observational measure) = .43 to 1.00; M = .72.

Discriminates between ADHD and comparison children

r = .25 to .30 for parent ADHD ratings and observations of on-task class behavior; r = .46 to .57 for teacher ADHD ratings and observations of on-task class behavior Not reported

Sensitive to behavioral treatment effects

Sensitive to behavioral and pharmacological treatment effects in multiple studies Sensitive to behavioral and pharmacological treatment effects in multiple studies

Not reported

Discriminates between different treatment manipulations (e.g., medicated versus unmedicated days)

r (ITBE, Teacher I/O) = .58 to .74. r (ITBE, Teacher O/D) = .51 to .72. r (ITBE, Counselor I/O) = .51. r (ITBE, Counselor O/D) = .64. r (ITBE, STP Point system measures) = .47 to .84.

Note: DSMIV = Diagnostic and Statistical Manual of Mental Disorders (4th ed.; APA, 1994); ADHD = Attention-deficit/hyperactivity disorder; T = Teacher; P = Parent; Y = Youth; H/I = HyperactiveImpulsive; I = Inattentive; TRF = Teacher Report Form; CBCL = Child Behavior Checklist; I/O = Inattentive/Overactive IOWA Conners factor; O/D = Oppositional/Defiant IOWA Conners factor; GAF = Global Assessment of Functioning; STP = Summer Treatment Program. aThe table does not include an exhaustive review of measures or studies.

459

PELHAM, FABIANO, MASSETTI

Downloaded by [the Bodleian Libraries of the University of Oxford] at 06:43 01 April 2012

The Vanderbilt rating scale (Wolraich, Feurer, Hannah, Baumgaertel, & Pinnock, 1998; Wolraich et al., 2003), the current version of the Swanson, Nolan, and Pelham Rating Scale, and the Child Symptom Inventory (Sprafkin, Gadow, & Nolan, 2001) also include comorbid symptoms of other disorders. Standardized ADHD rating scales are currently recommended by the American Medical Association (Goldman, Genel, Bezman, & Slanetz, 1998), the AAP (2000), the American Academy of Child and Adolescent Psychiatry (1997), and expert consensus (Lahey & Wilcutt, 2002). Table 1 lists common ADHD rating scales. The scales listed have parent and teacher versions (or are appropriate for either rater). They are all clearly reliable. When intervention occurs in the interim, or the interval lengthens, the stability of the scale scores is less consistent. Cross-informant reliabilities are low, ranging from .14 to .59, and show that raters differ in their evaluations of ADHD behavior. The rating scales are effective at discriminating between clinical and nonclinical groups (AAP, 2000) and among subgroups of children with ADHD (e.g., Power et al., 1998). Finally, these rating scales have a long history of use as measures of treatment outcome and are clearly sensitive to both behavioral and pharmacological treatment effects (e.g., MTA Cooperative Group, 1999a). Broadband Rating Scale Subscales Broadband scales include items that span the range of child psychopathologies and include both rationally- and empirically derived items. Although they are not currently recommended for the diagnosis of ADHD in clinical practice (AAP, 2000) because the broad domain factors (e.g., externalizing) do not accurately identify children with ADHD (Brown et al., 2001), many studies have investigated the psychometric properties of subscales of these measures in relation to ADHD. The Child Behavior Checklist (CBCL) and Teacher Report Form (Achenbach & Rescorla, 2001) and the Behavior Assessment System for Children (Reynolds & Kamphaus, 2002) are two widely used broadband assessments. These scales differ in that the CBCL is empirically derived, whereas the Behavior Assessment System for Children is rationally derived. A strength of the measures is that both have detailed manuals that list normative information across gender and developmental levels for many disorders in addition to ADHD. Both scales include an Attention Problems subscale (Achenbach & Rescorla, 2001; Edelbrock & Costello, 1988; Ostrander, Weinfurt, Yarnold, & August, 1998), and such subscales are frequently used as a proxy for ADHD diagnosis in research studies (e.g., the Conduct Problems Prevention Research Group, 2002; Hartman, Stage, & WebsterStratton, 2003). Other non-DSMIV based ADHD rating scales include the Conners Parent and Teacher Rat460

ing Scales (Conners, Sitarenios, Parker, & Epstein, 1998a, 1998b), the Inattentive/Overactive factor of the IOWA Conners Rating Scale (Loney & Milich, 1982), and the Child Attention Problems Rating Scale (Barkley, 1990; Edelbrock & Costello, 1988), which is rationally derived from the Teacher Report Form. Table 1 lists reliability and validity information on these broadband subscales. The measures are highly correlated with each other. Across days or months, the subscales are also reliable. The Attention Problem subscales on the broadband measures are also highly related to DSMIV diagnoses of ADHD. Ostrander et al. (1998) reported that the Behavior Assessment System for Children correctly classified 97.7% of ADHD cases diagnosed using the parent version of the Diagnostic Interview for Children and Adolescents (Boyle et al., 1993). They further reported that the CBCL was effective in identifying children with ADHD, primarily inattentive type. Similarly, Chen, Faraone, Biederman, and Tsuang (1994) found that the CBCL Attention subscale was highly accurate in identifying children with ADHD. It is important to note, however, that both studies used a clinical sample with a high base rate of ADHD, and these studies must be replicated in more diverse samples. The Conners Rating Scale and its short forms are also well validated. When the Conners is compared with other measures of symptoms (e.g., the Diagnostic Interview Schedule for Children; Shaffer et al., 2000), there is concurrent validity with correlations ranging from .68 to .80. The IOWA Conners Rating Scale differentially identifies children with hyperactive and aggressive behavior (Loney & Milich, 1982), and it is significantly related to objective measures of behavior such as peer sociometrics and academic achievement (Atkins, Pelham, & Licht, 1989). The Child Attention Problems Rating Scale is useful for discriminating between children with inattention who have hyperactivity and those who do not (Barkley, DuPaul, & McMurray, 1990). Overall, these scales, using empirically or rationally derived behavioral descriptors rather than explicit DSMIV symptoms, exhibit very good reliability and validity. The measures are used ubiquitously for assessing outcomes across childhood treatment outcome studies (e.g., Barkley et al., 2000; Kolko et al., 1999; Pelham et al., 1993) and are sensitive to both behavior and pharmacological treatment effects. Structured Interviews Interviews (typically conducted with the childs mother) for the assessment of psychopathology in children and adolescents have evolved primarily within the context of research applications (Hodges, 1993). Both epidemiological studies and other broad-based research approaches have made extensive use of inter-

ADHD ASSESSMENT

view methods for collecting diagnostic information from large samples (Boyle et al., 1993; Shaffer et al., 2000), and the use of structured interviews to assess for diagnostic criteria is often required by grant and manuscript reviewers. Interview questions are standardized to reduce information variance, and multiple disorders can be assessed with the same instrument. The use of structured or semistructured interviews is recommended by experts as part of ADHD assessments (Lahey & Wilcutt, 2002). Table 1 provides reliability and validity data on the ADHD module of two common structured interviews and two semistructured interviews. Reliability scores for the Diagnostic Interview for Children and AdolescentsRevised (Boyle et al., 1993; Reich, Shayka, & Taibleson, 1991) and the Diagnostic Interview Schedule for Children (Shaffer et al., 2000) are high for the parent versions; the internal consistency and testretest reliability coefficients range from .60 to .90. Parent assessments of ADHD tend to be more reliable for older children (Boyle et al., 1993). Stability of diagnosis has been demonstrated over 1 to 3 years for both the Diagnostic Interview Schedule for Children and the Diagnostic Interview for Children and AdolescentsRevised. It is further important to note that testretest reliability information for structured interviews is largely based on clinic-based populations rather than general population samples (Bravo et al., 2001; Jensen et al., 1995; Shaffer et al., 1988; Sylvester, Hyde, & Reichler, 1987). Consistent with the absence of validity of child reports of ADHD, agreement between parents and children is quite low (kappas for the Diagnostic Interview for Children and AdolescentsRevised between .01 and .34). The semistructured interviews we reviewed, the Kiddie Schedule for Affective Disorders and Schizophrenia (Orvaschel, 1985) and Child and Adolescent Psychiatric Assessment (Angold & Costello, 2000), have not published reliability data for children with ADHD. Although internal consistency and testretest reliability information reported for other disorders is moderate to strong, it is not possible to generalize these findings to ADHD, especially given variability across diagnostic categories found with other measures (Jensen et al., 1996; Schwab-Stone et al., 1996). Interview measures have some validity with respect to diagnostic classifications (Angold & Costello, 2000; Boyle et al., 1993; Carlson & Rapport, 1989; Lewczyk, Garland, Hurlburt, Gearity, & Hough, 2003; Reich, Shayka, & Taibleson, 1991). With respect to diagnostic categories, classifications demonstrate both sensitivity and specificity, indicating strong convergent and discriminant validity. Relatively little information is available with respect to concurrent validity, as rating scales or other measures are not frequently compared to interviews in diagnostic batteries. Available reports do suggest concurrent validity (e.g., the parent Van-

derbilt rating scale and Diagnostic Interview Schedule for ChildrenVersion 4 correlation = .79; Wolraich et al., 2003). Measures of Impairment Evaluation of impairment is typically conducted by a clinician rating the childs current level of functioning based on information collected during an intake (e.g., clinical interview, parent and teacher ratings, review of records) or by having the parent or teacher rate impairment directly. As noted previously, there are numerous methods currently used to evaluate psychosocial impairment in specific domains of functioning. Measures that document global or overall impairment (e.g., the Childrens Global Assessment of Functioning; Bird et al., 1990) or provide a multidimensional assessment of impairment (e.g., the Child and Adolescent Functional Assessment Scale; Hodges & Wong, 1996) are also commonly employed. The review of all measures of domain-specific impairment would comprise its own article; we therefore limit our review to global and multidimensional impairment measures that have been used with ADHD samples (Table 1). The impairment ratings reviewed show good temporal stability and interrater reliability, and there is evidence of concurrent and convergent validity. In addition, the scales are highly efficient in classifying clinical and nonclinical cases. The Child and Adolescent Functional Assessment Scale (Hodges, Doucette-Gates, & Liao, 1999; Hodges & Wong, 1996) and the Childrens Global Assessment of Functioning (Shaffer et al., 1983) are slightly different in content and domains assessed, but both are clinician-completed and exhibit good psychometric properties. The Columbia Impairment Rating (Bird et al., 1993, 1996), the Impairment Rating Scale (IRS; Fabiano et al., 2005), and the Vanderbilt (Wolraich et al., 2003) ask parents to rate the childs level of impairment, and all scales evidence adequate reliability and validity. The IRS and the Vanderbilt have teacher versions, which also exhibit adequate psychometric properties. Thus, for impairment ratings completed by clinicians, parents, and teachers, there is substantial evidence for the validity of measures. Although these global impairment ratings are effective in identifying impaired areas of functioning, they have not yet been widely used as measures of treatment outcome. Impairment is typically measured using discrete measures of individual domains of impairment rather than global functioning measures or measures of global functioning within specific domains (e.g., MTA Cooperative Group, 1999a, 1999b). The information in Table 1 suggests that any one of the global impairment ratings may be a viable alternative to such multimeasure approaches, as the global measures correlate fa461

Downloaded by [the Bodleian Libraries of the University of Oxford] at 06:43 01 April 2012

PELHAM, FABIANO, MASSETTI

vorably with more extensive, domain-specific measures but are much shorter and less costly. Clinicians in applied settings may find impairment measures that include domain-specific indexes of impairment (e.g., Child and Adolescent Functional Assessment Scale or the IRS) more useful than those with a single global rating (e.g., Childrens Global Assessment of Functioning) because such measures can yield information on key impaired functional domains that should be targeted in further assessment and treatment. Observational Measures

Downloaded by [the Bodleian Libraries of the University of Oxford] at 06:43 01 April 2012

There is a long tradition of using behavioral observations with children described as disruptive, conduct disordered, and hyperactive (M. W. Roberts, 2001). In Table 1 we have listed a heterogeneous grouping of observational methods. Five of the measures involve having an independent observer evaluate the childs behavior in an analog (e.g., clinic) or natural (i.e., classroom) setting to code the presence of behaviors such as time on task, out-of-seat behavior, and verbal intrusion (Abikoff, Gittelman-Klein, & Klein, 1977, 1980; Atkins et al., 1985; Atkins, Pelham, & Licht, 1988; Atkins et al., 1989; Barkley, 1990; Mash, Terdal, & Anderson, 1973; Milich, Loney, & Landau, 1982). In both clinical settings (e.g., Mash & Johnston, 1982; Milich et al., 1982) and natural settings (e.g., Abikoff et al., 1977; Atkins et al., 1985), the observational codes listed in Table 1 exhibit acceptable reliability and validity. In addition, there are numerous examples of these or similar observational systems discriminating between ADHD and comparison children and subgroups of ADHD children, as well as evidence of sensitivity to the effects of behavioral and pharmacological treatment (e.g., Abramowitz, Eckstrand, OLeary, & Dulcan, 1992; Chronis et al., 2004; Fabiano et al., 2004; Klein & Abikoff, 1997; Murphy, Pelham, & Lang, 1992; Northup et al., 1999; Pelham et al., 1993; Pelham et al., 2000; Pelham, Greiner, & Gnagy, 1998; Pelham, Wheeler, & Chronis, 1998; Rapport, Murphy, & Bailey, 1982). The Individualized Target Behavior Evaluation (ITBE) is a very simple observational scheme that uses teacher- or parent-implemented frequency counts as proxies for more extensive observations by independent observers. It thus does not require a high degree of training, a special setting, or independent observers. Idiosyncratic problem behavior ratings have long been used and have demonstrated sensitivity to treatment (e.g., Patterson, 1974; Pelham, Schnedler, Bologna, & Contreras, 1980). Such ratings and the ITBE are different from a standardized problem behavior checklist because they include only the categories of behavior relevant for a particular child. However, the former are parent and teacher ratings, whereas the ITBE is a measure of whether a collection of events has occurred. An 462

ITBE operationalizes the idiosyncratic target behaviors within the childs areas of impairment and sets a criterion for each behavior evaluated (e.g., interrupts three or fewer times during dinner; has no instances of aggression during recess). The teacher or parent evaluates whether the child has met each behavioral goal in the time specified (e.g., during each class period), and the overall percentage of targets met is calculated (e.g., 65% of the goals were met on a given day). In clinical use, the ITBE has been most widely used as part of a daily report card in the school setting (e.g., OLeary, Pelham, Rosenbaum, & Price, 1976). The psychometric properties of the ITBE are also listed in Table 1 (Pelham et al., 2001; Pelham et al., in press). The ITBE is reliable, with internal consistency coefficients ranging from .77-.88, and acceptable temporal stability. The ITBE also correlates moderately to highly with standard paper-and-pencil measures of ADHD behavior (i.e., IOWA Conners subscales) and with observational measures. In addition, it is an idiographic measure of treatment outcome that is sensitive to the effects of medication and behavior modification (e.g., Chronis et al., 2004; Pelham, Burrows-MacLean, et al., 2005; Pelham et al., 2001; Pelham et al., 2002). Advantages and Limitations of Current Assessment Methods Given that the psychometric properties of all of the methods reviewed here are sound, our consideration of the advantages and limitations of each method emphasizes the utility of the approach. Considering first rating scales as a group, characteristics that likely contribute to their ubiquitous use include that they are easy to administer and score, take little rater or clinician time, and are cost-efficient, allowing the clinician to obtain information from multiple raters across settings. They have become the sine qua non of methods for diagnosing ADHD. Limitations of ADHD-specific rating scales include a lack of information regarding impairment (but see Waschbusch, Sparkes, & Northern Partners in Action for Child and Youth Services, 2003, and Wolraich et al., 1998, 2003, for exceptions), thus requiring the administration of additional measures. These limitations could be dealt with by adding sections to current ADHD-specific rating scales similar to the first section of the CBCL or the last section of the Vanderbilt. The scales also typically do not assess for other important diagnostic information such as age of onset or chronicity of the symptoms. Again, this limitation could be remedied by including the relevant questions. Rating scales may also be insensitive to low base rate or covert behaviors that may be underestimated or unknown to the rater and can be better assessed through an observational system or nonobtrusive measures (see Fabiano et al., 2004; Hinshaw, Simmel, & Heller, 1995, for examples). Al-

ADHD ASSESSMENT

though the standardized symptom rating scales are psychometrically sound and sensitive to treatment effects, the DSMIV symptoms of ADHD themselves are poor descriptors of clinically important treatment outcomes (i.e., social validity; Foster & Mash, 1999; Pelham & Fabiano, 2001) and have limited treatment utility (Scotti, Morris, McNeil, & Hawkins, 1996) and predictive validity (Mannuzza & Klein, 1999; Pelham, Lahey, Gnagy, Kipp, & Roy, 2005). Finally, there is the possibility of bias in parent ratings of ADHD. For example, maternal depression is common in families that include a child with ADHD, and it has been argued that parental depression may influence ratings (Chi & Hinshaw, 2002), making children appear to have ADHD even though they do not. On the other hand, if mothers have a history of depression but are not actively depressed, bias may not be an issue (Baumann, Pelham, Lang, Jacob, & Blumenthal, 2004). The clear implication for both researchers and clinicians is that evaluations from teachers or other sources (observations) are needed in addition to parents. There are few studies on this topic, and more are needed across multiple parental characteristics. Although they are valid for initial screening and identifying ADHD cases, structured interviews may be impractical for situations where repeated measurements are required and for measuring specific domains of impairment. Structured interviews require a significant amount of clinician or parent time, making them too costly for use in most clinical settings. Computer administration can reduce staff time, but it also reduces structured interviews to the same information set as rating scales (e.g., a yes/no check on a computer versus a paper form), eliminating a putative advantage of interviews. In addition, if structured interviews with parents are administered without the concurrent administration of teacher ratings, critical information on the childs functioning in the school setting will be lost. Despite their costs, as well as the fact that they have not been validated against direct observations or objective records, DSMIV-based structured interviews have been accepted by many as the gold standard in psychology and psychiatry. Global impairment ratings are efficient, well-validated measures for obtaining information on the degree to which the child is experiencing problems in daily life functioning. Advantages of these measures include their ease of administration and scoring. Multidimensional measures of impairment such as the IRS and the Child and Adolescent Functional Assessment Scale provide domain-specific information on functioning as well as an overall global rating. Such multidimensional ratings can be conceptualized as the focused global ratings suggested by Mash and Foster (2001) to simplify more complex assessment schemes. In contrast, non-domain-specific ratings of impairment (e.g., the Childrens Global Assessment of Func-

tioning) have limited treatment utility; a clinician who collects such a rating would then have to follow up with questions about functioning across specific domains of impairment to plan effectively for treatment. Observational measures may yield objective information that is often viewed as the gold standard in research, particularly as measures of treatment effects. However, traditional observational measures have limitations, particularly for clinical application, including high cost, the need to train observers, and the need to conduct multiple ratings across days and settings to obtain stable and representative estimates of behavior. The observational codes all use a time-sampling approach or an analog situation (e.g., parentchild interactions in a clinic) as a proxy for behavior in natural settings. Time sampling is problematic because it is difficult to measure low base rate behaviors (e.g., aggression). Observations in clinic analogue settings are costly and difficult to employ in clinical practice, and they do not provide a representative example of the childs behavior in the natural environment (e.g., Mash & Foster, 2001; M. W. Roberts, 2001). Finally, even gathering on-task information through observations may be less efficient than employing nonobtrusive measures such as asking teachers to track how much class work the child produced. Because observations of on-task behavior are a proxy for how productive the child is in the classroom, Atkins and colleagues (1985, 1988, 1989) had teachers save childrens assignments and scored their completion and accuracy, and this measure contributed to a discriminant function that separated ADHD and control children. Such measures have also been widely used in both regular school and analogue classroom settings to evaluate treatment effects (e.g., Pelham et al., 1993). Atkins and colleagues also checked childrens desks once per day and evaluated whether the child was prepared for class (e.g., had pencil and eraser in desk). Not only did these unobtrusive measures cost little and discriminate accurately, but they also constitute logical targets for intervention and can be easily monitored to evaluate improvement. A common thread in commentaries on assessments for child disruptive behavior disorders concerns the importance of identifying measures not only with an evidence base, but also with a high degree of utility in clinical settings (Nelson-Gray, 2003), where time-intensive and therefore expensive measures such as structured diagnostic interviews and behavioral observations are not viewed as practical or cost-effective and are not routinely used (Mash & Foster, 2001; Meyer et al., 2001; Mori & Armendariz, 2001; M. W. Roberts, 2001). The ITBE may provide a solution to this dilemma. It is best conceptualized as a combination of a simple objective observation and behavioral rating of impaired areas of functioning that addresses many of the limitations inherent in other assessment measures. First, it is an idiographic measure of functioning; in463

Downloaded by [the Bodleian Libraries of the University of Oxford] at 06:43 01 April 2012

PELHAM, FABIANO, MASSETTI

stead of asking parents or teachers to complete lengthy rating scales, or observers to code for numerous behaviors, the ITBE only includes target behaviors relevant to the particular child being observed. Second, the ITBE provides continuity throughout clinical contact with a child across both time and settings (e.g., home, school, peer and recreational). Third, the ITBE is particularly useful for low base-rate behaviors (e.g., stealing, fighting) because it is more likely to code behaviors than is a time-sample approach, and it can replace observations of on-task behavior by targeting teacherrecorded work completion as discussed earlier. Fourth, measures such as the ITBE are parent recorded, teacher recorded, or both, rather than observer or clinician recorded. This minimizes cost and maximizes parent and teacher involvement in the assessment process. In addition, procedures for developing ITBEs (the target behavior operationalization in an FBA) are widely available in textbooks and on the Internet and do not involve expensive training or recurring costs for copyrighted materials. Finally, the ITBE targets are the socially and empirically valid targets of treatmentthe problematic behaviors for which the child is initially referred and therefore the natural targets of intervention (Foster & Mash, 1999; Pelham & Fabiano, 2001). In the case of behavioral intervention, the ITBE constructed during assessment becomes the daily report card that is the backbone of clinical behavior therapy for a child with ADHD (OLeary et al., 1976). The target behaviors it contains are the focus of intervention and are monitored continuously to evaluate progress during treatment, seamlessly connecting initial assessment, treatment, and outcome monitoring. The ITBE is another example of a measure that approximates Mash and Fosters (2001) suggestion for simplified observational schemes that have focused global ratings. All of the assessment methods we have reviewed have the limitation of shared method and source variance. Psychometricians have long known that the source of ratings and method of measurement contribute a significant portion of the variance in correlations between measures and criteria (e.g., Campbell & Fiske, 1959; Gomez, Burns, Walsh, & de Moura, 2003; Langhorne, Loney, Paternite, & Bechtoldt, 1976; Meyer et al., 2001). This is illustrated in correlations between the Attention Problems subscale of the Teacher Report Form and the teacher Conners (r = .80). However, the high degree of shared variance appears to be limited to situations where the same rater uses the same method (see Table 1). Shared variance is far more modest in measures using different sources or methods (Achenbach, McConaughy, & Howell, 1987; De Los Reyes & Kazdin, 2004; Langhorne et al., 1976; Meyer et al., 2001). Cross-informant agreement with rating scales is expected to be low because (a) raters have different tolerances for and interpretations of a childs behavior and 464

(b) children behave differently across situations and therefore informants typically have access to nonoverlapping information. The implication for assessment is clear: Raters from a single source or setting do not provide a comprehensive picture of the current levels of functioning for a child with ADHDratings from both parents and teachers are always indicated for comprehensive ADHD assessment (see also Meyer et al., 2001; Power, Costigan, Leff, Eiraldi, & Landau, 2001). Nearly all the studies included used school-age, Caucasian boys as participants. Converging evidence suggests that normative ratings for boys and girls on these rating scales are different, with boys average ratings being more deviant than girls ratings in clinical and community samples (e.g., DuPaul, 1991; Fabiano et al., in press; Newcorn et al., 2001; Pelham, Milich, Murphy, & Murphy, 1989); some authors thus report normative information by gender (e.g., Achenbach & Rescorla, 2001). However, even studies such as the DSMIV field trial included 76% boys, making it unclear whether the girls included in the study are representative of all the girls in the population with ADHD or whether they represent only the severe end of the continuum of ADHD in girls (Frick et al., 1994). Because the ADHD literature to date focuses mostly on boys, more research in this area is needed. Another understudied area is the impact of racial and ethnic differences on the measurement of ADHD. It is difficult to draw conclusions regarding the extent of differences between groups given the few number of studies that report normative information by racial or ethnic group (e.g., Samuel et al., 1997). Studies that have explicitly investigated racial or ethnic differences on standardized rating scales are suggestive of differences between groups (e.g., Epstein, March, Conners, & Jackson, 1998; Epstein et al., in press; Reid, Casat, Norton, Anastopoulos, & Temple, 2001; Reid et al., 1998), with African American children generally rated with higher scores than European American children. These results indicate that ethnically and culturally appropriate norms should be utilized in assessments and screenings to prevent a high rate of false positive identifications of African American children. Beyond these studies, little information is available on other racial or ethnic groups, and more research in this area is needed. Similarly, raters with lower verbal functioning, such as mothers with lower educational attainment, may have substantial difficulties with the language used in ratings, and these measures have not been validated with such populations. Finally, most ADHD measures are validated in samples of school-age children. With the recent professional consensus conceptualizing ADHD as a chronic disorder (AAP, 2001), measures of preschool-age children, adolescents, and adults with ADHD are needed. Broadband measures currently include factor and normative information for preschoolers and adolescents

Downloaded by [the Bodleian Libraries of the University of Oxford] at 06:43 01 April 2012

ADHD ASSESSMENT

(Achenbach & Rescorla, 2001; Reynolds & Kamphaus, 2002); however, we are aware of only one study of the validity of parent and teacher ADHD-specific rating scales and a structured interview in preschool children (Lahey et al., 1998). At this time, the availability of well-validated ADHD measures for young children is limited. For example, although the DSMIV diagnosis appears to accurately identify preschoolers with the impairments and symptoms characteristic of older ADHD children (Lahey et al., 1998), it is difficult to adhere to DSMIV criteria, which mandate cross-situational impairment, for children who are not in a structured school setting. A few studies have validated or explored the usefulness of ADHD measures in adolescents and concluded that the two-factor structure of inattentive and hyperactiveimpulsive symptoms is consistent with that obtained in childhood (Conners et al., 1998a, 1998b; Molina, Smith, & Pelham, 2001). In addition, consistent with studies of school-age children, studies of adolescent self-report indicate adolescents are poor reporters of ADHD symptoms, providing no unique contribution beyond parent and teacher ratings (Smith, Pelham, Gnagy, Molina, & Evans, 2000). For both preschoolers and adolescents, few studies have explicitly investigated whether the number of symptoms needed for diagnosis differs depending on age (as might be expected given the development of attention and impulse control, but see Frick et al., 1994, for a study that found few age differences). Areas in need of further study include the incremental validity, predictive power, and appropriateness of the current DSMIV criteria for preschoolers and adolescents. Impairment ratings, in contrast, may be less influenced by factors such as gender, race or ethnicity, or age, and they are not constrained by DSMIV criteria (e.g., Angold et al., 1999; Fabiano et al., 2005). Therefore, for clinical purposesfor example, deciding about a need for treatmentemphasizing impairment more than DSMIV symptoms is a way to avoid making erroneous decisions based on the paucity of data for younger and older children and for girls and non-Caucasian children. Overdiagnosis in preschool and non-Caucasian children and underdiagnosis in adolescents and girls could be minimized, with treatment provided to children whose functional impairments justify it. Incremental Validity Although the standard assessment techniques used for ADHD have a clear evidence base (Table 1) when used independently, an important question is: What are the minimum strategies and tools necessary for an efficient and effective assessment for ADHD? Central to this question is the incremental validity of adding additional raters or methods to an initial strategy for assessment. For example, if parent and teacher ratings are

taken as a standard start to assessment, a prudent decision given their reliability, validity, and cost efficiency, this decision can take many forms, including (a) whether a subset of items on a measure would be as effective as the full measure; (b) whether both parent and teacher ratings provide incremental information, given the other; (c) whether structured parent interviews provide additional information, given parent ratings; (d) whether observational or laboratory methods provide incremental information, given rating scales; (e) whether comorbid diagnoses add incremental validity to ADHD assessments; (f) whether assessment of non-ADHD symptom domains of impairment adds incrementally to assessment; and (g) whether FBAs add incremental validity to parent and teacher diagnostic rating scales. Interestingly, although these are central questions in the assessment of ADHD, few studies have explicitly explored these issues (Johnston & Murray, 2003). In discussing the incremental validity of assessment measures, we use standardized DSMIV rating scales as a standard. DSMIV symptom-based rating scales are lengthy (ranging from 18 to more than 100 items) and therefore somewhat difficult for teachers to complete. This is especially true if the CBCL is also being administered to screen for additional disorders. Smaller item sets reduce assessment cost and make largescale screenings and clinical assessments more feasible, particularly in school settings. A relevant question, therefore, is whether incremental validity is improved by using all of the DSMIV items rather than a small subset. Some researchers have investigated the utility of single DSMIV items or item combinations to either identify children with ADHD or rule out those without the disorder (Frick et al., 1994; Milich et al., 1987; Pelham, Gnagy, et al., 1992; Power, Andrews, et al., 1998; Power et al., 2001; Power, Doherty, et al., 1998). As Power et al. cogently note, the DSMIV items have differing predictive powers, yet the DSMIV weights all symptoms equally. Certainly identifying the symptoms with the greatest predictive power would allow for more efficient, less costly assessments. An items ability to identify a child with ADHD is typically assessed by calculating the items positive predictive power (PPP), whereas an items ability to identify a child without ADHD is assessed via its negative predictive power (NPP). The researchers cited earlier have examined PPP and NPP of individual DSMIV symptoms and have found that one or two ADHD symptoms (e.g., one inattention item and one impulsivity item) may be useful for either identifying ADHD cases or children appropriate for further screening or ruling out children who should not be subjected to further screening. Unfortunately, no studies of which we are aware have created a brief scale from the items that appear to have high PPP and NPP to examine its ability to discrimi465

Downloaded by [the Bodleian Libraries of the University of Oxford] at 06:43 01 April 2012

PELHAM, FABIANO, MASSETTI

nate between ADHD and comparison samples. Across the studies described previously, items with high PPP generally had low base rates. That is, the symptoms were rare but, if endorsed, it was very likely that the child was identified as ADHD. In contrast, the items with high NPP generally had higher base rates. Thus, if the more common behavior was not endorsed, it was very unlikely the child would be identified as an ADHD case. However, the impact of base rates on identification of ADHD cases has not been widely studied. For example, in a clinical setting, where frequent reports of ADHD symptoms are expected, cut points on measures may need to be different than in a whole-school sample, where ADHD symptoms may be relatively less common. Power, Andrews, et al. (1998) and Power, Doherty et al. (1998) have also investigated the incremental validity of multiple raters and the number of items required by each rater or combinations of raters to efficiently identify children with ADHD (Power et al., 2001). They demonstrated that, when combined, parent and teacher items with the greatest PPPs are highly effective in identifying ADHD cases. In fact, for ruling out cases of ADHD, two informants were not necessarily required; if either the parent or teacher did not endorse an item with a high NPP, the child was usually a child without ADHD. There has been relatively little research on the need for both parent and teacher raters and how best to combine them. We argue in the following that for treatment purposes both are necessary, but whether both are necessary for diagnosis is a question worthy of additional research. An alternative to searching for a subset of DSMIV items that predict diagnosis is to examine whether incremental validity is added by the longer DSMIV-based scales compared to briefer, empirically derived scales. Not surprisingly, because the empirically derived scales were used long before DSMIV and were the basis for most of the DSMIV items, there is a considerable amount of overlap among item content, and they are highly correlated (Table 1). The five items that comprise the Inattentive/Overactive scale on the IOWA Conners were developed from and are highly associated with longer sets of items on the full Conners Rating Scales (Conners, 1969; Loney & Milich, 1982). The 10-item IOWA is equally sensitive to treatment effects and group comparisons compared to longer sets of items such as the 45-item, DSMIV-based Disruptive Behavior Disorders rating scale. Similarly, the Child Attention Problems Rating Scale (cf. Barkley, 1990) is highly related to DSMIV-based rating scales at only two thirds the length (i.e., Power et al., 1998). This literature contrasts with the assessment recommendations of the AAP, which emphasize using DSMIV criteria and obtaining information on those criteria from parents and teachers and include a 466

DSMIV symptom-based scale in their Toolkit for ADHD (American Academy of Pediatrics and National Initiative for Childrens Healthcare Quality, 2002); the guidelines do not recommend the use of empirically derived, non-DSMIV based scales in diagnosis (AAP, 2000). Further, the AAP guidelines state that parent or teacher rating scales are an option, not a requirement, in making the DSMIV diagnosis. This recommendation appears to need qualification, as the current literature supports the use of ADHD subscales on broadband measures in classifying children with ADHD. A key question from the view of cost of services is whether structured interviews provide incremental diagnostic validity beyond parent and teacher rating scales. There has been relatively little research on this pointmost researchers use both structured interviews and rating scales. The existing research suggests that no incremental validity is conferred from the use of structured interviews. Table 1 indicates that rating scales correlate with structured parent interviews. Furthermore, groups of children identified by structured interviews are also nearly perfectly classified using symptom rating scales (e.g., DuPaul, Power, McGoey, Ikeda, & Anastopoulos, 1998; Ostrander et al., 1998; Power et al., 2001). There is no evidence that combining rating scales with structured interviews will result in incremental benefit for diagnosis of ADHD (e.g., Wolraich et al., 2003). These results contradict the universal recommendation to researchers and clinicians that structured interviews are necessary for diagnosis and suggest that substantial savings in diagnostic costs could be obtained by relying more on rating scales. Whether direct observations provide incremental validity above parent and teacher ratings is also an important question. We are not aware of any research on the incremental diagnostic validity of observational schemes for diagnosis; however, we can address whether an entire complex observational scheme must be used or if a subset of codes is adequate. There is evidence that verbally intrusive behaviors in the classroom are the best predictors of ADHD status, with other categories providing little incremental validity (Abikoff et al., 1977, 1980; Atkins et al., 1985, 1989). This result suggests that if a teacher or aide reliably recorded instances of verbally intrusive behavior (e.g., completed via the ITBE as a target behavior), complex observational measures and the expense of an independent observer could be eliminated. This key behavior could be targeted in treatment and monitored as a socially and empirically validated index of treatment response. Similarly, Atkins et al. (1985) showed that routine teacher-recorded seatwork completion and accuracy discriminated between ADHD and comparison cases better than observations of on-task behavior.

Downloaded by [the Bodleian Libraries of the University of Oxford] at 06:43 01 April 2012

ADHD ASSESSMENT

Interestingly, direct observations are the gold standard for evaluating treatment effects in controlled trials. Because the observational schemes in Table 1 correlate only modestly with parent and teacher ratings, they clearly provide unique information for these purposes (e.g., Atkins et al., 1989). In studies of treatment outcome, observational measures have revealed patterns of results that are not apparent when ratings alone are obtained and therefore provide incremental validity in assessing treatment effects. Further, direct observations avoid the biases that are inherent in rating scalesespecially in treatment studies in which raterblinding is not possible or is easily compromised. A related point can be made about laboratory measures of attention and impulsivity. Even though we have not discussed these measures because it is widely agreed that they are not valid for the purpose of diagnosis (AAP, 2000) or measuring ecologically valid treatment response (e.g., Nigg, Hinshaw, & Halperin, 1996), cognitive performance cannot be well studied with any other method. Thus, there is a role to play for laboratory measures in research with the goal of understanding the nature of cognition in ADHD, despite the fact that laboratory measures do not contribute to diagnosis or clinical assessment. Is there evidence that incremental validity is gained by assessing comorbid diagnoses? Although emphasized in research and practice, as we discussed at the beginning of this article, there is little research showing that information regarding comorbidity influences the utility of assessment in ADHD. The few studies addressing this point show that comorbid diagnoses do not influence response to treatment and therefore treatment planning (e.g., MTA Cooperative Group, 1999b; Pelham et al., 1993). The approach to and evaluation of treatment for ADHD-related problems are identical: pharmacological or behavioral treatment or both. This is true even when the decision is whether a child meets full ADHD criteria or has a subthreshold number of symptoms. When this situation exists in medicine or psychopathology, diagnosesboth primary and comorbidhave no treatment utility and should be made as efficiently and cost-effectively as possible (Meyer et al., 2001; Nelson-Gray, 2003; Pelham, 2001). Target behaviors may be added when a child has a comorbid diagnosis (e.g., peer-directed aggression), but that information comes from assessment of impairment in key domains rather than from the comorbid diagnosis. Impaired functioning is required for ADHD diagnosis and must be assessed, but is there evidence that it adds incremental validity? Because the correlation between ADHD symptoms and impairment is modest (Fabiano et al., 2005), because there is variability in expression of ADHD-related impairment across domains (Lahey et al., 1998), and because measures of impair-

ment and symptoms account for unique variance in predicting outcomes (Pelham, Lahey, et al., 2005), it is clear that measures of impairment add incremental validity beyond an ADHD diagnosis. How comprehensive such assessments need to be is another question. Consider the MTA study, which employed a broad set of assessment instruments to capture the baseline functioning and treatment outcomes of the children treated in that study (Hinshaw et al., 1997). For treatment outcome in particular, different domains and methods of assessment yielded outcomes for the four treatment conditions that were very different from the parent and teacher Swanson, Nolan, and Pelham Rating Scale DSMIV symptom ratings (MTA Cooperative Group, 1999a). For example, there was no effect of stimulant medication on academic achievement but a large effect on parent and teacher ratings. At follow-up, behavioral and pharmacological treatments differed on adultrated DSMIV symptoms but not in any key functional domain (achievement, peer relations, parenting; MTA Cooperative Group, 2004). A secondary outcome article involved combining all of the measures into a single scale and yielded stronger evidence of psychosocial and combined treatment effects, relative to medication, than any other set of analyses (Conners et al., 2001). Although unique information was obtained regarding treatment effects employing multiple-outcome domains, the MTA group did not conduct a systematic study of which outcome variables contributed unique variance to evaluation of treatment effectiveness. We are not aware of any study that has conducted such an evaluation, and one is needed, as the incremental validity or treatment utility of large sets of outcome measures for both research and clinical practice has not been well studied. Further, our previous discussion of measures of impairment argued that inexpensive, simple domain-specific measures of impairment (e.g., IRS, CBCL) are sufficiently highly correlated with more comprehensive measures (e.g., achievement) that they can be used instead at very large cost savings. Finally, will incremental validity be gained beyond parent and teacher rating scales by conducting FBAs that focus on functioning rather than DSMIV symptoms of ADHD? In other words, is there value beyond identifying key domains of impairment in conducting assessments that evaluate the contexts in which the problems occur with a focus on setting events and maintaining variablesall focused on case conceptualization and treatment planning (Gresham et al., 2001)? Surprisingly, relatively little research has addressed this question either for ADHD or in general (Ervin et al., 2001), and more research is needed. At a clinical level, however, conducting such an assessment has clear face validityother than pharmacological interventions, an evidence-based treatment for ADHD 467

Downloaded by [the Bodleian Libraries of the University of Oxford] at 06:43 01 April 2012

PELHAM, FABIANO, MASSETTI

(all of which are behavioral) could not be developed for a child with ADHD without conducting an FBA. Implications for Clinical Practice Our review has clear implications for clinical practice. Effective screenings for ADHD may be made quickly and economically using only a few items completed by parent and teacher respondents (e.g., August et al., 1996). Across studies and measures, brief measures such as the Inattentive/Overactive scale of the IOWA Conners, the Child Attention Problems Rating Scale, individual and pairs of DSMIV items both within and across raters, and the Behavior Assessment System for Children Attention Problems subscale all reliably classify children diagnosed with ADHD. Lengthy and expensive structured interviews and complete DSMIV-symptom-based rating scales add little incremental information in formulating an ADHD diagnosis; and a few observational measures are as useful as complete complex observation systems for effectively discriminating ADHD from comparison children. A brief rating scale or a combination of a small number of DSMIV items from parent and teacher ratings, ITBE records of verbal intrusion, routine teacher records of seatwork completion and accuracy, and nonobtrusive evaluations of whether the child has the required supplies in his or her desk at school would appear to be an efficient and parsimonious way of diagnosing ADHD in clinical practice, with more objectivity than rating scales alone but at a very low cost relative to structured psychiatric interviews. The kind of large-scale research needed to validate this approach to assessment has not yet been conducted, but clinicians can be confident based on extant research that it has some empirical support. Such an approach would have considerable treatment utility because it would be feasible and cost-effective in primary care and educational settings, where training in mental health assessment is limited and cost and time issues are paramount. Less time spent on making a diagnosis leaves the clinician, physician, or school counselor with more time to focus on treatment. Once a diagnosis is established, the clinician needs to conduct the rest of the assessment process, including (a) identifying impaired domains of functioning; (b) operationalizing target behaviors within these domains; (c) conducting a functional analysis of the antecedents, settings, and consequences of the target behavior(s); and (d) implementing treatment and constructing measures such as the ITBE to monitor and evaluate treatment progress. In other words, after diagnosis, all assessment focuses on the childs specific impaired areas of functioning or target behaviors and the treatment of these behaviorsnot DSMIV symptoms. Consider, for example, the symptom often does not seem to listen when spoken to directly. A child who 468

has this item endorsed on a structured interview or rating scale would have the item count toward a diagnosis of ADHD. However, the item in and of itself provides no information on the extent to which this behavior is a problem for the child and what causes, maintains, or exacerbates the behavior. For one child, the function of the behavior could be to avoid tasks he or she dislikes and is limited to situations where a demand is placed on the child. For another child, the behavior may be caused by poor adult commands and instructions that fail to make the desired behavior clear to the child. A third child may have a clear attention problem and may not be processing the instructions. An FBA of the problematic behavior would result in three different treatment approaches based on the three different hypothesized functions (Gresham et al., 2001; Nelson-Gray, 2003). In other words, the symptom is not informative for treatment without knowledge of the impaired functioning that it reflects and its context. Regarding the samples in which relatively less is known about ADHD diagnosis (low socioeconomic status, ethnicity, preschoolers, adolescents) a focus on impairment and FBA should minimize the consequences of potential diagnostic errors. If an African American child or a preschooler is identified through inappropriately elevated parent or teacher ratings of DSMIV symptoms, a careful determination of level of impairment and a good FBA should discount the symptom ratings if there truly is a false positive diagnosis. The opposite problem could occur with a girl or an adolescentlow symptom scores from teachers ruling out the diagnosis but clear impairment documenting a need for intervention. Thus, available instruments should be used with caution in these samples with a focus on impairment rather than symptom levels. This approachdeemphasizing the relative importance of a DSMIV diagnosis and the traditional approaches to assessment and focusing instead on functional behavioral analysis of impairment may make psychologists uneasy for several reasons. Some may fear that reliance on simple parent and teacher rating scales for diagnosis (as opposed to structured interviews and neuropsychological batteries) will increase the number of children identified with ADHD and therefore the number treated with medication. As we noted previously, however, most children are referred for treatment based on functioning in daily life rather than symptoms. If the clinicians focus is on functional impairments and adaptive skills, children so identified need treatment, and elevated rates of identification are not a concern. If psychological practitioners increased their collaboration with primary care physicians, who prescribe most of the medication for ADHD, to implement established treatment guidelines (AAP, 2001), such an outcome is neither inevitable nor likely. Alternatively, some may be concerned that minimizing the traditional role filled by psychologists (con-

Downloaded by [the Bodleian Libraries of the University of Oxford] at 06:43 01 April 2012

ADHD ASSESSMENT

Downloaded by [the Bodleian Libraries of the University of Oxford] at 06:43 01 April 2012

ducting comprehensive, multi-instrument assessments and integrating and interpreting diagnostic information) might mean that psychologists roles in diagnosis and treatment of ADHD will be diminished. To the contrary, assigning a DSMIV diagnosis has become almost a cookbook activity that can be performed by many different professionals. However, few if any professionals other than psychologists are trained in procedures for conducting FBAs and employing the results to design and monitor treatment, and this unique role is preserved in our approach. An emphasis on functional outcomes rather than DSMIV diagnosis does not mean that professionals who adopt this approach will be less involved in assessment and treatment of ADHDonly that they will spend their time engaged in different activities.

Conclusions This review highlights the evidence for assessment methods for children with ADHD but also reveals limitations and clear directions for needed research. We enumerate conclusions in the following that inform practitioners and researchers on the evidence-based assessment of ADHD. 1. Using traditional psychometric criteria, there is substantial evidence for the reliability and validity of many measures commonly used to diagnose ADHD and measure treatment outcomes, including DSMIVbased as well as empirically and rationally derived ADHD rating scales, DSMIV-based structured interviews, measures of global impairment, and observational methods. These all exhibit the requisite reliability and validity estimates needed for evidence-based assessments. There is currently no evidence supporting the validity of child self-report of ADHD symptoms. Because virtually all use of rating scales and structured interviews involves mother report, there is no evidence on the validity of father report. 2. Diagnosing ADHD is most efficiently accomplished with parent and teacher rating scales. All three forms of rating scalesDSMIV based, rationally derived, and empirically derivedare valid and agree equally well with other methods of diagnosis. Because rationally and empirically derived scales have fewer ADHD items (e.g., the IOWA Conners) and often screen for other disorders and impairments (e.g., the CBCL), they are more efficient than DSMIV-based scales. There is evidence that only a few DSMIV items with high PPP or NPP, depending on the purpose of the assessment, are as effective as complete symptom lists, but measures using only these items have not been systematically evaluated. The number of raters needed to identify children with ADHD depends on the purpose of the assessment. For large-scale screenings,

a single teacher rating with a few items that have high NPP and PPP may be sufficient. However, because of the modest agreement between parents and teachers, because of the DSMIV requirement for information on cross-situational impairment, and because target behavior identification and treatment planning is setting-specific, information from both parents and teachers is necessary for clinical purposes. 3. Symptom rating scales must be combined with a clinical interview or additional paper-and-pencil questions to obtain information about onset and rule out other disorders (e.g., low IQ, autism). However, DSM IV-based structured diagnostic interviews do not add incremental validity to parent and teacher rating scales. The current practice of requiring structured DSMIVbased interviews in an attempt to increase diagnostic precision is not supported by research. There are few studies examining incremental validity of other methods compared to rating scales. 4. Observational methods that include only a few categories (e.g., verbal intrusions) and utilize a few nonobtrusive measures (e.g., teacher-recorded work completion) are as effective as more comprehensive observations with multiple categories or aggregates thereof for identifying children with ADHD and are far more efficient. Additionally, there is growing evidence that simple proxies for complex observations (e.g., target behavior probabilities) may be useful. Systematic investigation of the incremental validity of such approaches added to rating scales is needed. 5. Systematic assessments for other disorders that are comorbid with ADHD should follow guidelines in other articles in this special section; however, it is worth emphasizing that diagnostic information does not inform ones treatment approach regardless of the ADHD-related diagnosis (e.g., subtype, comorbidity). 6. Because of the modest correlation between ADHD symptoms and impaired functioning, assessment of ADHD must include evaluation of the childs functioning in the key domains of peer and sibling and parent and teacher relationships, academic progress, and the classroom and family. There are numerous measures available that have been validated for each of these domains. Very brief assessments of those domains may be sufficient for both research and clinical purposes (i.e., those included on measures of global impairment), but additional research is needed. Objective assessment of functioning can be efficiently accomplished with idiographic measures of daily behaviors (both problematic and adaptive). Studies of the incremental validity of such idiographic measures beyond global ratings of impairment have not yet been conducted and are needed. 7. There is currently a paucity of information on most ADHD assessment measures across gender, race or ethnicity, age before and after elementary school, and developmental levels (but see Achenbach & 469

PELHAM, FABIANO, MASSETTI

Downloaded by [the Bodleian Libraries of the University of Oxford] at 06:43 01 April 2012

Rescorla, 2001, for a good example of comprehensive normative information). Future research needs to address this lack of information. 8. For treatment planningas well as for studies of the nature of ADHD (e.g., cognitive deficits)the context (i.e., antecedents, consequences, and settings) of symptoms, and the impact of those symptoms on functioning, should be collected routinely. The primary focus of assessment in ADHD should be on an FBA of impairment. Such an assessment identifies environmental contexts and socially valid target behaviors (not DSMIV symptoms of ADHD) and facilitates treatment planning. 9. We have emphasized the need for additional research on the incremental validity of combinations of assessment approaches (Johnston & Murray, 2003). Implicit in this recommendation is a focus on the tradeoffs between incremental validity and the cost of assessmentthat is, cost effectiveness and cost-benefit analysis (Yates & Taub, 2003). For example, do the day-long, clinic-based evaluations of ADHD that are common in some settings provide incremental validity for treatment planning beyond parent or teacher rating scales, and, if they do, does the incremental benefit outweigh the additional costs? We strongly suspect the answer is no, but we are not aware of any research on such questions. In conclusion, since the advent of the DSMIV classification system, significant professional and research energy has been devoted to constructing ever-more complex, time-intensive, and costly measures to accurately identify DSMIV ADHD. This emphasis has reduced the use of functional behavioral approaches that integrate objective assessments and treatments and that were pioneered and ubiquitously used in the 1960s and 1970s. Because the methods and scales for measuring ADHD symptoms are so straightforward and take so little time and cost, it is our hope that clinicians will begin to refocus on the treatment-relevant aspects of assessment that we have outlined herein. We believe that such an approach will best serve children with ADHD and their families.

References
American Academy of Child and Adolescent Psychiatry Work Group on Quality Issues. (1997). Practice parameters for the assessment and treatment of children, adolescents, and adults with attention-deficit/hyperactivity disorder. Journal of the American Academy of Child & Adolescent Psychiatry, 36(Suppl.), 85S121S. Abikoff, H., Gittelman-Klein, R., & Klein, D. F. (1977). Validation of a classroom observation code for hyperactive children. Journal of Consulting and Clinical Psychology, 45, 772783. Abikoff, H., Gittelman-Klein, R., & Klein, D. F. (1980). Classroom observation code for hyperactive children: A replication of

validity. Journal of Consulting and Clinical Psychology, 48, 555565. Abramowitz, A. J., Eckstrand, D., OLeary, S. G., & Dulcan, M. K. (1992). ADHD childrens responses to stimulant medication and two intensities of a behavioral intervention. Behavior Modification, 16, 193203. Achenbach, T. M. (1991). Integrative guide for the 1991 CBCL/4 18, YSR, and TRF profiles. Burlington: University of Vermont, Department of Psychiatry. Achenbach, T. M., & Edelbrock, C. S. (1981). Behavioral problems and competences reported by parents of normal and disturbed children aged 4 through 6. Monographs of the Society for Research in Child Development, 46(1, Serial No. 188). Achenbach, T. M., McConaughy, S. H., & Howell, C. T. (1987). Child/adolescent behavioral and emotional problems: Implications of cross-informant correlations for situational specificity. Psychological Bulletin, 101, 213232. Achenbach, T. M., & Rescorla, L. A. (2001). Manual for ASEBA school-age forms and profiles. Burlington: University of Vermont, Research Center for Children, Youth, and Families. American Academy of Child and Adolescent Psychiatry. (1997). Practice parameters for the assessment and treatment of children, adolescents, and adults with attention-deficit/hyperactivity disorder. Journal of the American Academy of Child & Adolescent Psychiatry, 36(Suppl.), 85121. American Academy of Pediatrics. (2000). Clinical practice guideline: Diagnosis and evaluation of the child with attention-deficit/hyperactivity disorder. Pediatrics, 105, 11581170. American Academy of Pediatrics. (2001). Clinical practice guideline: Treatment of the school-aged child with attention-deficit/hyperactivity disorder. Pediatrics, 108, 10331044. American Academy of Pediatrics and National Initiative for Childrens Healthcare Quality. (2002). Caring for children with ADHD: A resource toolkit for clinicians. Chicago: Author. American Psychiatric Association. (1980). Diagnostic and statistical manual of mental disorders (3rd ed.). Washington, DC: Author. American Psychiatric Association. (1987). Diagnostic and statistical manual of mental disorders (3rd ed., rev.). Washington, DC: Author. American Psychiatric Association. (1994). Diagnostic and statistical manual of mental disorders (4th ed.). Washington, DC: Author. Anastopoulos, A. D., & Shelton, T. L. (2001). Assessing attention-deficit/hyperactivity disorder. New York: Kluwer Academic/Plenum. Anastopoulos, A. D., Shelton, T. L., DuPaul, G. J., & Guevremont, D. C. (1993). Parent training for attention-deficit hyperactivity disorder. Journal of Abnormal Child Psychology, 21, 581596. Angold, A., & Costello, E. J. (2000). The Child and Adolescent Psychiatric Assessment (CAPA). Journal of the American Academy of Child & Adolescent Psychiatry, 39, 3948. Angold, A., Costello, E. J., Farmer, E. M. Z., Burns, B. J., & Erkanli, A. (1999). Impaired but undiagnosed. Journal of the American Academy of Child & Adolescent Psychiatry, 38, 129137. Atkins, M. S., Pelham, W. E., & Licht, M. H. (1985). A comparison of objective classroom measures and teacher ratings of attention deficit disorder. Journal of Abnormal Child Psychology, 13, 155167. Atkins, M. S., Pelham, W. E., & Licht, M. (1988). The development and validation of objective classroom measures for the assessment of conduct and attention deficit disorders. In R. J. Prinz (Ed.), Advances in behavioral assessment of children and families (Vol. 4, pp. 331). Greenwich, CT: JAI. Atkins, M. S., Pelham, W. E., & Licht, M. H. (1989). The differential validity of teacher ratings of inattention/overactivity and aggression. Journal of Abnormal Child Psychology, 17, 423435.

470

ADHD ASSESSMENT August, G. J., Braswell, L., & Thuras, P. (1998). Diagnostic stability of ADHD in a community sample of school-aged children screened for disruptive behavior. Journal of Abnormal Child Psychology, 26, 345356. August, G. J., Realmuto, G. M., MacDonald, A. W., Nugent, S. M., & Crosby, R. (1996). Prevalence of ADHD and comorbid disorders among elementary school children screened for disruptive behavior. Journal of Abnormal Child Psychology, 24, 571595. Barkley, R. A. (1990). Attention-deficit hyperactivity disorder: A handbook for diagnosis and treatment. New York: Guilford. Barkley, R. A. (1997). Inhibition, sustained attention, and executive functions: Constructing a unified theory of ADHD. Psychological Bulletin, 121, 6594. Barkley, R. A. (in press). Attention-deficit hyperactivity disorder: A handbook for diagnosis and treatment (3rd ed.). New York: Guilford. Barkley, R. A., & Cunningham, C. E. (1979). The effects of methylphenidate on the motherchild interactions of hyperactive children. Archives of General Psychiatry, 36, 201208. Barkley, R. A., DuPaul, G. J., & McMurray, M. B. (1990). Comprehensive evaluation of attention deficit disorder with and without hyperactivity as defined by research criteria. Journal of Consulting and Clinical Psychology, 58, 775789. Barkley, R. A., Fischer, M., Smallish, L., & Fletcher, K. (2004). Young adult follow-up of hyperactive children: Antisocial activities and drug use. Journal of Child Psychology and Psychiatry, 45, 195211. Barkley, R. A., Shelton, T. L., Crosswait, C., Moorehouse, M., Fletcher, K., Barrett, S., et al. (2000). Multi-method psycho-educational intervention for preschool children with disruptive behavior: Preliminary results at post-treatment. Journal of Child Psychology and Psychiatry & Allied Disciplines, 41, 319332. Baumann, B. L., Pelham, W. E., Lang, A. R., Jacob, R. G., & Blumenthal, J. D. (2004). The impact of maternal depressive symptomatology on ratings of children with ADHD and child confederates. Journal of Emotional and Behavioral Disorders, 12, 9098. Biederman, J., Faraone, S. V., Doyle, A., Lehman, B. K., Kraus, I., Perrin, J., et al. (1993). Convergence of the Child Behavior Checklist with structured interview-based psychiatric diagnoses of ADHD children with and without comorbidity. Journal of Child Psychology and Psychiatry, 34, 12411251. Bird, H. R., Andrews, H., Schwab-Stone, M., Goodman, S., Dulcan, M., Richters, J., et al. (1996). Global measures of impairment for epidemiologic and clinical use with children and adolescents. International Journal of Methods in Psychiatric Research, 6, 295307. Bird, H. R., Canino, G., Rubio-Stipec, M., & Ribera, J. C. (1987). Further measures of the psychometric properties of the Childrens Global Assessment Scale. Archives of General Psychiatry, 44, 821824. Bird, H. R., Shaffer, D., Fisher, P., Gould, M. S., Staghezza, B., Chen, J. Y., et al. (1993). The Columbia Impairment Scale (CIS): Pilot findings on a measure of global impairment for children and adolescents. International Journal of Methods in Psychiatric Research, 3, 167176. Bird, H. R., Yager, T. J., Staghezza, B., Gould, M. S., Canino, G., & Rubio-Stipec, M. (1990). Impairment in the epidemiological measurement of childhood psychopathology in the community. Journal of the American Academy of Child & Adolescent Psychiatry, 29, 796803. Boyle, M. H., Offord, D. R., Racine, Y., Sanford, M., Szatmari, P., Fleming, J. E., et al (1993). Evaluation of the Diagnostic Interview for Children and Adolescents for use in general population samples. Journal of Abnormal Child Psychology, 21, 663681. Bravo, M., Ribera, J., Rubio-Stipec, M., Canino, G., Shrout, P., Ramirez, R., et al. (2001). Testretest reliability of the Spanish version of the Diagnostic Interview Schedule for Children (DISC IV). Journal of Abnormal Child psychology, 29, 433444. Brown, R. T., Freeman, W. S., Perrin, J. M., Stein, M. T., Amler, R. W., Feldman, H. M., et al. (2001). Prevalence and assessment of attention-deficit/hyperactivity disorder in primary care settings. Pediatrics, 107. Retrieved March 10, 2005, from the World Wide Web: http://www.pediatrics.org/cgi/content/full/107/3/e43 Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitraitmultimethod matrix. Psychological Bulletin, 56, 81105. Carlson, G. A., & Rapport, M. D. (1989). Diagnostic classification issues in attention-deficit hyperactivity disorder. Psychiatric Annals, 19, 576583. Castellanos, F. X., Sharp, W. S., Gottesman, R. F., Greenstein, D. K., Giedd, J. N., & Rapoport, J. L. (2003). Anatomic brain abnormalities in monozygotic twins discordant for attention-deficit/hyperactivity disorder. American Journal of Psychiatry, 160, 16931696. Castellanos, F. X., & Swanson, J. (2002). Biological underpinnings of ADHD. In S. Sandberg (Ed.), Hyperactivity and attention disorders of childhood (2nd ed., pp. 336366). Cambridge, England: Cambridge University Press. Chamberlain, P., & Patterson, G. R. (1995). Discipline and child compliance in parenting. In M. Bornstein (Ed.), Handbook of parenting: Vol. 4. Applied and practical parenting (pp. 205225). Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Chen, W. J., Faraone, S. V., Biederman, J., & Tsuang, M. T. (1994). Diagnostic accuracy of the Child Behavior Checklist scales for attention-deficit hyperactivity disorder: A receiver-operating characteristic analysis. Journal of Consulting and Clinical Psychology, 62, 10171025. Chi, T. C., & Hinshaw, S. P. (2002). Motherchild relationships of children with ADHD: The role of maternal depressive symptoms and depression-related distortions. Journal of Abnormal Child Psychology, 30, 387400. Chronis, A. M., Fabiano, G. A., Gnagy, E. M., Onyango, A. N., Pelham, W. E., Williams, A., et al. (2004). An evaluation of the Summer Treatment Program for children with attention-deficit/hyperactivity disorder using a treatment withdrawal design. Behavior Therapy, 35, 561585. Coie, J. D., & Dodge, K. A. (1998). Aggression and antisocial behavior. In W. Damon (Series Ed.) & N. Eisenberg (Vol. Ed.), Handbook of child psychology: Vol. 3. Social, emotional, and personality development (5th ed., pp.779862). New York: Wiley. Collett, B. R., Ohan, J. L., & Myers, K. M. (2003). Ten-year review of rating scales: V. Scales assessing attention-deficit/hyperactivity disorder. Journal of the American Academy of Child & Adolescent Psychiatry, 42, 10151037. Conduct Problems Prevention Research Group. (2002). Predictor variables associated with positive Fast Track outcomes at the end of third grade. Journal of Abnormal Child Psychology, 30, 3752. Conners, C. K. (1969). A teacher rating scale for use in drug studies with children. American Journal of Psychiatry, 126, 884888. Conners, C. K., Epstein, J. N., March, J. S., Angold, A., Wells, K. C., Klaric, J., et al. (2001). Multimodal treatment of ADHD (MTA): An alternative outcome analysis. Journal of the American Academy of Child & Adolescent Psychiatry, 40, 159167. Conners, C. K., Sitarenios, G., Parker, J. D. A., & Epstein, J. N. (1998a). Revision and restandardization of the Conners Teacher Rating Scale (CTRSR): Factor structure, reliability, and criterion validity. Journal of Abnormal Child Psychology, 26, 279291.

Downloaded by [the Bodleian Libraries of the University of Oxford] at 06:43 01 April 2012

471

PELHAM, FABIANO, MASSETTI Conners, C. K., Sitarenios, G., Parker, J. D. A., & Epstein, J. N. (1998b). The Revised Conners Parent Rating Scale (CPRSR): Factor structure, reliability, and criterion validity. Journal of Abnormal Child Psychology, 26, 257268. Cunningham, C. E., & Barkley, R. A. (1979). The interactions of normal and hyperactive children with their mothers in free play and structured tasks. Child Development, 50, 217224. Cunningham, C .E., & Siegel, L. S. (1987). Peer interactions of normal and attention-deficit-disordered boys during free-play, cooperative task, and simulated classroom situations. Journal of Abnormal Child Psychology, 15, 247268. De Los Reyes, A., & Kazdin, A. E. (2004). Measuring informant discrepancies in clinical child research. Psychological Assessment, 16, 330334. Douglas, V. I. (1999). Cognitive control processes in attention-deficit/hyperactivity disorder. In H. C. Quay & A. E. Hogan (Eds.), Handbook of disruptive behavior disorders (pp. 105138). New York: Kluwer Academic/Plenum. DuPaul, G. J. (1991). Parent and teacher ratings of ADHD symptoms: Psychometric properties in a community-based sample. Journal of Clinical Child Psychology, 20, 245253. DuPaul, G. J., Anastopoulos, A. D., Power, T. J., Reid, R., Ikeda, M. J., & McGoey, K. E. (1998). Parent ratings of attention-deficit/hyperactivity disorder symptoms: Factor structure and normative data. Journal of Psychopathology and Behavioral Assessment, 20, 83102. DuPaul, G. J., Power, T. J., Anastopoulos, A. D., Reid, R., McGoey, K. E., & Ikeda, M. J. (1997). Teacher ratings of attention deficit hyperactivity disorder symptoms: Factor structure and normative data. Psychological Assessment, 9, 436444. DuPaul, G. J., Power, T. J., McGoey, K. E., Ikeda, M. J., & Anastopoulos, A. D. (1998). Reliability and validity of the parent and teacher ratings of attention-deficit/hyperactivity disorder symptoms. Journal of Psychoeducational Assessment, 16, 5568. DuPaul, G. J., & Stoner, G. (2003). ADHD in the schools: Assessment and intervention strategies. New York: Guilford. Edelbrock, C., & Costello, A. J. (1988). Convergence between statistically derived behavior problem syndromes and child psychiatric diagnoses. Journal of Abnormal Child Psychology, 16, 219231. Epstein, J. N., Erkanli, A., Conners, C. K., Klaric, J., Costello, J. E., & Angold, A. (2003). Relations between continuous performance test performance measures and ADHD behaviors. Journal of Abnormal Child Psychology, 31, 543554. Epstein, J. N., March, J. S., Conners, C. K., & Jackson, D. L. (1998). Racial differences on the Conners Teacher Rating Scale. Journal of Abnormal Child Psychology, 26, 109118. Epstein, J. N., Willoughby, M., Valencia, E. Y., Tonev, S. T., Abikoff, H .B., Arnold, L. E., et al. (in press). The role of childrens ethnicity in the relationship between childrens ADHD and observed classroom behavior. Journal of Consulting and Clinical Psychology. Ervin, R. A., Radford, P. M., Bertsch, K., Piper, A. L., Ehrhardt, K. E., & Poling, A. (2001). A descriptive analysis and critique of the empirical literature on school-based functional assessment. School Psychology Review, 30, 193210. Evans, S. W., Allen, J., Moore, S., & Strauss, V. (in press). Measuring symptoms and functioning of youth with ADHD in middle schools. Journal of Abnormal Child Psychology. Fabiano, G. A., Pelham, W. E., Gnagy, E. M., Waschbusch, D., Lahey, B. B., Chronis, A. M., et al. (2005). A practical impairment measure: psychometric properties of the Impairment Rating Scale in three samples of children with attention-deficit/hyperactivity disorder. Manuscript submitted for publication. Fabiano, G. A., Pelham, W. E., Manos, M., Gnagy, E. M., Chronis, A. M., Onyango, A. N., et al. (2004). An evaluation of three time out procedures for children with attention-deficit/hyperactivity disorder. Behavior Therapy, 35, 449469. Foster, S., & Mash, E. (1999). Assessing social validity in clinical treatment procedures: Issues and procedures. Journal of Consulting and Clinical Psychology, 67, 308319. Frick, P .J., Lahey, B. B., Applegate, B., Kerdyck, L., Ollendick, T., Hynd, G. W., et al. (1994). DSMIV field trials for the disruptive behavior disorders: Symptom utility estimates. Journal of the American Academy of Child & Adolescent Psychiatry, 33, 529539. Gadow, K. D., & Nolan, E. E. (2002). Differences between preschool children with ODD, ADHD, and ODD+ADHD symptoms. Journal of Child Psychology and Psychiatry, 43, 191201. Gadow, K. D., & Sprafkin, J. (1997). ADHD Symptom Checklist4 manual. Stony Brook, NY: Checkmate Plus. Gadow, K. D., Sprafkin, J., & Nolan, E. E. (2001). DSMIV symptoms in community and clinic preschool children. Journal of the American Academy of Child & Adolescent Psychiatry, 40, 13831392. Gaub, M., & Carlson, C. L. (1997). Behavioral characteristics of DSMIV ADHD subtypes in a school-based population. Journal of Abnormal Child Psychology, 25, 103111. Goldman, L .S., Genel, M., Bezman, R. J., & Slanetz, P. J. (1998). Diagnosis and treatment of attention-deficit/hyperactivity disorder in children and adolescents. Journal of the American Medical Association, 279, 11001107. Gomez, R., Burns, G. L., Walsh, J. A., & de Moura, A. (2003). A multi-traitmultisource confirmatory factor analytic approach to the construct validity of ADHD rating scales. Psychological Assessment, 15, 316. Gomez, R., Harvey, J., Quick, C., Scharer, I., & Harris, G. (1999). DSMIV AD/HD: Confirmatory factor models, prevalence, and gender and age differences based on parent and teacher ratings of Australian primary school children. Journal of Child Psychology and Psychiatry & Allied Disciplines, 40, 265274. Goyette, C., Conners, C., & Ulrich, R. (1978). Normative data on revised Conners parent and teacher rating scale. Journal of Abnormal Child Psychology, 6, 221236. Gresham, F. M., Watson, T. S., & Skinner, C. H. (2001). Functional behavioral assessment: Principles, procedures, and future directions. School Psychology Review, 30, 156172. Hart, E. L., Lahey, B. B., Loeber, R., & Hanson, K. S. (1994). Criterion validity of informants in the diagnosis of disruptive behavior disorders in children: A preliminary study. Journal of Consulting and Clinical Psychology, 62, 410414. Hartman, R. R., Stage, S. A., & Webster-Stratton, C. (2003). A growth curve analysis of parent training outcomes: Examining the influence of child risk factors (inattention, impulsivity, and hyperactivity problems), parental, and family risk factors. Journal of Child Psychology and Psychiatry, 44, 388398. Hinshaw, S. P., March, J., Abikoff, H. B., Arnold, L. E., Cantwell, D. P., Conners, C. K., et al. (1997). Comprehensive assessment of childhood attention-deficit hyperactivity disorder in the context of a multisite, multimodal clinical trial. Journal of Attention Disorders, 1, 217234. Hinshaw, S. P., & Melnick, S. M. (1995). Peer relationships in boys with attention-deficit hyperactivity disorder with and without comorbid aggression. Development and Psychopathology, 7, 627647. Hinshaw, S. P., & Nigg, J. T. (1999). Behavior rating scales in the assessment of disruptive behavior problems in childhood. In D. Shaffer, C. P. Lucas, & J. E. Richters (Eds.), Diagnostic assessment in child and adolescent psychopathology (pp. 91126). New York: Guildford. Hinshaw, S. P., Simmel, C., & Heller, T. L. (1995). Multimethod assessment of covert antisocial behavior in children: Laboratory

Downloaded by [the Bodleian Libraries of the University of Oxford] at 06:43 01 April 2012

472

ADHD ASSESSMENT observations, adult ratings, and child self-report. Psychological Assessment, 7, 209219. Hoagwood, K., Kelleher, K. J., Feil, M., & Comer, D. (2000). Treatment services for children with ADHD: A national perspective. Journal of the American Academy of Child & Adolescent Psychiatry, 39, 198206. Hodges, K. (1993). Structured interviews for assessing children. Journal of Child Psychology and Psychiatry, 34, 4968. Hodges, K., Doucette-Gates, A., & Liao, Q. (1999). The relationship between the Child and Adolescent Functional Assessment Scale (CAFAS) and indicators of functioning. Journal of Child and Family Studies, 8, 109122. Hodges, K., & Wong, M. M. (1996). Psychometric characteristics of a multidimensional measure to assess impairment: The Child and Adolescent Functional Assessment Scale. Journal of Child and Family Studies, 5, 445467. Hoza, B., Pelham, W. E., Dobbs, J., Owens, J. S., & Pillow, D. R. (2002). Do boys with attention-deficit/hyperactivity disorder have positive illusory self concepts? Journal of Abnormal Psychology, 111, 268278. Huang-Pollock, C. L., & Nigg, J. T. (2003). Searching for the attention deficit in attention-deficit/hyperactivity disorder: The case of visuospatial orienting. Clinical Psychology Review, 23, 801830. Huesmann, L. R., Eron, L. D., Lefkowitz, M. M., & Walder, L. O. (1984). Stability of aggression over time and generations. Developmental Psychology, 20, 11201134. Jensen, P., Roper, M., Fisher, P., Piacentini, J., Canino, G., Richters, J., et al. (1995). Testretest reliability of the Diagnostic Interview Schedule for Children (DISC 2.1): Parent, child, and combined algorithms. Archives of General Psychiatry, 52, 6171. Jensen, P. S., Watanabe, H. K., Richters, J. E., Roper, M., Hibbs, E. D., Salzberg, A. D., et al. (1996). Scales, diagnoses, and child psychopathology: II. Comparing the CBCL and the DISC against external validators. Journal of Abnormal Child Psychology, 24, 151168. Johnston, C., & Mash, E. J. (2001). Families of children with attention-deficit/hyperactivity disorder: Review and recommendations for future research. Clinical Child and Family Psychology Review, 4, 183207. Johnston, C., & Murray, C. (2003). Incremental validity in the psychological assessment of children and adolescents. Psychological Assessment, 15, 496507. Klein, R. G., & Abikoff, H. (1997). Behavior therapy and methylphenidate treatment of children with ADHD. Journal of Attention Disorders, 2, 89114. Kolko, D. J., Bukstein, O. G., & Barron, J. (1999). Methylphenidate and behavior modification in children with ADHD and comorbid ODD or CD: Main and incremental effects across settings. Journal of the American Academy of Child & Adolescent Psychiatry, 38, 578586. Lahey, B. B., Applegate, B., McBurnett, K., Biederman, J., Greenhill, L., Hynd, G. W., et al. (1994). DSMIV field trials for attention deficit hyperactivity disorder in children and adolescents. American Journal of Psychiatry, 151, 16731685. Lahey, B. B., Miller, T. I., Gordon, R. A., & Riley, A. W. (1999). Developmental epidemiology of the disruptive behavior disorders. In H. C. Quay & A. E. Hogan (Eds.), Handbook of disruptive behavior disorders (pp. 2348). New York: Kluwer Academic/ Plenum. Lahey, B. B., Pelham, W. E., Loney, J., Lee, S. S., & Willcutt, E. (in press). Instability of the DSMIV subtypes of ADHD from preschool through elementary school. Archives of General Psychiatry. Lahey, B. B., Pelham, W. E., Schaughency, E. A., Atkins, M. S., Murphy, H. A., Hynd, G., et al. (1988). Dimensions and types of attention deficit disorder. Journal of the American Academy of Child & Adolescent Psychiatry, 27, 330335. Lahey, B. B., Pelham, W. E., Stein, M. A., Loney, J., Trapani, C. Nugent, K., et al. (1998). Validity of DSMIV attention-deficit/hyperactivity disorder for younger children. Journal of the American Academy of Child & Adolescent Psychiatry, 37, 695702. Lahey, B. B., & Wilcutt, E. G. (2002). Validity of the diagnosis and dimensions of attention-deficit hyperactivity disorder. In P. S. Jensen & J. R. Cooper (Eds.), Attention deficit hyperactivity disorder: State of the sciencebest practices (pp. 1-11-23). Kingston, NJ: Civic Research Institute. Lang, A. R., Pelham, W. E., Atkeson, B. M., & Murphy, D. A. (1999). Effects of alcohol intoxication on parenting in interactions with child confederates exhibiting normal or deviant behaviors. Journal of Abnormal Child Psychology, 27, 177189. Langhorne, J .E., Loney, J., Paternite, C. E., & Bechtoldt, H. P. (1976). Childhood hyperkinesis: A return to the source. Journal of Abnormal Psychology, 85, 201209. Lewczyk, C. M., Garland, A. F., Hurlburt, M. S., Gearity, J., & Hough, R. L. (2003). Comparing DISCIV and clinician diagnoses among youths receiving public mental health services. Journal of the American Academy of Child & Adolescent Psychiatry, 42, 349356. Loeber, R., Green, S. M., & Lahey, B. B. (1990). Mental health professionals perception of the utility of children, mothers, and teachers as informants on childhood psychopathology. Journal of Clinical Child Psychology, 19, 136143. Loney, J., & Milich, R. (1982). Hyperactivity, inattention, and aggression in clinical practice. Advances in Developmental and Behavioral Pediatrics, 3, 113147. Mannuzza, S., & Klein, R. G. (1999). Adolescent and adult outcomes in attention-deficit/hyperactivity disorder. In H. C. Quay & A. E. Hogan (Eds.), Handbook of disruptive behavior disorders (pp. 279294). New York: Kluwer Academic/Plenum. Mash, E. J., & Foster, S. L. (2001). Exporting analogue behavioral observation from research to clinical practice: Useful or costdefective? Psychological Assessment, 13, 8698. Mash, E. J., & Johnston, C. (1982). A comparison of the mother child interactions of younger and older hyperactive and normal children. Child Development, 53, 13711381. Mash, E. J., & Terdal, L. G. (Eds.). (1997). Assessment of childhood disorders (3rd ed.). New York: Guilford. Mash, E. J., Terdal, L. G., & Anderson, K. (1973). The responseclass matrix: A procedure for recording parentchild interactions. Journal of Consulting and Clinical Psychology, 40, 163164. Massetti, G. M., Pelham, W. E., Chacko, A., Walker, K. S., Arnold, F. W., Coles, E. K., et al. (2003, November). Situational variability of ADHD, ODD and CD: Psychometric properties of the DBD interview and rating scale. Poster presented at the Association for Advancement of Behavior Therapy Conference, Boston. Matier-Sharma, K., Perachio, N., Newcorn, J. H., Sharma, V., & Halperin, J. M. (1995). Differential diagnosis of ADHD: Are objective measures of attention, impulsivity, and activity levels helpful? Child Neuropsychology, 1, 118127. Mattison, R. E., Gadow, K. D., Sprafkin, J., Nolan, E. E., & Schneider, J. (2003). A DSMIV-referenced teacher rating scale for use in clinical management. Journal of the American Academy of Child & Adolescent Psychiatry, 42, 442449. Meyer, G. J., Finn, S. E., Eyde, L. D., Kay, G. G., Moreland, K. L., Dies, R. R., et al. (2001). Psychological testing and psychological assessment: A review of the evidence and issues. American Psychologist, 56, 128165. Milich, R., Balentine, A. C., & Lynam, D. R. (2001). ADHD combined type and ADHD predominantly inattentive type are dis-

Downloaded by [the Bodleian Libraries of the University of Oxford] at 06:43 01 April 2012

473

PELHAM, FABIANO, MASSETTI tinct and unrelated disorders. Clinical Psychology: Science and Practice, 8, 463488. Milich, R., & Landau, S. (1982). Socialization and peer relations in hyperactive children. Advances in Learning and Behavioral Disabilities, 1, 283339. Milich, R., Loney, J., & Landau, S. (1982). Independent dimensions of hyperactivity and aggression: A validation with playroom observation data. Journal of Abnormal Psychology, 91, 183198. Milich, R., Loney, J., & Roberts, M. A. (1986). Playroom observations of activity level and sustained attention: Two-year stability. Journal of Consulting and Clinical Psychology, 54, 272274. Milich, R., Widiger, T. A., & Landau, S. (1987). Differential diagnoses of attention deficit disorders and conduct disorders using conditional probabilities. Journal of Consulting and Clinical Psychology, 55, 762767. Molina, B. S. G., & Pelham, W. E. (2003). Childhood predictors of adolescent substance use in a longitudinal study of children with ADHD. Journal of Abnormal Psychology, 112, 497507. Molina, B. S. G., Smith, B. H., & Pelham, W. E. (2001). Factor structure and criterion validity of secondary school teacher ratings of ADHD and ODD. Journal of Abnormal Child Psychology, 29, 7182. Mori, L. T., & Armendariz, G. M. (2001). Analogue assessment of child behavior problems. Psychological Assessment, 13, 3645. MTA Cooperative Group. (1999a). 14-month randomized clinical trial of treatment strategies for attention deficit hyperactivity disorder. Archives of General Psychiatry, 56, 10731086. MTA Cooperative Group. (1999b). Moderators and mediators of treatment response for children with attention-deficit/hyperactivity disorder. Archives of General Psychiatry, 56, 10881096. MTA Cooperative Group. (2004). National Institute of Mental Health multimodal treatment study of ADHD follow-up: 24-month outcomes of treatment strategies for attention-deficit/hyperactivity disorder (ADHD). Pediatrics, 113, 754761. Murphy, D. A., Pelham, W. E., & Lang, A. R. (1992). Aggression in boys with attention-deficit hyperactivity disorder: Methylphenidate effects on naturalistically observed aggression, response to provocation in the laboratory, and social information processing. Journal of Abnormal Child Psychology, 20, 451466. Nangle, D. W., & Erdley, C. A. (Eds.). (2001). The role of friendship in psychological adjustment. San Francisco: Jossey-Bass. Nelson-Gray, R. O. (2003). Treatment utility of psychological assessment. Psychological Assessment, 15, 521531. Newcorn, J. H., Halperin, J. M., Jensen, P. S., Abikoff, H. B., Arnold, L. E., Cantwell, D. P., et al. (2001). Symptom profiles in children with ADHD: Effects of comorbidity and gender. Journal of the American Academy of Child & Adolescent Psychiatry, 40, 137146. Nigg, J. T., Hinshaw, S. P., & Halperin, J. M. (1996). Continuous performance test in boys with attention deficit hyperactivity disorder: Methylphenidate dose response and relations with observed behaviors. Journal of Clinical Child Psychology, 25, 330340. Northup, J., Jones, K., Broussard, C., DiGiovanni, G., Herring, M., Fusilier I., et al. (1999). A preliminary analysis of interactive effects between common classroom contingencies and methylphenidate. Journal of Applied Behavior Analysis, 30, 121125. OLeary, K. D., Pelham, W. E., Rosenbaum, A., & Price, G. H. (1976). Behavioral treatment of hyperkinetic children. Clinical Pediatrics, 15, 510515. Orvaschel, H. (1985). Psychiatric interviews suitable for use in research with children and adolescents. Psychopharmacology Bulletin, 21, 737745. Ostrander, R., Weinfurt, K. P., Yarnold, P. R., & August, G. J. (1998). Diagnosing attention deficit disorders with the Behavioral Assessment System for Children and the Child Behavior Checklist: Test and construct validity analyses using optimal discriminant classification trees. Journal of Consulting and Clinical Psychology, 66, 660672. Patterson, G. R. (1974). Interventions for boys with conduct problems: Multiple settings, treatment and criteria. Journal of Consulting and Clinical Psychology, 42, 471481. Pelham, W. E. (2001). Are ADHD/I and ADHD/C the same or different? Does it matter? Clinical Psychology: Science and Practice, 8, 502506. Pelham W. E., & Bender M. E. (1982). Peer relationships in hyperactive children. In K. Gadow & I. Bailer (Eds.), Advances in learning and behavioral disabilities. (Vol. 1, pp. 366436). Greenwich, CT: JAI. Pelham, W. E., Burrows-MacLean, L., Gnagy, E. M., Fabiano, G. A., Coles, E. K., Tresco, K. E., et al. (2005). Transdermal methylphenidate, behavioral, and combined treatmentfor children with ADHD. Experimental and Clinical Psychopharmacology, 13, 111126. Pelham, W. E., Carlson, C., Sams, S. E., Vallano, G., Dixon, M. J., & Hoza, B. (1993). Separate and combined effects of methylphenidate and behavior modification on boys with attentiondeficit hyperactivity disorder in the classroom. Journal of Consulting and Clinical Psychology, 61, 506515. Pelham, W. E., Evans, S. W., Gnagy, E. M., & Greenslade, K. E. (1992). Teacher ratings of DSMIIIR symptoms for the disruptive behavior disorders: Prevalence, factor analyses, and conditional probabilities in a special education sample. School Psychology Review, 21, 285299. Pelham, W. E., & Fabiano, G. A. (2001). Treatment of attention-deficit hyperactivity disorder: The impact of comorbidity. Journal of Clinical Psychology and Psychotherapy, 8, 315329. Pelham, W. E., Fabiano, G. A., Gnagy, E. M., Greiner, A. R., Hoza, B., Manos, M., et al. (2005). Comprehensive psychosocial treatment for ADHD. In E. Hibbs & P. Jensen (Eds.), Psychosocial treatments for child and adolescent disorders: Empirically based strategies for clinical practice (pp. 377410). Washington, DC: American Psychological Association Press. Pelham, W. E., Gnagy, E. M., Burrows-Maclean, L., Williams, A., Fabiano, G. A., Morrissey, S. M., et al. (2001). Once-a-day Concerta methylphenidate versus t.i.d. methylphenidate in laboratory and natural settings. Pediatrics, 107. Retrieved June 4, 1999, from the World Wide Web: http://www.pediatrics.org/ cgi/content/full/107/6/e105 Pelham, W. E., Gnagy, E. M., Greenslade, K. E., & Milich, R. (1992). Teacher ratings of DSMIIIR symptoms for the disruptive behavior disorders. Journal of the American Academy of Child & Adolescent Psychiatry, 31, 210218. Pelham, W. E., Gnagy, E. M., Greiner, A. R., Hoza, B., Hinshaw, S. P., Swanson, J. M., et al. (2000). Behavioral versus behavioral and pharmacological treatment in ADHD children attending a summer treatment program. Journal of Abnormal Child Psychology, 28, 507525. Pelham, W. E., Greiner, A., & Gnagy, E. M. (1998). Summer Treatment Program manual. Buffalo, NY: Comprehensive Treatment for Attention Deficit Disorder. Pelham, W. E., & Hoza, B. (1996). Comprehensive treatment for ADHD: Intensive summer treatment programs and follow-up. In E. D. Hibbs & P. S. Jensen (Eds.), Psychosocial treatments for child and adolescent disorders (pp. 311340). Washington, DC: American Psychological Association. Pelham, W. E., Hoza, B., Pillow, D. R., Gnagy, E. M., Kipp, H. L., Greiner, A. R., et al. (2002). Effects of methylphenidate and expectancy on children with ADHD: Behavior, academic performance, and attributions in a summer treatment program and

Downloaded by [the Bodleian Libraries of the University of Oxford] at 06:43 01 April 2012

474

ADHD ASSESSMENT regular classroom settings. Journal of Consulting and Clinical Psychology, 70, 320335. Pelham, W. E., Lahey, B., Gnagy, E., Kipp, H., & Roy, A. (2005, June). Predictive validity of ADHD symptoms vs. impairment on functional outcomes. Poster to be presented at the annual meeting of the International Society for Research on Child and Adolescent Psychopathology, New York. Pelham, W. E., Lang, A. R., Atkeson, B., Murphy, D. A., Gnagy, E. M., Greiner, A. R., et al. (1998). Effects of deviant child behavior on parental alcohol consumption. American Journal on Addictions, 7, 103114. Pelham, W. E., Milich, R., Murphy, D. A., & Murphy, H. A. (1989). Normative data on the IOWA Conners teacher rating scale. Journal of Clinical Child Psychology, 18, 259262. Pelham, W. E., Schnedler, R. W., Bologna, N. C., & Contreras, J.A. (1980). Behavioral and stimulant treatment of hyperactive children: A therapy study with methylphenidate probes in a within-subject design. Journal of Applied Behavior Analysis, 13, 221236. Pelham, W. E., Wheeler, T., & Chronis, A. (1998). Empirically supported psychosocial treatments for attention deficit hyperactivity disorder. Journal of Clinical Child Psychology, 27, 190205. Power, T. J., Andrews, T. J., Eiraldi, R. B., Doherty, B. J., Ikeda, M. J., DuPaul, G. J., et al. (1998). Evaluating attention deficit hyperactivity disorder using multiple informants: The incremental utility of combining teacher with parent reports. Psychological Assessment, 10, 250260. Power, T. J., Costigan, T. E., Leff, S. S., Eiraldi, R. B., & Landau, S. (2001). Assessing ADHD across settings: Contributions of behavioral assessment to categorical decision making. Journal of Clinical Child Psychology, 30, 399412. Power, T. J., Doherty, B. J., Panichelli-Mindel, S. M., Karustis, J. L., Eiraldi, R. B., Anastopoulos, A. D., et al. (1998). The predictive validity of parent and teacher reports of ADHD symptoms. Journal of Psychopathology and Behavioral Assessment, 20, 5781. Quay, H. C., & Peterson D. R. (1983). Interim manual for the Revised Behavior Problem Checklist. Unpublished manuscript, University of Miami, Coral Gables. Rapport, M. D., Chung, K. M., Shore, D., Denney, D. B., & Isaacs, P. (2000). Upgrading the science and technology of assessment and diagnosis: Laboratory and clinic-based assessment of children with ADHD. Journal of Clinical Child Psychology, 29, 555568. Rapport, M. D., Murphy, H. A., & Bailey, J. S. (1982). Ritalin vs. response cost in the control of hyperactive children: A within-subject comparison. Journal of Applied Behavior Analysis, 15, 205216. Reich, W. (2000). Diagnostic Interview for Children and Adolescents (DICA). Journal of the American Academy of Child and Adolescent Psychiatry, 39, 5966. Reich, W., Shayka, J. J., & Taibleson, C. (1991). Diagnostic Interview for Children and AdolescentsDSMIIIR version (parent form). St. Louis, MO: Washington University, Division of Child Psychiatry. Reid, R., Casat, C. D., Norton, J. H., Anastopoulos, A. D., & Temple, E. P. (2001). Using behavior rating scales for ADHD across ethnic groups: The IOWA Conners. Journal of Emotional and Behavioral Disorders, 9, 210218. Reid, R., DuPaul, G. J., Power, T. J., Anastopoulos, A. D., Rogers-Adkinson, D., Noll, M., et al. (1998). Assessing culturally different students for attention-deficit hyperactivity disorder using behavior rating scales. Journal of Abnormal Child Psychology, 26, 187198. Reynolds, C. R., & Kamphaus, R. W. (2002). The clinicians guide to the Behavior Assessment Scale for Children. New York: Guilford. Roberts, M. A., Milich, R., Loney, J., & Caputo, J. (1981). A multi-trait, multi-method analysis of variance of teachers ratings of aggression, hyperactivity, and inattention. Journal of Abnormal Child Psychology, 9, 371380. Roberts, M. W. (2001). Clinic observations of structured parentchild interaction designed to evaluate externalizing disorders. Psychological Assessment, 13, 4658. Samuel, V. J., Thornell, A., George, P., Taylor, A., Brome, D.R., Biederman, J., et al. (1997). The unexplored void of ADHD and African-American research: A review of the literature. Journal of Attention Disorders, 1, 197207. Schwab-Stone, M. E., Shaffer, D., Dulcan, M. K., Jensen, P. S., Fisher, P., Bird, H. R., et al. (1996). Criterion validity of the NIMH Diagnostic Interview Schedule for children version 2.3 (DISC2.3). Journal of the American Academy of Child & Adolescent Psychiatry, 35, 878888. Scotti, J. R., Morris, T. L., McNeil, C. B., & Hawkins, R. P. (1996). DSMIV and disorders of childhood and adolescence: Can structural criteria be functional? Journal of Consulting and Clinical Psychology, 64, 11771191. Sergeant, J. A., Oosterlaan, J., & Van Der Meere, J. (1999). Information processing and energetic factors in attention-deficit/hyperactivity disorder. In H. C. Quay & A. E. Hogan (Eds.), Handbook of disruptive behavior disorders (pp. 75104). New York: Kluwer Academic/Plenum. Shaffer, D., Fisher, P., Lucas, C. P., Dulcan, M. K., & Schwab-Stone, M. E. (2000). NIMH Diagnostic Interview Schedule for Children Version IV (NIMH DISCIV): Description, differences from previous versions, and reliability of some common diagnoses. Journal of the American Academy of Child & Adolescent Psychiatry, 39, 2838. Shaffer, D., Gould, M. S., Brasic, J., Ambrosini, P., Fisher, P., Bird, H., et al. (1983). A Childrens Global Assessment Scale (CGAS). Archives of General Psychiatry, 40, 12281231. Shaffer, D., Schwab-Stone, M., Fisher, P., Davies, M., Piacentini, J., & Gioia, P. (1988). A revised version of the Diagnostic Interview Schedule for Children (DISCR). New York: New York State Psychiatric Institute, Columbia University College of Physicians and Surgeons, Division of Child Psychiatry. Smith, B. H., Pelham, W. E., Gnagy, E., Molina, B., & Evans, S. (2000). The reliability, validity, and unique contributions of self-report by adolescents receiving treatment for attention-deficit/hyperactivity disorder. Journal of Consulting and Clinical Psychology, 68, 489499. Sprafkin, J., Gadow, K. D., & Nolan, E. E. (2001). The utility of a DSMIV-referenced screening instrument for attention-deficit/hyperactivity disorder. Journal of Emotional and Behavioral Disorders, 9, 182191. Sprafkin, J., Gadow, K. D., Salisbury, H., Schneider, J., & Loney, J. (2002). Further evidence of reliability and validity of the Child Symptom Inventory4: Parent Checklist in clinically referred boys. Journal of Clinical Child and Adolescent Psychology, 31, 513524. Sylvester, C. E., Hyde, T. S., & Reichler, R. J. (1987). The Diagnostic Interview for Children and Personality Inventory for Children in studies of children at risk for anxiety disorders or depression. Journal of the American Academy of Child & Adolescent Psychiatry, 26, 668675. Tillman, R., Geller, B., Craney, J. L., Bolhofner, K., Williams, M., Zimerman, B., Frazier, J., et al. (2003). Temperament and character factors in a prepubertal and early adolescent bipolar disorder phenotype compared to attention deficit hyperactive and normal controls. Journal of Child and Adolescent Psychopharmacology, 13, 531543. Waschbusch, D. A., Sparkes, S. J., & Northern Partners in Action for Child and Youth Services. (2003). Rating scale assessment of attention-deficit/hyperactivity disorder (ADHD) and opposi-

Downloaded by [the Bodleian Libraries of the University of Oxford] at 06:43 01 April 2012

475

PELHAM, FABIANO, MASSETTI tional defiant disorder (ODD): Is there a normal distribution and does it matter? Journal of Psychoeducational Assessment, 21, 261281. Wolraich, M. L., Feurer, I. D., Hannah, J. N., Baumgaertel, A., & Pinnock, T. Y. (1998). Obtaining systematic teacher reports of disruptive behavior disorders utilizing DSMIV. Journal of Abnormal Child Psychology, 26, 141152. Wolraich, M. L., Lambert, W., Doffing, M. A., Bickman, L., Simmons, T., & Worley, K. (2003). Psychometric properties of the Vanderbilt ADHD diagnostic parent rating scale in a referred population. Journal of Pediatric Psychology, 28, 559568. Yates, B. T., & Taub, J. (2003). Assessing the costs, benefits, cost-effectiveness, and cost-benefit of psychological assessment: We should, we can, and heres how. Psychological Assessment, 15, 478495.

Received July 31, 2004 Accepted March, 26, 2005

Downloaded by [the Bodleian Libraries of the University of Oxford] at 06:43 01 April 2012

476

You might also like