Assessment of Depression The Depression Inventory
Assessment of Depression The Depression Inventory
Assessment of Depression The Depression Inventory
Pharmacopsychiat., vol. 7, pp. 151-169, ed. P. Pichot, Paris (Karger, Basel 1974)
Assessment of Depression:
The Depression Inventory
AARON T. BECK and ALICE BEAMESDERFER
Introduction
tributed to one ofthese causes: fluctuations in the clinical state of the patient,
5 %; inconsistencies by one of the psychiatrists, 37 %; inadequacies in the
nosological system, 58 %.
factors, only one of which is consistent with the clinical definition of depres-
sion. Another study [O'CONNOR et al., 1957] isolated five clusters. Again,
National Univ. of Singapore
Downloaded by:
BECKjBEAMESDERFER 154
since only one of these clusters was related to the clinical concept of depres-
sion, the authors questioned the attribution of unitary significance to the
D-Scale. Other studies suggest that the MMPI is sensitive to response sets
such as the social desirability response set and the acquiescence response set
[MESSICK, 1960].
HAMILTON'S [1960] rating scale for depression required administration
by experienced diagnosticians. More recently, adjective checklists have been
developed to measure depression and other affects [CLYDE, 1961; ZUCKER-
MAN and LUBIN, 1965]. Nevertheless, subjective feelings are only one aspect
of depressive illness, and failure to account for other dimensions in depres-
sion impairs the usefulness of these tests. Depression is much more than an
unpleasant feeling state, but is a complex disorder involving cognitive, moti-
vational, behavioral, and affective components [BECK, 1972].
Development of the DI
cally excluded from the study. The percentages among major diagnostic cate-
gories were: psychotic disorder, 41 %; psychoneurotic disorder, 43 %; per-
sonality disorder, 16 %. The three largest subgroups were: schizophrenic re-
action, 28.2 %; psychoneurotic depressive reaction, 25.3 %; anxiety reaction,
15.5 %.
The psychiatrists in our study had several preliminary meetings during
which they reached an agreement regarding the criteria for each of the noso-
logical categories and focused special attention on the various types of de-
pression. The American Psychiatric Association's Diagnostic and statistical
manual of mental disorders [1952] was used, but considerable amplification of
the diagnostic descriptions was necessary.
The psychiatrists also established specific indices for use in making a
clinical evaluation of the depth of depression. For each specified sign and
symptom the psychiatrists made a rating on a four-point scale of none,
mild, moderate, and severe. These indices were used to increase uniform-
ity among the psychiatrists. However, in rating the depth of depression,
they made a global judgment and were not confined by the ratings in each
index.
The psychiatrists also rated the patient on the degree of overt anxiety
and agitation, and filled out a checklist to indicate the presence of other spe-
cific psychosomatic and psychiatric symptoms and disturbances in concen-
tration, recall, memory, reality-testing, and judgment. They also rated the
severity of the present illness on a four-point scale.
Each patient was seen by two psychiatrists who made independent judg-
ments of the depth of depression and the diagnosis. After the second inter-
view, the psychiatrists met and discussed the case to ascertain the reasons for
any disagreement. The DI was administered independently by a trained tech-
nician.
Reliability of the DI
606 cases showed that each item had a significant positive correlation with
the total DI score [BECK, 1972].
The second method of evaluating internal consistency was the determi-
nation of the split-half reliability. 97 cases were selected for this analysis. The
Pearson Or' between the odd and even categories was computed and yielded a
reliability coefficient of 0.86; a Spearman-Brown correction for attenuation
raised the coefficient to 0.93.
Traditional methods of assessing the stability and consistency of inven-
tories were not appropriate for the evaluation of the DI. Test-retest did not
seem proper because of the possible influence of memory on the scores. If a
long interval were provided, the score would be influenced by fluctuations in
the intensity of depression. The inter-rater reliability method was not used
for the same reasons, i.e. two successive technicians would have to administer
the test.
Because of these considerations, we used two indirect methods of eval-
uating the stability of the inventory. The first was a variation of the test-retest
method. The inventory was administered to a group of 38 patients by a tech-
nician at two different times, with a mean interval of 4 weeks between the two
tests. Each time, a clinical rating of the depth of depression was made by a
psychiatrist. We found that changes in the DI scores paralleled changes in the
clinical ratings of the depth of depression.
An indirect measure of the inter-rater reliability was achieved by com-
paring the scores obtained by each of the three participating technicians with
the clinical ratings. The mean scores, respectively, obtained at each level of
depression were virtually identical among the interviewers. When the DI
scores were plotted against the depth of depression, the curves were notably
similar, indicating a high level of agreement among those who administered
the inventory.
Validity of the DI
tests and diagnostic techniques [1954] recommends the use of concurrent va-
lidity and construct validity criteria in evaluating personality tests.
National Univ. of Singapore
Downloaded by:
BECKjBEAMESDERFER 158
Concurrent Validity
Construct Validity
As pointed out by CRONBACH and MEEHL [1955], the most relevant in-
formation dealing with personality variables is obtained from an assessment
of the construct validity of the test. In short, the construct validity of a mea-
sure is determined by setting up a number of hypotheses regarding the per-
sonality variable (depression in our case). If the hypothesis is confirmed in an
experiment which uses the test as a criterion measure, the validity of the in-
strument is supported.
The hypotheses that we tested in our own investigations of depression
were: (1) depressed patients are likely to have a certain kind of dream charac-
terized by 'masochistic' content; (2) they are likely to have a negative self-
concept; (3) they identify with the 'loser' on projective tests dealing with suc-
cess and failure; (4) they have a history of deprivation that sensitized them to
depression in later life; (5) they respond to experimentally induced failure
with a disproportionate drop in self-esteem and increase in hopelessness; (6)
following a success experience, depressed patients will show a significant
subjective and objective improvement, and (7) they show a high correlation
between intensity of depression and suicidal intent.
Using the DI as the criterion measure, we found that these predictions
were largely supported. BECK and WARD [1961] found a significant relation-
ship between depression and 'masochistic' dreams. BECK and STEIN [1960]
found that depressives score highly on a self-concept test, with high scores in-
dicating negative self-concept. BECK [1961] found that depressed patients
identify with the 'loser' when presented with a series of pictorial stimuli.
BECK et at. [1963] obtained a significant relationship between childhood be-
reavement and adult depression. LOEB et at. [1964] demonstrated that de-
pressed patients make excessively pessimistic predictions after inferior task
137.132.123.69 - 4/10/2017 12:23:27 PM
ratings (r = -0.026). While the reason for this discrepancy is not immediate-
ly apparent, it is possibly the result of a response set in less-educated patients.
National Univ. of Singapore
Downloaded by:
The Depression Inventory 161
Response Sets
Several factor analyses of the DI have been reported. DELAY et al. [1963]
administered the DI to 79 depressed patients in France. Since each individual
National Univ. of Singapore
Downloaded by:
BECK/BEAMESDERFER 162
item of the DI correlated positively with the total DI score, the authors con-
cluded that there was evidence of a 'general factor' of depression. In a subse-
quent study, PICHOT and LEMPERIERE [1964] added 56 cases of depression to
their initial sample, making a total of 135 patients tested with the DI. Four
factors were then extracted from the data through factor analysis; factor A
consisted of the physiological signs of depression ('vital depression'); factor
B consisted of items relevant to the sense of self-derogation (,self-debase-
ment'); factor C contained items related to hopelessness and suicide ('pessi-
mism-suicide'); and factor D revolved around two motivational symptoms
('indecision-inhibition').
PICHOT et al. [1966] performed a factor analysis utilizing a modified ver-
sion of the DI (20 of the original 21 items, plus 13 additional items). In addi-
tion to a general factor of depression, these authors extracted 10 factors
which were submitted to Varimax rotation and grouped into categories. They
found 3 of the 10 factors (lethargy, intrapunitive, and affect) to be highly reli-
able measures of subjective symptomatology. Four other factors (somatic,
loss of libido, sleeping trouble, and anxiety) were 'interesting but weak'
measures. The remaining three factors were uninterpretable.
CROPLEY and WECKOWICZ [1966] and WECKOWICZ et al. [1967] also per-
formed a factor analysis using the DI. These authors reported three signifi-
cant factors: The first, called 'guilty depression', was heavily loaded on guilt
feelings, sense of punishment, self-accusation, sense of failure, self-punitive
wishes, self-hate, depressed mood, indecisiveness, and pessimism. The second
was identified as 'retarded depression', with high loadings on work inhibi-
tion, fatigue, lack of satisfaction, depressed mood, somatic preoccupation,
and indecisiveness. The third factor, 'somatic disturbance', was defined by
high loadings on weight loss, loss of appetite, and sleep disturbance. One oth-
er factor, called 'tearful depression', approached the authors' criterion of sig-
nificance. This factor loaded highly on body image, crying spells, and loss of
libido.
A more recent factor analytic study by WECKOWICZ et al. [1971] at-
tempted to relate the factors obtained from clinical evaluations, symptoms,
and complaints to physiological measures. Subjects were administered the
DI, psychomotor tests, the Shagass sedation threshold test, salivation tests,
and automatic nervous system activity tests. The following six significant fac-
tors were obtained: (1) 'somatic factor of retarded depression'; (2) 'atypical'
schizoid and involutional depression; (3) 'typical guilty depression'; (4) 'bod-
137.132.123.69 - 4/10/2017 12:23:27 PM
ily fatigue and neurasthenic exhaustion'; (5) 'somatization; and (6) 'hypo-
chondriasis'.
National Univ. of Singapore
Downloaded by:
The Depression Inventory 163
Cut-Off Scores
There is no arbitrary score that can be used for all purposes as a cut-off
point in the DI. The specific cut-off point depends upon the characteristics
of the patients in the sample and on the purposes for which the inventory is
being used.
The crux of the problem is: How many false-positives and false-nega-
tives occur at a particular cutting score? For identifying a relatively pure
group of depressed patients for research purposes, the investigator wants to
minimize false positives (i.e. high scores who are not really depressed). He
may not be concerned about false negatives, who would be excluded from his
study. A high cutting score, therefore, should be used (a score of more than
21 on the original DI).
As a screening device to detect depression among psychiatric patients, we
found a cut-off point of 13 is appropriate. This score gives fewer false-nega-
tives, but more false-positives than the higher cut-off point. For screening de-
pression among medical patients, SCHWAB et al. [1967a] found that a cutting
score of 10 was appropriate.
Abridged, self-administered DI
A new, short form of the DI has recently been developed to aid general
practitioners, as well as researchers, in the rapid screening of depressed pa-
tients [BECK and BECK, 1972]. Because depression may be masked by physi-
cal symptoms, the diagnosis is likely to be missed by a general practitioner,
and the depression often goes untreated. SALKIND'S [1969] data indicated
that48 % of office practice patients manifested depression ranging from mild
137.132.123.69 - 4/10/2017 12:23:27 PM
The DI has been recommended for all general practitioners in the British
Health Service by RAWNSLEY [1968]. In order to facilitate its use by family
physicians, a shorter, simplified version of the scale was developed to help
identify depressed patients. The new form requires approximately 5 min to
complete, and is suitable for self-administration by the patient.
We set the following criteria for the abridged version of the DI: (1) max-
imum correlation with clinicians' ratings of depth of depression, and (2) a 10-
to I5-item scale that would correlate better than 0.90 with the long form. In
our previous validation and reliability study, each of the 21 items was corre-
lated with the total DI score and with clinicians' depth of depression
ratings. The items were ranked for each of the two correlations, and the
two ranks were consolidated into a final rank.
Initially, the item that had the best correlation with the total DI score
was selected for the abridged form; then the sum of the best two items, then
the best three, and so on until the cumulative correlations levelled off. While
the correlation with the total DI score reached its criterion after seven items,
that with the clinicians' ratings did so after 13 items. Thus, we obtained a 13-
item questionnaire correlating 0.96 with the total DI score and 0.61 with the
clinicians' ratings of depression.
After rescoring the DIs of our original 599 patients on the basis of these
13 items, we computed the standard deviations and means for groups catego-
rized by clinicians' ratings of depth of depression and established cut-off
points for each category. While the clinician or investigator may have to
probe deeper if he wants a more complete estimate of the severity of depres-
sion, our cut-off points can alert him to the probable degree of severity with-
out SUbjecting the patient to a lengthy psychiatric interview and mental status
examination. The range of scores for the abridged DI are: 0--4, none or
minimal; 4-7, mild; 8-15, moderate; 16+, severe.
References
WILLIAMS, J. G.; BARLOW, D. H., and AGRAS, W. S.: Behavioral measurement of severe
depression (in press, 1972).
ZUCKERMAN, M. and LUBIN, B.: Manual for the multiple affect adjective check list
(Educational and Industrial Testing Service, San Diego 1965).
ZUNG, W. W. K.: A self-rating depression scale. Arch. gen. Psychiat. 12: 63-70 (1965).
ZUNG, W. W. K.: A cross-cultural study of symptoms in depression. Amer. J. Psychiat.
126: 154-159 (1969).
Appendix
Instructions
Be sure to read all the statements in each group before making your choice.
A. (Sadness)
o I do not feel sad
1 I feel sad or blue
2 I am blue or sad all the time and I can't snap out of it
3 I am so sad or unhappy that I can't stand it
B. (Pessimism)
o I am not particularly pessimistic or discouraged about the future
1 I feel discouraged about the future
137.132.123.69 - 4/10/2017 12:23:27 PM
C. (Sense of Failure)
o I do not feel like a failure
1 I feel I have failed more than the average person
2 As I look back on my life, all I can see is a lot of failures
3 I feel I am a complete failure as a person (parent, husband, wife)
D. (Dissatisfaction)
o I am not particularly dissatisfied
1 I don't enjoy things the way I used to
2 I don't get satisfaction out of anything anymore
3 I am dissatisfied with everything
E. (Guilt)
o I don't feel particularly guilty
1 I feel bad or unworthy a good part of the time
2 I feel quite guilty
3 I feel as though I am very bad or worthless
P. (Self-Dislike)
o I don't feel disappointed in myself
1 I am disappointed in myself
2 I am disgusted with myself
3 I hate myself
G. (Self-Harm)
o I don't have any thoughts of harming myself
1 I feel I would be better off dead
2 I have definite plans about committing suicide
3 I would kill myself if I had the chance
H. (Social Withdrawal)
o I have not lost interest in other people
1 I am less interested in other people than I used to be
2 I have lost most of my interest in other people and have little feeling for them
1
3 I have lost all of my interest
1 in other people and don't care about them at all
1
I. (Indecisiveness) 1
o I make decisions about as1well 1 as ever
1 I try to put off making decisions
2 I have great difficulty in making decisioUil
3 I can't make any decisions at all any more
J. (Self-Image Change)
o I don't feel I look any worse than I used to
1 I am worried that I am looking old or unattractive
2 I feel that there are permanent changes in my appearance and they make me
look unattractive
3 I feel that I am ugly or repulsive looking
K. (Work Difficulty)
o I can work about as well as before
1 It takes extra effort to get started at doing something
137.132.123.69 - 4/10/2017 12:23:27 PM
L. (Fatigability)
o I don't get any more tired than usual
1 I get tired more easily than I used to
2 I get tired from doing anything
3 I get too tired to do anything
M. (Anorexia)
o My appetite is no worse than usual
1 My appetite is not as good as it used to be
2 My appetite is much worse now
3 I have no appetite at all any more