CHAPTER
2
signal Detection,
Information Theory, and
Absolute Judgment
OVERVIEW
Information processing in most systems begins with the detection of some environ-
mental event. In a major catastrophe, the event is so noticeable that immediate detec-
tion is assured. The information-processing problems in these circumstances are those
of recognition and diagnosis. However, there are many other circumstances in which
detection itself represents a source of uncertainty or a potential bottleneck in perfor-
mance because it is necessary to detect events that are near the threshold of percep-
tion. Will the security guard monitoring a bank of television pictures detect the
abnormal movement on one of them? Will the radiologist detect the abnormal x-ray
as it is scanned? Will the industrial inspector detect the flaw in the product? Will the
yan driver described in Chapter 1 be able to detect a vehicle in front of him in poor
visibility conditions?
‘This chapter first will deal with the situation in which an observer classifies the
world into one of two states: a signal is present or it is absent. The detection process is
modeled within the framework of signal detection theory, and we show how the model
can assist engineering psychologists in understanding the complexities of the detection
process, in diagnosing what goes wrong when detection fails, and in recommending cor-
rective solutions.
The process of detection may involve more than two states of categorization. It
may, for example, require the human operator to choose between three or four levels
of uncertainty about the presence of a signal or to detect more than one kind of sig-
nal. At this point, we introduce information theory, and then use it to describe the sim-
plest form of multilevel categorization, the absolute judgment task. Finally, we consider
7Signal Detection, Information Theory and Absolute Judgement
the more complex multidimensional stimulus judgment. We again use a signal detec~
tion approach to account for performance in both multilevel and multidimensional
categorization judgments.
SIGNAL DETECTION THEORY
The Signal Detection Paradigm
Signal detection theory is applicable in any situation in which there are two discrete states
of the world (signal and noise) that cannot easily be discriminated. Signals must be de-
tected by the human operator, and in the process two response categories are produced:
Yes (I detect a signal) and no (Ido not). This situation may describe activities such as the
detection of a concealed weapon by an airport security guard, the detection of a contact
ona radar scope (N. H. Mackworth, 1948), a malignant tumor on an x-ray plate by a ra~
diologist (Parasuraman, 19855 Swets & Pickett, 1982), a malfunction of an abnormal sys-
tem by a nuclear plant supervisor (Lees & Sayers, 1976), a critical event in air traffic
control (Bisseret, 1981), typesetting errors by a proofteader (Anderson & Revelle, 1982),
an untruthful statement from a polygraph (Szucko & Kleinmuntz, 1981), a crack on the
body of an aircraft, or a communications signal from intelligent life in the bombard-
ment of electromagnetic radiation from outer space (Blake & Baird, 1980).
The combination of two states of the world and two response categories produces
the 2.x 2 matrix shown in Figure 2.1, generating four classes of joint events, labeled hits,
misses, false alarms, and correct rejections. It is apparent that perfect performance is that
in which no misses of false alarms occur. However, since the signals are not very intense
in the typical signal detection task, misses and false alarms do occur, and so there are
normally data in all four cells. In signal detection theory (SDT) these values are typically
State of the world
Signal Noise
| i}
Yes | Hit False alarm
i |
Response 4 —
No Miss Correct
rejection
Le tt
Figure 2.1 The four outcomes of signal detection theory.iigement 19
__Chopter 2__ Signal Detection, Information Theory an
cspressed ats probabilities, by dividing the number of occurrences in a cell by the sotal
nimbet of occurrences ina column. thus if 20 signals were presented, and there were
5 hits and 15 misses, we would write P(hit) = 5/20 = 0.
The SIT model iGreen & Swets, 1966) assumes that there are two stages of infor
mation processing in the task of detection: (1) Sensory evidence is aggregated c
cerning the presence or absence of the signal, and (2) a decision is made about whether
this evidence indicates a signal or aot. According to the theory, external stimuli gener
ate neural activity in the brain. ‘Theretore, on the average there will be more sensory or
neural evidence in the brain when a signal is present than when it is absent. This neural
evidence, X, may be conceived as the rate of firing of neurons at a hypothetical "detec-
tion center.” The rate increases with stimulus intensity. We refer to the quantity X as
the evidence variable. Therefore, if there is enough neural activity, X exceeds a crit-
ical threshold X,,, and the operator decides “yes.” If there is too little, the operator
decides “ao”
Because the amount of energy in the signal is typically low, the average amount of
X generated by signals in the environment is not much greater than the average gener
ated when no signals are present (noise). Furthermore, the quantity of X varies contin
uously even in the absence of a signal because of random variations in the environment
and in the operator's own “baseline” level of neural firing (c.g., the neural “noise” in the
sory channels and the brain). This variation is shown in Figure 2.2. Therefore, even
wben no signal is present, X will sometimes exceed the criterion X, as result of ran
dom variations alone, and the operator will say “yes” (generating a false alarm at point
\ of Figure 2.2). Correspondingly, even with a signal present, the random level of ac
tivity may be low, causing X to be less than the criterion, and the operator will say “no”
(generating a miss at point B of Figure 2.2). The smaller the difference in intensity be
tween signals and noise, the greater these error probabilities become because the amount
of variation in X resulting from randomness increases relative to the amount of energy
in the signal. In Figure 2.2, the average level of X is increased slightly in the presence of
at wveuk signal and greatly when a strong signal is presented,
For example, consider a person monitoring of a noisy radar screen. Somewhere in
the midst of the random variations in stimulus intensity caused by reflections from
clouds and rain, there is an extra increase in intensity that represents the presence of the
signal—an aircraft. The amount of noise will not be constant over time but will fluctu-
ate; sometimes it will be high, completely masking the stimulus, and sometimes low, al-
lowing the plane to stand out. In this example, “noise” varies in the environment,
Similarly, this random variation produced noise so that it was more difficult for him to
observe relevant signals—vehicles in the road ahead. Suppose, instead, you were stand-
ing watch on a ship, searching the horizon on a dark night for a faint light. It becomes
difficult to distinguish the flashes that might be real lights from those that are just
“visual noise” in your own sensory system. In this case, the random noise is internal.
Thus “noise” in signal detection theory is a combination of noise from external and
internal sources. The van driver described in Chapter 1 was subject to inclement weather,
which produced random variations in reflectances in sleet and spray.
“The relations between the presence or absence of the signal, random variability of
X, and X,-can be seen in Figure 2.3. The figure plots the probability of observing a spe-
cific value of X. given that a noise trial (left curve) or signal trial (right curve) in fact oc
curred, These data might have been tabulated (trom the graph of the evidence variable20 _Chapter2_ Signal Detection, Information Theory and Absolute Judgement
Evidence
variable
x
wo
| Weak Strong
‘Amount of
energy
froma
signal
Time
Figure 2.2 The change in the evidence variable X caused by a weak and a strong signal.
Notice that with the weak signal, there can sometimes be less evidence when the signal is,
present (point 6) than when the signal is absent (point A).
at the top of Figure 2.2) by counting the relative frequency of different X values during the
intervals when the signal was off, creating the probability curve on the left of Figure 2.3,
and making a separate count of the probability of different X values while the weak signal
‘was on, generating the curve on the right of Figure 2.3. As the value of X increases, itis
more likely to have been generated while a signal was present.
When the absolute probability that X was produced by the signal equals the proba-
bility that it was produced by only noise, the signal and noise curves intersect. The crite-
rion value X, chosen by the operator is shown by the vertical line. All X values to the right
(X > X,) will cause the operator to respond “yes” All to the left generate “no” responses.
‘The different shaded areas represent the occurrences of hits, misses, false alarms, and cor-
rect rejections. Since the total area within each curve is one, the two shaded regions within
each curve must sum to one. That is, P(H) + P(M) = 1 and P(FA) + P(CR) = 1.
Setting the Response Criterion: Optimality in SDT
In any signal detection task, observers may vary in their response bias. For example, they
may be “liberal” or “risky”: prone to saying yes, and therefore detecting most of the signals,section, Information Theory and Absolute Judgement_21
Criterion beta
ise No" <—| i
| Noi Le ves Signal
Correct . Hit
rejection
PLXI(N oF S)]
=
Figure 2.3 Hypothetical distributions underiying signal detection theory: (a) high sensitivity
{b) low sensitivity
that occur but making many false alarms. Alternatively, they may be “conservative”: say-
ing no most times and making few false alarms but missing many of the signals.
Sometimes circumstances dictate whether a conservative or a risky strategy is best.
For example, when the radiologist scans the x-ray of a patient who has been referred be
cause of other symptoms of illness, itis better to be biased to say yes (i.e. “you have a
tumor”) than when examining the x-ray of a healthy patient, for whom there is no rea
son to suspect any malignancy (Swets & Pickett, 1982). Consider, on the other hand, the
monitor of the power-generating station who has been cautioned repeatedly by the super-
visor not to unnecessarily shut down a turbine, because of the resulting loss of revenue to
the company. The operator will probably become conservative in monitoring the dials
and meters for malfunction and may be prone to miss (or delay responding to) a mal-
function when it does occur.
‘As can be seen in Figure 2.3, an operator’s conservative or risky behavior is de-
termined by placing the decision criterion X;. If X, is placed to the right, much ev-
idence is required for it to be exceeded, and most responses will be “no” (conservative
responding). If it is placed to the left little evidence is required, most responses will
be “yes” and the strategy is risky. An important variable that is positively correlated2
Chapter 2_ Signal Detection, information Theory and Absolute Judgement
with X, is beta, which is the ratio of neural activity produced by signal and noise
= BUxIN) (2.1)
‘This is the ratio of the height of the two curves in Figure 2.3, for a given level of X<.
Imagine shifting the value of X,. to the right. This will produce a beta value greater than
one. When this occurs, there will be fewer yes responses: therefore, there will be fewer
hits, but also fewer false alarms, Next imagine shifting X, to the left. Now beta is less
than one, and there will be more yes responses and more hits, but also more false alarms.
‘Thus both beta and X; define the response bias or response criterion.
More formally, the actual probability values for each of the cells in Figure 2.1 would
be calculated from obtained data. These data would describe the areas under the two prob-
ability distribution functions of unit area shown in Figure 2.3, to the left and right of the
criterion. Thus, for example, the probability of a hit, with the criterion shown, is the rel-
ative area under the “signal” curve (a signal was presented) to the right of the criterion
(the subject said yes). One can determine by inspection that if the two distributions are of
equal size and shape, then the setting of beta = 1 occurs where the two curves intersect as
shown in Figure 2.3 and will provide data in which P(H) = P(CR) and P(M) = P(FA), that
is,a truly “neutral” criterion setting.
Signal detection theory is able to prescribe exactly where the optimum beta should
fall, given (1) the likelihood of observing a signal and (2) the costs and benefits (payoffs)
of the four possible outcomes (Green & Swets, 1966; Swets & Pickett, 1982). We will first
consider the influence of signal probability, then payoffs, on the optimal setting of beta,
and finally, human performance in setting beta.
Signal Probability First consider the situation in which signals occur just as often
as they do not, and there is neither a different cost to the two bad outcomes nor a
different benefit to the two good outcomes of Figure 2.1. In this case optimal perfor-
mance minimizes the number of errors (misses and false alarms). Optimal performance
will occur when X, is placed at the intersection of the two curves in Figure 2.3: that
is, when beta = 1. Any other placement, in the long run, would reduce the probabil-
ity of being correct.
However, ifa signal is more likely, the criterion should be lowered. For example, if
traffic is busy on the freeway, increasing the likelihood of collision with another vehi-
cle, our van driver should be more likely to apply the brakes than if the road ahead were
empty. If the radiologist has other information to suggest that a patient is likely to have
a malignant tumor, or the physician has received the patient on referral, the physician
should be more likely to categorize an abnormality on the x-ray as a tumor than to ig-
nore it as mere noise in the x-ray process. Conversely, if signal probability is reduced,
beta should be adjusted conservatively. For example, suppose an inspector searching for
defects in computer microchips is told that the current batch has a low estimated fault
frequency, because the manufacturing equipment has just received maintenance. In this
case, the inspector should be more conservative in searching for defects. Formally,Chapter 2 Signal Detection, Information Theory and Absolute Judgement 23
this adjustment of the optimal beta in response to changes in signal and noise proba
bility is represented by the prescription
Boge = BN) (2.2)
PIS)
This quantity will be reduced (made riskier) as P(S) increases, thereby moving the value
of X, producing optimal beta to the left of Figure 2.3. If this setting is adhered to, per-
formance will maximize the number of correct responses (hits and correct rejections).
Note that the setting of optimal beta will not produce perfect performance. There will
still be false alarms and misses as long as the two curves overlap. However, optimal beta
is the best that can be expected for a given signal strength and a given level of sensitivity.
‘The formula for beta (Equation 2,1) and the formula for optimum beta (Equation 2.2)
are sometimes confused. Boye defines where beta should be set, and it is entirely deter-
mined by the ratio of the probability with which noise and signals occur in the envi-
ronment. In contrast, where beta is set is determined by the observer and must be
derived from empirical data.
Payoffs ‘The optimal setting of beta is also influenced by payoffs. In this case, optimal is
no longer defined in terms of minimizing errors but is now maximizing the total expected
financial gains (or minimizing expected losses). If it were important for signals never to
be missed, the operator might be given high rewards for hits and high penalties for misses,
leading to a low setting of beta. This payoff would be in effect for a quality control in-
spector who is admonished by the supervisor that severe costs in company profits (and
the monitor's own paycheck) will result if faulty microchips pass through the inspection
station. The monitor would therefore be more likely to discard good chips (a false alarm) in
order to catch all the faulty ones. Conversely, in different circumstances, if false alarms are
to be avoided, they should be heavily penalized. These costs and benefits can be translated
into a prescription for the optimum setting of beta by expanding Equation 2.2 to
P(N) | V(CR) + C(FA)
P(S) —-V(H) + C(M)
(2.3)
Boe
where Vis the value of desirable events (hit, H, or correct rejection, CR), and C is the
cost of undesirable events (miss, M, or false alarm, FA). An increase in denominator val-
ues will decrease the optimal beta and should lead to risky responding. Conversely, an
increase in numerator values should lead to conservative responding. Notice also that
the value and probability portions of the function combine independently. An event
like the malfunction of a turbine may occur only very rarely, thereby raising the opti-
mal beta; however, the cost of a miss in detecting it might be severe, and thus optimal
beta should be set to a relatively low value.
Human Performance in Setting Beta The actual value of beta that an operator uses
can be computed from the number of hits and false alarms obtained from a series of de-
tection trials. Therefore, we may ask how well people set their criteria in response to
changes in payoffs and probabilities, relative to optimal beta. Humans do adjust beta as24 —Chapter2_Signal Detection, Information Theory and Absolute Judgement
dictated by changes in these quantities. However, laboratory experiments have shown
that beta is not adjusted as much as it should be. That is, subjects demonstrate a slug-
gish beta, as shown in Figure 2.4. They are less risky than they should be ifthe ideal beta
{s high, and less conservative than they should be if the ideal beta is low. As shown in
Figure 2-4, the sluggishness is found to be more pronounced when beta is manipulated
by probabilities than by payoffs (Green & Swets, 1966).
"A number of explanations have been proposed to account for why beta is sluggish
in response to probability manipulations. It may be a reflection of the operator's need
to respond “creatively” by introducing the rare response more often than is optimal,
since extreme values of beta dictate long strings of either yes (low beta) or no (high beta)
responses. Another explanation may be that the operator misperceives probabilistic data,
There is evidence that people tend to overestimate the probability of rare events and un-
derestimate that of frequent events (Peterson & Beach, 1967; Sheridan & Ferrell, 1974)
This behavior, to be discussed in more detail in Chapter 8, would produce the observed
shift of beta toward unity.
eo pao
Obtained B
02h?
x—— probabilities
o- === payofis |
of 1 n pit 1 popauiti i
of 02 030405 081.0 2. 3 456 810
Optimal {
Figure 2.4 Relationship between obtained and optimal diChapter? Signal 0 2
he sluggislt beta phenomenon can be demonstrated most clearly in the laboratory,
where precise probabilities and values can be specified to the subjects. There is also ev
idence for sluggish beta in real-world environments, For example, Harris and Chaney
(1969), who examined the performance of inspectors in a Kodak plant, reported that as
the defect rate fell below about 5 percent, inspectors failed to lower beta accordingly.
bisseret (1981) applied signal detection theory to the air traffic controller's task of judg
ing whether an impending conflict between to aircraft will (signal) or will not (noise)
require a course correction. He found that controllers were more willing to detect a con
flict vices, more likely to lower beta), aud therefore command a correction, as the diffi
ally of the problem and therefore the uncertainty of the future increased. Bisseret also
compared the performance of experts and trainees and found that experts used lower
beta settings, being more willing to call for a correction. Bisseret suggested that trainees
were more uncertain about how to carry out the correction and therefore more relue-
tant to take the action, and argued that a portion of training should be devoted to the
issue of criterion placement
In medicine, Lusted (1976) reported evidence that physivians adjust their response
ctiterion in diagnosis according to how frequently the disease occurs in the popula-
tion-—essentially an estimate of P(signal)—but they adjust less than the optimal amount
specified by the changing probabilities. However, the difficulty of specifying precisely
the costs or benefits of all four of the joint events in medical diagnosis, as in air traffic
control, makes it difficult to determine the exact level of optimal beta (as opposed to
the direction of its change). How, for example, can the physician specify the precise cost
of an undetected malignancy (Lusted, 1976), or the air traffic controller specify the
costs of an undetected conflict that might produce a midair collision? The problems
associated with specifying costs define the limits of applying signal detection theory to
determine optimal response criteria
Sensitivity
Signal detection theory has made an important conceptual and analytical distinction
between the response bias parameters described above and the measure of the oper
ator's sensitivity, the keenness or resolution of the detection mechanisms. We have
seen that the operator may fail to detect a signal if response bias is conservative
Alternatively, the signal may be missed because the detection process is poor at dis
Criminating signals from noise,
Sensitivity refers to the separation of noise and signal distributions along the X axis
of Figure 2.3, IF the separation is large (Figure 2.3a), sensitivity is high and a given value
of is quite likely to be generated by either S or N but not both. If the separation is small
(Figure 2.5b), sensitivity is low. Since the curves represent neural activation, their se
aration could be reduced either by physical properties of the signal (e.g,,:t reduction
its intensity or salience) or by properties of the observer (e.g., 2 loss of hearing for an
{ of training of a medical student for the task of detect
auditory detection task or a la
ing tumor patterns on an x-ray)
In the theory of signal det
sponds to the separation of the means of two distributions in Figure 2.3, expressed in
ction, the sensitivity measure is called d’ and corre~26
“Chapter 2. Signal Detection, Information Theory and Absolute Judgement
units of their standard deviations. For most situations d’ varies between 0.5 and 2.0.
‘Table of d’ and beta values generated by different hit and false-alarm rates can be found
in Swets, 1964.
Like bias, sensitivity also has an optimal value (which is not perfect). The com-
putation of this optimal is more complex and is based on an ability to characterize
precisely the statistical properties of the physical energy in signal and no-signal tri-
als, Although this can be done in carefully controlled laboratory studies with acoustic
signals and white-noise background, it is difficult to do in more complex environ-
ments. Nevertheless, data from auditory signal detection investigations suggest that
the major cause for the departure results from the operator's poor memory for the
precise physical characteristics of the signal. When memory aids are provided to re-
mind the operator of what the signal looks or sounds like, d’ approaches optimal lev-
els. This point will be important when we consider the nature of vigilance tasks later
in the chapter.
THE ROC CURVE
Theoretical Representation
All detection performance that has the same sensitivity is in some sense equivalent, re-
gardless of bias. A graphic method of representation known as the receiver operating
characteristic (ROC) is used to portray this equivalence of sensitivity across changing
levels of bias. The ROC curve is useful for obtaining an understanding of the joint ef-
fects of sensitivity and response bias on the data from a signal detection analysis. In this
section we will describe the ROC curve and note its relation to the 2 x 2 matrix of Fig-
ure 2.1 and the theoretical signal and noise curves of Figure 2.3.
Figure 2.1 presents the raw data that might be obtained from a signal detection the-
ory (SDT) experiment. Of the four values, only two are critical. These are normally P(H)
and P(FA), since P(M) and P(CR) are then completely specified as 1 — P(H) and 1 ~
P(EA), respectively. Figure 2.3 shows the theoretical representation of the neural mech-
anism within the brain that generated the matrix of Figure 2.1. As the criterion is set at
different locations along the X axis of Figure 2.3, a different set of values will be gener-
ated in the matrix of Figure 2.1. Figure 2.5 also shows the ROC curve, which plots P(H)
against P(FA) for different settings of the response criterion.
Each signal detection condition generates one point on the ROC. If the signal
strength and the observer’s sensitivity remain constant, changing beta from one condi-
tion to another (either through changing payoffs or varying signal probability) will pro-
duce a curved set of points. Points in the lower left of Figure 2.5 represent conservative
responding; points in the upper right represent risky responding. When connected, these
points make the ROC curve. Figure 2.5 shows the relationship among the raw data, the
ROC curve, and the theoretical distributions, collected for three different beta values.
One can see that sweeping the criterion placement from left (low beta or “risky” respond-
ing) to right (high beta or “conservative” responding) across the theoretical distributions
produces progressively more “no” responses and moves points on the ROC curve from
upper right to lower left.0.60
Probability
of hit
\ Less sensitive
SS
|
|
0.05 0.30040 0.80
Probability of false alarm
Figure 2.5 The ROC curve. The figure shows how three points on an ROC curve of high sensi-
tivity relate to the raw data and underlying signal and noise curves. As beta is shifted at the
right, the figure also shows one point of lower sensitivity.
It can be time-consuming to carry out the same signal detection experiment sev-
eral times, each time changing only the response criterion by a different payoff or sig-
nal probability. A more efficient means of collecting data from several criterion sets is
to have the subject provide a rating of confidence that a signal was present (Green &
Swets, 1966). If three confidence levels are employed (e.g.,""1” = confident that no signal
was present, “2” = uncertain, and “3” = confident that a signal was present), the data
may be analyzed twice in different ways, as shown in Table 2.1. During the first analy
sis, levels 1 and 2 would be considered a “no” response and level 3 a “yes” response. ‘This
classification corresponds to a conservative beta setting, since roughly two-thirds of the
responses would be called “no.” In the second analysis, level | would be considered a
“no” response, and levels 2 and 3 would be considered a “yes” response. his classifica-
tion corresponds to a risky beta setting. Thus, two beta settings are available from only
one set of detection trials. Economy of data collection is realized because the subject is
asked to convey more information on each trial.
Formally, the value of beta (the ratio of curve heights in Figure 2.3) at any given point
along the ROC curve is equal to the slope of a tangent drawn to the curve at that point. This
slope (and therefore beta) will be equal to 1 at points that fall along the negative diagonal
(shown by the dotted line in Figure 2.5). Ifthe hit and false-alarm values of these points are
determined, we will find that P(H) = 1 — P(FA), as can be seen for the two points on28 Chapter 2_ Signal Detection, information Theory and Absolute Judgement
TABLE 2.1 Analysis of Confidence Rating in Signal Detection Tasks
Subject’s Response Stimulus Presented How Responses Are Judged
Noise Signal
= “No Signal” 4 2 No No
Uncertain” 3 2 No. Yes
ignal” 14 Yes Yes
Total No. Of Trials 8 8 4 +
Conservative Risky
Criterion Criterion
PUFA) =, P(FA) = 4
P(HIT) =‘, P(HIT) =%,
T The table shows how data with three levels of confidence can be collapsed to derive two points on the ROC curve.
Entries within the table indicate the number of times the subject gave the response on the left to the stimulus
(signal or noise) presented.
the negative diagonal of Figure 2.5. Performance here is equivalent to performance
at the point of intersection of the two distributions in Figure 2.3. Note also that
points on the positive diagonal of Figure 2.5, running from lower left to upper right,
represent chance performance: No matter how the criterion is set, P(H) always equals
P(FA), and the signal cannot be discriminated at all from the noise. A representation of,
Figure 2.3 that gives rise to such chance performance would be one in which the signal
and noise distributions were perfectly superimposed. Finally, points in the lower right
region of the ROC space represent worse than chance performance. Here, the subject
says “signal” when no signal is perceived and vice versa, implying that the subject is mis-
interpreting the task
We will now discuss how the ROC curve represents sensitivity. Figure 2.5 shows that the
ROC curve for a more sensitive observer is more bowed, being located closer to the upper
left, Note that the ROC space in Figure 2.5 is plotted on a linear probability scale, and there-
fore shows a typically bowed curve. An alternative way of plotting the curve is to use z-scores
(Figure 2.6). Constant units of distance along each axis represent constant numbers of stan-
dard scores of the normal distribution. This representation has the advantage that the bowed
lines of Figure 2.5 now become straight lines parallel to the chance diagonal. For a given
point, dis then equal to Z(H) ~ Z(FA), reflecting the number of standardized scores that
the point lies to the upper left of the chance diagonal. A measure of response bias that cor-
relates very closely with beta, and is easy to derive from Figure 2.6, is simply the z-score of
the false-alarm probability for a particular point (Swets & Pickett, 1982).
Empirical Data
It is important to realize the distinction between the theoretical, idealized curves in
Figures 2.3, 2.5, and 2.6 and the actual empirical data collected in a signal detectionBAY
Figure 2.6 The ROC curve on probability paper.
experiment or a field investigation of detection performance. The most obvious contrast
is that the representations in Figures 2.5 and 2.6 are continuous, smooth curves, whereas
empirical data would consist of a set of discrete points. More important, empirical re-
sults in which data are collected from a subject as the criterion is varied often provide
points that do not fall precisely along a line of constant bowedness (Figure 2.5) or a 45-
degree slope (Figure 2.6). Often the slope is slightly shallower. This situation arises be-
cause the distributions of noise and signal-plus-noise energy are not in fact precisely
normal and of equal variance, as the idealized curves of Figure 2.3 portray, particularly
if there is variability in the signal itself. This tilting of the ROC curve away from the ideal
presents some difficulties for the use of d’ as a measure of sensitivity. If’ is measured as
the distance of the ROC curve of Figure 2.6 from the chance axis, and this distance varies
as a function of the criterion setting, what is the appropriate setting for measuring d’?
One approach is to measure the distance at unit beta arbitrarily (i.e., where the ROC
curve intersects the negative diagonal). This measure is referred to as d and may be em-
ployed if data at two or more different beta settings are available so that a straight-line
ROC can be constructed on the probability plot of Figure 2.6 (Green & Swets, 1966).
Although it is therefore desirable to generate two or more points on the ROC curve,
there are some circumstances in which it may be impossible to do so, particularly when
evaluating detection data in many real-world contexts. In such cases, the experimenter often
cannot manipulate beta or use rating scales and must use the data available from only a sin-
gle stimulus-response matrix. This does not always present a problem. Collection of a full
set of ROC data may not be necessary if bias is minimal (Macmillan & Creelman, 1991).
Nonetheless, if there are only one or two points in the ROC space and there is evidence for
strong risky or conservative bias, another measure of sensitivity should be used.
Under these circumstances, the measure P(A) or the area under the ROC curve is an
alternative measure of sensitivity (Calderia, 1980; Craig, 1979; Green & Swets, 1966). The
measure represents the area to the right and below the line segments connecting the lower
left and upper right corners of the ROC space to the measured data point (Figure 2.7).
Craig (1979) and Calderia (1980) have argued that the advantage of this measure is that
it is “parameter free.” That is, its value does not depend on any assumptions concerning
the shape or form of the underlying signal and noise distributions. For this reason, it
is a measure that may be usefully employed even if two or more points in the ROC30 Chapter 2_Signal Detection, Information Theory and Absolute Judgement
space are available but do not fall along a 45-degree line. (‘This suggests that the data do
not meet the equal variance assumptions.) The measure P(A) may be calculated from
the formula
P(H) + us P(FA)) aa)
Alternative measures of bias also exist. For example, the measure C locates the
criterion relative to the intersection of the two distributions. The intersection point
is the zero point, and then distance from this criterion is measured in Z-units. Thus,
C =0.5(Z(FA) + Z(H)). Conservative biases produce positive C-values; risky biases
produce negative values. Recent summaries of bias measures suggest that Cis a better
measure of bias than beta, because it is less sensitive to changes in d’ (See, Warm, Dem-
ber, & Howe, 1997; Snodgrass & Corwin, 1988). Nonparametric measures of bias are
also available, and are described in See et al. (1997).
‘The reader is referred to Calderia (1980), Green and Swets (1966), Macmillan and
Creelman (1991), and Snodgrass and Corwin (1988) for further discussion of the rela~
tive merits of different sensitivity and bias measures.
P(A) =
APPLICATIONS OF SIGNAL DETECTION THEORY
Signal detection theory has had a large impact on experimental psychology, and its con-
cepts are highly applicable to many problems of human factors as well. It has two gen-
eral benefits: (1) It provides the ability to compare sensitivity and therefore the quality
of performance between conditions or between operators that may differ in response
bias. (2) By partitioning performance into bias and sensitivity components, it provides
a diagnostic tool that recommends different corrective actions, depending on whether
change in performance results from a loss of sensitivity or a shift in response bias.
The implications of the first benefit are clear. Suppose the performance of two op-
erators or the hit rate obtained from two different pieces of inspection equipment are
compared. If A has a higher hit rate but also a higher false-alarm rate than B, which is
Figure 2.7 Example of the sensitivity measure P(A), the area under the ROC curve, derived
from one point.Chapter 2__ Signal Detection, Information Theory and Absolute Judgement __31
superior? Unless the explicit mechanism for separating sensitivity from bias is available,
this comparison is impossible. Signal detection theory provides the mechanism.
‘The importance of the second benefit—the diagnostic value of signal detection
theory—will be evident as we consider some actual examples of applications of signal
detection theory to real-world tasks. In the many possible environments where the op-
erator must detect an event and does so imperfectly, the existence of these errors pre-
sents a challenge for the engineering psychologist: Why do they occur, and what
corrective actions can prevent them? Three areas of application (medical diagnosis,
eyewitness testimony, and industrial inspection), will be considered, leading to a more
extensive discussion of vigilance.
Medical Diagnosis
Medical diagnosis is a fruitful realm for the application of signal detection theory
(Lusted, 1971, 1976; Parasuraman, 1985). Abnormalities (diseases, tumors) are either
present in the patient or they are not, and the physician must decide “present” or “ab-
sent.” Sensitivity is related to factors such as the salience of the abnormality or the num-
ber of converging symptoms, as well as the training of the physician to focus on relevant
cues, Response bias, meanwhile, can be influenced by both signal probability and pay-
offs. In the former category, influences include the disease prevalence rate and whether
the patient is examined in initial screening (probability of disease low, beta high) or re-
ferral (probability higher, beta lower). Lusted (1976) has argued that physicians’ detec-
tions generally tend to be less responsive to variation in the disease prevalence rate than
optimal. Parasuraman (1985) found that radiologist residents were not responsive
enough to differences between screening and referral in changing beta. Both results
illustrate the sluggish beta phenomenon.
Although payoffs may influence decisions, itis difficult to quantify consequences of
hits (e.g., a detected malignancy leads to its surgical removal with associated hospital
costs and possible consequences), false alarms (an unnecessary operation), and misses.
Placing values on these events based on financial costs of surgery, malpractice suits, and
the intangible costs of human life and suffering is clearly difficult. Yet there is little doubt
that they influence the physician’s detection rate, Lusted (1976) and Swets and Pickett
(1982) have shown how diagnostic performance can be quantified with an ROC curve.
Ina thorough treatment, Swets and Pickett describe the appropriate methodology for
using signal detection theory to examine performance in medical diagnosis.
Several investigations have examined the more restricted domain of tumor diagno-
sis by radiologists. Rhea, Potsdaid, and DeLuca (1979) have estimated the rate of omis-
sion in the detection of abnormalities to run between 20 and 40 percent. In comparing
the detection performance of staff radiologists and residents, Parasuraman (1985), found
differences in sensitivity (favoring the radiologists) and bias (radiologists showing a more
conservative criterion).
Swennson, Hessel, and Herman (1977) examined the effect of directing the radiol-
ogist’s attention to a particular area of an x-ray plate where an abnormality was likely
to occur. They found that this increased the likelihood of the tumor’s detection, but did
so by reducing beta rather than increasing sensitivity. In related work, Swennson (1980)
conducted an ROC analysis of the performance of radiologists searching for tumors32,
Chapter 2_ Signal Detection, Information Theory and Absolute Judgement
in chest x-rays. Swennson found the counterintuitive result that examining the entire
radiograph produced greater sensitivity than a control condition in which only one
part of the radiograph was examined. Swennson proposed a model having two de-
tection components. The first identifies likely candidate locations, and the second
identifies tumors in those locations. By using different sensitivities in the two stages,
Swennson’s model accounted for the obtained results.
A signal-detection approach has been used to distinguish between the relative merits
of two different imaging techniques for examining brain lesions: computerized tomogra-
phy (CT) and radionuclide (RN) scans (Swets et al, 1979). A confidence level procedure
‘was used such that the radiologists made judgments on a 5-point scale. Hit and false-alarm.
rates were computed from these data using the procedure described above. Plots of the
ROC space showed that higher levels of sensitivity were evident with the CT method rel-
ative to the RN method, with little difference in terms of bias.
‘Schwartz, Dans, and Kinosian (1988) performed an ROC analysis on data from a
study by Nishanian et al. (19875 see Swets, 1992). Nishinian et al. compared the accu-
racy for three diagnostic tests for HIV. The analysis showed that two of the tests pro-
duced nearly the same sensitivity value, higher than that produced by the third test.
However, the two tests having greater sensitivity differed in terms of bias; one test was
much more likely to make a false positive decision (a false alarm—that is, diagnosing
someone with HIV when they did not have it) for a slight increase in the number of
hits. Signal detection analysis of this type can be very important in helping physicians
determine which diagnostic test to use in a given situation. For example, it may be ap-
propriate to use the more liberal test (and run the increased risk of a false alarm) when
other factors suggest that the patient is more likely to be HIV positive.
Recognition Memory and Eyewitness Testimony
The domain of recognition memory represents a somewhat different application of sig-
nal detection theory. Here the observer is not assessing whether or not a physical signal
is present but rather decides whether or not a physical stimulus (the person or name to
be recognized) was seen or heard at an earlier time.
‘One important application of signal detection theory to memory is found in the
study of eyewitness testimony (e.g., Ellison & Buckhout, 1981; Wells, 1993; Wright
& Davies, 1999). The witness to a crime may be asked to recognize or identify a suspect
as the perpetrator. The four kinds of joint events in Figure 2.1 can readily be specified.
The suspect examined by the witness either is (signal) or is not (noise) the same indi-
vidual actually perceived at the scene of the crime. The witness in turn can either say
“That's the one” (Y) or “No, it’s not” (N).
In this case, the joint interests of criminal justice and protection of society are served
by maintaining a high level of sensitivity while keeping beta neither too high (many
misses, with criminals more likely to go free) nor too low (a high rate of false alarms, with
an increased likelihood that innocent individuals will be prosecuted). Signal detection
theory has been most directly applied to a witness's identification of suspects in police
lineups (eg., Ellison & Buckhout, 1981). In this case, the witness is shown a lineup ofchapter 2_ Signal Detection, Information theory and Ab:
: or so individuals, one of whom is the suspect detained by the police, and the oth-
ers are “foils.” Hence, the lineup decision may be considered a two-stage process: Is the
suspect in the lineup, and if so, which one is it?
Ellison and Buckhout (1981) have expressed concern that witnesses generally
have a low response criterion in lineup identifications and will therefore often say yes
to the first question. This bias would present no difficulty if their recognition mem-
ory was also accurate, enabling them to identify the suspect accurately. However, con-
siderable research on staged crimes shows that visual recognition of brief events is
notoriously poor (e.g., Loftus, 1979). Poor recognition memory coupled with the
risky response bias allows those conducting the lineup to use techniques that will cap-
italize on witness bias to ensure a positive identification. These techniques include
ensuring that the suspect is differently dressed, is seen in handcuffs by the witness be-
fore the lineup, or is quite different in appearance from the foils. In short, techniques
are used that would lead even a person who had not scen the crime to select the sus-
pect from the foils or would lead the witness to make a positive identification from
a lineup that did not contain the suspect (Ellison & Buckhout, 1981; Wells & Bradfield,
1998). This process is not testing the sensitivity of recognition memory but is em-
phasizing response bias,
How could the bias in the lineup process be reduced? Ellison and Buckhout (1981)
suggest a simple procedure: inform the witness that the suspect may not be in the lineup.
As does reducing the probability of a signal, this procedure will drive beta upward to
ward a more optimal setting. Ellison and Buckhout further argue that people in the
lineup should be equally similar to one another (such that a nonwitness would have an
equal chance of picking any of them). Although greater similarity will reduce the hit
rate slightly, it will reduce the false-alarm rate considerably more. The result will be a
net increase in sensitivity.
Wells (1993) offers two approaches to improving the eyewitness sensitivity in the
lineup situation, The first, called a blank lineup control, requires the witness to view
two lineups, the first being a blank lineup containing no suspect, and the second being
a lineup with the suspect. (The witness does not realize there is a second lineup while
observing the first.) This is similar to the signal-absent trial in the signal detection sit-
uation, Wells (1984) found that those witnesses who do not make an identification with
the blank lineup are more likely to make an accurate identification in the real lineup
than witnesses who go directly to the real lineup. The second approach noted by Wells
(1993) is called a mock witness control. Here, a lineup is shown to a set of people who
were not witnesses, and who are given only limited information about the crime. If the
mock witnesses identify the suspect at a rate greater than chance, then it suggests that
something in the limited information is leading them to choose the suspect. Both ap-
proaches are akin to adding a control group to an experiment, which allows for com-
parison with the real situation (the experimental group).
Gonzales, Ellsworth, and Pembroke (1993) contrasted recognition memory pertor-
mance in the lineup situation with recognition memory in the showup. A showup occurs
when the witness is shown one suspect and is asked whether the suspect is the person who
committed the crime, It is widely believed that the showup is biased towards saying “yes”
(and thereby increasing false alarms) relative to the lineup. However, Gonzales et al. found34 Chapter 2_ Signal Detection, Information Theory and Absolute Judgement
that witnesses were more likely to say “yes” with a lineup than a showup, even when the
suspect was absent, demonstrating a risky bias for the lineup. Gonzales et al. speculated
that the initial process of finding a best match within the lineup may bias the witness to
make an identification. Therefore, the result of the initial decision (the suspect is in the
lineup) serves to lower the criterion for subsequent decisions (which one is the suspect).
This would serve to increase both hits and false alarms, as observed.
‘A farther danger is that eyewitnesses tend to become more certain of their judgment
after being told they selected the suspect (Wells & Bradfield, 1998). In the United States,
there is nothing to stop an investigator from telling an eyewitness that they selected the
suspect from a lineup after the choice has been made. In the Wells and Bradfield study,
subjects were shown a security video and then later asked to identify the gunman from a
set of photographs. All subjects made identifications, although the actual gunman was not
in the set. Following the identification, subjects were told whether or not they had iden-
tified the suspect. If they were told that they had identified the suspect, they were later
‘more certain that they had identified the gunman. However, since the gunman was not in
the set, these were all false identifications. In signal detection terms, this is an increase in
the false-alarm rate resulting from a reduction in beta. The problem is that eyewitnesses
appear at tril convinced they have identified the criminal, and juries are in turn more eas-
ily convinced by the eyewitness’s testimony (Wells & Bradfield, 1998). For these reasons,
researchers have recommended that investigators should not reveal any information about
the outcome of the identification until a clear statement of confidence has been obtained
from the eyewitness (Wells & Seelau, 1995).
Industrial Inspection
In many industries, there is a need to check on the quality of manufactured products
or product parts. An industrial inspector might check on the quality of welds in an au-
tomobile factory, for example. Since the inspector may not know the location of the
flaw, industrial inspection often involves a visual search (to be discussed further in
Chapter 3). For example, Swets (1992) examined technicians inspecting metal fatigue
in airplanes. A large number of metal specimens, some with and some without flaws,
were inspected using ultrasound and eddy-current methods. The performance of each
technician was specified by a hit rate and false-alarm rate and plotted as a point in an
ROC space. The ROC plot for the inspectors using the ultrasound showed the points
randomly distributed above the positive diagonal. In contrast, the points in the eddy-
current ROC space were more tightly clustered in the extreme upper left corner of the
space. Thus, inspectors were better able to detect a flaw using the eddy-current method
than the ultrasound method (greater sensitivity) and there was less variability in the
criterion setting. Despite the large variability among observers, the general advantage
of the eddy-current method was evident.
VIGILANCE
In the vigilance task an operator is required to detect signals over a long period of time
(referred toas the watch), and the signals are intermittent, unpredictable, and infrequent.
Examples include the radar monitor, who must observe infrequent contacts; the airport
security inspector, who examines x-rayed carry-on luggage; the supervisory monitor ofand Absolute Judgem
complex systems, who must detect the infrequent malfunctions of system components
and the quality control inspector, who examines a stream of products (sheet metal, cir-
cuit boards, microchips, fruit) to detect and remove detective or flawed items.
‘Two general conclusions emerge from the analysis of operators’ performance in the
vigilance situation, First, the steady-state level of vigilance performance is known as the
vigilance level, and operators often show lower vigilance levels than desirable. Second,
the vigilance level sometimes declines steeply during the first half hour or so of the
watch. This phenomenon was initially noted in radar monitors during World War Il (N.
H. Mackworth, 1948), has been experimentally replicated numerous times, and has been
observed in industrial inspectors (Harris & Chaney, 1969; Parasuraman, 1986). This de
crease in vigilance level over time is known as the vigilance decrement.
Vigilance Paradigms
It is important to distinguish two classes of vigilance situations, or paradigms. The free
response paradigm is one in which a target event occurs at any time and nonevents are
not defined, This is analogous to the task confronting the power plant monitor. In con-
trast, with the inspection paradigm, events occur at fairly regular intervals. A few of these
events are targets (defects), but most are nontargets (normal items). This is the task faced
by a circuit board inspector, for example. In the free-response paradigm, event rate is de-
fined by the number of targets per unit time, In the inspection paradigm event rate is
ambiguous because it may be defined either by the number of targets per unit time or
(incorrectly) by the ratio of targets to total events (targets and nontargets). The latter
measure (which we shall refer to as target probability) will stay constant even if the num
ber of targets per unit time is increased—the result of speeding up a conveyor belt, for
example. In typical industrial inspection tasks, event rate may be fairly high, but in other
tasks such as that of the airport security inspector, it will be much lower.
Itis also important to distinguish between a successive and a simultaneous vigilance
paradigm. In a successive paradigm or task, observers must remember the target stim.
ulus and compare successively presented stimulus configurations against the remem
bered representation, For example, an inspector might be asked to detect if the color of
a garment is darker than usual. In a simultaneous paradigm, all the information needed
to make the discrimination is present for each event. For example, each garment could
be compared to a standard piece of fabric.
Finally, we should distinguish between sensory and cognitive paradigms (See, Howe,
Warm, and Dember, 1995). In a sensory task, signals represent changes in auditory
or visual intensity. In a cognitive task, like proofreading a final manuscript, symbolic or
alphanumeric stimuli are used. Since cognitive stimuli such as letters or numbers are
often familiar, some researchers claim this distinction is really between familiar or un-
familiar stimuli (e.g., Koelega, Brinkman, Hendriks, & Verbaten, 1989). Nonetheless,
there appear to be important differences in the results obtained in the different para-
digms (See et al., 1995), as described below.
Measuring Vigilance Performance
A large number of investigations of factors affecting the vigilance level and the vigilance
decrement have been conducted over the last five decades with a myriad of experimental
variables in various paradigms. An exhaustive listing of all of the experimental results36
Chapter 2_ Signal Detection, Information Theory and Absolute Judgement
of vigilance studies is beyond the scope of this chapter. Readers interested in more ex-
tensive treatments are referred to the following sources: Davies and Parasuraman (1982);
Parasuraman (1986); See et al. (1995); Warm (1984); Warm and Dember (1998). Before
summarizing the most important results, we will discuss how vigilance performance
should be measured.
Specifying vigilance performance in terms of number of targets detected is analogous
to gauging performance in a signal detection task using hit rate (P(H)]. Specifying the vig
ilance decrement this way is misleading because P(H) could decline through an increase
in beta with no decline in sensitivity (i.e., P(FA) is also decreasing). Hence, vigilance per-
formance is better understood by applying a signal detection approach.
Indeed, it has been shown repeatedly that the vigilance decrement can arise either
as a result of a decrease in sensitivity (e.g,, J. F. Mackworth & Taylor, 1963) or as a shift
toa more conservative criterion (e.g., Broadbent & Gregory, 1965), depending on the task
and experimental situation. Therefore, rather than trying to account for factors affecting
the vigilance decrement, it is more informative to describe those factors that affect the
sensitivity decrements or beta increments that underlie the vigilance decrement.
In inspection tasks, when nontarget events are clearly defined, a signal detection
analysis is straightforward, since the false-alarm rate may be easily computed. In the
free-response paradigm, however, when nontarget events are not well defined, further
assumptions must be made to compute a false-alarm rate. In the laboratory, this is typ-
ically accomplished by defining an appropriate response interval after each signal within
which a subject’s response will be designated a hit. The remaining time during a watch
is partitioned into a number of false-alarm intervals, equal in duration to the response
intervals, P(FA) is simply the number of false alarms divided by the number of false-
alarm intervals (Parasuraman, 1986; Watson & Nichols, 1976). Although beyond the
scope of this treatment, there are caveats that should be considered when applying sig-
nal detection theory to vigilance phenomena, particularly when the false-alarm rates
are low (Craig, 1977; Long & Waag, 1981)
Factors Affecting Sensitivity Level and Sensitivity Decrement
The following factors affect sensitivity in a vigilance task:
1. Sensitivity decreases, and the sensitivity decrement increases, as a target’s sig-
nal strength is reduced, which occurs when the intensity or duration of a target
is reduced or otherwise made more similar to nontarget events (J. F. Mackworth
& Taylor, 1963; ‘Teichner, 1974).
2. Sensitivity decreases when there is uncertainty about the time or location at
which the target signal will appear. This uncertainty is particularly great if there
are long intervals between signals (J. E Mackworth & Taylor, 1963; Milosevic,
1974; Warm, Dember, Murphy, & Dittmar, 1992).
3. For inspection tasks, which have defined nontarget events, the sensitivity level
decreases and the decrement increases when the event rate is increased (Badde-
ley & Colquhoun, 1969; See et al, 1995). Event rate is defined as the number of
events per unit time; an example of increasing the event rate would be speeding
up the conveyer belt in an inspection situation. Note that this keeps the ratioChapter 2 _Signal Detection, Information Ths
of target to nontargets constant, and therefore event rate should not be con
fused with target probability, which aftects bias (see below).
4. The sensitivity level is higher for simultaneous tasks than for successive tasks
(Parasuraman, 1979). A sensitivity decrement occurs for successive tasks at high
event rates but does not occur at low event rates, or for simultaneous tasks at
either rate (Parasuraman, 1979).
The sensitivity decrement is eliminated when observers are highly practiced so that
the task becomes automatic, rather than controlled (Fisk & Schneider, 1981; Fisk &
Scerbo, 1987; see Chapters 6, 7, and 11 for detailed coverage of automaticity
6. A sensitivity increment (ie., improvement with time on watch) sometimes oc-
curs in simultaneous paradigms with cognitive (familiar) but not sensory
stimuli (See et al., 1995).
w
Factors Affecting Response Bias Level and Bias Increment
‘Changes in bias also occur, and the more salient results are as follows:
L. Target probability affects response bias, with higher probabilities decreasing
beta, (more hits and false alarms) and lower probabilities increasing it (more
misses and correct rejections) (Loeb & Binford, 1968; See et al., 1997; Williges,
1971), although sluggish beta is evident (Baddeley & Colquhoun, 1969). Note
that a decrease in target probability can occur if nontarget events are more
densely spaced between targets. Such changes in target probability are some-
times incorrectly referred to as event rate.
2, Payoffs affect response bias as in the signal detection task (¢.g., Davenport, 1968;
See et al., 1997), although the effect of payoffs is less consistent and less effec-
tive than manipulating probability (see Davies & Parasuraman, 1982). This
stands in contrast to the relative effects of manipulating probability and pay-
offs in the signal detection task where signals are more common.
3. Increased beta values are evident when signal strength is reduced (Broadbent, 1971).
Theories of Vigilance
We present three theories used to explain vigilance performance and the influence of
factors such as type of display, task type, or environmental stressors, and we will show
how they suggest corrective improvements. The advantage of such theories is that they
provide parsimonious ways to account for vigilance performance and thereby suggest
techniques to improve it. The first theory accounts for sensitivity loss; the others ac-
count for criterion shifts. It may be possible to integrate the theories, as described below.
Sensitivity Loss: Fatigue and Sustained Demand Theory _ In the earliest laboratory
studies investigating vigilance (N. H. Mackworth, 1948), the subject monitored a clock
hand that ticked at periodic intervals (nontarget events). Occasionally the hand under-
wenta “double tick” (target event), moving twice the angle of the nontarget events. In this
paradigm, and many others that have employed visual signals, a sensitivity decrement38
Chapter 2_ Signal Detection, Information Theory and Absolute Judgement _ _
usually occurs. To account for the decrement, Broadbent (1971) argued that the sustained
attention necessary to fixate the clock hand or other visual signals continuously extracts
a toll in fatigue. Indeed, sometimes the vigilance task is referred to as a sustained atten-
tion task (Parasuraman, 1979). Because of the resulting fatigue, the subject looks away or
blinks more often as the watch progresses, and therefore signals are missed.
More recently, investigators have concluded that a vigilance task imposing a sus-
tained load on working memory (e.g., having to recall what the target signal looks or
sounds like, as in a successive task) will demand the continuous supply of processing re-
sources (Deaton & Parasuraman, 1988; Parasuraman, 1979). Indeed ratings of mental
workload (see Chapter 9) show that the workload of vigilance tasks is generally high
(Warm, Dember, & Hancock, 1996). This mental demand may be as fatiguing as the sus-
tained demand to keep one’s eyes open and fixated, and here too the eventual toil of fa-
tigue will lead to a loss in sensitivity. A further implication of the resource-demanding
nature of vigilance tasks is their susceptibility to interference from concurrent tasks (to
be discussed in Chapter 11).
One would expect, therefore, that situations demanding greater processing resources
(e.g. when the target is difficult to detect, when there is uncertainty about where or
when the target will occur, when the event rate is fast, when the observer has to re-
member what the target looks or sounds like, when the target is not familiar) should
produce greater fatigue, leading to a lower vigilance level. For example, there is evidence
that successive tasks are more strongly resource-limited than simultaneous tasks
(Matthews, Davies, & Holley, 1993). One would also expect that the sustained demand
of the task over time will be greater in these situations, leading to greater sensitivity
decrements. Therefore, sustained demand theory proposes that sustained demand over
time leads to the sensitivity decrement; and that factors demanding greater mental re~
sources will lower sensitivity levels (Parasuraman, 1979; Matthews, Davies, & Holley,
1993), The finding that the sensitivity level is higher, and the sensitivity decrement elim-
inated, when observers detect the target automatically with litte effort is also consistent
with sustained demand theory, since a characteristic of automatic processing is that it
produces little resource demand (Schneider & Shiffrin, 1977).
Criterion Shifts: Expectancy Theory In many vigilance situations, the vigilance
decrement is due not so much to a sensitivity decrement, but rather a bias increment.
The sustained-demand theory described above cannot account for such increases, since
sustained demand is postulated to decrease d’, not increase beta. On the other hand,
the expectancy theory proposed by Baker (1961) attributes the vigilance decrement to
an upward adjustment of the response criterion in response to a reduction in the per-
ceived frequency (and therefore expectancy) of target events. Assume that the subject
sets beta on the basis of a subjective perception of signal frequency, P,(S). Ifa signal is
missed for any reason, subjective probability P,(S) is reduced because the subject be-
lieves that one less signal occurred. This reduction in turn causes an upward adjust-
ment of beta, which further increases the likelihood of a miss, and so on, in a “vicious
circle” (Broadbent, 1971). Although this behavior could lead to an infinite beta and
aa negligible hit rate, in practice other factors will operate to level off the criterion at a
stable but higher value.
‘When the signal probability is lowered, it should serve to decrease the expectation of
the signal, and therefore increase beta. Payoffs may have similar effects. Since the viciousChapter 2_ Signal Detection, information Theory and Absolute Judgement 39
circle depends on signals being missed in the first place. it stands to reason that the kinds
of variables that reduce sensitivity (short, low-intensity signals) should also increase the
expectancy effect in vigilance, as noted above.
An alternative theoretical conception for the bias increment was proposed by
Welford (1968). Arousal theory postulates that in a prolonged low-event environment,
the “evidence variable” X (see Figure 2.3) shrinks while the criterion stays constant. This
change is shown in Figure 2.8. The shrinking results trom a decrease in neural activity
‘both signal and noise) with decreased arousal. This decreased arousal may be related
to the sustained attentional demands of the vigilance task that affects sensitivity. An ex-
amination of Figure 2.8 reveals that such an effect will reduce both hit and false-alarm
rates (a change in beta) while keeping the separation of the two distributions, as ex-
pressed in standard scores, at a constant level (a constant @’).
The arousal view is consistent with some physiological findings (e.g., Dardano,
1962; McGrath, 1963; Milosevic, 1975). However, some recent studies with drugs that
affect arousal have found changes in sensitivity, but not the changes in response bias
predicted by arousal theory. For example, drugs like caffeine that increase arousal do
not produce changes in response bias but increase sensitivity (Fine et al., 1994); drugs
like antihistamine and oxazepam that decrease arousal do not affect response bias
but decrease sensitivity (Fine et al. 1994; van Leeuwen et al., 1994). In addition, there
Passage of time
Evidence variable X
Figure 2.8 Welford's arousal theory of the vigilance decrement40 Chapter 2_Signal Detection, Information Theory and Absolute Judgement
is good evidence that this theory is insufficient to explain all occurrences of criterion
Shifts. This evidence is provided by instances in which a manipulated variable that
would be expected to influence the arousal level does not produce the expected ef-
fect on the criterion. For example, as we have seen, increasing the total event rate,
while keeping the absolute target frequency constant (thereby decreasing the tar-
get/nontarget ratio), increases the decrement. Yet arousal theory should predict the
opposite effect, since the more frequent nontarget events should increase arousal,
not decrease it.
Nonetheless, arousal theory is parsimonious in that the same physiological mech-
anism—increased fatigue, decreased arousal—that accounts for the bias increment
can also account for the sensitivity decrement. It is possible that arousal might inter~
act with expectancy, with lowered arousal (greater fatigue) increasing the chance of a
miss and lowering P,(S). Thus, the best general explanation might be that decreasing
arousal (increasing fatigue) decreases sensitivity, increases beta, and also lowers the
subjective probability of a signal, again increasing beta. Expectancy theory is neces-
sary to properly account for the effects of probabilities and payoffs on the steady-state
vigilance level, however.
Techniques to Combat the Loss of Vigilance
In many vigilance situations, vigilance performance reflects some combination of
shifts in sensitivity and response bias. Very often d’ and beta shifts are observed in a
single vigil. Iti also important to reemphasize that the theoretical mechanisms pro-
posed to account for the vigilance decrement also account for differences between
onditions or tasks that are not related to time. Although Teichner (1974) has pointed
Gut that the vigilance decrement can be small (around 10 percent), corrective tech-
niques that reduce the decrement will also improve the absolute level of performance
‘As Parasuraman (1986) noted, performance levels may be consistently lower than
some minimum acceptable level of performance even if there is no decrement, which
may be of concern to management. Like the theories of vigilance, these corrective
techniques may be categorized into those that enhance sensitivity and those that shift
the response criterion.
Increasing Sensitivity We note the following techniques for improving sensitivity in
a vigilance task.
(1) Show target examples (reduce memory load). A logical outgrowth of the sus-
tained demand theory is that any technique aiding or enhancing the subject’s mem-
ory of signal characteristics should reduce sensitivity decrements and preserve a
higher overall level of sensitivity. Hence, the availability of a “standard” representa-
tion of the target should help. For example, Kelly (1955) reported a large increase in
detection performance when quality control operators could look at television pic-
tures of idealized target stimuli. Furthermore, a technique that helps reduce the bias
increment caused by expectancy may also combat a loss in sensitivity. The introduc-
tion of false signals, as described in the next section, could improve sensitivity by re-
freshing memory.Chapter 2_ Signal Detection, information Theory and Absolute Judgement 41
A study by Childs (1976), which found that subjects perform better when moni-
toring for only one target than when monitoring for one of several, is also consistent
with the importance of memory aids in vigilance. Childs also observed an improvement
in performance when subjects were told specifically what the target stimuli were rather
than what they were not. Schoenfeld and Scerbo (1997) found that the sensitivity decre-
‘ment was less when searching for the presence of a feature in a visual display than when
searching for its absence. The general recommendation is that inspectors should have
access to visual representation of possible defectives rather than simply the representa-
tion of those that are normal.
(2) Increase target salience. Various artificial techniques of signal enhancement are
closely related to the reduction in memory load. Available solutions capitalize on pro-
cedures that will differentially affect signals and nonsignals. For example, Luzzo and
Drury (1980) developed a signal-enhancement technique known as “blinking” When
successive events are similar to one another (e.g., wired circuit boards), detection of mi
wired boards can be facilitated by rapidly and alternately projecting an image of a sin-
gle location of a known good prototype and the item to be inspected, If the latter is
“normal,” the image will be identical, fused, and continuous. If the item contains a mal-
function (e.g.,a gap in wiring), the gap location will blink on and off in a highly salient
fashion as the displays are alternated.
A related signal-enhancement technique is to induce coherent motion into targets
but not nontargets, thereby taking advantage of the human's high sensitivity to motion.
For example, an operator scanning a radar display for a target blip among many similar
nontargets encounters a very difficult detection problem. Scanlan (1975) has demon-
strated that a radar target undergoes a coherent but slow motion whereas the noise prop-
erties are random. If successive radar frames are stored, recent frames can be replayed in
fast time, forward and backward. Under these conditions, the target's coherent motion
stands out improving detection performance.
Analternative approach is to transcribe the events to an alternate sensory modal-
ity. This technique takes advantage of the redundancy gain that occurs when a signal
is presented in two modalities at once. Employing this technique, Colquhoun (1975),
Doll and Hanna (1989), and Lewandowski and Kobus (1989) found that sonar moni-
tors detected targets more accurately when the target was simultaneously displayed vi-
sually and auditorially than when either mode was employed by itself.
(3) Vary event rate. Sustained demand theory suggests that high event rates can pro-
duce larger losses in vigilance performance. As Saito (1972) showed in a study of bottle
inspectors, a reduction of the event rate from 300 to under 200 bottles per minute
markedly improved inspection efficiency. Allowing observers to control event rate is also
effective: Scerbo, Greenwald, and Sawin (1993) showed that giving observers such con-
trol improves sensitivity and lowers the sensitivity decrement.
(4) Train observers. A technique closely related to the enhancement of signals
through display manipulations is one that emphasizes operator training. Fisk and
‘Schneider (1981) demonstrated that the magnitude of a sensitivity decrement could be
greatly reduced by training subjects to respond consistently and repeatedly to the target
elements, This technique of developing automatic processing of the stimulus (described
further in Chapter 6) tends to make the target stimulus “jump out” of the train of events,
just as one’s own name is heard in a series of words. Fisk and Schneider note that the42 Chapter 2 Signal Detection, Information Theory and Absolute Judgement
critical stimuli must consistently appear as a target stimulus, and that the probability of
target must be high during the training session.
Shift in Response Criterion The following methods may be useful in shifting the cri-
terion to an optimal level.
(1) Instructions. An unsatisfactory vigilance or inspection performance can occur
because the operator's perceptions of the probability of signals or the costs of errors do
not agree with reality. For example, in quality control, an inspector may believe that it
is better to detect more defects and not worry about falsely rejecting good parts although
it would be more cost-effective to maintain a higher criterion because the probability
of a defective part is low. Simple instructions in industrial or company policy may ad-
just beta to an appropriate level. In airline security inspection, increased stress on the
seriousness of misses (failing to detect a weapon smuggled through the inspection line)
could cause a substantial decrease in the number of misses (but a corresponding in-
crease in false alarms).
(2) Knowledge of Results, Less direct means can also adjust the response criterion to
a more optimal level. For example, where possible, knowledge of results (KR) should be
provided to allow an accurate estimation of the true P(S) (N. H. Mackworth, 1950). It
appears that KR is most effective in low-noise environments (Becker, Warm, Dember,
& Hancock, 1995).
(3) False Signals. Baker (1961) and Wilkinson (1964) have argued that intro-
ducing false signals should keep beta low. False signals will raise the subjective P,(S)
and might raise the arousal level as well. Furthermore, if the false signals refresh the
operator’s memory, the procedure should improve sensitivity and reduce the sensi
tivity decrement by reducing the sustained demand of the task, as discussed earlier.
For example, as applied to the quality control inspector, a certain number of prede-
fined defectives might be placed on the inspection line. These would be “tagged,” so
that if missed by the inspector, they would still be removed. Their presence in the in-
spection stream should guarantee a higher P,(S) and therefore a lower beta than
would be otherwise observed. However, this technique should not be used if the ac-
tions that the operator would take after detection have undesirable consequences for
an otherwise stable system. An extreme example would occur if false warnings were
introduced into a chemical process control plant and these led the operator to shut
down the plant unnecessarily.
(4) Confidence levels. Finally, allowing operators to report signal events with differ-
ent confidence levels decreases the bias increment (Broadbent & Gregory, 1965; Rizy,
1972). If rather than classifying each event as target or nontarget, the operator can say
“target,” “uncertain,” or “nontarget” (or a wider range of response options), beta should
not increase as quickly since the observer would say “nontarget” less often, and the sub-
jective perception of signal frequency, P,(S) should not decrease as quickly. The idea of
a graded confidence in detection has important implications for the design of alarms
(discussed in Chapter 13).
(5) Other techniques, Other techniques to combat the decrement have focused more
directly on arousal and fatigue. Parasuraman (1986) noted that rest periods can haveChapler 2_Signal Detection, information 43
beneficial effects. Presumably, rest periods serve to increase arousal, which as we have
seen influences both sensitivity and response bias. Welford (1968) has argued persua-
sively that any event (such as a phone call, drugs, or noise) that will sustain or increase
arousal should reduce the decrement or at least maintain beta at a more constant level.
Using biofeedback techniques, Beatty, Greenberg, Deibler, and O'Hanlon (1974) have
shown that operators trained to suppress theta waves (brain waves at 3~7 Hz, indicat-
ing low arousal) will also reduce the decrement.
Conclusions
Despite the plethora of vigilance experiments and the wealth of experimental data, the ap-
plication of research results to real-world vigilance phenomena has not yet been exten-
sive. This is somewhat surprising in light of the clear shortcomings in many inspection
tasks, with miss rates sometimes as high as 30 to 40 percent (Craig, 1984; Parasuraman,
Warm, & Dember, 1987). One reason the results of laboratory studies have not been more
fully applied relates to the discrepancy between the fairly simple stimuli with known lo-
cation and form employed in many laboratory tasks, and the more complex stimuli ex-
isting in the real world. The monitor of the nuclear power plant, for example, does not
know precisely what configuration of warning indicators will signal the onset of an ab-
normal condition, but it is unlikely that it will be the appearance of a single near-thresh-
old light in direct view. Some laboratory investigators have examined the effects of signal
complexity and uncertainty (e.g., Adams, Humes, & Stenson, 1962; Childs, 1976; Howell,
Johnston, & Goldstein, 1966). These studies are consistent in concluding that increased
complexity or signal uncertainty will lower the absolute vigilance level. However, their
conclusions concerning the influence of signal complexity and uncertainty on other vig-
ilance effects (e.g,, the size of the vigilance decrement), and therefore the generalizability
of these effects to complex signal environments, have not been consistent.
Asecond possible reason laboratory results have not been fully exploited relates to the
differences in motivation and signal frequency between laboratory data and real vigilance
phenomena. In the laboratory, signal rates may range from one an hour to as high as three
or four per minute—low enough to show decrements, and lower than fault frequencies
found in many industrial inspection tasks, but far higher than rates observed in the per-
formance of reliable aircraft, chemical plants, or automated systems, in which defects occur
at intervals of weeks or months. This difference in signal frequency may well interact with
differences in motivational factors between the subject in the laboratory, performing a
‘well-defined task and responsible only for its performance, and the real-time system op-
erator confronted with a number of other competing activities and a level of motivation
potentially influenced by large costs and benefits, This motivation level may be either lower
or far higher than those of the laboratory subjects, but it will probably not be the same.
‘These differences do not mean that the laboratory data should be discounted. The
basic variables causing vigilance performance to improve or deteriorate that have been
uncovered in the laboratory should still affect detection performance in the real world,
although the effect may be attenuated or enhanced. Data have been collected in real or
highly simulated environments: in process control (Crowe, Beare, Kozinsky, & Hass,
1983; Lees and Sayers, 1976), in maritime ship navigation monitoring (Schmidke, 1976),44 _Chopter 2_ Signal Detection, Information Theory and Absolute Judgement
and in aviation (Molloy & Parasuraman, 1996; Ruffle-Smith, 1979), and many of the
same vigilance phenomena occurring in the laboratory occur in the real world. For ex-
ample, Pigeau, Angus, O'Neill, and Mack (1995) found a sensitivity decrement with
NORAD operators detecting the presence of aircraft entering Canadian airspace. These
results show that vigilance effects can occur in real-world situations. It should also be
noted that there is increasing implementation of automation in many work environ-
ments, and since perfectly reliable automated systems have not yet been developed, the
vigilance task is becoming more commonplace in many work domains in the form of
monitoring an automated system (Parasuraman & Riley, 1997; see Chapter 13).
INFORMATION THEORY
The Quantification of Information
‘The discussion of signal detection theory was our first direct encounter with the human
operator as a transmitter of information: An event (signal) occurs in the environment;
the human perceives it and transmits this information to a response. Indeed, a consid-
erable portion of human performance theory revolves around the concept of transmit-
ting information. In any situation when the human operator either perceives changing
environmental events or responds to events that have been perceived, the operator is
encoding or transmitting information. The van driver in our Chapter 1 example must
process visual signals from the in-vehicle map display, from traffic signs, from other ve-
hicles, as well as process auditory signals (e.g., the truck horn). A fundamental issue in
engineering psychology is how to quantify this flow of information so that different
tasks confronting the human operator can be compared. Using information theory, we
can measure task difficulty by determining the rate at which information is presented.
We can also measure processing efficiency, using the amount of information an opera-
tor processes per unit of time. Information theory, therefore, provides metrics to com-
pare human performance across a wide number of different tasks.
Information is potentially available in a stimulus any time there is some uncertainty
about what the stimulus will be. How much information a stimulus delivers depends in
part on the number of possible events that could occur in that context. If the same stim-
ulus occurs on every trial, its occurrence conveys no information. If two stimuli (events)
are equally likely, the amount of information conveyed by one of them when it occurs,
expressed in bits, is simply equal to the base 2 logarithm of this number, for example,
with two events, log, 2 = 1 bit. If there were four alternatives, the information conveyed
by the occurrence of one of them would be log, 4 = 2 bits.
Formally, information is defined as the reduction of uncertainty (Shannon & Weaver,
1949). Before the occurrence of an event, you are less sure of the state of the world (you
possess more uncertainty) than after. When the event occurs, it has conveyed information
to you, unless it is entirely expected. The statement “Mexico declared war on the United
States this morning” conveys quite a bit of information. Your knowledge and understand-
ing of the world are probably quite different after hearing the statement than they were be-
fore. On the other hand, the statement “The sun rose this morning” conveys little
information because you could anticipate the event before it occurred, Information theory
formally quantifies the amount of information conveyed by a statement, stimulus, or event.
This quantification is influenced by three variables:
1. The number of possible events that could occur, Non, Information Thee
yand Absolute tudgement 48
2. The probabilities of those events
3. ‘The events’ sequential constraints, or the context in which they ovcut.
We will now describe how each of these three variables influences the amount of infor
mation conveyed by an event
event (which conveys information),
inty about some as-
Number of Events Before the occurrence of
a person has a state of knowledge that is characterized by unc
pect of the world. After the event, that uncertainty is normally less. The amount of un
certainty reduced by the event is defined to be the average minimum number of
true-false questions that would have to be asked to reduce the uncertainty. For exam-
ple, the information conveyed by the statement “Clinton won’ after the 1996 election is
1 bit because the answer to one true-false question—"Did Clinton win?” (True) or “Did
Dole win?” (False)—is sufficient to reduce the previous uncertainty. If, on the other
hand, there were four major candidates, all running for office, two questions would have
to be answered to eliminate uncertainty. In this case, one question might be “Was the
winner from the liberal (or conservative) pair?” After this question was answered, a sec~
ond question would be “Was the winner the more conservative (or liberal) member of
the pair?” Thus, if you were simply told the winner, that statement would formally con
vey 2 bits of information, This question-asking procedure assumes that all alternatives
are equally likely to occur. Formally, then, when all alternatives ate equally likely, the in-
formation conveyed by an event H, in bits, can be expressed by the formula
H, = log, N @
where Nis the number of equally likely alternatives.
Because information theory is based on the minimum number of questions and the
forearrives at a olution in a minimum time, it has. quality of optimal performance. [tis this
optimal aspect that makes the theory attractive in its applications to human performance
Probability Real-world events do not always occur with equal frequency or likelihood.
If you lived in the Arizona desert, much more information would be conveyed by the
statement “It is raining” than the statement “It is sunny.” Your certainty of the state of the
world is changed very little by knowing that itis sunny, but itis changed quite a bit (un-
inty is reduced) by hearing of the low-probability event of rain, In the example of
the four election candidates, less information would be gained by learning that the fa
vored candidate won than by learning that the Socialist Worker or Libertarian candidate
won. The probabilistic element of information is quantified by making rare events con-
vey more bits. This in turn is accomplished by revising Fquation 2.3 for the information
conveyed by event ito be
1)
H, = log,| — (2.6)
. ela
where P, is the probability of occurrence of event i. This formula increases H for low
probability events. Note that if N events are equally likely, each event will occur with
probability 1/N. In this case, Equations 2.5 and 2.6 are equivalent46 _Chapler 2_ Signal Detection, information Theory and Absolute Judgement
‘As noted, information theory is based on a prescription of optimal behavior. This op-
timum can be prescribed in terms of the order in which the true-false questions should
be asked. If some events are more common or expected than others, we should ask the
question about the common event first. In our four-candidate example, we will do the best
(ask the minimum number of questions on the average) by first asking “Is the winner Clin-
ton?” or “Is the winner Dole?” assuming that Clinton and Dole have the highest proba-
bility of winning. If instead the initial question was “Is the winner an independent?” or “Is
the winner from one of the minor parties?” we have clearly “wasted” a question, since the
answer is likely to be no, and our uncertainty would be reduced by only a small amount.
The information conveyed by a single event of known probability is given by Equa-
tion 2.6. However, psychologists are often more interested in measuring the average in-
formation conveyed by a series of events with differing probabilities that occur over
time—for example, a series of warning lights on a panel or a series of communication
commands. In this case the average information conveyed is computed as
Hye x afre() (2.7)
In this formula, the quantity within the square brackets is the information per event
as given in Equation 2. 6. This value is now weighted by the probability of that event,
and these weighted information values are summed across all events. Accordingly,
frequent low-information events will contribute heavily to this average, whereas rare
high-information events will not. If the events are equally likely, this formula will re-
duce to Equation 2.5.
‘An important characteristic of Equation 2.7 is that if the events are not equally
likely, H,, will be less than its value if the same events are equally probable. For exam-
ple, consider four events, A, B, C, and D, with probabilities of 0.5, 0.25, 0.125, and 0.125.
The computation of the average information conveyed by each event in a series of such
events would proceed as follows:
Event A B c D
P, 0.5 0.25 0.125 0.125
i
P, 2 4 8 8
1
logy 1 2 3 3
Za] ton | 05 + 05 + 0375 + 0.375 = L75bits
This value is less than log, 4 = 2 bits, which is the value derived from Equation 2.5 when
the four events are equally likely. In short, low-probability events convey more infor-
mation because they occur infrequently. However, the fact that low-probability events
are infrequent causes their high-information content to contribute less to the average.al Detection, Information Theory and Absolute tudgement 47
Sequential Constraints and Context In the preceding discussion, probability has
been used to reflect the long-term frequencies, or steady-stare expectancies, of events
will occur. However, there is a third contributor to information that reileis the short
\ particular event may occur
term sequences of events, or their transient expectan
rarely in terms of its absolute frequency. However, given a particular context, it may be
highly expected, and therefore its occurrence conveys very little information in that con~
text, In the example of rainfall in Arizona, we saw that the absolute probability of rain
is low. But if we heard that there was a large front moving eastward from California, our
expectance of rain, given this information, would be higher. That is, information can be
reduced by the context in which it appears. As another example, the letter 1 in the al-
phabet is not terribly common and therefore normally conveys quite a bit of informa
tion when it occurs; however, in the context of a preceding q, it is almost totally
predictable and therefore its information content, given that contest, is nearly 0 bits.
Contextual information is frequently provided by sequential constraints on a series
of events, In the series of events ABABABABAB, for example, P(A) = P(B) = 0.5. There-
fore, according to Equation 2.7, each event conveys | bit of information, But the next
letter in the sequence is almost certainly an A. Therefore, the sequential constraints re~
duce the information the same way a change in event probabilities reduces information
from the equiprobable case. Formally, the information provided by an event, given a
context, may be computed in the same manner as in Equation 2.6, except that the ab-
solute probability of the event P, is now replaced by a contingent probability P,|.X (the
probability of event i given context X).
Redundancy In summary, three variables influence the amount of information that
series of events can convey. The number of possible events, N, sets an upper bound on
the maximum number of bits if all events are equally likely. Making event probabilities
unequal and increasing sequential constraints both serve to reduce information from
this maximum. The term redundancy formally defines this potential loss in informa-
tion. Thus, for example, the English language is highly redundant because of two fac
tors: All letters are not equiprobable (e vs. x), and sequential constraints such as those
found in common digraphs like qu, ed, th, or nt reduce uncertainty.
Formally, the percent redundancy of a stimulus set is quantified by the formula
¥% redundancy = (: - ass \. 100 (2.8)
where M,, is the actual average information conveyed taking into account all three vari-
ables (approximately 1.5 bits per letters for the alphabet) and H,,,, is the maximum
possible information that would be conveyed by the N alternatives if they were equally
likely (log, 26 = 4.7 bits for the alphabet). ‘Thus, the redundancy of the English lan-
guage is(1 ~ 1.5/4.7) x 100 =68 percent. Wh-t th-s sug-est- is tat ma-y of t-e le-ter~
ar- not ne-ess-ry fo com-reh-nsi-n. Flowever, to stress a point that will be emphasized in
Chapter 5, this does not negate the value of redundancy in many circumstances. We have
seen already in our discussion of vigilance that redundaney gain can improve perfor
mance when perceptual judgments are difficult. At the end of this chapter, we will see
its value in absolute judgment tasks48 _Chapter2_ Signal Detection, Information Theory and Absolute Judgement _
Information Transmission of Discrete Signals
In much of human performance theory, investigators are concerned not only with how
much information is presented to an operator but also with how much is transmitted
from stimulus to response, the channel capacity, and how rapidly it is transmitted, the
bandwidth. Using these concepts, the human being is sometimes represented as an i
formation channel, an example of which is shown in Figure 2.9. Consider the t
typing up some handwritten comments. First, information is present in the stimuli
(the handwritten letters). This value of stimulus information, H., can be computed by
the procedures described, taking into account probabilities of different letters and their
sequential constraints. Second, each response on the keyboard is an event, and so we
can also compute response information, H,, in the same manner. Finally, we ask if each
letter on the page was appropriately typed on the keyboard. That is, was the informa-
tion faithfully transmitted, H,. If it was not, there are two types of mistakes: First, in-
formation in the stimulus could be lost, H,, which would be the case if a certain letter
was not typed. Second, letters may be typed that were not in the original text. This is,
referred to as noise. Figure 2.9a illustrates the relationship among these five informa-
tion measures. Notice that it is theoretically possible to have a high value of both H,
and H, but to have H, equal to zero. This result would occur if the typist were totally
ignoring the printed text, creating his or her own message. A schematic example is
shown in Figure 2.9b.
‘We will now compute H, in the context of a four-alternative stimulus-response re-
action-time task rather than the more complex typing task. In this task, the subject is
confronted by four possible events, any of which may appear with equal probability, and
must make a corresponding response for each.
For the ideal information transmitter, H, = Hy = H,- In optimal performance of the
reaction-time task, for example, each stimulus (conveying 2 bits of information if
equiprobable) should be processed (H, = 2 bits) and should trigger the appropriate re-
sponse (H, = 2 bits). As we saw, in information-transmitting systems, this ideal state
is rarely obtained because of the occurrence of equivocation and noise.
‘The computation of H, is performed by setting up a stimulus-response matrix, such
as that shown in Figure 2.10, and converting the various numerical entries into three
sets of probabilities: the probabilities of events, shown along the bottom row; the prob-
Noise Noise
(a) ()
Figure 2.9 information transmission and the channel concept: (a) information transmitted
through the system; (b| no information transmitted
ttt nai tennant ia bil ai ill asi Sibise
Chapter 2 Signal Detection, Information Theory and Absolute Judgement 49
Stimulus
AB CoD coo
2 1
Al 028) 2(025) A _ 2
5 ~ | 2 ~
| 8 1
8 25) 2 (0.25) 1 2
Response 3
} c 2s) 2(025) 1}4 2
: 7 :
: D (025225) P| 1 1/2
i 2 2 2 2 2 2 2 2
i (0.25) (0.25) (025) (0.25)
t Hq = log, 8
t Hy = Hy+ Hq Hog = 2+ 2-2 = 2bits Hy = Hes Ha Hoa
(a) (b)
Figure 2.10 Two examples of the calculation of information transmission,
abilities of responses, shown along the right column; and the probabilities of a given
stimulus-response pairing, These latter values are the probability that an entry will fall
in each filled cell, where a cell is defined jointly by a particular stimulus and a particu-
Fi lr response. In Figure 2.10a, there are four filled cells, with P = 0.25 for each entry. Each
E of these sets of probabilities can be independently converted into the information mea-
E sures by Equation 2.7.
i ‘Once the quantities H,, Hy, and Hy, are calculated, the formula
5 Hy = Hy + Hy - Hep (2.9)
allows us to compute the information transmitted. The rationale for the formula is as follows:
‘The variable H, establishes the maximum possible transmission for a given set of events and
so contributes positively to the formula. Likewise, H, contributes positively. However, to guard
against situations such as that depicted in Figure 2.9, in which events are not coherently paired
with responses, H.,, a measure of the dispersion or lack of organization within the matrix, is
subtracted. If each stimulus generates consistently only one response (Figure 2,10a), the entries
in the matrix should equal the entries in the rows and columns. In this case, H, = Hy = Hp
which means that by substituting the values in Equation 2.9, H, = H,. However, if there
is greater dispersion within the matrix, there are more bits within Hy. In Figure 2.10b,
this is shown by eight equally probable stimulus-response pairs, or 3 bits of information
in Hyg. Therefore, Hq, > H, and H,< H,. The relation between these quantities is
shown in a Venn diagram in Figure 2.11.
Ofien the investigator may be interested in an information transmission rate expressed
in bits/second rather than the quantity H,, expressed in bits. To find this rate, H, is com-
puted over a series of stimulus events, along with the average time for each transmission
(ie, the mean reaction time, RT). Then the ratio H,/RT is taken to derive a measure of the———
50.
Chapter 2_ Signal Detection, Information Theory and Absolute Judgement
Figure 2.1 Information transmission represented in terms of Venn diagrams
enn nate ne tana anasecmert ce eesicm an ema S oet
bandwidth of the communication system in bits/second. This is a useful metric because it
represents processing efficiency by taking into account both speed and accuracy, and it al-
Jows comparison of efficiencies across tasks. For example, measures of processing efficiency
can be obtained for typing or monitoring tasks, and the bandwidths can be compared.
Conclusion
In conclusion, it should be noted that information theory hasits clear benefits —it provides
a single combined measure of speed and accuracy that is generalizable across tasks—but it
also has its limitations (see Wickens, 1984). In particular, H, measures only whether responses
are consistently associated with events, not whether they are correctly associated, and the
measure does not take into account the magnitude of an error. Sometimes larger errors are
more serious, such as when the stimulus and response scales lie along a continuum (¢.g» dri-
vinga car on a windy road, tracking a moving target) Information theory can also be applied
to such continuous tasks, and Wickens (1992) describes methods for doing this. However,
an alternative is to use either a correlation coefficient or some measure of the integrated
error across time, as discussed in Chapter 10. Further discussion of H, and its relation to
measures of d’ and percentage correct can be found in Wickens (1984).
ABSOLUTE JUDGMENT
‘The human senses, although not perfect, are still relatively keen when contrasted with
the detection resolution of machines. In this light, itis somewhat surprising that the lim-
its of absolute judgment—in which an observer assigns a stimulus into one of multiple
categories along a sensory dimension—are relatively severe. This is the task, for exam-
ple, that confronts an inspector of wool quality who must categorize a given specimen
into one of several quality levels; or our van driver who must interpret and recognize
the color of a display symbol appearing on his map display. Our discussion of absolute
judgment will first describe performance when stimuli vary on only a single physical di-
‘mension, We will then consider absolute judgment along two or more physical dimensionsChapter 2_Signal Detection, Information Theory and Absolute Judgement _ST
that are perceived simultaneously and discuss the implications of these findings to prin
ciples of display coding.
Single Dimensions
Experimental Results For a typical absolute judgment experiment, a stimulus con-
tinuum (e.g. tone pitch, light intensity, or texture roughness) and a number of discrete
levels of the continuum (eg,, four tones of different frequencies) are selected. These
stimuli are then presented randomly to the subject one at a time, and the subject is asked
to associate a different response to each one. For example, the four tones might be called
‘A,B, Cand D. The extent to which each response matched the presented stimulus can
then be assessed, When four discriminable stimuli (2 bits) are presented, transmission
(H,) is usually perfect—at 2 bits. Then the stimulus set is enlarged, and additional data
are also collected with five, six, seven, and more discrete stimulus levels, and H, is com-
puted each time by using the procedures described in the preceding section. Typically,
the results indicate that errors begin to be made when about five to six stimuli are used,
and the error rate increases as the number of stimuli increase further. These results in
dicate that the larger stimulus sets have somehow saturated the subject’s capacity to
transmit information about the magnitude of the stimulus. We say the subject has a
maximum channel capacity.
Graphically, these data can be represented in Figure 2.12, in which the actual in-
formation transmitted (H,) is plotted as a function of the number of absolute judgment
stimulus alternatives (expressed in informational terms as H,). The 45-degree slope of
the dashed line indicates perfect information transmission, and the “leveling” of the
function takes place at the region in which errors began to occur (ie, Hy < H,) The
level of the flat part or asymptote of the function indicates the channel capacity of the
operator: somewhere between 2 and 3 bits. George Miller (1956), in a classic paper en~
titled The Magical Number Seven Plus or Minus Two,” noted the similarity of the as-
ymptote level across a number of different absolute judgment functions with different
+ Perlect performance
Pe Se
Hy Human performance
Figure 2.12 Typical human performance in absolute judgment tasks.52 Chapter 2_Signal Detection, information Theory and Absolute Judgement
|
sensory continua. Miller concluded that the limits of absolute judgment at 7 + 2stim-
ulus categories (2-3 bits) is fairly general. This limit does, however, vary somewhat from
one stimulus continuum to another; itis less than 2 bits for saltiness of taste and about
3.4 bits for judgments of position on a line. Nonetheless, there are clear capacity limi-
tations for the absolute judgment of sensory stimuli.
‘The level of the asymptote does not appear to reflect a basic limit in sensory reso-
lution, for two reasons. First, the senses are extremely keen in their ability to make dis-
criminations between two stimuli (“Are they the same or different?”). For example, the
number of adjacent stimulus pairs that a human can accurately discriminate on the sen-
sory continuum of tone pitch is roughly 1,800 (Mowbray & Gebhard, 1961). Second,
Pollack (1952) has observed that the limits of absolute judgment are little affected by
whether the stimuli are closely spaced on the physical continuum or widely dispersed.
Conversely, sensory discrimination of stimuli is clearly affected by stimulus spacing.
Hence, the limit is not sensory but is in the accuracy of the subject’s memory for the
representation of the four to eight different standards (Siegel & Siegel, 1972).
If in fact, absolute judgment limitations are related to memory, there should be some
association between this phenomenon and difference in learning or experience, since dif-
ferences in memory are closely related to those of learning. It is noteworthy that sensory
continua for which we demonstrate good absolute judgments are those for which such
judgments in real-world experience occur relatively often. For example, judgments of po-
sition along a line (3.4 bits) are made in measurements on rulers, and judgments of angle
(4.3 bits) are made in telling the time from analog clocks. High performance in absolute
judgment also seems to be correlated with professional experience with a particular sen-
sory continuum in industrial tasks (Welford, 1968) and is demonstrated by the note-
worthy association of absolute pitch with skilled musicians (Carroll, 1975; Klein, Coles,
& Donchin, 1984; Siegel & Siegel, 1972).
Many attempts to model performance in absolute judgment tasks are similar to sig-
nal detection theory (see Luce, 1994; Shiffrin & Nosofsky, 1994), extending it to situa-
tions where there are more than two stimulus possibilities. In these approaches, each
stimulus is assumed to give rise to a distribution of “perceptual effects” along the uni-
dimensional continuum, an approach initially developed by Torgerson (1958). The ob-
server partitions the continuum into response regions using a set of decision criteria,
instead of the one criterion used in the simple signal detection situation. If the variance
of these distributions increased with the number of stimuli, it would be more difficult 4
to absolutely identify each stimulus. That is, as the number of stimuli increases, sensi- :
tivity (our ability to accurately determine which stimulus we are perceiving) decreases.
Such models (e.g., Braida et al., 19843 Luce, Green, and Weber, 1976) can account for
edge effects as well: stimuli located in the middle of the range of presented stimuli are
generally identified with poorer accuracy than those at extremes (Shiffrin & Nosofsky,
1994). The edge effect appears to be due to lowered sensitivity for stimuli in the middle
of the range, and not simply response bias or factors related to fewer response choices
at the extremes (Shiffrin & Nosofsky, 1994).
4
india i i
iit csaniletaii
Applications The conclusions drawn from research in absolute judgment are relevant
to the performance of any task that requires operators to sort stimuli into levels along a
physical continuum, particularly for industrial inspection tasks in which products must
be sorted into various levels for pricing or marketing (e.g, fruit quality) or for differentjon, Information Theory and Absolute Judgement __53
uses (e.g, steel or glass quality), The data from the absolute judginent paradigm indicate
the kind of pertormance limits that can be anticipated and suggest the potential role ot
training, Edge effects suggest that inspection accuracy should be better for extreme stim-
uli. One potential method for improving performance would be to have different inspec
tors sort different levels of the dimension in question. This would lead to different extreme
stimulus categories for each inspector, thereby creating more “edges” where absolute judg-
iment performance is superior. The method remains untested, however.
Absolute judgment data are also relevant to coding, where the level of a stimulus di-
mension is assigned a particular meaning, and the operator must judge that meaning,
For example, computer monitors can display a very large range of colors (e.g. 64,000 lev
els) and software designers are sometimes tempted to use the large available range to code
Variables. However, it is clear that people cannot correctly classify colors beyond about
seven levels, so coding a variable in terms of color cannot be accurately processed by the
user (see Chapter 3). In general, basic data on the number and conceptual categories that
can be employed without error are relevant to the development of display codes.
Moses, Maisano, and Bersh (1979) have cautioned that a conceptual continuum,
should not be arbitrarily assigned to a physical dimension. They have argued that some
conceptual continua have a more “natural” association or compatibility with some phys-
ical display dimensions than with others, The designers of codes should be wary of the
potential deficiencies (decreased accuracy, increased latency) imposed by an arbitrary
or incompatible assignment. For example, Moses, Maisano, and Bersh suggest that the
representation of danger and unit size should be coded by the color and size of a dis-
played object, respectively, and not the reverse. (The issue of display compatibility will
receive more discussion in Chapters 3, 4, and 5.)
Multidimensional Judgment
If our limits of absolute judgment are severe and can only be overcome by extensive
training, how is it that we can recognize stimuli in the environment so readily? A major
reason is that most of our recognition is based on the identification of some combina
tion of two or more stimulus dimensions rather than levels along a single dimension.
When a stimulus can vary on two (or more) dimensions at once, we make an impor-
tant distinction between orthogonal and correlated dimensions. When dimensions of a
stimulus are orthogonal, the level of the stimulus on one dimension can take on any
value, independent of the other—for example, the weight and hair color of an individ-
ual. When dimensions are correlated, the level on one constrains the level on another—
for example, height and weight, since tall people tend to weigh more than short ones.
Orthogonal Dimensions The importance of multidimensional stimuli in increasing
the total amount of information transmitted in absolute judgment has been repeatedly
demonstrated (Garner, 1974). For instance, Egeth and Pachella (1969) demonstrated that
subjects could correctly classify only 10 levels of dot position on a line (3.4 bits of infor:
mation). However, when two lines were combined into a square, so that subjects classi
fied the spatial position of a dot in the square, subjects could correctly classify 57 levels
(5.8 bits). Note, however, that this improvement does not represent a perfect addition of
channel capacity along the two dimensions. If processing along each dimension were
independent and unaffected by the other, the predicted amount of information trans-
mitted would be 3.4 + 3.4 = 6.8 bits, or around 100 positions (10 x 10) in the square.———
54 _Chapter2_ Signal Detection, Information Theory and Absolute Judgement
eth and Pachell’s results suggest that there is some loss of information along each di-
mension resulting from the requirement to transmit information along the other.
Going beyond the two-dimensional case, Pollack and Ficks (1954) combined six di- |
mensions of an auditory stimulus (e.g., loudness, pitch) orthogonally. As each succes-
ive dimension was added, subjects showed a continuous gain in total information
transmitted but a loss of information transmitted per dimension. These relations are
Shown in Figure 2.13a, with seven bits the maximum capacity. The reason people with
absolute pitch are superior to those without does not lie in greater discrimination along
4 single continuum, Rather, those with absolute pitch make their judgments along two |
dimensions: the pitch of the octave and the value of a note within the octave. They have |
Created a multidimensional stimulus from a stimulus that others treat as unidimensional |
(Carroll, 1975; Shepard, 1982; Shiffrin & Nosofsky, 1994). |
Correlated Dimensions The previous discussion and the data shown in Figure 2.13a
suggest that combining stimulus dimensions orthogonally leads to a loss in informa- |
tion transmitted. As noted, however, dimensions can be combined in a correlated or re-
dundant fashion. For example, the position and color of an illuminated traffic light are
redundant dimensions, When the top light is illuminated, itis always red. In this case, Hi |
the information in the stimulus, is no longer the sum of H, across dimensions since this
gum is reduced by redundancy between levels on the two dimensions. If the correlation is |
Tas with the traffic light, total H, is just the Hi,on any single dimension (since other di- |
mensions are completely redundant). Thus, the maximum possible H, for all dimensions |
in combination is less than its value would be in the orthogonal case. However, Eriksen |
and Lake (1933) found that by progressively combining more dimensions redundantly,
the information loss (H, — Ff,) is much less for a given value of H, than it is when they
are combined orthogonally, and the information transmitted (H,) is greater than it would |
be along any single dimension. As ilustrated in Figure 2.13b, H represents a limit on
@) ()
Perfect performance
Human performance
Tota Hy
6
formation
transmitted, Hy
iisimersion
{2s #5 6 7
Number of combined dimensions Number of combined dimensions,
finoreasing Hs)
Figure 2.13 Human performance in absolute judgment of multidimensional auditory stimuli.
{a) Orthogonal dimensions. As more dimensions are added, more tofal information is ransmit-
ted, but ess information is transmitted per dimension. (6] Correlated dimensions. As more dl-
mensions are added, the security of the channel improves, but H, limits the amount of
information that can be transmitted.
=Chapter 2_ Signal Detection, information Theory and
information transmitted with correlated dimensions, and as the number of redundant or
correlated dimensions increases, H, will approach that limit.
It should be noted that the value of the correlation between redundant dime
can range from 0 to I. Such correlation may result from natural variation in the stimu-
us material (eg., height and weight of a person, or pethaps hue and brightness of a
tomato). Alternatively, it may result from artificially imposed constraints by the designer,
in which ease the correlation is usually 1, indicating complete redundancy (e.g., color
and location in traffic lights)
In summary, we can see that orthogonal and correlated dimensions accomplish two
different objectives in absolute judgment of multidimensional stimuli. Orthogonal di-
mensions maximize Hf, the efficiency of the channel. Correlated dimensions minimize
H,,s that is, they maximize the security of the channel.
‘Dimensional Relations: Integral, Separable, and Configurable, Orthogonal or corre~
lated dimensions refer to properties of the information conveyed by a multidimensional
stimulus, and not the physical form of the stimulus. However, combining dimensions in
multidimensional absolute judgment tasks has different effects, depending on the na-
ture of the physical relationship between the two dimensions. In particular, Garner
(1974) made the important distinction between an integral and a separable pair of phys~
ical dimensions. Separable dimensions are defined when the levels along each of the two
dimensions can be specified without requiring the specification of the level along the
other, For example, the length of the horizontal and vertical lines radiating from the dot
in Figure 2.14a are two separable dimensions; each can be specified without specifying
the other. For integral dimensions, this independence is impossible. The height and width
of a single rectangle are integral because to display the height of a rectangle, the width
must be specified; otherwise, it would not be a rectangle (Figure 2.14b). Correspond-
ingly, the color and brightness of an object are integral dimensions, Color cannot be
ions
(a)
Dimension &
(b)
Dimension 8
Figure 2.14 (a) Separable dimensions (height and width of a line segment); (b) integral
dimensions (height and width of a rectanglel. Dimension A is height; dimension Bis width.56 Chapter 2_Signal Detection, information Theory and Absolute Judgement
physically represented without some level of brightness. Hence, integral and separable
pairs differ in the degree to which the two dimensions can be independently specified.
‘To reveal the different implications on human performance of integral versus separa-
ble dimensional pairs, experiments are performed in which subjects categorize different lev-
els of one dimension for a set of stimuli. In the one-dimensional control condition, subjects
would sort on one varying dimension while the other dimension is held constant. In the
stimulus example in Figure 2.14b, they might sort the rectangles by height while the width
remained constant. In the orthogonal condition, they sort on one varying dimension while
ignoring variation in the other dimension. Thus, as in Figure 2.14b, they might sort rec-
tangle heights as the rectangle widths vary, even though the width is irrelevant to their
task and hence should be ignored. Finally, in the correlated (redundant) condition, the two
dimensions are perfectly correlated. An example would be sorting rectangles whose
height and width are perfectly correlated. Thus, the rectangles would all be of the same
shape but would vary in size.
An experiment by Garner and Felfoldy (1970) revealed that sorting with integral di-
mensions (e.g., rectangles), helped performance in the correlated condition (relative to
the control condition), but hurt performance in the orthogonal condition. In the con-
text of our discussion in Chapter 3, we refer to the latter as failure of focused attention.
In contrast, when sorting with separable dimensions (the stimuli in Figure 2.14a), per-
formance is little helped by redundancy and little hurt by the orthogonal variation of
the irrelevant dimension. These differences between integral and separable dimensions
are observed no matter whether performance is measured by accuracy (i.e., H,), when
several levels of each dimension are used, or by speed, when only two levels of each di-
mension are used and accuracy is nearly perfect. Table 2.2 lists examples of integral and
separable pairs of dimensions, as determined by Garner's classification methodology.
For example, as Table 2.2 indicates, pitch and loudness have been shown to be inte-
gral dimensions (Grau & Kemler-Nelson, 1988; Melara & Marks, 1990). Thus, changes in
the pitch of a warning signal will affect how loud it appears to be, and vice versa. In con-
trast, Dutta and Nairne (1993) found that spatial location and temporal order are sepa-
rable dimensions: that is, when subjects had to remember which of two stimuli (a circle
or square) occurred first, they were not affected by variation in the location (top-
bottom) of the stimuli, and vice versa. This suggests that for dynamic data displays (e.g.,
ina dynamic map display, or on a radar screen), changing the location of a symbol or
display element should not affect judgments concerning whether it appeared before
or after another symbol. A radar monitor remembering the position of two aircraft each
coded by a symbol on a radar screen should not be affected by differences in which one
appeared on the radar screen first. It also suggests that differences in the time when two
symbols appeared on the screen should not affect judgments of relative location of the
two symbols. For example, a radar operator’s judgment of which aircraft appeared first
on the radar screen should not be affected by the relative location of the two aircraft.
Traditionally, only two levels of the irrelevant dimension have been tested in speeded
classification. Using more than two levels allows an understanding of the relationship
between the number of levels of the irrelevant dimensions on interference, It also al-
lows us to examine the effects of the spacing of the levels (are the levels far apart, or close
together) on interference. Melara and Mounts (1994) found that increased spacing on
the irrelevant dimension increased interference in the orthogonal sort. They also found
that increasing the number of levels reduced interference, a counterintuitive result. Thenal Detection, Information Theory and Absolute Judgement $7
ABLE 2.2 Pairs of Integral and Seperable Dimensions
Integral Dimensions
height of rectangle width of rectangle height width
lightness color, saturation size var color. saturation
hue colorsaturation size (area brightness
piteh timbre shape color. saturation
pitch loudness shape (& letter shape? color
duration location
orientation (angle) size
spatial location temporal order
result might be better understood by realizing that the levels of the unattended dismen:
sion become less distinct when there are more of them
Problems for the Integral-Separable Distinction Despite the general trends noted
above, there are some problems with the integral-separable distinction. It is unclear exactly
what makes dimensions integral or separable (Carswell & Wickens, 1990; Cheng & Pachella,
1984), and the two symptoms of integrality (tedundancy gain and orthogonal cost) do not
always co-occur. It appears that there is a continuum of integrality, with some dimensional
pairs being clearly integral (hue and brightness), others somewhat so (height and width of
a rectangle), others fairly separable (color and shape of a geometric figure), and others
clearly separable (the height of two parallel bar graphs), There are also other methods for
establishing whether dimensions are integral or separable (c.g., multidimensional scaling)
and the results for all methods do not always produce the same result, because different
methods are biased towards different outcomes (see Kemiler-Nelson, 1993, for a discus
sion). The fact that the concept of integrality is not absolute, with two categories (inte
gral and separable), but is rather defined along a continuum does not diminish its
importance, However, it is also important to consider two qualifications to the general
principles of separability and integrality.
The first qualification is that some combinations of dimensions are asymmetric
‘That is, variation in Dimension A affects Dimension B, but the reverse does not hold
true, For example, Shechter and Hochstein (1992) found that variation jn position or
width of bar stimuli affected judgments of contrast, but judgments of position and
width were not affected by variation in contrast. Hollands and Spence (1997) found
similar relations between the overall size or scaling of a stacked bar graph and the size
of the proportion shown within the graph. Variation in scaling affected perception of
proportion, but the reverse did not hold true, Garner (1974) has noted similar results
with other stimuli (e.g. pitch and phoneme). He noted that stimuli that combine in this
asymmetric manner may have certain hierarchical properties, with one dimension being
more fundamental than the other. For example,a phoneme must have a piteh, but pitch
on without any linguistic properties, Similar claims can be nade
amined by Hollands and Spence, and Shechter and Hochstein
can exist as a dimens
for the dimensions e:58
Chapter 2_ Signal Detection, Information Theory and Absolute Judgement
‘The second qualification—really an elaboration—concerns correlated dimensions,
such as the rectangles of varying shape shown in Figure 2.14. When dimensions are sep-
arable, ike the color and shape of a symbol, it matters little which level of one dimension
is paired with which level of the other (e.g. whether the red symbol is a square and the
blue symbol is a circle, or vice versa). For integral dimensions, however, the pairing often
does make a difference. For example, when the height and width of rectangles are posi-
tively correlated, creating rectangles of constant shape and different size, performance is
not as good as if the dimensions are negatively correlated, such as those shown in Figure
2.14, creating rectangles of different shapes (Lockhead & King, 1977; Weintraub, 1971).
Pairs of dimensions for which the pairing of particular levels makes a difference are re~
ferred to as configurable (Carswell & Wickens, 1990; Pomerantz & Pristach, 1989) or con-
‘gruent (Melara & Marks, 1990). Pomerantz (1981) has referred to the emergent properties
that can be produced when configurable dimensions are combined. These emergent
properties, or emergent features, like the shape and size of a rectangle, will have impor-
tant implications for object displays (to be discussed in Chapter 3).
‘A Theoretical Understanding General Recognition Theory (GRT), a model proposed
by Ashby and coworkers (Ashby & Lee, 1991, Ashby & Maddox, 1994; Maddox & Ashby,
1996) has attempted to model the mechanisms that might lead to interference and faci
tation in the speeded classification task. GRT is based on signal detection theory and gen-
eralizes the signal detection theory concepts to situations where stimuli vary on more than
one physical dimension, Imagine that you are a food inspector examining tomatoes on two
criteria: size and color saturation (a light red versus a deep, rich red). According to GRT, a
particular stimulus generates point in multidimensional space. small light-red tomato
would occupy a point in a two-dimensional space, where the two dimensions are size and
color saturation. Like signal detection theory, GRT assumes that repeated presentations of
the same stimulus lead to different amounts of neural activity. Hence, the perceptual ef-
fect of a stimulus can be represented by a multivariate probability distribution, which has
a three-dimensional, bell-like shape. Although this could be represented as a three-
dimensional figure, itis simpler to draw as if viewed from above, as shown by each circle
in Figure 2.15. The diameter of the circle represents the variability of the distribution. The
circles in Figure 2.15 could represent 95 percent of the distribution, for example.
Different stimuli produce distributions in different locations, as shown in Figure
2.15a, For example, small, light-red tomatoes produce a distribution of neural activ-
ity in the bottom left of Figure 2.15a. Conversely, large, deep-red tomatoes produce a
distribution in the upper right of the figure. Note that the vertical positions of the
distributions (representing color saturation) are not affected by the level of size, and
that the horizontal positions (representing size) are not affected by the level of color
saturation. In this case, the dimensions are separable. In contrast, Figure 2.15b shows
the integral dimensions of color saturation and lightness. We would expect, therefore,
that variations in color saturation affect judgments of lightness, and vice versa. In the
GRT model, the vertical position of the distributions (representing color saturation)
is affected by the horizontal position (lightness). Varying one dimension affects the
perceived level of the other.
Tn an orthogonal condition, subjects sort on one varying dimension while ignor-
ing variation in the other dimension. Hence the inspector sorting fruits in terms of size
and color saturation would maintain a mental representation like that shown in Figure
———