Abductive Inference Computation Philosophy Technology
Abductive Inference Computation Philosophy Technology
Abductive Inference Computation Philosophy Technology
Abductive inference
Computation, philosophy, technology
Edited by
JOHN R. JOSEPHSON
The Ohio State University
SUSAN G. JOSEPHSON
The Ohio State University
and
Columbus College of Art & Design
CAMBRIDGE
UNIVERSITY PRESS
Published by the Press Syndicate of the University of Cambridge
The Pitt Building, Trumpington Street, Cambridge CB2 IRP
40 West 20th Street, New York, NY 10011-4211, USA
10 Stamford Road, Oakleigh, Melbourne 3166, Australia
A catalog record for this book is available from the British Library.
Introduction 1
1 Conceptual analysis of abduction 5
What is abduction? 5
Diagnosis and abductive justification 9
Doubt and certainty 12
Explanations give causes 16
Induction 18
Taxonomy of basic inference types 27
From wonder to understanding 28
2 Knowledge-based systems and the science of AI 31
The science of AI 31
Knowledge-based systems and knowledge representations 38
Generic tasks 50
3 Two RED systems - abduction machines 1 and 2 63
The red-cell antibody identification task 63
The common architecture underlying RED-1 and RED-2 66
The RED-1 Overview mechanism 75
The RED-2 Overview mechanism 78
Hypothesis interactions 91
4 Generalizing the control strategy - machine 3 94
The PEIRCE tool 94
Reimplementing RED in PEIRCE 101
Abduction in SOAR 105
Generic tasks revisited 113
5 More kinds of knowledge: Two diagnostic systems 117
TIPS 117
PATHEX/LIVER: Structure-function models
for causal reasoning 123
6 Better task analysis, better strategy - machine 4 136
Abduction machines - summary of progress 136
vi ABDUCTIVE INFERENCE
Task analysis of explanatory hypothesis formation 139
Concurrent assembly 142
Concurrent realization of the essentials-first strategy:
Framework 147
Efficiency of the essentials-first strategy 151
7 The computational complexity of abduction 157
Introduction 157
Background 158
Notation, definitions, and assumptions 159
Complexity of finding explanations 164
Complexity of plausibility 171
Application to red-cell antibody identification 176
Discussion 177
8 Two more diagnostic systems 180
Distributed abduction in MDX2 181
QUAWDS: Diagnostic system for gait analysis 184
An abductive approach to knowledge-base refinement 196
9 Better task definition, better strategy - machine 5 202
Tractable abduction 202
Software: PEIRCE-IGTT 215
Experiment: Uncertainty and correctness 223
10 Perception and language understanding 238
Perception is abduction in layers 238
Computational model of abduction in layers 242
Speech understanding as layered abduction 246
Three pilot speech recognition systems 250
Multisense perception 258
Knowledge from perception 259
Appendix A Truth seekers 262
Abduction machines 262
In synthetic worlds 264
Appendix B Plausibility 266
Plausibility and probability 267
The need to go beyond probability 270
Dimensions of plausibility 271
Alternatives to probability 271
Plausibility and intelligence 272
Extended Bibliography 273
Acknowledgments 291
Index 295
Contributors
Note: LAIR is Laboratory for Artificial Intelligence Research, Department of Computer and
Information Science, The Ohio State University, Columbus, Ohio 43210 USA
vn
Vlll ABDUCTIVE INFERENCE
1
2 ABDUCTIVE INFERENCE
make them work. Still, at the time it was written, I wasn't sensitive to the
implications of taking an information-processing view of inference and in-
telligence. Since then Chandra's tutelage in AI and my experiences in de-
signing and building knowledge-based systems have significantly enriched
my view. They have especially sensitized me to the need to provide for fea-
sible computation in the face of bounded computational resources. To pro-
vide for feasible computation, a model of intelligence must provide for the
control of reasoning processes, and for the organization and representation
of knowledge.
When I joined the AI group at Ohio State (which later became the Labora-
tory for Artificial Intelligence Research, or LAIR), it was intensely studying
diagnosis and looking for the generic tasks hypothesized by Chandra to be
the computational building blocks of intelligence. Besides Chandra and me,
the AI group at the time consisted of Jack Smith, Dave Brown, Tom Bylander,
Jon Sticklen, Mike Tanner, and a few others. Working primarily with medi-
cal domains, the group had identified "hierarchical classification" as a cen-
tral task of diagnosis, had distinguished this sort of reasoning from "data
abstraction" and other types of reasoning that enter into diagnostic problem
solving, and was trying to push the limits of this view by attempting new
knowledge domains.
My first major project with the AI group in 1983 was a collaboration with
Jack Smith, MD, and others, on the design and construction of a knowledge-
based system (called RED) for an antibody-identification task performed
repeatedly by humans in hospital blood banks. The task requires the forma-
tion of a composite mini-theory for each particular case that describes the
red-cell antibodies present in a patient's blood. Our goal was to study the
problem-solving activity of an expert and to capture in a computer program
enough of the expert's knowledge and reasoning strategy to achieve good
performance on test cases. Preenumerating all possible antibody combina-
tions would have been possible (barely) but this was forbidden because such
a solution would not scale up.
The reasoning processes that we were trying to capture turned out to in-
clude a form of best-explanation reasoning. Before long it became clear that
classification was not enough to do justice to the problem, that some way of
controlling the formation of multipart hypotheses was needed. This led me
to design what we now call the RED-1 hypothesis-assembly algorithm. We
then built RED-1, a successful working system with a novel architecture for
hypothesis formation and criticism. RED-l's successor, RED-2, was widely
demonstrated and was described in a number of papers. Jack Smith (already
an MD) wrote his doctoral dissertation in computer science on the RED
work.
The RED systems show that abduction can indeed be made precise enough
to be a usable notion and, in fact, precise enough to be programmed. These
Introduction 3
systems work well and give objectively good answers, even in complicated
cases where the evidence is ambiguous. They do not manipulate numerical
probabilities, follow deductive inference rules, or generalize from experi-
ence. This strongly reinforces the argument that abduction is a distinct form
of inference, interesting in its own right.
RED-1 was the first, and RED-2 the second, of six generations of abductive-
assembly mechanisms that we designed and with which we experimented.
In the following chapters the evolution of these machines is traced as they
grew in power and sophistication. They were all intended as domain-inde-
pendent abductive problem solvers, embodying inference-control strategies
with some pretensions of generality. One design for parallel hypothesis as-
sembly was never implemented, but each of the other five mechanisms was
implemented, at least partially, and a fair amount of experience was built up
in our lab with abductive problem solving.
In the PEIRCE project (named after Charles Sanders Peirce) we made
generalizations and improvements to RED-2 's hypothesis-assembly mecha-
nism. PEIRCE is a domain-independent software tool for building knowl-
edge-based systems that form composite explanatory hypotheses as part of
the problem-solving process. PEIRCE has various hypothesis-improvement
tactics built in and allows the knowledge-system builder to specify strate-
gies for mixing these tactics. Members of our group also designed and built
other abductive systems, including MDX2 by Jon Sticklen, TIPS (Task Inte-
grated Problem Solver) by Bill Punch, and QUAWDS (Qualitative Analysis
of Walking Disorders) by Tom Bylander and Mike Weintraub. Other discov-
eries were made in collaboration with Mike Tanner, Dean Allemang, Ashok
Goel, Todd Johnson, Olivier Fischer, Matt DeJongh, Richard Fox, Susan
Korda, and Irene Ku. Most of the abduction work has been for diagnosis in
medical and mechanical domains, but more recently, in collaboration with
several speech scientists and linguists here at Ohio State, we have begun to
work on layered-abduction models of speech recognition and understanding.
Susan Josephson has been my mate and a stimulating intellectual com-
panion throughout my adult life. When the project of editing this book bogged
down in the summer of 1990, Susan agreed to take the lead and set aside for
a time her project of writing a book on the philosophy of AL The present
book is a result of our collaboration and consists of a deeply edited collec-
tion of LAIR writings on abduction by various authors. I take responsibility
for the major editorial decisions, especially the controversial ones. Susan is
responsible for transforming a scattered set of material into a unified narra-
tive and sustained argument, and she produced the first draft. Many voices
blend in the text that follows, although mine is the most common. Authors
whose work is included here should not be presumed to agree with all con-
clusions.
In chapter 1 we set the stage with a careful discussion of abduction and
4 ABDUCTIVE INFERENCE
some of its relationships with other traditionally recognized forms of infer-
ence. This is followed in chapter 2 by an orientation to our view of AI as a
science and to our approach to building knowledge systems. The remainder
of the book traces the development of six generations of abduction machines
and describes some of the discoveries that we made about the dynamic logic
of abduction.
Conceptual analysis of abduction
What is abduction?
Abduction, or inference to the best explanation, is a form of inference that
goes from data describing something to a hypothesis that best explains or
accounts for the data. Thus abduction is a kind of theory-forming or inter-
pretive inference. The philosopher and logician Charles Sanders Peirce (1839—
1914) contended that there occurs in science and in everyday life a distinc-
tive pattern of reasoning wherein explanatory hypotheses are formed and
accepted. He called this kind of reasoning "abduction."
In their popular textbook on artificial intelligence (AI), Charniak and
McDermott (1985) characterize abduction variously as modus ponens turned
backward, inferring the cause of something, generation of explanations for
what we see around us, and inference to the best explanation. They write
that medical diagnosis, story understanding, vision, and understanding natural
language are all abductive processes. Philosophers have written of "infer-
ence to the best explanation" (Harman, 1965) and "the explanatory infer-
ence" (Lycan, 1988). Psychologists have found "explanation-based" evidence
evaluation in the decision-making processes of juries in law courts
(Pennington & Hastie, 1988).
We take abduction to be a distinctive kind of inference that follows this
pattern pretty nearly:1
The core idea is that a body of data provides evidence for a hypothesis that
satisfactorily explains or accounts for that data (or at least it provides evi-
dence if the hypothesis is better than explanatory alternatives).
Abductions appear everywhere in the un-self-conscious reasonings, inter-
This chapter was written by John R. Josephson, except the second section on diagnosis,
which was written by Michael C. Tanner and John R. Josephson.
6 ABDUCTIVE INFERENCE
pretations, and perceivings of ordinary life and in the more critically self-
aware reasonings upon which scientific theories are based. Sometimes ab-
ductions are deliberate, such as when the physician, or the mechanic, or the
scientist, or the detective forms hypotheses explicitly and evaluates them to
find the best explanation. Sometimes abductions are more perceptual, such
as when we separate foreground from background planes in a scene, thereby
making sense of the disparities between the images formed from the two
eyes, or when we understand the meaning of a sentence and thereby explain
the presence and order of the words.
Abduction in science
Abductions are common in scientific reasoning on large and small scales. 2
The persuasiveness of Newton's theory of gravitation was enhanced by its
8 ABDUCTIVE INFERENCE
ability to explain not only the motion of the planets, but also the occurrence
of the tides. In On the Origin of Species by Means of Natural Selection
Darwin presented what amounts to an extended argument for natural selec-
tion as the best hypothesis for explaining the biological and fossil evidence
at hand. Harman (1965) again: when a scientist infers the existence of atoms
and subatomic particles, she is inferring the truth of an explanation for her
various data. Science News (Peterson, 1990) reported the attempts of as-
tronomers to explain a spectacular burst of X rays from the globular cluster
Ml 5 on the edge of the Milky Way. In this case the inability of the scientists
to come up with a satisfactory explanation cast doubt on how well astrono-
mers understand what happens when a neutron star accretes matter from an
orbiting companion star. Science News (Monastersky, 1990) reported attempts
to explain certain irregular blocks of black rock containing fossilized plant
matter. The best explanation appears to be that they are dinosaur feces.
P or Q or R or S or . . .
But not-0, not-/?, not-S, . . .
Therefore, P.
Ampliative inference
Like inductive generalizations, abductions are ampliative inferences; that is, at
the end of an abductive process, having accepted a best explanation, we may
have more information than we had before. The abduction transcends the infor-
mation of its premises and generates new information that was not previously
encoded there at all. This can be contrasted with deductions, which can be thought
of as extracting, explicitly in their conclusions, information that was already
implicitly contained in the premises. Deductions are truth preserving, whereas
successful abductions may be said to be truth producing.
This ampliative reasoning is sometimes done by introducing new vocabu-
lary in the conclusion. For example, when we abduce that the patient has
hepatitis because hepatitis is the only plausible way to explain the jaundice,
we have introduced into the conclusion a new term, "hepatitis," which is
from the vocabulary of diseases and not part of the vocabulary of symptoms.
By introducing this term, we make conceptual connections with the typical
progress of the disease, and ways to treat it, that were unavailable before.
Whereas valid deductive inferences cannot contain terms in their conclu-
sions that do not occur in their premises, abductions can "interpret" the given
data in a new vocabulary. Abductions can thus make the leap from "observa-
tion language" to "theory language."
D is a collection of data.
H explains D.
No other hypothesis can explain D as well as H does.
Emergent certainty
Abductions often display emergent certainty; that is, the conclusion of an
abduction can have, and be deserving of, more certainty than any of its pre-
mises. This is unlike a deduction, which is no stronger than the weakest of
16 ABDUCTIVE INFERENCE
its links (although separate deductions can converge for parallel support).
For example, I may be more sure of the bear's hostile intent than of any of the
details of its hostile gestures; I may be more sure of the meaning of the sentence
than of my initial identifications of any of the words; I may be more sure of the
overall theory than of the reliability of any single experiment on which it is
based. Patterns emerge from individual points where no single point is essential
to recognizing the pattern. A signal extracted and reconstructed from a noisy
channel may lead to a message, the wording of which, or even more, the intent
of which, is more certain than any of its parts.
This can be contrasted with traditional empiricist epistemology, which
does not allow for anything to be more certain than the observations (except
maybe tautologies) since everything is supposedly built up from the obser-
vations by deduction and inductive generalization. But a pure generaliza-
tion is always somewhat risky, and its conclusion is less certain than its
premises. "All goats are smelly" is less certain than any given "This goat is
smelly." With only deductive logic and generalization available, empirical
knowledge appears as a pyramid whose base is particular experiments or
sense perceptions, and where the farther up you go, the more general you
get, and the less certain. Thus, without some form of certainty-increasing
inference, such as abduction, traditional empiricist epistemology is unavoid-
ably committed to a high degree of skepticism about all general theories of
science.
Induction
Peirce's view was that induction, deduction, and abduction are three distinct
types of inference, although as his views developed, the boundaries shifted
somewhat, and he occasionally introduced hybrid forms such as "abductive
induction" (Peirce, 1903). In this section I hope to clear up the confusion
about the relationship of abduction to induction. First I argue that inductive
generalizations can be insightfully analyzed as special cases of abductions.
I also argue that predictions are a distinctive form of inference, that they are
not abductions, and that they are sometimes deductive, but typically not.
The result is a new classification of basic inference types.
Conceptual analysis of abduction 19
Harman (1965) argued that "inference to the best explanation" (i.e., ab-
duction) is the basic form of nondeductive inference, subsuming "enumera-
tive induction" and all other forms of nondeductive inferences as special
cases. Harman argued quite convincingly that abduction subsumes sample-
to-population inferences (i.e., inductive generalizations [this is my way of
putting the matter]). The weakness of his overall argument was that other
forms of nondeductive inference are not seemingly subsumed by abduction,
most notably population-to-sample inferences, a kind of prediction. The main
problem is that the conclusion of a prediction does not explain anything, so
the inference cannot be an inference to a best explanation.
This last point, and others, were taken up by Ennis (1968). In his reply to
Ennis, instead of treating predictions as deductive, or admitting them as a
distinctive form of inference not reducible to abduction, Harman took the
dubious path of trying to absorb predictions, along with a quite reasonable
idea of abductions, into the larger, vaguer, and less reasonable notion of
"maximizing explanatory coherence" (Harman, 1968). In this I think Harman
made a big mistake, and it will be my job to repair and defend Harman's
original arguments, which were basically sound, although they proved some-
what less than he thought.
Inductive generalization
First, I will argue that it is possible to treat every good (i.e., reasonable,
valid) inductive generalization as an instance of abduction. An inductive
generalization is an inference that goes from the characteristics of some
observed sample of individuals to a conclusion about the distribution of those
characteristics in some larger population. As Harman pointed out, it is use-
ful to describe inductive generalizations as abductions because it helps to
make clear when the inferences are warranted. Consider the following infer-
ence:
This inference is warranted, Harman (1965) writes, ". . . whenever the hy-
pothesis that all A's are #'s is (in the light of all the evidence) a better,
simpler, more plausible (and so forth) hypothesis than is the hypothesis, say,
that someone is biasing the observed sample in order to make us think that
all A's are B's. On the other hand, as soon as the total evidence makes some
other competing hypothesis plausible, one may not infer from the past cor-
relation in the observed sample to a complete correlation in the total popu-
lation."
20 ABDUCTIVE INFERENCE
If this is indeed an abductive inference, then "All A's are B's" should ex-
plain "All observed A's are ZTs." But, "All A's are 2?'s" does not seem to
explain why "This A is a B" or why A and B are regularly associated (as
pointed out by Ennis, 1968). Furthermore, I suggested earlier that explana-
tions give causes, but it is hard to see how a general fact could explain its
instances, because it does not seem in any way to cause them.
The story becomes much clearer if we distinguish between an event of
observing some fact and the fact observed. What the general statement in
the conclusion explains is the events of observing, not the facts observed.
For example, suppose I choose a ball at random (arbitrarily) from a large hat
containing colored balls. The ball I choose is red. Does the fact that all of
the balls in the hat are red explain why this particular ball is red? No. But it
does explain why, when I chose a ball at random, it turned out to be a red
one (because they all are). "All A's are i?'s" cannot explain why "This A is a
Z?" because it does not say anything at all about how its being an A is con-
nected with its being a B. The information that "they all are" does not tell
me anything about why this one is, except it suggests that if I want to know
why this one is, I would do well to figure out why they all are.
A generalization helps to explain the events of observing its instances,
but it does not explain the instances themselves. That the cloudless, daytime
sky is blue helps explain why, when I look up, I see the sky to be blue (but it
doesn't explain why the sky is blue). The truth of "Theodore reads ethics
books a lot" helps to explain why, so often when I have seen him, he has
been reading an ethics book (but it doesn't explain why he was reading eth-
ics books on those occasions). Seen this way, inductive generalization does
have the form of an inference whose conclusion explains its premises.
Generally, we can say that the frequencies in the larger population, to-
gether with the frequency-relevant characteristics of the method for draw-
ing a sample, explain the frequencies in the observed sample. In particular,
"y4's are mostly i?'s" together with "This sample of A's was drawn without
regard to whether or not they were ZTs" explain why the A's that were drawn
were mostly ZTs.
Why were 61% of the chosen balls yellow?
Because the balls were chosen more or less randomly from a population that was
two thirds yellow (the difference from 2/3 in the sample being due to chance).
Alternative explanation for the same observation:
Because the balls were chosen by a selector with a bias for large balls from a popu-
lation that was only one third yellow but where yellow balls tend to be larger than
non yellow ones.
How do these explain? By giving a causal story.
What is explained is (always) some aspect of an event/being/state, not a
whole event/being/state itself. In this example just the frequency of charac-
Conceptual analysis of abduction 21
teristics in the sample is explained, not why these particular balls are yellow
or why the experiment was conducted on Tuesday. The explanation explains
why the sample frequency was the way it was, rather than having some mark-
edly different value. In general, if there is a deviation in the sample from
what you would expect, given the population and the sampling method, then
you have to throw some Chance into the explanation (which is more or less
plausible depending on how much chance you have to suppose).12
The objects of explanation - what explanations explain - are facts about
the world (more precisely, always an aspect of a fact, under a description).
Observations are facts; that is, an observation having the characteristics that
it does is a fact. When you explain observed samples, an interesting thing is
to explain the frequencies. A proper explanation will give a causal story of
how the frequencies came to be the way they were and will typically refer
both to the population frequency and the method of drawing the samples.
Unbiased sampling processes tend to produce representative outcomes;
biased sampling processes tend to produce unrepresentative outcomes. This
"tending to produce" is causal and supports explanation and prediction. A
peculiarity is that characterizing a sample as "representative" is character-
izing the effect (sample frequency) by reference to part of its cause (popula-
tion frequency). Straight inductive generalization is equivalent to conclud-
ing that a sample is representative, which is a conclusion about its cause.
This inference depends partly on evidence or presumption that the sampling
process is (close enough to) unbiased. The unbiased sampling process is part
of the explanation of the sample frequency, and any independent evidence
for or against unbiased sampling bears on its plausibility as part of the ex-
planation.
If we do not think of inductive generalization as abduction, we are at a
loss to explain why such an inference is made stronger or more warranted, if
in collecting data we make a systematic search for counter-instances and
cannot find any, than it would be if we just take the observations passively.
Why is the generalization made stronger by making an effort to examine a
wide variety of types of ^4's? The inference is made stronger because the
failure of the active search for counter-instances tends to rule out various
hypotheses about ways in which the sample might be biased.
In fact the whole notion of a "controlled experiment" is covertly based on
abduction. What is being "controlled for" is always an alternative way of
explaining the outcome. For example a placebo-controlled test of the effi-
ciency of a drug is designed to make it possible to rule out purely psycho-
logical explanations for any favorable outcome.
Even the question of sample size for inductive generalization can be seen
clearly from an abductive perspective. Suppose that on each of the only two
occasions when Konrad ate pizza at Mario's Pizza Shop, he had a stomach-
ache the next morning. In general, Konrad has a stomachache occasionally
22 ABDUCTIVE INFERENCE
but not frequently. What may we conclude about the relationship between the
pizza and the stomachache? What may we reasonably predict about the out-
come of Konrad's next visit to Mario's? Nothing. The sample is not a large
enough. Now suppose that Konrad continues patronizing Mario's and that after
every one of 79 subsequent trips he has a stomach ache within 12 hours. What
may we conclude about the relationship between Mario's pizza and Konrad's
stomachache? That Mario's pizza makes Konrad have stomachaches. We may
predict that Konrad will have a stomachache after his next visit, too.
A good way to understand what is occurring in this example is by way of
abduction. After Konrad's first two visits we could not conclude anything
because we did not have enough evidence to distinguish between the two
competing general hypotheses:
1. The eating pizza - stomachache correlation was accidental (i.e., merely coin-
cidental or spurious [say, for example, that on the first visit the stomach ache
was caused by a virus contracted elsewhere and that on the second visit it was
caused by an argument with his mother]).
2. There is some connection between eating pizza and the subsequent stomach
ache (i.e., there is some causal explanation of why he gets a stomach ache
after eating the pizza [e.g., Konrad is allergic to the snake oil in Mario's
Special Sauce]).
By the time we note the outcome of Konrad's 79th visit, we are able to
decide in favor of the second hypothesis. The best explanation of the corre-
lation has become the hypothesis of a causal connection because explaining
the correlation as accidental becomes rapidly less and less plausible the longer
the association continues.
Prediction
Another inference form that has often been called "induction" is given by
the following:
and
Therefore, approximately m/n of the A's in the next sample will be B's.
Therefore, S is a B.
and
data space
P e i r c e ' s taxonomy
inference
new t a x o n o m y : one d i m e n s i o n
inference
deduction other
new t a x o n o m y : a n o t h e r d i m e n s i o n
infe r e n c e
abduction pr ^ d i c t i o n mixed
/ / \ \
/ / \
inductive statistical deductive inductive
generalization syllogism prediction projection
Notes
1 This formulation is largely due to William Lycan.
2 Thagard (1988) recognizes abduction in his analysis of scientific theory formation.
3 For example, Darden (1991) describes the modularity of genetic theory.
4 The remainder of this chapter is more philosophical. It is not necessary to accept everything in
order to find value in the rest of the book. The next chapter includes an orientation to our ap-
proach to AI and a discussion of representational issues. The main computational treatment of
abduction begins in chapter 3. The philosophy increases again at the end of the book.
5 Thus abductions have a way of turning negative evidence against some hypotheses into positive
evidence for alternative explanations.
6 This condition shows most clearly why the process of discovery is a logical matter, and why logic
cannot simply be confined to matters of justification. The "logic of justification" cannot be neatly
separated from the "logic of discovery" because justification depends, in part, on evaluating the
quality of the discovery process.
7 Suggested by William Lycan.
8 I am ignoring here the so called "Gettier problem" with these conditions, but see Harman (1965)
for an argument that successful abductions resolve the Gettier problem anyway.
30 ABDUCTIVE INFERENCE
9 For a brief summary of deductive and other models of explanation see Bhaskar (1981), and for a
history of recent philosophical accounts of explanation, see Salmon (1990).
10 For a well-developed historical account of the connections between ideas of causality and expla-
nation see Wallace (1972, 1974). Ideas of causality and explanation have been intimately linked
for a very long time.
11 What the types of causation and causal explanation are remains unsettled, despite Aristotle's best
efforts and those of many other thinkers. The point here is that a narrow view of causation makes
progress harder by obscuring the degree to which all forms of causal thinking are fundamentally
similar.
12 "It is embarrassing to invoke such a wildly unlikely event as a chance encounter between the
entry probe and a rare and geographically confined methane plume, but so far we have elimi-
nated all other plausible explanations" (Planetary scientist Thomas M. Donahue of the Univer-
sity of Michigan on the analysis of chemical data from a Pioneer probe parachuted onto the
planet Venus, reported in Science News for Sept. 12, 1992).
13 Note that this analysis suggests an explanation for why traditional mathematical logic has been
so remarkably unsuccessful in accounting for reasoning outside mathematics and the highly
mathematical sciences. The universal quantifier of logic is not the universal quantifier of ordi-
nary life, or even of ordinary scientific thought.
14 I have not put likelihood qualifiers in the conclusions of any these forms because doing so would
at best postpone the deductive gap.
Knowledge-based systems and the science of AI
The science of AI
The field of artificial intelligence (AI) seems scattered and disunited with
several competing paradigms. One major controversy is between proponents
of symbolic AI (which represents information as discrete codes) and propo-
nents of connectionism (which represents information as weighted connec-
tions between simple processing units in a network). Even within each of
these approaches there is no clear orthodoxy. Another concern is whether AI
is an engineering discipline or a science. This expresses an uncertainty about
the basic nature of AI as well as an uncertainty about methodology. If AI is
a science like physics, then an AI program is an experiment. As experiments,
perhaps AI programs should be judged by the standards of experiments. They
should be clearly helpful in confirming and falsifying theories, in determin-
ing specific constants, or in uncovering new facts. However, if AI is funda-
mentally engineering, AI programs are artifacts, technologies to be used. In
this case, there is no such reason for programs to have clear confirming or
falsifying relationships to theories. A result in AI would then be something
practical, a technique that could be exported to a real-world domain and
used. Thus, there is confusion about how results in AI should be judged,
what the role of a program is, and what counts as progress in AI.
It has often been said that the plurality of approaches and standards in
AI is the result of the extreme youth of AI as an intellectual discipline. This
theory implies that with time the pluralism of AI will sort itself out into a
normal science under a single paradigm. I suggest, instead, that as it ages
AI will continue to include diverse and opposing theories and methods.
This is because the reason for the pluralism is that programs are at the heart
of AI, and programs can be approached in four fundamentally different
ways: (1) An AI program can be treated as a technology to be used to solve
practical problems (AI as engineering); (2) it can be treated as an experi-
ment or model (AI as traditional science); (3) an AI program can also be
treated as a real intelligence, either imitative of human intelligence (AI as
art) or (4) non-imitative (AI as design science). Because these four ways
The first section of this chapter on the science of AI is by Susan G. Josephson; the re-
maining sections are by B. Chandrasekaran.
31
32 ABDUCTIVE INFERENCE
to treat an AI program exist, the field of AI is, and will remain, diverse.
AI as engineering
When an AI program is treated as a technology to be used to solve practical
problems, this is AI as engineering (of course, some programs are proof-of-
principle prototypes with no immediate practicality, even though ultimate
practicality is the goal). In AI as engineering, technology is used to solve
real-world problems. The difference between AI programs that are only en-
gineering and those that can be thought of as doing traditional science is
whether there are any clearly distinguishable phenomena being investigated.
That is, if AI is like traditional science, we expect to find theories about the
nature of some phenomena being tested and confirmed; whereas, when pro-
grams are treated as engineering, the emphasis is on the accomplishment of
practical tasks.
The engineering approach is important for the discipline as a whole, be-
cause it spreads AI through the culture. It applies AI technology to problems
in medicine, commerce, and industry; as a result, it creates more uses for AI
technology and stimulates the development of AI. When programs are treated
as engineering, the emphasis is on practical technology, but it also creates a
demand that supports other conceptions of AI.
AI as traditional science
When AI programs are thought of as experiments or models, this is AI as
traditional science. In traditional-science AI, the theories being tested are
about the phenomena of natural intelligence. Such a theory is tested by
building it into a program that is run on a computer. If AI is like tradi-
tional science, running such a program is equivalent to performing an ex-
periment and helps prove one theory to be right and inconsistent theories
to be wrong. For example, Newell and Simon (1976) conceive of AI as de-
riving and empirically testing universal laws about intelligence. Each new
machine and each new program, they explain, is an experiment. They state
the physical symbol system hypothesis: "A physical symbol system has the
necessary and sufficient means for general intelligent action" (Newell and
Simon 1976, p. 41). This hypothesis is proven right when programs based
on it work well - that is, perform "intelligently" for a range of cases in a
computationally efficient manner. This hypothesis is proven wrong if gen-
eral intelligent actions can be produced by a mechanism that is not a physi-
cal symbol system.
However, the standards normally used to judge AI programs (such as com-
putational efficiency, requiring knowledge that is actually available, scaling
up, working on a wide range of cases, and so on) judge programs indepen-
Knowledge-based systems and the science of AI 33
dently of whether they provide good theories of natural intelligence. Two
working programs that have radically different underlying theories might
both be judged to have the same degree of adequacy by these standards.
Thus, having a well-designed working program is not sufficient to confirm
its underlying theory as anything but a candidate theory of some cognitive
phenomena, nor is it sufficient to differentiate between competing theories.
To confirm or falsify a theory in AI as a correct theory of intelligence, we
need to judge programs embodying that theory by how well they match up
against the phenomena of intelligence being investigated. Given that our
only sure source of data about intelligence comes from human intelligent
behavior, such a program would be judged by how well it simulates human
behavior. However, it is not sufficient for a program merely to simulate the
input-output behavior of humans; it is quite possible for a program to simu-
late human behavior by using mechanisms that are nothing like those used
by humans (Pylyshyn, 1979). So for a program to be judged a good model of
intelligence, the underlying mechanism must have plausibility as a model of
the causal mechanisms that actually produce the intelligent behavior in hu-
mans. These concerns about how humans do things lead us away from AI
and toward cognitive psychology.
Cognitive psychology directly studies human cognitive behavior. From
the perspective of AI as traditional science, AI and cognitive psychology
study the same thing: human cognitive behavior. However, AI and cognitive
psychology approach the problem of cognition in different ways. Psychol-
ogy supplies the evidence that confirms or falsifies models in AI, and psy-
chology supplies the domain that AI models are about. Cognitive psychol-
ogy seeks the causal mechanisms that give rise to human cognitive behav-
ior. Cognitive psychology can regard AI theories as possible models to be
tested and refined by comparing them with observations of human behavior.
Indeed, AI contributes best to cognitive psychology when a multiplicity of
competing AI theories acts as a pool of hypotheses from which to choose.
AI theorists make hypotheses about the phenomena of intelligence and
then work them out in computational models refined by being run on com-
puters. Within that context, AI theorists can study these computational mod-
els as mathematics or as computer systems. The approach of AI is to under-
stand the artifactual models themselves, whereas the approach of cognitive
psychology is to see them as models of human behavior and to use them to
study human behavior. Criteria such as feasibility of computation still show
one AI program as better than another, but it is evidence about human be-
ings that rules out some models as candidates for psychological theories.
Practicing AI in this way means that AI models are unlike theories in phys-
ics, say, where science is conducted within a single theoretical framework.
Within cognitive psychology there might be this sort of science, but not
34 ABDUCTIVE INFERENCE
within AI. Within traditional-science AI there will continue to be a plurality
of theoretical frameworks.
AI as art
When AI programs are thought of, not as models or simulations but as real
intelligences that actually understand and have other humanlike cognitive
states, then the standard is verisimilitude, and AI becomes art. This AI as art
is often called strong AI (Searle, 1980).
The comparison between AI and art is not arbitrary. Both concern crafting
artifacts designed to have some imitative appearance. Realistic statues, such
as those of Duane Hanson, are imitations of the physical appearance of hu-
mans, whereas an AI program, such as ELIZA or PARRY, is an imitation of
the appearance human thinking. Art helps us see ourselves. Machines with
humanlike mentalities help us see our mental selves. Further, just as realis-
tic sculptures give the appearance of humanness - not by using the true causal
mechanism, human flesh - the Al-as-art systems give the appearance of hu-
man mentality, not by using the same causal mechanisms that humans use,
but by using whatever gives the appearance of mentality.
The standards for success of Al-as-art systems, such as passing the Turing
test, require that the programs imitate interesting characteristics of humans,
not that they present plausible models of human mental mechanisms. An AI
program passes the Turing test if it deceives someone into believing that it
is a human being, not by solving difficult problems or by being useful tech-
nology. This imitation involves deception and illusion: the machines are
designed to appear to have human mentalities, often even seeming to have
emotions and personal lives. The goal of AI as art is to make machines that
have, or appear to have, a humanlike independent intelligence. This imita-
tive AI looks unsystematic and ad hoc from a scientific perspective.
Some theorists believe that the objective of trying to give machines hu-
manlike intelligence is inherently misguided. They object to the claim that a
machine that imitates perfectly the appearance of a humanlike mentality
can be said to actually have a humanlike mentality. They say that a purely
syntactic machine cannot think in the sense that humans do, cannot be given
the sort of commonsense knowledge that humans have, and cannot learn in
the way that humans do. Consequently, a machine can never really be intel-
ligent in the way that humans are.
The AI that these theorists find so misguided is AI as art. Like other good
art, this sort of AI is a stimulus for discussions of human nature and the
human condition. AI as art serves as a focal point for criticisms of the very
project of making thinking machines. That there are such criticisms shows
the vitality of this sort of AI. However, these criticisms are completely inap-
plicable to AI as science and engineering. Whether machines can have ac-
Knowledge-based systems and the science ofAI 35
tual humanlike cognitive states is irrelevant to modeling human intelligence
in cooperation with cognitive psychology, to building information-process-
ing systems, and to using AI technology in business and industry.
AI as design science
When an AI program is viewed as a real intelligence - not as an imitation of
human intelligence but as a cognitive artifact in its own right - this is AI as
design science. The information-processing-task approach of David Marr
and the generic-task approach described later in this chapter view programs
in this way. Design-science AI is similar to engineering in that it makes
working systems that are important for what they can do, but, unlike AI as
engineering, design-science AI is practiced with the intent of discovering
the principles behind cognition. Although humans are our primary examples
of cognitive agents, the principles sought by design-science AI are those
that hold for cognition in the abstract, where humans are just one set of
examples of cognitive agents and silicon-based machines are another.
Design-science AI is a synthesis of AI as art and AI as science. It is simi-
lar to AI as art in that it involves designing programs that have functions
independent of modeling humans. The systems it produces are like strong AI
systems in being cognitive systems in their own right rather than mere mod-
els. However, unlike AI as art, design-science AI is not interested in the
mere appearance of intelligence but in the causal mechanisms that produce
intelligent behavior. Design-science AI is similar to traditional science in its
concern for making systems that show us new things about the phenomena
of intelligence, such as reasoning and problem solving.
Summary
AI as engineering uses discovered techniques for solving real-world prob-
lems. AI as art explores the imitative possibilities of machines and the depths
of the analogy between humans and machines. By doing this, AI as art acts
as a focus for attacks against the very project of making artifactual intelli-
gences. There are two ways to practice AI as science: One can practice tra-
ditional-science AI by making models on a computer of the causal mecha-
nisms that underlie human intelligent behavior, or one can practice design-
science AI by making computational theories and information-processing
systems that can be studied as instances of cognitive agents, thus pursuing
the study of cognition and intelligence in the abstract.
38 ABDUCTIVE INFERENCE
To the questions of whether AI is science or engineering, what the role of
a program is, and what counts as progress in AI, we have answered that there
are four different orientations to AI. AI is engineering, traditional science,
art, and design science. Programs play different roles for these different ori-
entations, and progress is different for each orientation. The goal of AI as
engineering is to develop technology so that we can be served by intelligent
machines in the workplace and in our homes. The goal of AI as traditional
science is to supply accurate models of human cognitive behavior, and thus
to understand the human by way of computer-based simulations. The goal of
AI as art is to make machines that are interesting imitations or caricatures of
human mentality so that we can learn and benefit from encountering them.
The goal of AI as design science is to discover the design principles that
govern the achievement of intellectual functioning, in part by making work-
ing systems that can be studied empirically.
The reason that AI seems scattered and* disunited with competing para-
digms is that there are at least these four fundamentally different orienta-
tions to the discipline. Within each of these orientations many legitimate
suborientations and specialties can coexist. As long as workers in AI follow
the standards that are appropriate for each orientation, this plurality is not
something that should be resisted or bemoaned because each approach serves
a clear and important function for the whole.
Connectionism
One of the major disputes in AI at the present time is between advocates of
"connectionism" and advocates of traditional or "symbolic" AI. Given this
controversy, it is worth spending a little time examining the differences be-
tween these two approaches.
Although connectionism as an AI theory comes in many different forms,
most models represent information in the form of weights of connections
between simple processing units in a network, and take information-pro-
cessing activity to consist in the changing of activation levels in the pro-
cessing units as a result of the activation levels of units to which they are
immediately connected. The spreading of activations is modulated by the
weights of connections between units. Connectionist theories especially
emphasize forms of learning that use continuous functions for adjusting the
weights in the network. In some connectionist theories this "pure" form is
mixed with symbol manipulation processes (see Smolensky, 1988). The es-
sential characteristic of a connectionist architecture, in contrast with a "sym-
bolic" architecture, is that the connectionist medium has no internal labels
that are interpreted, and no abstract forms that are instantiated, during pro-
cessing. Furthermore, the totality of weighted connections defines the infor-
mation content, rather than information being represented as a discrete code,
as it is in a "symbolic" information processor.
For some information-processing tasks the symbol and non-symbol ac-
counts differ fundamentally in their representational commitments. Consider
the problem of multiplying two integers. We are all familiar with algorithms to
perform this task. Some of us also still remember how to use the traditional
slide rule to do this multiplication: The multiplicands are represented by their
logarithms on a linear scale on a sliding bar, they are "added" by being aligned
end to end, and the product is obtained by reading the antilogarithm of the sum.
Although both the algorithmic and slide rule solutions are representational, in
no sense can either of them be thought of as an "implementation" of the other.
(Of course the slide rule solution could itself be simulated on a digital com-
puter, but this would not change the fundamental difference in representational
commitment in the two solutions.) Logarithms are represented in one case but
not in the other, so the two solutions make very different commitments about
what is represented. There are also striking differences between the two solu-
tions in practical terms. As the size of the multiplicands increases, the algorith-
mic solution suffers in the amount of time needed to complete the solution,
whereas the slide rule solution suffers in the amount of precision it can deliver.
Given a blackbox multiplier, as two theories of what actually occurs in the box,
the algorithmic solution and the slide rule solution make different ontological
commitments. The algorithmic solution to the multiplication problem is "sym-
bolic," whereas the slide rule solution is similar to connectionist models in be-
ing "analog" computations.
Knowledge-based systems and the science of AI 41
There is not enough space here for an adequate discussion of what makes
a symbol in the sense used when we speak of "symbolic" theories (see
Pylyshyn [1984] for a thorough and illuminating discussion of this topic).
However, the following points seem useful. There is a type-token distinc-
tion: Symbols are types about which abstract rules of behavior are known
and to which these rules can be applied. This leads to symbols being tokens
that are "interpreted" during information processing; there are no such in-
terpretations in the process of slide rule multiplication (except for input and
output). The symbol system can thus represent abstract forms, whereas the
slide rule solution performs its addition or multiplication, not by instantiat-
ing an abstract form, but by having, in some sense, all the additions and
multiplications directly in its architecture. As we can see from this, the rep-
resentational commitments of the symbolic and nonsymbolic paradigms are
different. Thus the connectionists are correct in claiming that connectionism
is not simply an implementation of symbolic theories.
However, connectionist and symbolic approaches are actually both real-
izations of a more abstract level of description, namely, the information-
processing level. Marr (1982) originated the method of information-process-
ing analysis as a way of conceptually separating the essential elements of a
theory from its implementational commitments. He proposed that the fol-
lowing methodology be adopted for this purpose. First, identify an informa-
tion-processing task with a clear specification about what kind of informa-
tion is available as input and what kind of information needs to be made
available as output. Then, specify a particular information-processing theory
for achieving this task by specifying what kinds of information need to be
represented at various stages in the processing. Actual algorithms and data
structures can then be proposed to implement the information-processing
theory on some underlying machine. These algorithms and data structures
will make additional representational commitments. For example, Marr speci-
fied that one task of vision is to take, as input, image intensities in a retinal
image, and to produce, as output, a three-dimensional shape description of
the objects in the scene. His theory of how this task is accomplished in the
visual system is that three distinct kinds of information need to be gener-
ated: (1) from the image intensities a "primal sketch" of significant inten-
sity changes - a kind of edge description of the scene - is generated; (2)
then a description of surfaces and their orientation, what he called a "21/2-
d sketch," is produced from the primal sketch, and (3) finally a 3-d shape
description is generated.
Information-processing (IP) level abstractions (input, output, and the kinds
of information represented along the way) constitute the top-level content
of much AI theory making. The difference between IP theories in the sym-
bolic and connectionist paradigms is that the representations are treated as
symbols in the former, which permit abstract rules of compositions to be
42 ABDUCTIVE INFERENCE
invoked and instantiated, whereas in the latter, information is represented
more directly and affects the processing without undergoing any symbol
interpretation process. The decisions about which of the information trans-
formations are best done by means of connectionist networks, and which are
best done by using symbolic algorithms, can properly follow after the IP
level specification of the theory has been given. The hardest work of theory
making in AI will always remain at the level of proposing the right IP level
abstractions, because they provide the content of the representations by speci-
fying the kinds of information to be represented.
For complex tasks, the approaches of the symbolic and nonsymbolic para-
digms actually converge in their representational commitments. Simple func-
tions such as multiplication are so close to the architecture of the underlying
machine that we saw differences between the representational commitments
of the algorithmic and the slide rule solutions. In areas such as word recog-
nition the problem is sufficiently removed from the architectural level that
we can see macrosimilarities between symbolic and connectionist solutions;
for example, both solutions may make word recognition dependent on letter
recognition. The final performance will have micro-features that are charac-
teristic of the architecture (such as "smoothness of response" for connectionist
architectures). However, the more complex the task, the more common the
representational issues between connectionism and the symbolic paradigm.
There is no clear line of demarcation between the architecture-independent
theory and the architecture-dependent theory. It is partly an empirical issue
and partly depends on what primitive functions can be computed in a par-
ticular architecture. The farther away a problem is from the architecture's
primitive functions, the more architecture-independent analysis needs to be
done.
Certain kinds of retrieval and matching operations, and parameter learn-
ing by searching in local regions of a search space, are especially appropri-
ate primitive operations for connectionist architectures. (However, although
memory retrieval may have interesting connectionist components, the basic
problem will still remain the principles by which episodes are indexed and
stored.) On the other hand, much high-level thought has symbolic content
(see Pylyshyn [1984] for arguments that make this conclusion inescapable).
So one can imagine a division of responsibility between connectionist and
symbolic architectures along these lines in accounting for the phenomena of
intelligence.
Logic
In many circles some version of logic is thought to be the proper language
for characterizing all computation and, by extension, intelligence. By logic
is meant a variant of first-order predicate calculus, or at least a system where
the notion of truth-based semantics is central and inference making is char-
acterized by truth-preserving transformations. (Within AI, nonmonotonic
logic and default logics relax the truth-preserving requirement. The thrust
of the arguments here nevertheless remain valid.) The architecture of sys-
tems using logical formalism for knowledge representation generally con-
sists of a knowledge base of logical formulas and an inference mechanism
that uses the formulas to derive conclusions.
A number of theoretical advantages have been claimed for logic in AI,
including precision and the existence of a well-defined semantics. In tasks
where the inference chains are relatively shallow - where the discovery of
the solution does not involve search in a large search space - the logic rep-
44 ABDUCTIVE INFERENCE
resentation may be useful. Also, in logic-based systems it is possible to cre-
ate a subtasking structure (when the structure of a task allows it) that can
help to keep the complexity of inference low. In practical terms, however,
the existence of a rigorous semantics is not typically helpful since precision
is rarely the point. If one were to model, say, reasoning in arithmetic, one
could represent domain knowledge in the form of axioms and use a variety
of inference processes to derive new theorems. However, the computational
complexity of such systems would tend to be impractical - even for rela-
tively simple axiomatic systems - without the use of some clever (and ex-
tra-logical) "steerer" to guide the path of inference. Thus, even in domains
where powerful axiom systems exist, the problem of capturing the effective-
ness of human reasoning remains.
Logic as knowledge representation makes a serious commitment to knowl-
edge as propositions and to True/False judgments as the basic use of knowl-
edge. It is also closely connected to the belief that the aim of intelligence is
to draw correct conclusions. In this view, what human beings often do (e.g.,
draw plausible, useful, but strictly speaking logically incorrect conclusions)
is interesting as psychology, but shows humans only as approximations to
the ideal intelligent agent, whose aim is to be correct. Since at least the late
19th century, logical reasoning has been viewed as the real test of thought.
Witness the title of Boole's book, An Investigation of the Laws of Thought
on Which are Founded the Mathematical Theory of Logic and Probabilities
(1854). This has led to the almost unconscious equating of thought, with
logical thought, and the attempt to seek in logic the language for represent-
ing and constructing the idealized agent. From this perspective the content
of consciousness includes a series of propositions, some of them beliefs. For
a certain kind of theorist, it seems entirely natural to model thought itself as
basically the manipulation of propositions and the generation of new ones.
From this view of thought, stream-of-consciousness imaginings, half-formed
ideas, remindings, vague sensations, ideas suddenly coming to occupy con-
sciousness from the depths of the mind, and so forth do not count as serious
subjects for the study of intelligence as such.
Is "strict truth" the right kind of interpretive framework for knowledge?
Or are notions of functional adequacy (i.e., knowledge that helps to get cer-
tain kinds of tasks done) or the related notions of plausibility, relevance,
likelihood, close-enough, and so forth, more relevant to capturing the way
agents actually use knowledge intelligently? My (Chandra's) 16-month-old
daughter, when shown a pear, said, "Apple!" Surely more than parental pride
makes me attribute a certain measure of intelligence to that remark, which,
when viewed strictly in terms of propositions, was untrue. After all, her
conclusion was more than adequate for the occasion - she could eat the pear
and get nourishment - whereas an equally false proposition, "It's a chair,"
would not give her similar advantages. Thus, viewing propositions in terms
Knowledge-based systems and the science ofAI 45
of context-independent truth and falsity does not make all the distinctions
we need, as in this case between a false useful categorization and a false
useless one.
Some in the logic tradition say that before building, say, commonsense
reasoners, we should get all the ontology of commonsense reasoning right.
We should first analyze what words such as: "know," "cause," and so forth
really mean before building a system. We should axiomatize commonsense
reality. But a use-independent analysis of knowledge is likely to make dis-
tinctions that are not needed for processing and is likely to miss many that
are. The logic tradition separates knowledge from its functions, and this leads
to missing important aspects of the form and content of the knowledge that
needs to be represented.
Furthermore, laws of justification are not identical to laws of thought,
Boole notwithstanding. Logic gives us laws for justifying the truth of con-
clusions rather than laws for arriving at conclusions. For example, although
we might justify our answer to a multiplication problem by using modus
ponens on Peano's axioms, how many of us actually multiply numbers by
using modus ponens on Peano's axioms, and how efficient would that be?
While it might be useful for an intelligent agent to have the laws of logic,
and to apply them appropriately, those laws alone cannot account for the
power of intelligence as a process.
This is not to say that logic as a set of ideas about justification is not
important for understanding intelligence. How intelligent agents generate
justifications, how they integrate them with discovery processes for a final
answer that is plausibly correct, and how the total computational process is
controlled in complexity are indeed questions for AI as an information-pro-
cessing theory of intelligence. It seems highly plausible that much of the
power of intelligence arises, not from its ability to lead to correct conclu-
sions, but from its ability to direct explorations, retrieve plausible ideas,
and focus the more computationally expensive justification processes to where
they are actually required.
To see biological intelligence as merely an approximate attempt to achieve
logical correctness is to miss the functioning of intelligence completely. Of
course there are processes that over long periods of time, and collectively
over humankind, help to produce internal representations with increasingly
higher fidelity; but that is just one part of being intelligent, and in any case
such processes are themselves bounded by constraints of computational fea-
sibility. Once highly plausible candidates for hypotheses about the world
are put together by such processes (as abductive processes might be used in
producing an explanation in science), then explicit justification processes
can be used to anchor them in place.
To the extent that one believes that intelligence is at base a computational
process, and to the extent that logic is a preferred language for the semantics
46 ABDUCTIVE INFERENCE
of computer programs, logic will play a role in describing the meaning of
computer programs that embody theories of intelligence. This brings logic
into computational AI in a fairly basic way. Yet it is possible to be misled by
this role for logic. To make an analogy, architectural ideas are realized using
bricks, steel, wood, and other building materials, while classical statics and
mechanics describe the behavior of these materials under forces of various
kinds. Even though statics and mechanics belong to the language of civil
engineering analysis, they are not the language of architecture as an intel-
lectual discipline. Architecture uses a language of habitable spaces.
While resisting the idea of intelligence as a logic machine, I have often
been impressed by the clarity that the use of logic has sometimes brought to
issues in AI. Quite independently of one's theory of mind, one needs a lan-
guage in which to describe the objects of thought, i.e., parts of the world.
Logic has been a pretty strong contender for this job, though not without
controversy. Wittgenstein is said to have changed his views radically about
the use of logic for this purpose. Nevertheless, logic has played this role for
many researchers in AI who are looking for a formalism in which to be
precise about the knowledge of the world being attributed to an agent. For
example Newell (1982) proposes a "knowledge level" where logic is used to
describe what an agent knows, without any implied commitment to the use
of logical representations for internal manipulation by the agent.1
Rule-based systems
The dominant knowledge-representation formalism in the first generation of
expert systems was rules. The rule-based expert-system approach is that of
collecting from the human experts a large number of rules describing activi-
ties in each domain, and constructing a knowledge base that encodes these
rules. It is usually assumed as part of the rule paradigm that the rule-using
Knowledge-based systems and the science of AI 49
(i.e., reasoning) machinery is not the main source of problem-solving power
but that the power came from the knowledge.
Like the logic-based architectures, the rule-based architectures separate
the knowledge base from the inference mechanisms that use the knowledge.
Furthermore, logic-based and rule-based architectures are unitary architec-
tures; that is, they have only one level. In standard rule-based systems, be-
cause the problem solver (or "inference engine," as it has come to be called)
is free of knowledge, controlling the problem-solving process typically re-
quires placing rules for control purposes in the knowledge base itself.
Such a rule-based architecture can work well when relatively little com-
plex coupling between rules exists in solving problems, or when the rules
can be implicitly separated into groups with relatively little interaction among
rules in different groups. In general, however, when the global reasoning
requirements of the task cannot be conceptualized as a series of local deci-
sions, rule-based systems with this one-level architecture have significant
"focus of attention" problems. That is, because the problem solver does not
have a notion of reasoning goals at different levels of abstraction, maintain-
ing coherent lines of thought is often difficult to achieve. The need for con-
trol knowledge in addition to domain knowledge is a lesson that can be ap-
plied to any unitary architecture.
Another problem that arises in unitary rule-based architectures is that,
when the knowledge base is large, a large number of rules may match in a
given situation, with conflicting actions proposed by each rule. The proces-
sor must choose one rule and pursue the consequences. This choice among
rules is called "conflict resolution," and a family of essentially syntactic
conflict resolution strategies has been proposed to accomplish it, such as
"choose the rule that has more left-hand side terms matching" or ". . . has
more goals on the right-hand side" or ". . . has a higher certainty factor."
Conflicts of this type, I claim, are artifacts of the architecture. If the archi-
tecture were multileveled, higher organizational levels would impose con-
straints on knowledge-base activation so that there would be no need for
syntactic conflict resolution strategies. In general, for a system to be able to
focus, multiple levels of reasoning contexts, goals, and plans must be main-
tained.
These points apply not just to rule architectures, but to unitary architec-
tures in general. Retrieval theories based on semantic networks (where con-
cepts are represented as nodes and relationships between them as links) tend
to explain memory retrieval by proposing that it is based on distances mea-
sured in the number of links, rather than on the types of links and the knowl-
edge embedded in the nodes and links. These unitary architectures encour-
age theory making at the wrong level of abstraction. Because they are uni-
tary architectures, they necessarily omit important higher level information-
processing distinctions that are needed to give an adequate functional de-
50 ABDUCTIVE INFERENCE
Generic tasks
In our work on knowledge-based reasoning, we have proposed a number of
information-processing strategies, each characterized by knowledge represented
using strategy-specific primitives and organized in a specific manner. Each strat-
egy employs a characteristic inference procedure that is appropriate to the task.
Let us look at the computationally complex problem of diagnosis to see how
this approach contrasts with the other approaches just discussed.
Diagnostic reasoning
Formally, the diagnostic problem might be defined as a mapping from the
set of all subsets of observations of a system to the set of all subsets of
possible malfunctions, such that each malfunction subset is a best explana-
tion for the corresponding observation subset. A mathematician's interest in
the problem might be satisfied when it is shown that under certain assump-
tions this task is computable (i.e., an algorithm - such as the set-covering
algorithm of Reggia [1983] - exists to perform this mapping). A mathemati-
cian might also want to derive the computational complexity of this task for
various assumptions about the domain and the range of this mapping.
Knowledge-based systems and the science of AI 51
A logician in AI would consider the solution epistemically complete if he
or she can provide a formalism to list the relevant domain knowledge and
can formulate the decision problem as one of deducing a correct conclusion.
Some diagnostic formalisms view the diagnostic problem as one more ver-
sion of truth-maintenance activity (Reiter, 1987).
Now, each of these methods is computationally complex, and without ex-
tensive addition of heuristic knowledge, a problem cannot ordinarily be solved
in anything resembling real time. It is clear, however, that the abstract prob-
lem of how to go from observable states of the world to their explanations
faces intelligent agents on a regular basis. From the tribesman on a hunt
who needs to construct an explanation for tracks in the mud to a scientist
constructing theories, this basic problem recurs in many forms. Of course,
not all versions of the problem are solved by humans, but many versions,
such as medical diagnosis, are solved routinely. Chapter 1 discusses how
diagnosis is fundamentally an abductive problem.
Because of our concern with the structure of intelligence, let us ask the
following question: What is an intelligence that it can perform this task?
That is, we are interested in the relationships between mental structures and
the performance of the diagnostic task, not simply in the diagnostic task
itself. The distinction can be clarified by considering multiplication. Multi-
plication viewed as a computational task has been sufficiently studied so
that fast and efficient algorithms are available and are routinely used by
today's computers. On the other hand, if we ask, "How does a person actu-
ally perform multiplication in the head?" the answer is different from the
multiplication algorithms just mentioned. Similarly, the answer to how di-
agnosis is done needs to be given in terms of how the particular problem is
solved by using more generic mental structures. Depending upon one's theory
of what those mental structures are, the answer will differ.
We have already indicated the kinds of answers that unitary architectures
would foster about how diagnosis is done. In rule-based architectures the
problem solver simply needs to have enough rules about malfunctions and
observations; in frame-based architectures diagnostic knowledge is repre-
sented as frames for domain concepts such as malfunctions; connectionist
architectures represent diagnostic knowledge by strengths of connections
that make activation pathways from nodes representing observations to nodes
representing malfunctions. The inference methods that are applicable to each
are fixed at the level of the architecture: some form of forward or backward
chaining for rule systems, some form of inheritance mechanisms and em-
bedded procedures for frame systems, and spreading activations for
connectionist networks. The first generation of knowledge-based systems
languages - those based on rules, frames, or logic - did not distinguish be-
tween different types of knowledge-based reasoning.
Intuitively, one thinks that there are types of knowledge and control that
52 ABDUCTIVE INFERENCE
are common to diagnostic reasoning in different domains, and, similarly,
that there are common structures and strategies for design as a cognitive
activity, but that the structures and control strategies for diagnostic reason-
ing and design problem solving are somehow different. However, when one
looks at the formalisms (or equivalently the languages) that are commonly
used for building expert systems, the knowledge representation and control
strategies usually do not capture these intuitive distinctions. For example,
in diagnostic reasoning one might generically want to speak in terms of
malfunction hierarchies, rule-out strategies, setting up a differential, and so
forth, whereas for design problem solving the generic terms might be de-
vice - component hierarchies, design plans, functional requirements, and the
like.
Ideally one would like to represent knowledge by using the vocabulary
that is appropriate for the task, but the languages in which expert systems
have typically been implemented have sought uniformity across tasks and
thus have lost perspicuity of representation at the task level. For example,
one would expect that the task of designing a car would require significantly
different reasoning strategies than the task of diagnosing a malfunction in a
car. However, rule-based systems apply the same reasoning strategy ("fire"
the rules whose conditions match, etc.) to design and diagnosis, as well as to
any other task. Unitary-architecture methodologies suppress what is distinc-
tive about tasks such as diagnosis in favor of task-independent information-
processing strategies. In addition, the control strategies that come with these
methodologies do not explicitly show the real control structure of the task
that a problem solver is performing. Thus these methodologies, although
useful, are rather low-level with respect to modeling the needed task-level
behavior. In essence, their languages resemble low-level assembly languages
for writing knowledge-based systems. Although they are obviously useful,
approaches that more directly address the higher level issues of knowledge-
based reasoning are clearly needed.
observations,
Knowledge-
Directed sensor values,
Data
Retrieval etc.
Liver disease
alcoholic cirrhosis
cific modules of a device. (That is, performance of a generic task may re-
quire the solution of some problem of a different type as a subtask.) The
points to note here are that (1) the inference engine and the forms of knowl-
edge are tuned for the classificatory task and (2) the control transfer is not
necessarily centralized.
Hierarchical classification as an information-processing strategy is ubiq-
uitous in human reasoning. How classification hierarchies are created - from
examples, from other types of knowledge structures, and so forth - requires
an additional set of answers. We have discussed elsewhere (Sembugamoorthy
& Chandrasekaran, 1986) how knowledge of the relationships between struc-
ture and the functions of components of a device can be used to derive
malfunction hierarchies.
Summary. Overall problem solving for diagnosis can thus be traced as fol-
lows: The hierarchical classifier's problem-solving activity proceeds top
down, and for each concept that is considered, a hypothesis matcher for that
concept is invoked to determine the degree of fit with the data. The hypoth-
esis matcher turns to the inferencing data base for information about the
data items in which it is interested. The data base completes its reasoning
and passes on the needed information to the hypothesis matcher. After ac-
quiring all the needed information from the data base, the hypothesis matcher
completes its matching activity for the concept and returns the value for the
match to the classifier, along with the data that the concept hypothesis can
explain. The classifier's activity now proceeds according to its control strat-
egy: It either rules out the concept, or establishes it and pursues the more
detailed successors (or suspends judgment, but we will not discuss that here).
This process, whereby each specialist consults other specialists that can pro-
vide the information needed to perform its own task, is repeated until the
classifier concludes with a number of high-plausibility hypotheses, together
with information about what each can explain. At this point the abductive
Knowledge-based systems and the science of AI 59
assembler takes control and constructs a composite explanatory hypothesis
for the problem. As we shall see in following chapters, the present discus-
sion omits many subtleties in the strategies of each of the problem solvers.
Each of these generic processes is a possible information-processing strat-
egy in the service of a number of different tasks. Classification plays a role
whenever one can make use of the computational advantages inherent in
indexing actions by classes of world states. Hypothesis matching is useful
whenever the appropriateness of a concept to a situation needs to be deter-
mined. Abductive assembly is useful whenever an explanation needs to be
constructed and a small number of possibly competing, possibly co-occur-
ring explanatory hypotheses are available. Each strategy can be used only if
knowledge in the appropriate form is available. These elementary generic
tasks can also be used in complex problem-solving situations other than di-
agnosis, such as designing and planning.
Under the proper conditions of knowledge availability, each of these ge-
neric strategies is computationally tractable. In hierarchical classification,
entire subtrees can be pruned if a node is rejected. The mapping from data to
concept can be achieved if hierarchical abstractions make hypothesis match-
ing computationally efficient. Abductive assembly can be computationally
expensive (as described in later chapters), but if another process can ini-
tially prune the candidates and generate only a few hypotheses, then this
pruning can help to keep the computational demands of hypothesis assem-
bly under control. Hierarchical classification accomplishes this pruning in
the design for diagnosis just described.
Notes
1 [Some further reasons for not using classical logical formalisms in AI: (1) much of intelligence is
not inference (e.g., planning) and (2) most intelligent inference is not deductive (e.g., abduction,
prediction). - J. J.]
Two RED systems - abduction machines 1 and 2
63
64 ABDUCTIVE INFERENCE
tern builds a "memory" of antigens that it encounters, which allows it to
quickly produce antibodies directed specifically against those antigens if it
ever encounters them again. Antibodies directed against red-cell antigens
are red-cell antibodies.
A patient's encounter with foreign red cells can thus produce a transfu-
sion reaction, with possible consequences including fever, anemia, and life-
threatening kidney failure. So tests must be performed to ensure that the
patient does not already possess antibodies to the red cells of a potential
donor, and this gives rise to the red-cell antibody identification task.
The first step in the overall testing process is determining the patient's
A-B-0 blood type. Red blood cells can have A or B structures on their sur-
faces, leading to possible blood types of A (the person's blood cells have
A structures but not B structures), B (the other way round), AB (both), or
O (neither). A person's A-B-0 blood type is determined by testing for the
presence or absence of A and B antigens on the person's cells, and is cross-
checked by testing for the presence of A and B antibodies in the person's
blood serum. (A person should not have antibodies that would react with
his or her own red cells.) When we say "A antibody" or "anti-A" we mean
an antibody "directed against the A antigen" (an example of final-cause
naming).
People who do not have the A or B antigens on their red cells almost
always have the corresponding antibody in their serum, and transfusion re-
actions for these antigens can be severe, so A-B-0 testing is always done
prior to transfusion. Rho (also called "D") is another structure sometimes
present in the red-cell membrane. The formation of the corresponding anti-
body can cause major problems in both transfusions and pregnancies, so
donor and patient Rh types (Rh positive or Rh negative) are routinely deter-
mined, and the same type is always used for transfusion.
In addition to A, B, and D, more than 400 red-cell antigens are known.
Once blood has been tested to determine the patient's A-B-0 and Rh blood
types, it is necessary to test for antibodies directed toward other red-cell
antigens. When this test is positive, it is necessary to determine which anti-
bodies are present. The RED systems perform information processing for
this task.
Red-cell antibodies carried by the blood serum are identified by using one
or more panels (see Figure 3.1).2 A panel is a set of tests performed by mix-
ing 10 or so cells (red-cell suspensions, each of which is a sample of red
cells from a single donor) with the patient's serum in each of 5 or so test
conditions. That is, the patient's serum is mixed with red cells from perhaps
10 people, each perhaps 5 times under varying test conditions. Thus ap-
proximately 50 individual tests make up a panel. The test cells have certain
known antigens on their surface and are usually provided to the testing labo-
Two RED systems 65
623A 479 537A 506A 303A 209A 186A 195 164 Scr2 Scr1
AlbuminIS 0 0 0 0 0 0 0 0 0 0 0
Albumin37 0 0 0 0 0 0 0 0 0 0 0
Coombs 3+ 0 3+ 0 3+ 3+ 3+ 3+ 3+ 3+ 2+
EnzymelS 0 0 0 0 0 0 0 0 0 0 0
Enzyme37 0 0 1+ 0 0 1+ 0 1+ 0 1+ 0
Figure 3.1. Red-cell test panel. This panel is from case OSU 9, which will be used for
illustration throughout this chapter. The various test conditions, or phases, are listed along
the left side (AlbuminIS, etc.) and identifiers for donors of the red cells are given across
the top (623 A, etc.). Entries in the table record reactions graded from 0, for no reaction, to
4+ for the strongest agglutination reaction, or H for hemolysis. Intermediate grades of
agglutination are +/- (a trace of reaction), 1+w (a grade of 1+, but with the modifier
"weak"), 1+, 1+s (the modifier means "strong"), 2+w, 2+, 2+s, 3+w, 3+, 3+s, and 4+w.
Thus, cell 623A has a 3+ agglutination reaction in the Coombs phase.
537A + + 0 0 + 0 0 + + 0 + 0 + 0 + 0 +
479 0 + 0 + + 0 + 0 + 0 + 0 + 0 + + 0
623A 0 + + + 0 0 0 0 + 0 + 0 + 0 + + +
Figure 3.2. Partial antigram showing antigens present on donor cells. The donor cells
are named along the left side (164, etc.) and the antigens along the top (C, D, etc.). In
the body of the table + indicates the presence of an antigen and 0 indicates absence.
For example, cell 209 A is positive for C (the antigen is present on the surface of the
cell) but negative for D (the antigen is absent). D is the Rh antigen, so the donor of
this cell is Rh negative.
sions from test results by using principles, hypotheses, knowledge, and strate-
gies similar to those used by human technologists. We recorded and studied
many sessions of human problem solving (protocols) in this domain, and, al-
though we do not pretend to have captured the full flexible richness of human
problem-solving abilities and behavior, nevertheless the RED systems have cap-
tured enough of those abilities to get the job done quite well, as will be seen.
For any of the versions of RED, the input required to solve a case consists
of at least one panel and an antigram describing the donor cells used in the
panel. The output solution consists of: (1) a set of antibodies that together
make up the best explanation for the reactions described in the input panel,
(2) a critical markup of the antibodies in the best explanation marking which
antibodies are deserving of greatest confidence, and (3) an evaluation of the
presence and absence of the most clinically significant antibodies given in
language such as "likely present" and "ruled out." 3
At the completion of the antibody-identification task the patient's status is
known with respect to certain specific antibodies, and the A-B-O and Rh types
are known as well. Blood can be selected for transfusion on this basis. However,
there may still be undetected antigen-antibody incompatibilities, or some mis-
take, so an additional cross-match step is performed before the blood is actually
released for use. The cross-match is done by testing the patient's serum with the
red cells from the actual unit of blood selected for transfusion. This procedure
verifies that the patient has no antibodies to the donor cells.
Hierarchical classification
In some problem situations, abduction can be accomplished by a relatively
simple matching mechanism, with no particular concern taken for focusing
the control of which explanatory concepts to consider, or for controlling the
assembly of composite explanations. For example, if there are only a small
68 ABDUCTIVE INFERENCE
number of potentially applicable explanatory concepts, and if time and other
computational resources are sufficiently abundant, then each hypothesis can
be considered explicitly. Of course it is a pretty limited intelligence that can
afford to try out each of its ideas explicitly on each situation. The classifica-
tion mechanism we describe here meets the need for organizing the prestored
explanatory concepts and for controlling access to them. It provides a good
mechanism for the job whenever knowledge of the right sort is available to
make it go.
For antibody identification the classification task is that of identifying a
patient's serum antibody as plausibly belonging to one or more specific nodes
in a predetermined antibody classification hierarchy (see Figure 3.3).4 CSRL
(Conceptual Structures Representation Language) is a computer language
that we developed at Ohio State for setting up hypothesis-matching special-
ists that reside in a classification hierarchy and use the establish-refine prob-
lem-solving process. CSRL is a software tool for building hierarchical clas-
sifiers. Each node in the RED antibody-classification tree is implemented as
a CSRL specialist, a little software agent that has embedded in it knowledge
and decision logic for evaluating the applicability of its particular class or
category to the case (the hypothesis-matching task).
As described in chapter 2, the specialists are invoked using the establish-
refine control strategy for classification (Gomez & Chandrasekaran, 1981)
so that when a general hypothesis is ruled out, all more specific subtypes of
it are also ruled out. In principle, establish-refine processing can take place
in parallel, whenever sub-concepts can be matched independently. By effi-
ciently pruning the search for plausible hypotheses, establish-refine is a sig-
nificant contributor to taming the complexity of a problem. Establish-refine
makes it efficient and practical to search a very large space of stored con-
cepts for just those that plausibly apply to the case. The classification struc-
ture in RED is used to generate plausible hypotheses to be used as potential
parts of a compound hypothesis to be constructed by a process of abductive
assembly and criticism.
In the RED systems a separate module is devoted to the hypothesis-gen-
eration subtask, the classification structure just described. In RED-1 and
RED-2 it runs first, as soon as the case is started, and produces a set of
hypotheses, each the result of matching a prestored concept to the case. The
hypotheses produced are all explicitly relevant for explaining features of
the case, and many potential hypotheses do not appear, having been cat-
egorically ruled out. Each hypothesis arrives with a symbolic likelihood,
the result primarily of case-specific quality of match but also of case-inde-
pendent knowledge about frequencies of occurrence. One way in which the
RED engine is distinguished from INTERNIST (Miller, Pople, & Myers,
1982), probably the best-known abductive system, and the set-covering sys-
tems (Peng & Reggia, 1990; Reggia, 1983) is that RED devotes a separate
Two RED systems 69
Red Antibody Specialist Hierarchy
antiNIgM
antiNMixed
antiSMixed
antiSIgG
antisIgG
antisMixed
antiFyalgG
}'/// / / antiFya
antiFyaMixed
antiFyb IgG
.,- antiFyb
antiFybMixed
anti JkalgG
AlloantibodyPresent?f^ antiJka
-— anti JkaMixed
—-anti Jkb IgG
—- anti JkbMixed
antiKIgG
- antiKMixed
antiklgG
fe\\ ^antik^
antikMixed
—- antiKpalgG
"~—- antiKpaMixed
. - - antiKpblgG
- - antiKpbMixed
\ \ \ « i n * i Ico
—- antiJsalgG
Figure 3.3. Part of the antibody-classification hierarchy used in RED-2. More re-
fined (i.e., more specific) hypotheses are toward the right
Hypothesis matching
The individual nodes in the classification hierarchy are represented by hy-
pothesis-matching specialists, which are organized into a hierarchy with
general hypotheses near the "top" and more specific hypotheses toward the
"bottom" (in Figure 3.3, hypotheses are more general to the left and become
more specific as they proceed to the right). Each specialist performs the
70 ABDUCTIVE INFERENCE
c a s t Plausibilities Casei
jiijl anile -3
iijlj; antiCw -3
I/Eantif -3
antlNIgM -3
antiNMixed 0
antiSMiKed 0
, ant is -3
antiFya -3
antiFyb -3
antiJka -3
. antiJkb -3
lAlloantibodyPresent? 2
antiKMiMed 0
antik -3
antiKpaMixed 0
|$\Y\ • antiKpb -3
antiLualgM -3
antiLuaMixed 0
antiLealgM -3
Figure 3.4. RED-2 classification hierarchy after the case data (OSU 9) have been
considered. Inverted nodes are established (plausibility value at least -2). Plausibility
values are shown with each node. Plausibility values should not be confused with
reaction strengths (1+w, 1+, 1+s, etc.).
Coombs + /- 0 2+ + /- + /- 2+
EnzymelS 0 0 0 0 0 0
Enzyme37 0 0 0 + /- 0 + /•
Figure 3.5. Reaction profile for the anti-N-Mixed antibody hypothesis for case OSU
9. The profile represents what this particular hypothesis is offering to explain for this
particular case. The pattern of reactions is the most that can be explained consistently
by the hypothesis. Note by comparing with figure 3.6 that the hypothesis only offers
to partially explain some of the reactions.
specialist is factored into several knowledge groups, the results of which are
combined by additional knowledge groups to generate the plausibility value
for the hypothesis. In RED this value is an integer from -3 to +3, represent-
ing the plausibility on a symbolic scale from "ruled out" to "highly plau-
sible." The plausibility values can be thought of as prima facie estimates of
likelihood for the hypotheses, in contrast to the "all things considered" like-
lihood estimates that are produced later. The plausibilities can also be thought
of as estimates of pursuit value - that is, the degree to which the hypotheses
are worthy of further consideration. The plausibility value assigned to a hy-
pothesis, given a certain pattern of evidence, is determined by the judgment
of the domain experts.
In addition to hypothesis matching, a specialist also triggers production
of a description of the test results that can be explained by its hypothesis.
This description, called a reaction profile in the RED systems, is produced
only by specialists that have not ruled out their hypotheses (plausibility value
-3) (see Figures 3.4 and 3.5). Overview then uses the reaction profile in
assembling the best explanation for all the reactions in the panel.
Thus each plausible hypothesis delivered by the hypothesis generator comes
with:
1. a description, particularized to the case, of the findings that it offers to ex-
plain
2. a symbolic plausibility value
Each plausible hypothesis has its own consistent little story to tell and to
contribute to the larger story represented by the abductive conclusion.
By filtering the primitive hypotheses, letting through only those plausible
for the case, we potentially make a great computational contribution to the
problem of finding the best composite. By making only moderate cuts in the
number, say n, of hypothesis parts to consider, we can make deep cuts in the
72 ABDUCTIVE INFERENCE
n
2 composites that can potentially be generated from them. For example, if
63 prestored hypothesis patterns are reduced to 8 that are plausible for the
case at hand, we cut the number of composites that can potentially be gener-
ated from 263 (the number of grains of rice on the last square of the chess-
board in the classical story) - more than 9 quintillion - to 28, or 256.
How other systems screen hypotheses. The INTERNIST system for diagno-
sis in internal medicine (Miller, Pople, & Myers, 1982) can be viewed as
doing this sort of hypothesis screening when it considers only the subset of
prestored diseases that are evoked by the present findings. In this way it
screens out those hypotheses that are completely irrelevant for explaining
the findings. It cuts the number even further when it scores the evoked dis-
eases for confidence and continues to consider only those above a certain
threshold. This can be seen as screening out hypotheses for low likelihood
of being correct, where likelihood is measured primarily by quality of match
to the case data.
The DENDRAL system (for elucidating molecular structure from mass
spectrogram and other data) (Buchanan, Sutherland, & Feigenbaum, 1969)
performs such a hypothesis-screening subtask explicitly. During an initial
phase it uses the data provided to generate a BADLIST of molecular sub-
structures that must not appear in the hypothesized composite structures.
This BADLIST is used to constrain the search space during the subsequent
enumeration of all possible molecular structures consistent with the con-
straints. That is (in the present terms) DENDRAL first rules out certain hy-
potheses for matching poorly to the case data, and then generates all pos-
sible composite hypotheses not including those that were ruled out (and re-
specting some other constraints). Note that DENDRAL, like RED, devotes a
separate problem solver with its own knowledge structures to the initial
screening task.
In the Generalized Set Covering model of diagnosis and abduction (Reggia,
1983; Reggia, Perricone, Nau, & Peng, 1985) a disease is associated with a
certain set of findings that it potentially covers (i.e., which it can explain if
they are present). The diagnostic task is then viewed as that of generating all
possible minimum coverings for a given set of findings, by sets associated
with diseases (in order to use this as a basis for further question asking). In
expert systems built using this model, a match score is computed for each
relevant disease each time new findings are entered for consideration, and
match scores are used, when appropriate, as the basis for categorically re-
jecting disease hypotheses from further consideration.
Abductive assembly
To this point, RED-1 and RED-2 share the same architecture. Both use clas-
sification hierarchies, although the RED-2 classification hierarchy has two
levels and is more elaborate than that of RED-1, which has only one level.
Both use the same hypothesis-matching and knowledge-directed data-retrieval
mechanism. However, the main difference between the two systems is their
abductive-assembly mechanisms.
Synthesizing a best composite can be computationally expensive: If there
are n plausible hypotheses, then there are 2" composites that can potentially
be made from them. If each needs to be generated separately in order to
determine which is best, the process can become impractical. Clearly, it is
usually preferable to adopt strategies that allow us to avoid generating all
combinations.
Sometimes an abduction problem is completely dominated by the diffi-
culty of generating even one good composite explanation. It can be shown
that the problem of generating just one consistent composite hypothesis that
explains everything, under conditions where many of the hypotheses are
incompatible with one another, is NP complete (see chapter 7) and thus prob-
ably has no efficient general solution.
Each of the RED engines first generates a single, tentative, best compos-
ite and then improves it by criticism followed by suitable adjustment. For
the criticism to be accomplished, certain other composite hypotheses are
generated, but only a relatively small number. Here again, the RED systems
devote a distinct problem solver to a distinct task, in this case that of form-
ing a best composite. The initial composite hypothesis that is formed is (un-
der favorable conditions) one that explains all the data that needs to be ex-
plained; that is maximally plausible, or nearly so; and (in RED-2) is inter-
nally consistent.
Often, more than simply computational feasibility considerations are involved
in a decision to generate the one best composite instead of generating all com-
posites subject to some constraints. For purposes of action, an intelligent agent
will typically need a single best estimate as to the nature of the situation, even if
it is only a guess, and does not need an enumeration of all possible things it
could be. By rapidly coming to a best estimate of the situation, the agent arrives
quickly at a state where it is prepared to take action. If it had to enumerate a
large number of alternatives, it would take longer, not only to generate them,
but also to decide what to do next. It is difficult to figure out what to do if
proposed actions must try to cover for all possibilities.
Alternatively, there may be situations in which careful and intelligent rea-
Two RED systems 75
soning requires the generation of all plausible composites (i.e., those with a
significant chance of being true), so that they can all be examined and com-
pared. This might especially be called for when the cost of making a mis-
take is high, as in medicine, or when there is plenty of time to consider the
alternatives. Of course generating all plausible composites is computationally
infeasible if there are many plausible fragments from which to choose. More-
over, besides being a computationally expensive strategy, generating all the
alternatives will not typically be required because we can compare compos-
ites implicitly by comparing alternative ways of putting them together. For
example, in comparing hypothesis fragments Hx and Hv we implicitly (par-
tially) compare all composites containing Hl with those containing Hr
How other systems generate composites. The authors of DENDRAL saw its
job as that of generating all possible composites that are allowable given the
previously established constraints on submolecules and the known case-in-
dependent chemical constraints on molecular structure. In contrast INTER-
NIST terminates (after cycles of questioning, which we ignore for these pur-
poses) when it comes up with its single best composite. The set-covering
model generates all possible composites of minimal cardinality but avoids
having to enumerate them explicitly by factoring the combinations into dis-
joint sets of generators.
RED-1 performance
Following is an edited transcript of the output of RED-1 for a case in which
the first panel was not conclusive and further testing was required to settle
the remaining ambiguities of the case. 6 This output represents the state at
the end of the first inconclusive panel. The correct answer after further test-
Two RED systems 11
ing turned out to be: ANTI-FY-A, ANTI-K, and ANTI-D (i.e., antibodies to
the antigens FY-A, K, and D). As can be seen, RED anticipated the correct
answer, although it was unable to confirm it.
Discussion
RED-1 treated the abductive process as one of classification and hypothesis
assembly, proceeding through these stages:
(i) forming plausible primitive hypotheses by classification
(ii) assembling a tentative best explanation using a strategy of adding primitive
hypotheses to a growing composite in order of likelihood (based on match score)
until all the findings have been accounted for
(iii) removing explanatorily superfluous parts of the hypothesis
(iv) criticizing the composite hypothesis by testing for the existence of plausible
alternative explanations for what various parts of the composite offer to account for
(v) assigning final confidence values to the primitive hypotheses on the bases of
explanatory importance and match scores
Thus we succeeded in automating a form of best-explanation reasoning
for a real-world task using a strategy that has some degree of generality.
Figure 3.6. (a) RED-2 case OSU 9. The data to be explained. The disparity between
this and Figure 3.1 is that RED-2 does not try to explain the reactions on the screen-
ing cells (Scr2 and Scrl).
Coombs 3+ 0 3+ 0 3+ 3+ 3+ 3+
EnzymelS 0 0 0 0 0 0 0 0 0
Enzyme37 0 0 1 + 0 0 1+ 0 1+ 0
Growing Hypothesis
The growing hypothesis:
antiLeaMixed
Figure 3.6 (c) The anti-Lea-Mixed antibody hypothesis is chosen to explain the re-
action and becomes the first part of a growing composite hypothesis.
Still To Explain
623A 479 537A 506A 303A 209A 186A 195 164
AlbuminIS 0 0 0 0 0 0 0 0 0
Albumin37 0 0 0 0 0 0 0 0 0
Coombs
EnzymelS
3+
0
0
0
3+
0
0
0
3+
0
EH0
0
0
0
0
0
0
Enzyme37 0 0 1 + 0 0 1+ 0 + /- 0
Still To Explain
623A 479 537A 5O8A 303A 209A 186A 195 164
AlbuminIS 0 0 o ! o 0 | 0 0 0 o :
Albumm37 0 o 0 ' 0 o I o 0 0
Coombs 3+ 0 ^3 " ! 0 ~ 3+ BTS 0 o ij
z° r~°'
EnzymelS 0 0 0 0 o n (D
Enzyme37 0 0 1+ ! o 0 1 + o t / - 0
Growing Hypothesis
The growing hypothesis;
antiLeaMixed
Figure 3.6 (f) The problem state as it appears on the screen of the Xerox 1109 Lisp
machine in the Loops environment.
Growing Hypothesis
The g r o w i n g hypothesis:
antiLeaMixed
antiLeaMixed antiKIgG
Figure 3.6 (g) Anti-K-IgG included in the composite hypothesis (the new two-part
hypothesis is the bottom entry).
82 ABDUCTIVE INFERENCE
Still To Explain
623A 479 537A 506 A 3O3A 209 A 186A 195 164
AlbuminIS 0 0 0 0 0 0 0 0 0
Albumin37 0 0 0 0 0 0 0 0 0
Coombs
EnzymelS
3+
0
0
0
0
0
0
0
EB
0 0
0 0
0
0
0
0
0
Enzyme37 0 0 0 0 0 0 0 0 0 j
Growing Hypothesis
The growing hypothesis:
antiLeaMixed
antiLeaMixed antiKIgG
antiLeaMixed antiKIgG antiP lMixed
Still To Explain
623A 479 537A 506A 303A 209A 186A 195 164
AlbuminIS o 0 0 0 0 0 0 0 0
Albumin37 0 0 0 0 0 0 0 0 0
Coombs o 0 0 1 + 0 0 0 0
EnzymelS o 0 0 0 0 0 0 0 0
Enzyme37 0 0 0 0 0 0 0 0 0
Growing Hypothesis
The growing hypothesis:
antiLeaMixed
antiLeaMixed antiKIgG
antiLeaMixed antiKIgG antiP lMixed
antiLeaMixed antiKIgG antiP lMixed antiSlgG
Still To Explain
623A 479 537A 506 A 303A 209A 186A 195 164
AlbuminIS 0 0 0 0 0 0 0 0 0
Albumin37 0 0 0 0 0 0 0 0 0
Coombs 0 0 0 0 0 0 0 0 0
EnzymelS 0 0 0 0 0 0 0 0 0
Enzyme37 0 0 0 0 0 0 0 0 0
Figure 3.6 (m) The four-part hypothesis is tentatively accepted and will be subjected
to criticism.
we introduce the danger of an infinite loop, but this is dealt with fairly readily.
We suitably raise the standards for reintroducing a hypothesis in precisely the
same situation in which it was first introduced. The second time around we
require that there be no net loss of explanatory power resulting from reintroduc-
ing the hypothesis and removing its contraries from the assembly. (There are a
variety of acceptable measures of explanatory power that will serve here to
guarantee progress.) The basic idea is that the finding must be explained, even
if doing so forces a major revision of the growing hypothesis.
If the initial assembly process concludes successfully, the result is a com-
posite hypothesis that is consistent, explanatorily complete, and nearly maxi-
mally plausible (at each choice point the most plausible alternative was cho-
sen). Note that the hypothesis assembler does a good job of producing a
composite hypothesis that combines the virtues of consistency, complete-
ness, and plausibility. The relative priorities for these virtues is implicit in
the algorithm. The least important virtue is plausibility: If RED can make a
consistent, complete explanation only by using implausible hypotheses, it
does so. Completeness is next in importance, with consistency being the
highest priority. The hypothesis assembler will enforce consistency through-
out the process. If it cannot find a consistent explanation for a finding, it
stops and yields the best consistent explanation it has found to that point,
even if the explanation is incomplete.
Criticism
The second stage of Overview's processing begins only if the assembly
stage produced a complete, consistent explanation. If so, Overview enters a
84 ABDUCTIVE INFERENCE
Figure 3.9. Anti K is not essential. The reactions can all be explained without this
hypothesis.
something that has no other plausible explanation other than by using the
hypothesis part in question (see Figures 3.8 and 3.9). Note the distinction
between hypothesis parts that are nonsuperfluous relative to a particular
composite, that is they cannot be removed without explanatory loss, and
essentials without which no complete explanation can be found in the whole
hypothesis space. An essential hypothesis is very probably correct, espe-
cially if it was rated as highly plausible by its specialist.
Essential hypotheses are collected into an essential kernel representing
the part of the tentative composite that is most certain (Figure 3.10). This
essential kernel is passed as a seed to the assembler and is expanded in the
most plausible way to form a complete, consistent explanation. This second
composite hypothesis is also submitted to the parsimonizer for removal of
any explanatorily superfluous parts. At this stage the best explanation has
been inferred, or at least a best explanation has been inferred, there being no
a priori guarantee that a best explanation is unique (Figure 3.11). Under
some circumstances the reasoning process will have virtually proved the
correctness of its conclusion. If each part of the composite is essential (in
the sense just described), then the system has, in effect, proved that it has
found the only complete and parsimonious explanation available to it. This
must be the correct explanation, assuming that the data is correct and the
system's knowledge is complete and correct. When parts of the conclusion
86 ABDUCTIVE INFERENCE
P a r s i m o n i z e d h y p o t h e s i s : antiKIgG a n t i S l g G
The following antibodies have been ruled out: antiD antiC antic antiK antie antiCw ant if antilVI ant is antiFya
antiFyb antiJka antiJkb antik antiKpb antiJsb antiLub antil
Thefollo wing a nti bodies are classified as LIKELY ABSENT: antiN arvtiLua antil.ea antiLeb a n i i P l
Figure 3.12. Final report giving the status of the antibody hypotheses.
are not essential, the system will have discovered that alternative explana-
tions are possible, so appropriate caution may be taken in using the abductive
conclusion. A final report draws conclusions about what is known about the
presence or absence of important antibodies (Figure 3.12). An explanation
facility is also provided whereby the system may be asked for the reasons
for its answers (Figure 3.13 a-d).
In general it is not sufficient just to produce a composite best explanation;
it is also important to have more information about how good the explana-
tion is, so that an agent can decide whether to act boldly or to be cautious,
Two RED systems 87
Any Questions?
Typethe name of an antibody if you would like an explanation of why it was classified the way it was.
Type A if you would I ike to see the a nii body classification report again.
antiN w as class if ied as LI K ELY ABSENT because it is not part of the best exp la nation, and it was not rated a<
highly plausible by the a ntibody specia lists.
a nt i S was class it led as WEAK LY CON FI RVI ED because if is the only plausible way to explain the indicated
reactions, a II of the reactions in the work up can be explained, but the Rule of Three for sufficiency of the
evidence was notpassed.
for example deciding to gather more evidence before taking action. Some
critical assessment is necessary because, as we said, the appropriate confi-
dence for an abductive conclusion depends in part on how well it compares
with the competition. This applies to the evaluation of composite hypoth-
eses no less than it applies to simple ones.
RED-2 performance
RED-2 was implemented in InterLisp™ and the LOOPS™ language for ob-
ject-oriented programming (from Xerox Corporation), and it was run on a
Xerox Lisp machine.7 RED-2's performance was initially evaluated by us-
ing 20 selected test cases chosen, not as a representative sample of blood-
bank problems, but rather, to fully exercise the capabilities of the system.
These cases are used in training blood-bank technologists, some from text-
books and some from course materials. Each case has an associated correct
answer that was used as the standard of comparison. Twenty cases were cho-
sen, 6 in which the accepted answer involved a single antibody; 6 had two
antibody answers; and the remaining 8 required three or more antibodies in
the conclusion. This is actually a much more trying distribution than is typi-
cal of the blood bank, where most cases involve only a single antibody. The
answers generated in the blood bank are not "scientifically pure" in that the
goal is to detect medically important antibodies, not simply to discover which
are present; thus answer acceptance has a pragmatic component.
Of the 20 cases chosen, 3 of the complex ones had accepted answers that
were very indefinite. RED-2's answers on these cases were not inconsistent
with the textbook, but we were unable to evaluate its performance in detail.
Remaining were 17 cases in which performance could be more closely evalu-
ated, varying in difficulty from simple to very complex. In 12 of the 17
cases, RED-2 's answers were precisely the same as the textbook answer. In
1 other case, RED-2 's answer differed from the textbook answer but was
still quite acceptable, according to our expert.
In 3 cases RED-2 overcommitted by including an extra antibody that was
not in the textbook answer. That is, RED-2 decided that the antibodies in the
textbook answer were not sufficient to explain all the reactions and included
another antibody to make up the difference. According to our expert, RED-
2's answers on these 3 cases were acceptable but more cautious than a typi-
cal blood-bank technologist. This discrepancy was probably a result of two
factors.
First our intention in building the system was that RED-2 should simply
interpret panel data and make a statement about which antibodies appear to
be present. This means that certain pragmatic considerations about thera-
peutic significance were left out of RED-2's knowledge. For example, a
technologist may be reluctant to accept an explanation of some reactions in
terms of a common but clinically insignificant antibody, for fear that a too
ready acceptance of such an answer may mask something clinically more
dangerous.
Two RED systems 89
Second, RED-2 is inflexible about how much reaction can be explained
by an antibody. Once it decides how much of the panel data to attribute to a
particular antibody, it never revises its estimate. From our analysis of proto-
cols taken from blood-bank specialists, it seems that humans are willing to
consider stretching the explanatory power of hypotheses. The result is that
where RED-2 might assume an additional antibody to explain a small re-
maining reaction, the human expert might simply assign the small remain-
der to an antibody already assumed to be present.
RED-2's answer was unacceptable (partly) in only one case. In this case
RED-2 built a five-antibody best explanation, whereas the expert answer
was a three-antibody solution. RED-2 detected two of the three "correct"
antibodies but missed the third, actually building a compound three-anti-
body alternative to using the expert's third antibody. Most troublesome was
that RED-2 identified this antibody as LIKELY ABSENT because it was not
part of its best explanation and was not especially plausible (according to its
hypothesis specialist). Each of the three antibodies RED-2 used instead of
the accepted one were rated as more plausible than the accepted one, which
is why they were chosen instead. According to our expert, RED-2's plausi-
bilities were reasonably assigned, but the blood-bank technologist would
prefer the simpler explanation, even though it uses an antibody that was
fairly implausible under the circumstances.
One interpretation for RED-2 's behavior on this case is related to the way it
judges the plausibility of compound hypotheses. RED-2 depends heavily on the
individual plausibility of simple hypotheses and knows little about the plausi-
bility of collections (although it does know about logical incompatibilities). As
a result, it can produce an answer containing many individually plausible anti-
bodies that are collectively implausible. This is most likely to arise when the
case is complex. In these cases neither RED-2 nor the human expert can be very
certain of an answer, so more testing is usually required.
Another measure of performance is a simple count of the number of anti-
bodies correctly identified. In the 20 test cases RED-2 identified 31 of 33
antibodies that were asserted to be part of correct answers. In addition, RED-
2 never ruled out an antibody that was part of the correct answer. This is
especially important clinically because ruling out an antibody that is actu-
ally present in the patient can have disastrous consequences. Of lesser im-
portance, but still interesting, is that RED-2 never confirmed an antibody
that had been ruled out in the textbook answer.
In summary, RED-2 produced clinically acceptable answers in 16 of 17
cases and correctly identified 31 of 33 antibodies. It never ruled out an anti-
body that was part of the correct answer or confirmed an antibody that was
not part of the correct answer. Thus the performance of the system was,
overall, quite good, demonstrating that "abductive logic" had been correctly
captured, at least to some degree.
90 ABDUCTIVE INFERENCE
Hypothesis interactions
Hypothesis interactions can be considered to be of two general types, each
with its own kind of significance for the problem solving:
1. explanatory interactions (e.g., overlapping in what the hypotheses can ac-
count for)
2. interactions of mutual support or incompatibility (e.g., resulting from causal
or logical relations)
For example, two disease hypotheses might offer to explain the same find-
ings without being especially compatible or incompatible causally, logically,
or definitionally. On the other hand, hypotheses might be mutually exclu-
sive (e.g., because they represent definitionally distinct subtypes of the same
disease) or mutually supportive (e.g., because they are causally associated).
In general the elements of a diagnostic differential need to be exhaustive of
the possibilities so that at least one must be correct, but they need not be
mutually exclusive. If they are exhaustive, then evidence against one of them
is transformed into evidence in favor of the others.
Following are six specific types of possible hypothesis interactions:
1. A and B are mutually compatible and represent explanatory alternatives where
their explanatory capabilities overlap.
2. Hypothesis A is a subhypothesis of B (i.e., a more detailed refinement).
If hypotheses have type-subtype relationships, which normally occurs if they
are generated by a hierarchical classifier, a hypothesis assembler can preferen-
tially pursue the goal of explanatory completeness and secondarily pursue the
goal of refining the constituent hypotheses down to the level of most detail.
This extension to the RED-2 mechanism was explored later (see chapter 4).
3. A and B are mutually incompatible.
The strategy used in RED-2 is to maintain the consistency of the growing
hypothesis as the assembly proceeds. If a finding is encountered whose only
available maximally plausible explainers are incompatible with something
already present in the growing hypothesis, then one of these newly encoun-
92 ABDUCTIVE INFERENCE
tered hypotheses is included in the growing hypothesis and any parts incon-
sistent with the new one are removed from the composite in the way that we
described earlier. The basic idea is that the finding must be explained, even
if doing so forces a serious revision of the growing hypothesis. This seems
to be a rather weak way of handling incompatibles, however, and better
strategies were devised later, as will be seen.
4. A and B cooperate additively where they overlap in what they can account for.
If such interactions occur, this knowledge needs to be incorporated into
the methods for computing what a composite hypothesis can explain. All of
the RED systems have done this.
5. Using A as part of an explanation suggests also using B.
To handle this type of hypothesis interaction, as for example if there is
available knowledge of a statistical association, we can give extra plausibil-
ity credit to the suggested hypothesis if the hypothesis making the sugges-
tion is already part of the growing composite. This feature was incorporated
into RED-2 as part of the strategy for handling ties in plausibility during
assembly. A path for the hypothesis to grow preferentially along lines of
statistical association provides a rudimentary ability for it to grow along
causal lines as well.
6. A, if it is accepted, raises explanatory questions of its own that can be re-
solved by appeal to B.
An example of this last type occurs when we hypothesize the presence of
a certain pathophysiological state to explain certain symptoms, and then
hypothesize some more remote cause to account for the pathophysiological
state. The stomachache is explained by the presence of the ulcers, and the
ulcers are explained by the anxiety disorder. At the same time that a newly
added hypothesis succeeds in explaining some of the findings, it might in-
troduce a "loose end." To handle this by mildly extending the RED-2 mecha-
nism, the newly added hypothesis can be posted as a kind of higher-level
finding which needs to be explained in its turn by the growing assembly.
This provides a way in which the growing hypothesis can move from hy-
potheses close to the findings of the case, and towards more and more re-
mote causes of those findings. Nothing like this was actually incorporated
into the RED-2 mechanism, but, as we will see, the capability was incorpo-
rated into later "layered abduction" machines.
Notes
1 Jack W. Smith, John R. Josephson, and Charles Evans designed RED-1; Jack W. Smith, John R.
Josephson and Michael C. Tanner designed RED-2.
2 Case OSU 9 is used for illustration throughout this chapter. It is documented more fully in (Tan-
ner, Josephson, and Smith, 1991). OSU 9 represents a single workup on one particular patient on
one particular occasion.
3 The information from a panel consists of approximately 50 reactions (counting non-reactions)
with each reaction or nonreaction described by one of 15 symbols encoding type or degree of
reaction. Thus there are approximately 1550 possible panels and the number of possible inputs
for any version of RED is astronomically large.
4 Charles Evans programmed the RED-1 antibody classification structure and Mike Tanner pro-
grammed the RED-2 classification structure using CSRL. CSRL was developed by Tom Bylander
and designed by Tom Bylander and Sanjay Mittal (Bylander, Mittal, & Chandrasekaran, 1983)
(Bylander & Mittal, 1986).
5 Charles Evans designed and programmed the inferencing data base for RED-1. Mike Tanner
designed and programmed the inferencing data base for RED-2.
6 Charles Evans programmed RED-1, except for the Overview mechanism, which John Josephson
programmed. They were programmed using the ELISP dialect of Rutgers/UCI-LISP and run on
a Decsystem-2060 computer under the TOPS-20 operating system.
7 John Josephson programmed the RED-2 Overview mechanism, and Mike Tanner programmed
the rest, in part by using CSRL.
8 From "A Study in Scarlet" by Sir Arthur Conan Doyle.
9 From "The Boscombe Valley Mystery" by Sir Arthur Conan Doyle.
10 From "The Hound of the Baskervilles" by Sir Arthur Conan Doyle.
Generalizing the control strategy - machine 3
94
Generalizing the control strategy 95
We reimplemented RED using PEIRCE, both to debug the tool and to
enable experiments with other abductive-assembly strategies for red-cell
antibody identification. PEIRCE was also used to build a liver-disease diag-
nosis system called PATHEX/LIVER and to build TIPS (Task Integrated Prob-
lem Solver), another medical diagnosis system. TIPS, especially, exploits
PEIRCE's ability to dynamically integrate different problem-solving meth-
ods. In this chapter we describe PEIRCE, the reimplementation of RED, and
a reimplementation of PEIRCE in SOAR, which is a software system that
realizes an architecture for general intelligence developed by Laird,
Rosenbloom and Newell (Laird, Rosenbloom, & Newell, 1986). This
reimplementation of PEIRCE is called ABD-SOAR. In chapter 5 we describe
TIPS and PATHEX/LIVER.
Sponsors. Each sponsor contains knowledge for judging when its associated
method is appropriate. The sponsor uses this knowledge to yield a symbolic
98 ABDUCTIVE INFERENCE
sponsor 1
sponsor 2
sponsor 3
Figure 4.1. The structure of a sponsor-selector system for making control decisions.
main abducer:
explain all the
data
single-finding subabducer:
abducer: explain a
explain meaningful
one finding part of the
data
ABDUCER Selector
Tactic
Sponsors
Tactic
Methods
Critique
Extend
coverage
Refine
/ Parsimonize
hypothesis
Resolve
inconsistency
value that indicates how appropriate the method is under the circumstances.
The information-processing task of a sponsor can be thought of as hypoth-
esis matching, one of the generic tasks described in chapter 2, although in
PEIRCE a sponsor evaluates the appropriateness of a certain hypothetical
course of action, rather than the plausibility of a classiflcatory or explana-
tory hypothesis as in CSRL.
A sponsor's knowledge, put in by a knowledge engineer as part of build-
ing a knowledge-based system, is represented in a decision table similar to
the knowledge groups of CSRL, the tool used to build hierarchical classifi-
cation systems. (Knowledge groups were described in Chapter 3.) A sponsor's
decision table maps problem-solving states to appropriateness values by rep-
resenting patterns of responses to queries about the problem state and by
associating each pattern with an appropriateness value.
Figure 4.4 illustrates a sponsor's decision table. (The column headings
have been put into English for purposes of exposition.) Each column, except
the last, is headed with a query about the system's problem-solving state.
Each row represents a pattern of responses to the queries, followed by an
appropriateness value that applies if the pattern occurs. The appropriateness
value can come from any fixed, discrete, ordered range of symbolic values;
in this example it comes from a scale that ranges from -3 (for highly inap-
100 ABDUCTIVE INFERENCE
T F 3
F F 1
? T -3
The process of abductive assembly can reach a choice point we call a hard
decision. Let us suppose that a single finding,/, is the focus of attention and that
the hypotheses available to explain / a r e A, B, and C. The abducer's job is to
choose one of these hypotheses to explain / but the only criterion of choice is
the plausibility ratings of the hypotheses. (Often other grounds for choice are
available, but for simplicity we will assume not in the present example.) Sup-
pose that all three hypotheses have the same plausibility value of 3. Since there
is no real basis for choice, the assembler must make a hard decision.
At this point, the RED-2 system would simply make a random choice,
because in context it did not seem to matter what was chosen, and down-
stream criticism with potential repair was scheduled to occur anyway. How-
ever, in the PEIRCE implementation we added another tactic: delaying a
hard decision. As easy decisions are made, we hoped that the hard decisions
would be resolved serendipitously as their associated findings are covered
by more easily decided upon hypotheses. If any hard decisions remain after
all the easy decisions are made, the system attempts to make the choice on
some grounds, perhaps by choosing the better of nearly equal hypotheses. If
no grounds can be found, the system reverts to random choice to make these
decisions. This strategy is somewhat similar to the strategy of "least com-
mitment" found in some earlier systems (e.g., Mittal & Frayman, 1987; Stefik,
1981). It is even more closely related to the idea of "island driving" (Hayes-
Roth and Lesser, 1977) first described in the HEARSAY-II system, in which
islands of high confidence are found and form a basis for exploring the rest
of the hypothesis space.
This tactic can be generalized to include several levels of decision difficulty,
so that an explanation will be accepted only if it meets the prevailing standards
of certainty (e.g., surpasses all rivals by at least a given interval) and these
standards can be relaxed in stages to gain more explanatory coverage.
The PEIRCE implementation of RED (RED-3) was rerun on a number of
cases for which RED-2 was previously evaluated. One problem that we encoun-
Generalizing the control strategy 103
tered was providing a reasonable measure of performance to show clearly the
difference between RED-3 and RED-2. Their abilities to get the "correct" an-
swer is not a good measure because both systems either get the same answer or,
where their answers differ, our experts deemed the answers to be close enough.
We needed to measure how the two systems arrived at their answers.
We discovered that the major difference between the two assembly strate-
gies was the contents of their initial hypotheses. This difference can be di-
rectly attributed to RED-3's postponement of some hard decisions. More-
over, RED-3 also discovered the essentialness of some hypotheses in the
process of assembly, which resulted in resolving some of the outstanding
hard decisions. The initial explanations were consequently much closer to
the final explanations that both strategies provided. The delaying-hard-de-
cisions feature of PEIRCE made the search process more efficient.2
Constrictors can be used to focus the diagnostic process in the same fash-
ion as pathognomonic findings. Though not so powerful as to enable focus-
ing on a particular disease, constrictors are more common than
pathognomonic symptoms and thus are more generally useful. An example
of a constrictor is jaundice, which indicates liver disease. Caduceus begins
its diagnostic process by looking for constrictors. The constrictors it finds
help to focus the diagnostic process on a limited number of hypotheses.
The ability to use knowledge of constrictors, or of special meaningful
patterns of findings, can be achieved in PEIRCE by allowing for special
extra methods to be sponsored for the extend-explanatory-coverage goal,
which arises as a result of selecting the extend-explanatory-coverage tactic.
Each such method is designed to try to explain certain findings, or finding
patterns, on the basis of special knowledge regarding how to recognize and
attempt to explain those findings or patterns. The default would be to pro-
ceed using general methods, one finding at a time, but for each domain a
new set of sponsors can be introduced to explain special cases.
Refinement control
In RED-3 we hoped to provide more control over the hypothesis-refinement
process in the hierarchical classifier by guiding it according to explanatory
need. By this we mean that the decision to refine a classification node (i.e.,
examine its subnodes) considers, not only the ability of the node to estab-
lish (i.e., set high plausibility), but also whether the node represents a hy-
pothesis that is needed to explain any of the findings. Thus the hierarchical
classifier and the abductive assembler interleave their operations. The hi-
erarchical classifier provides a list of candidate hypotheses (and their asso-
ciated plausibility ratings) to the abductive assembler, but these hypoth-
eses have only a certain level of specificity (i.e., they were explored down
to a certain level in the hierarchy). The abductive assembler then creates
an explanation of the findings by using this candidate list. The assembler
can then request a refinement of one of the hypotheses used in its com-
pound explanation. The classifier refines this node and returns a list of can-
didate subnodes (more specific hypotheses) to replace the original node.
The assembler removes the original hypothesis and attempts to reexplain
the findings that it explained, using the refined hypothesis. This interac-
tion continues until the explanation is specific enough, or the hierarchy is
examined to its tip nodes.
Generalizing the control strategy 105
The usefulness ofPEIRCE
We want to emphasize the utility of PEIRCE as a tool for experimenting with
the various methods of abductive assembly. It is fairly simple to change the
conditions under which a method is invoked in the overall assembly process by
modifying the knowledge of when and to what degree the method is appropri-
ate. Such changes are local in that they affect only the invocation of one method
at one recurring decision point. Adding a new method is equally simple, be-
cause it needs only to be included in the sponsor-selector tree with some knowl-
edge of when it is appropriate, and what priority it should have.
Abduction in SOAR
Introduction. Any single algorithm for abduction requires specific kinds of knowl-
edge and ignores other kinds of knowledge. Hence, a knowledge-based system
that uses a single abductive method is restricted to using the knowledge re-
quired by that method. This restriction makes the system brittle because the
single fixed method can respond appropriately only in a limited range of situa-
tions and can make use of only a subset of the potentially relevant knowledge.
To remedy this problem we have endeavored to develop a framework from which
abductive strategies can be opportunistically constructed at run time to reflect
the problem being solved and the knowledge available to solve the problem. In
this section we present this framework and describe ABD-SOAR, an implemen-
tation of the framework. We show how ABD-SOAR can be made to behave like
the abductive strategy used in RED-2, and we describe the differences between
ABD-SOAR and PEIRCE.
This work on ABD-SOAR contributes to our understanding of both knowl-
edge-based systems and abduction. First, it illustrates how to increase the
problem-solving capabilities of knowledge-based systems by using mecha-
nisms that permit the use of all relevant knowledge. ABD-SOAR requires
little domain knowledge to begin solving a problem but can easily make use
of additional knowledge to solve the problem better or faster. Second, the
framework can be used to provide a flexible abductive problem-solving ca-
pability for knowledge-based systems. Third, ABD-SOAR gives SOAR
(Laird, Rosenbloom, & Newell, 1986; Newell, 1990; Rosenbloom, Laird, &
Newell, 1987) an abductive capability so that many systems written in SOAR
can begin to solve abduction problems. Fourth, the framework provides a
simple and general mechanism for abduction that is capable of generating
the behavior of various fixed methods. Fifth, ABD-SOAR can be used to
experiment with different abductive strategies, including variations of ex-
isting strategies and combinations of different kinds of strategies.
SOAR has been proposed as an architecture for general intelligence. In
106 ABDUCTIVE INFERENCE
SOAR, all problem solving is viewed as searching for a goal state in a prob-
lem space. Viewing all problem-solving activity as search in a problem space
is called the problem-space computational model (PSCM) (Newell, Yost,
Laird, Rosenbloom, & Altmann, 1991). A problem space is defined by an
initial state and a set of operators that apply to states to produce new states.
The behavior of a problem-space system depends on three kinds of knowl-
edge: Operator-proposal knowledge indicates the operators that can be ap-
plied to the current state of problem solving. Operator-selection knowledge
selects from the list of applicable operators a single operator to apply to the
current state. Finally, operator-implementation knowledge applies a selected
operator to the current state to produce a new state. If any of this knowledge
is incomplete, SOAR automatically sets up a subgoal to search for addi-
tional knowledge. The subgoal is treated like any other problem in that it
must be achieved by searching through a problem space. The process of
enumerating and selecting operators at each step of problem solving, to-
gether with automatic creation of subgoals to overcome incomplete knowl-
edge, makes SOAR an especially appropriate framework for exploring op-
portunistic problem solving. Chandrasekaran (1988) says this about SOAR:
My view is that from the perspective of modeling cognitive behavior, a GT [ge-
neric-task]-level analysis provides two closely related ideas which give additional
content to phenomena at the SOAR architecture level. On the one hand, the GT
theory provides a vocabulary of goals that a SOAR-like system may have. On the
other hand, this vocabulary of goals also provides a means of indexing and organiz-
ing knowledge in long-term memory such that, when SOAR is pursuing a problem
solving goal, appropriate chunks of knowledge and control behavior are placed in
short term memory for SOAR to behave like a GT problem solver. In this sense a
SOAR-like architecture, based as it is on goal achievement and universal subgoaling,
provides an attractive substratum on which to implement future GT systems. In
turn, the SOAR-level architecture can give graceful behavior under conditions that
do not match the highly compiled nature of GT type problem solving, (pp. 208-209)
The framework
The general abductive framework3 can be described by using a single prob-
lem space with seven operators: cover, resolve-redundancy, resolve-incon-
sistency, determine-certainty, determine-accounts-for, mark-redundancies,
and mark-inconsistencies. A problem space is described by specifying its
goal, the knowledge content of its states, the initial state, operators, and
search-control knowledge. First we describe the problem spaces, then we
describe the minimal knowledge required to use the framework.
108 ABDUCTIVE INFERENCE
Initial-state schema. The initial state need contain only the data to be ex-
plained. Additional information can be provided.
Red Cells
1 2 3 4
0 (r3) 1+ 0 0
RED-like abduction. RED-2 uses two heuristics to help guide the search: (1)
The system prefers to cover stronger reactions before weaker reactions. If
reactions are equal, then one is selected at random to cover. (2) Whenever
multiple antibodies can explain the same reaction, the antibodies are or-
dered according to plausibility, and the one with the highest plausibility is
used. If multiple antibodies have the same plausibility, then one is selected
at random (this is a somewhat simplified description). This knowledge can
be added to ABD-SOAR by rating antibodies for plausibility and by adding
three search-control rules:
Comparison to PEIRCE
ABD-SOAR can be thought of as a reimplementation of PEIRCE in SOAR
to see whether such an implementation would give added flexibility to
PEIRCE. Because of the generality of its goal-subgoal hierarchy and its con-
trol mechanism, PEIRCE can be used to encode many different strategies
for abduction. For example, it was used to recode RED-2's strategy and sev-
eral variations of that strategy. However, problems with the control mecha-
nism and the goal-subgoal decomposition ultimately restrict flexibility by
limiting the knowledge that a PEIRCE-based system can use.
First, there is no way to generate additional search-control knowledge in
PEIRCE at run time. There is also no way to add new goals or methods at
run time. This is not a problem in ABD-SOAR because any subgoal or im-
passe can be resolved using the complete processing power of the PSCM.
Thus, search-control knowledge, evaluation knowledge, or operators for a
problem space can be generated just as any other kind of knowledge in ABD-
SOAR can.
Second, the control mechanism in PEIRCE cannot detect problem-solv-
ing impasses that result from a lack of knowledge. In ABD-SOAR, the ar-
chitecture automatically detects and creates a goal for these kinds of im-
passes. Finally, the goal-subgoal structure in ABD-SOAR is much finer-
grained than the one used in PEIRCE. This means that the abductive strat-
egy can be controlled at a finer level of detail in ABD-SOAR.
chapter 2, the idea of generic tasks brings several problems to mind: What
constitutes a generic task? and How do we distinguish between generic tasks
and other sorts of methods? These sorts of issues led us to develop the ap-
proach to generic tasks seen in this chapter and in chapter 5.
Instead of thinking of problem solving, such as diagnosis and design, as
complex tasks that can be decomposed into elementary generic-task special-
ists, in these systems we think of them as complex activities involving a
number of subtasks and a number of alternative methods potentially avail-
able for each subtask.
Methods
A method can be described in terms of the operators that it uses, the objects
upon which it operates, and any additional knowledge about how to orga-
nize operator application to satisfy the goal.4 At the knowledge level, the
method is characterized by the knowledge that the agent needs in order to
set up and apply the method. Different methods for the same task might call
for different types of knowledge. Let us consider a simple example. To mul-
tiply two multidigit numbers, the logarithmic method consists of the follow-
ing series of operations: Extract the logarithm of each input number, add the
two logarithms, and extract the antilogarithm of the sum. Their arguments,
as well as the results, are the objects of this method. Note that one does not
typically include (at this level of description of the logarithmic method)
specifications about how to extract the logarithm or the antilogarithm or
how to do the addition. If the computational model does not provide these
capabilities as primitives, the performance of these operations can be set up
as subtasks of the method. Thus, given a method, the application of any of
the operators can be set up as a subtask.
Some of the objects that a method needs can be generic to a class of prob-
lems in a domain. As an example, consider hierarchical classification using
a malfunction hierarchy as a method for diagnosis. Establish-hypothesis and
refine-hypothesis operations are applied to the hypotheses in the hierarchy.
These objects are useful for solving many instances of the diagnostic prob-
lem in the domain. If malfunction hypotheses are not directly available, the
generation of such hypotheses can be set up as subtasks. One method for
generating such objects is compilation from so-called deep, or causal, knowl-
edge. There is no finite set of mutually distinct methods for a task because
there can be numerous variants on a method. Nevertheless, the term method
is a useful shorthand to refer to a set of related ways to organize a computa-
tion.
A task analysis should provide a framework within which various ap-
proaches to the task at hand can be understood. Each method can be treated
in terms of all the features that an information-processing analysis calls for:
Generalizing the control strategy 115
the types of knowledge and information needed and the inference processes
that use these forms of knowledge.
Choosing methods
How can methods be chosen for the various tasks? Following are some criteria:
1. Properties of the solution. Some methods produce answers that are numeri-
cally precise; others produce only qualitative answers. Some of them produce
optimal solutions; others produce satisficing ones.5
2. Properties of the solution process. Is the computation pragmatically feasible?
How much time does it take? How much memory does it require?
3. Availability of knowledge required for the method to be applied. A method for
design verification might, for example, require that we have available a de-
scription of the behavior of a device as a system of differential equations; if
this information is not directly available, and if it cannot be generated by
additional problem solving, the method cannot be used.
Each method in a task structure can be evaluated for appropriateness in a
given situation by asking questions that reflect these criteria. A delineation
of the methods and their properties helps us move away from abstract argu-
ments about ideal methods for performing a task. Although some of this
evaluation can take place at problem-solving time, much of it can be done
when the knowledge system is designed; this evaluation can be used to guide
a knowledge-system designer in the choice of methods to implement.
repeating the process. SOAR (Newell, 1990); BB1 (Hayes-Roth, 1985); spon-
sor-selector hierarchies, as in DSPL (Brown & Chandrasekaran, 1989) and
PEIRCE, are good candidates for such an architecture. This approach com-
bines the advantages of task-specific architectures and the flexibility of run-
time choice of methods. The DSPL++ work of Herman (1992) aims to dem-
onstrate precisely this combination of advantages.
Using method-specific knowledge and strategy representations within a
general architecture that helps select methods and set up subgoals is a good
first step in adding flexibility to the advantages of task-specific architec-
tures. It can also have limitations, however. For many real-world problems,
switching among methods can result in control that is too coarse-grained. A
method description might call for a specific sequence of how the operators
are to be applied. Numerous variants of the method, with complex sequences
of the various operators, can be appropriate in different domains. It would
be impractical to support all these variants by method-specific architectures
or shells. It is much better in the long run to let the task-method-subtask
analysis guide us in identifying the needed task-specific knowledge and to
let a flexible general architecture determine the actual sequence of operator
application by using additional domain-specific knowledge. The subtasks
can then be flexibly combined in response to problem-solving needs, achiev-
ing a much finer-grained control behavior. This sort of control is evident in
the ABD-SOAR system just discussed.
Notes
1 PEIRCE was named after Charles Sanders Peirce, the originator of the term "abduction" for a form
of inference that makes use of explanatory relationships. PEIRCE was designed by John R. Jo-
sephson, Michael C. Tanner and William F. Punch III, and implemented by Punch in LOOPS
(from Xerox Corporation), an object-oriented programming system in InterLisp-D (also a Xerox
product).
2 Investigated in experiments conducted by Olivier Fischer (unpublished).
3 For a detailed description of the framework and several examples, see Johnson (1991).
4 The terms "task" and "goal" are used interchangeably here.
5 The term "satisfying" is due to Herbert Simon. See (Simon, 1969).
More kinds of knowledge: Two diagnostic systems
TIPS (Task Integrated Problem Solver) and PATHEX/LIVER were built us-
ing the PEIRCE tool. Both are examples of third-generation abduction ma-
chines. PEIRCE is not specialized for diagnoses and might be used as a shell
for any abductive-assembly system. TIPS and PATHEX/LIVER, however,
are diagnostic systems. They are complicated systems that are similar in
organization and capabilities. Despite their similarities, in the following de-
scriptions we emphasize TIPS's ability to dynamically integrate multiple
problem-solving methods and PATHEX/LIVER's proposed ability to com-
bine structure-function models - for causal reasoning - with compiled diag-
nostic knowledge. First we describe TIPS, and then PATHEX/LIVER.
TIPS
TIPS1 is a preliminary framework that implements the idea (described in chap-
ter 4) of making alternative problem-solving methods available for a task. Method
invocation depends on the problem state and the capabilities of the method, not
on a preset sequence of invocations. TIPS presents a general mechanism for the
dynamic integration of multiple methods in diagnosis.
One can describe diagnosis not only in terms of the overall goal (say,
explaining symptoms in terms of malfunctions), but also in terms of the rich
structure of subgoals that arise as part of diagnostic reasoning and in terms
of the methods used to achieve those goals. We call such a description a
task-structure analysis. A diagnostic system explicitly realized in these terms
has a number of advantages:
a.Such a system has multiple approaches available for solving a problem.
Thus the failure of one method does not mean failure for the whole prob-
lem solver.
b. Such a system can potentially use more kinds of knowledge.
c. Such a system can potentially solve a broader range of diagnostic prob-
lems.
The TIPS approach to creating dynamically integrated problem solvers is
TIPS is by William F. Punch III and B. Chandrasekaran. PATHEX/LIVER: Structure-
Function Models for Causal Reasoning is by Jack W. Smith, B. Chandrasekaran, and Tom
Bylander.
117
118 ABDUCTIVE INFERENCE
/Return
/ Data-Gathering Data-gathering
/ /Sponsor Method
/ / Redo-Hard- Redo-Hard-Decisions
y
^^^Decisions Sponsor Method
Diagnosis jfs '
Selector Yv"^^^_ Data-Validation Data-Validation
Sponsor Method
V\A \ Causal-Reasoning
Sponsor
Causal-Reasoning
Method
\
\ \ Compiled-Knowledge Compiled-Knowledge
\ Sponsor Method
^Fail
Overview
Abductive Assembly
X
\
/
Subhypothesis Selection |lnferencing Database |
ing uses the HYPER language (chapter 2) to represent the knowledge for
mapping patterns of data to confidence values.
The third module is an Overview module based on the PEIRCE language
(see chapter 4) which performs hypothesis assembly and criticism. In the
mechanism used in the Overview module for PATHEX/LIVER-1 abductive
assembly alternates between assembly and criticism.3 Assembly is performed
by a means-ends problem solver that is driven by the goal of explaining all
the significant findings using hypotheses that are maximally specific. At
each step of assembly, the most significant datum yet to be explained is used
to select the most plausible subhypothesis that offers to explain it. After the
initial assembly the critic removes superfluous parts and determines which
parts are essential (i.e., explains some datum that no other plausible hypoth-
esis can explain). Assembly is repeated, starting with the essential
subhypotheses. Finally, the critic again removes any superfluous parts. This
is essentially the RED-2 strategy for abductive assembly (see chapter 3),
although it is augmented by PEIRCE's tactic of delaying hard decisions and
PEIRCE's ability to influence hypothesis refinement (see chapter 4).
Abductive assembly depends heavily on knowing which data a composite
hypothesis can explain in the particular case. In RED-2, this was relatively
easy because it was possible to determine what a composite hypothesis ex-
plained by "adding up" what each subhypothesis explained (see chapter 3).
In PATHEX/LIVER-1, as an interim solution to the problem, we directly
attached a description of the data that each subhypothesis offers to explain.
This description is provided by compiled knowledge associated with each
potential subhypothesis. However, for better diagnostic performance in the
126 ABDUCTIVE INFERENCE
liver disease domain more flexibility is needed to determine how disorders
interact causally. We intend to use a type of structure-function model to
provide this ability in the proposed PATHEX/LIVER-2 system.
Typical Function
Function: lower-blood-pressure-of-body
ToMake: decreased-blood-pressure
If: blood-pressure-abnormally-high
By: lower-blood-pressure-process
with a type of qualitative simulation, to predict the states and their values
that would result from particular initial states of the system (see chapter 8).
Sticklen (1987) used the output of this simulation to compile potentially
useful pieces of hypothesis-matching knowledge for disorders that affect
the complement system. Although compiled knowledge generated in this
manner is not equivalent to clinical heuristic knowledge derived from expe-
rience, it can nevertheless be used to supplement such experiential knowl-
edge, to causally explain it, to justify it, or to check its consistency under
appropriate circumstances.
Matt DeJongh explored the use of FR in the RED domain for representing
the causal processes that relate data to potential explanations (DeJongh,
1991). He was able to use FR representations of the processes involved in
antibody-antigen reactions to formulate explanatory hypotheses (at run time)
by deriving them from the causal-process representations.
Compiled
Diagnostic
Knowledge
Functional j
KnowledgeJ
(Predictive j
I Knowledge I
P
i
I Behavioral I Structural
I Knowledge I Knowledge
O
Access Output Compilation
Process
Figure 5.4. Some types of knowledge useful for diagnosis and compilation processes
that derive one type from another.
Notes
1 William F. Punch III designed and built TIPS as part of his doctoral dissertation work under B.
Chandrasekaran and Jack Smith (Punch, 1989). John Svirbely, MD, and Jack Smith, MD, pro-
vided domain expertise.
2 [Note that {function | connection-topology | spatial-arrangements} is a domain-independent tri-
chotomy of diagnostically useful types of knowledge for almost any device. It applies in electri-
cal and mechanical as well as biological domains. Malfunction in a power supply may reduce
voltage available to other components and cause secondary loss of function. A break in electrical
connection can occur in any electrical device and has consequences that can be predicted if con-
nectivity is known. Overheating of an electrical component may damage spatially proximate com-
ponents. - J. J.]
3 Jack Smith, William Punch, Todd Johnson and Kathy Johnson designed PATHEX/LIVER-1. It
was implemented by William Punch (abductive assemblers), Kathy Johnson (inferencing data-
base, using ID ABLE software written by Jon Sticklen), and Todd Johnson (CSRL re-implementa-
tion and HYPER tool for hypothesis matching). John Svirbely, MD, Jack Smith, MD, Carl Speicher,
MD, Joel Lucas, MD, and John Fromkes, MD provided domain expertise.
4 For clarity we use the phrase causal-process description to replace what was called a behavior in
the original representations.
Better task analysis, better strategy - machine 4
Summary of Progress and Task Analysis of Abductive Hypothesis Formation are by John
R. Josephson. Concurrent Assembly is by Ashok K. Goel, John R. Josephson, and P.
Sadayappan. Efficiency of the Essentials-first Strategy is by Olivier Fischer, Ashok K.
Goel, John Svirbely, and Jack W. Smith.
136
Better task analysis, better strategy 137
Machine 1
First-generation machines, such as RED-1, use the hypothesis-assembly strat-
egy of beginning with the most plausible hypothesis and continuing to con-
join less and less plausible hypotheses until all the data are explained. Then,
to ensure parsimony, the first-generation strategy is to work from least to
most plausible part of the working hypothesis, removing any parts that are
explanatorily superfluous. Then a form of criticism occurs that determines,
for each part in the working hypothesis, whether it is essential - that is,
whether it explains some datum that can be explained in no other way
(known to the system). If hj is essential, then any composite hypothesis with-
out h1 will necessarily leave something unexplained. The strategy for de-
termining essentialness is to construct the largest composite hypothesis, not
including the part in question, and to check for an unexplained remainder.
Essential parts of the final composite hypothesis are marked as especially
confident.
This is the overall strategy of RED-1. It is domain-independent, not in
being applicable in every domain, but in having a specification that makes
no use of domain-specific terms. Instead the specification is confined to the
domain-independent (but abduction-specific) vocabulary of hypotheses, their
plausibilities, what they explain, and so forth. Systems such as RED-1 that
use this strategy can be considered to be instances of the Machine 1 abstract
abduction machine.
Machine 2
Second-generation abduction machines, such as RED-2, use the hypothesis-
assembly strategy of focusing attention on an unexplained datum and con-
joining to the working hypothesis a hypothesis that best explains that datum
(typically the most plausible explainer). Then the unexplained remainder is
computed, and a new focus of attention is chosen. This process continues
until all the data are explained, or until no further progress can be made. In
effect, the assembly strategy is to decompose the larger abduction problem,
that of explaining all the data, into a series of smaller and easier abduction
subproblems of explaining a series of particular data items. The smaller prob-
lems are easier to solve primarily because they usually have smaller differ-
entials than the larger problem does, that is, there are fewer relevant alterna-
tive explainers. Second generation machines also have a way to handle in-
compatible hypotheses: The consistency of the growing hypothesis is main-
tained, even if doing so requires removing parts of the hypothesis to accom-
modate a new one.
Second-generation machines use the same strategy for ensuring parsimony
as first-generation machines do, but they have a more elaborate strategy for
determining essentialness. Some hypotheses are discovered to be essential
138 ABDUCTIVE INFERENCE
during assembly of an initial composite (if it is noticed that a finding has
only one plausible explainer), but for each part of the initial composite whose
essentialness has not been determined, an attempt is made to construct a
complete, consistent alternative hypothesis that does not use the part in ques-
tion. If this cannot be done, the hypothesis is essential. If essential hypoth-
eses are found, the set of essentials is used as a starting point to assemble a
new composite hypothesis. Parsimony is checked again if necessary.
Overall this is the second-generation strategy. This one too is domain in-
dependent. Machines that use this strategy can be considered to be instances
of abstract Machine 2. Machines of generations one and two were described
in detail in chapter 3 as descriptions of RED-1 and RED-2.
Machine 3
Machine 3 (e.g., PEIRCE) has a more advanced strategy that combines im-
proved overall flexibility and opportunism in problem solving with some
new abilities. The flexibility and opportunism come from choosing at run
time from among several ways of improving the working hypothesis. Most
of these hypothesis-improvement tactics are already parts of the second-
generation strategy, but in Machine 3 they are separated out and available in
mixed order rather than always in the same fixed sequence.
Machine 3 has three main new abilities: control of hypothesis refinement,
special response to special data or patterns of data, and delaying hard deci-
sions. These three abilities, and the opportunism, are described in Chapter
4. Machine 3 also accepts "plug-in modules" that give it additional abilities
to reason with explicit causal, functional, and/or structural knowledge. These
extensions are described in chapter 5.
Control of hypothesis refinement enables a strategy of first assembling a
general hypothesis and then pursuing greater specificity. A diagnostician
might decide that the patient has some form of liver disease and also some
form of anemia, and decide to pursue a more detailed explanation by con-
sidering possible subtypes. Special response to special data or patterns of
data enables a reasoner to take advantage of evoking knowledge that associ-
ates specific findings or finding patterns with hypotheses or families of hy-
potheses especially worth considering. Delaying hard decisions - postpon-
ing choosing an explainer for an ambiguous finding - has advantages for
computational efficiency and for correctness. The efficiency advantages come
from not making decisions during hypothesis assembly that are arbitrary or
risky, and thus avoiding the need for subsequent retraction during the pro-
cess of criticism. The correctness advantage arises because a mistaken ear-
lier decision may adversely affect later decisions during assembly, and criti-
cism may not be able to detect and correct a mistaken choice. (The correct-
ness advantage has not been directly tested empirically.)
Better task analysis, better strategy 139
Table 6.1. Six generations of abduction machines
Machines 4, 5, and 6
Although some subtasks of forming a composite explanation are inherently
sequential, others are not. Our work on Machine 4 was stimulated by the
idea of trying to design for concurrent processing, and thereby force an analy-
sis of the subtask dependencies. The result, however, was a new task analy-
sis and a new abductive-assembly strategy, both of which make sense inde-
pendently of concurrent processing. In Machine 4, subtasks that do not de-
pend on one another are performed concurrently, and composite explana-
tions are formed by starting with a kernel of essential hypotheses and work-
ing outward to less and less confident ones. Fourth-generation machines are
described in this chapter, whereas fifth-generation machines are described
in chapter 9 and sixth-generation machines are described in chapter 10.
Table 6.1 shows the main innovations associated with each of the six ma-
chines. The capabilities of the machines are summarized in the appendix
entitled Truth Seekers in the section entitled Abduction Machines.
Evocation
One method for evocation is to search systematically for applicable con-
Evoke Instantiate
Determine Determine
explanatory initial confidence
coverage value
Instantiation
Once evoked, a concept needs to be instantiated, or attached to the data for
the current case. It is not enough, to use a perceptual example, for the frog
concept to be activated in my brain; it also needs to be attached as a hypoth-
esis to account for the croaking sound that I just heard. Besides determining
what a hypothesis can account for it is also useful to have a prima facie
estimate of its likelihood - an initial plausibility, a confidence estimate that
can be used as a guide in comparing the hypothesis with rivals. Typically we
set up our systems first to try ruling out a hypothesis based on its failure to
meet certain prestored necessary conditions. If rule-out fails, explanatory
coverage is determined; then a confidence estimate is made, sometimes based
in part on whether the explanatory coverage is extensive, or whether the
hypothesis will help explain anything at all. Clearly, the processes of deter-
mining explanatory coverage and of determining initial confidences are, in
general, entangled, and we have not gone far in systematically sorting them
out. Note that in general the instantiation of separate concepts can proceed
in parallel, but that the set-confidence and determine-coverage subtasks of
instantiating a concept are codependent and cannot be pursued separately .
Concurrent assembly
This section describes Machine 4.
Architectural implications. There are five interesting aspects to this design for
concurrent hypothesis formation, viewed from the prospect of realizing it on a
distributed-memory, message-passing, parallel computer architecture. First, the
parallelism among the data processes is fine-grained, as is the parallelism among
the hypothesis processes. However there is no concurrency between the data
processes and the hypothesis processes; they never execute at the same time.
Alternative designs that break down this strict sequentiality can be devised. For
example, as soon as a hypothesis finds out from some finding that it is a Clear-
Best, it could immediately start informing the findings in its explanatory cover-
age that they are explained, without waiting to hear from the remaining find-
ings. Nevertheless, in the main, the nonconcurrency between the data-centered
processes and those that are hypothesis-centered appears to be determined by
dependencies in the hypothesis-formation task rather than by failure to design
cleverly enough. Second, at any given time during the processing, the data and
hypothesis processes are either idle or executing the same instruction on differ-
ent data. Third, the processing is communication intensive. For instance, in the
first cycle during the generation of a composite explanation, each hypothesis
process communicates with every data process. Fourth, for real-world prob-
lems, the number of data and hypothesis processes is likely to be very large.
Even for relatively small abduction systems, the number of data and hypothesis
processes can be in the hundreds. Finally, the fine-grained characteristics of the
concurrent design suggest that neural networks might be efficient for abductive-
assembly problems (see Goel, 1989; Goel, Ramanujan, & Sadayappan, 1988;
Peng & Reggia, 1990; Thagard, 1989).
It appears that among existing computers, the Connection Machine (Hillis,
1986) may be the most suitable for realizing the design that we just pre-
sented. The Connection Machine is a distributed-memory, message-passing,
parallel computer, thus it is a good match to the processing demands of the
design. It enables the same instruction to be executed over different data, which
suits the control of processing called for by the design. Its architecture helps
keep the communication costs within acceptable limits, which is a major con-
cern. It supports massive, fine-grained parallelism among a large number of
small, semi-autonomous processes such as is called for by the design.
Summary of Machine 4
We have described Machine 4 for forming composite explanations. This
machine uses an essentials-first strategy, which sets up three subtasks ac-
Better task analysis, better strategy 151
cording to the confidence (judged by local abductive criteria) with which
elementary hypotheses are included in the composite explanation. In the
first subtask, the Essential hypotheses are identified. In the second subtask,
the Clear-Best hypotheses are identified, where a hypothesis is Clear-Best
if its prima facie confidence value is high both on some absolute scale and
relative to its explanatory alternatives for some finding. In the third
subtask, Weak-Best hypotheses are identified, where a Weak-Best is sim-
ply the best explanation for some finding. First the Essentials, then the
Clear-Bests, then the Weak-Best hypotheses are included in the growing
composite hypothesis. This way the formation of a composite explanation
grows from islands of relative certainty, the composite becoming overall
less and less certain as its explanatory coverage increases. These three
subtasks represent a coarse discretization of a process that could be more
gradual.
After an initial composite hypothesis is built in this way, it may be tested
for parsimony, and improved accordingly, before the processing is concluded.
Machine 4 uses a concurrent mechanism for the abductive-assembly task.
It associates a computational process with each elementary hypothesis and
with each datum to be explained. The processes communicate with each
other by sending messages. The response of a process to a message depends
on which message it receives. The control of processing alternates between
the hypothesis and the data processes. Process synchronization results from
processes waiting for responses to messages that they send.
Machine 4 offers several advantages over earlier generations of abduction
machines. The main advantage lies in the computational efficiency of the
essentials-first strategy, which should also improve accuracy. Additionally,
concurrent processing provides for faster processing, if processors are plen-
tiful. Machine 4 also offers advantages for management of overall uncer-
tainty, as has been described.
No working concurrent implementation of Machine 4 has been done, al-
though the usefulness of the essentials-first strategy has been experimen-
tally tested. This is described in the next section.
h
l
d
h2 " 2
Hypotheses Data
Experiments
In this section, we report on two sets of experiments that investigated the
essentials-first strategy for its computational and psychological validity. The
first set of experiments evaluated the strategy on a library of 42 test cases in
the RED domain. The second set of experiments involved the collection and
analysis of verbal protocols of a human expert synthesizing explanations in
the RED domain.
hi
h
2
y
d
', J
d
»3 3
Hypotheses Data
cians (Smith, Josephson, Tanner, Svirbely, and Strohm [1986] describe some
of these cases).
Of the 42 cases, 17 included at least one Essential hypothesis. Of these 17
cases, in 9 cases Method 2 resulted in more parsimonious explanations than
Method 1 did. Also, the explanations reached in these nine cases were iden-
tical to the answers provided in the teaching material. The remaining 8 cases
fell into two categories: Three cases had solutions containing only one anti-
body, hence there was no need to synthesize a composite explanation. In
each of the remaining 5 cases the hypothesis selected to explain the most
significant reaction by Method 1 was an Essential hypothesis. Thus, for these
cases Method 1 found the same answers as Method 2 did, with equal effi-
ciency. In summary, about two fifths of all cases contained Essential hy-
potheses, and, in approximately half these cases, using the essentials-first
strategy increased the parsimony of the answer.
Incompatibility handling
We suggest that, besides enabling more efficient synthesis of more parsimoni-
ous explanations, the essentials-first strategy is useful for reducing uncertainty
in explanation synthesis and for handling certain types of interactions, such as
incompatibility interactions, more efficiently and effectively (see chapter 9).
The usefulness of the essentials-first strategy for handling incompatibility inter-
actions can be illustrated by the following example. Let D contain three ele-
ments, dp d2, and d3. Let the set of elementary hypotheses contain three ele-
ments, hp h2, and h3, where h{ can account for d p h2 can account for d2 and d3,
and h3 can account for d3, as shown in Figure 6.4. Also, let and h3 be incom-
patible, which implies that and h3 cannot both be included in the same com-
posite explanation.
In this example Method 1, which does not use the essentials-first strategy,
may include h3 in the first cycle to account for d3. In the next cycle, when
trying to explain d p Method 1 may select hp because it is the only explana-
tion for dp However, h{ and h3 are incompatible. Therefore, Method 1 must
backtrack, retract the choice of h3, replace it with h2, and then add hp yield-
ing {hp h2} as the composite explanation.
156 ABDUCTIVE INFERENCE
In contrast, Method 2, using the essentials-first strategy, first selects hj
because it is an Essential hypothesis. In the next cycle, it selects h2 because
hj and h3 are incompatible. Thus, the essentials-first strategy can help re-
duce backtracking by taking advantage of incompatibility interactions.
Notes
1 Since making a copy is such a basic operation on information, we can reasonably conjecture that
it is computationally inexpensive for humans to make copies of fragments of episodic and seman-
tic memory. (Certainly it is inexpensive for a digital computer to make copies of representations.)
If so, an analogy can be formed by a process of copy-and-modify.
2 A classical example is Descartes' plausibility argument for a evil deceiver based on an analogy
with God (Descartes, 1641). We also note that the traditional Argument from Design for the exist-
ence of God is clearly an abduction. Moreover, it relies on an analogy with human designers for
its initial plausibility.
The computational complexity of abduction
Introduction
What kinds of abduction problems can be solved efficiently? To answer this
question, we must formalize the problem and then consider its computa-
tional complexity. However, it is not possible to prescribe a specific com-
plexity threshold for all abduction problems. If the problem is "small," then
exponential time might be fast enough. If the problem is sufficiently large,
then even O(n2) might be too slow. However, for the purposes of analysis,
the traditional threshold of intractability, NP-hard, provides a rough mea-
sure of what problems are impractical (Garey & Johnson, 1979). Clearly,
NP-hard problems will not scale up to larger, more complex domains.
Our approach is the following. First, we formally characterize abduction
as a problem of finding the most plausible composite hypothesis that ex-
plains all the data. Then we consider several classes of problems of this
type, the classes being differentiated by additional constraints on how hy-
This chapter is by Tom Bylander, Dean Allemang, Michael C. Tanner, and John R. Jo-
sephson. Tom Bylander is responsible for most of the mathematical results. The chapter
is substantially the same as a paper by the same name that appeared originally in Artifi-
cial Intelligence vol. 49, 1991, except that here we omit the proofs. It appears with per-
mission of Elsevier Science Publishers B. V. The full paper also appears in Knowledge
Representation edited by R. J. Brachman, H. J. Lavesque, and R. Reiter published by
MIT Press in 1992.
157
158 ABDUCTIVE INFERENCE
potheses interact. We demonstrate that the time complexity of each class is
polynomial (tractable) or NP-hard (intractable), relative to the complexity of
computing the plausibility of hypotheses and the data explained by hypotheses.
Our results show that this type of abduction faces several obstacles. Choos-
ing between incompatible hypotheses, reasoning about cancellation effects
among hypotheses, and satisfying the maximum plausibility requirement are
major factors making abduction intractable in general.
Some restricted classes of abduction problems are tractable. One kind of
class is when some constraint guarantees a polynomial search space, e.g.,
the single-fault assumption (more generally, a limit on the size of composite
hypotheses), or if all but a small number of hypotheses can be ruled out.1
This kind of class trivializes complexity analysis because exhaustive search
over the possible composite hypotheses becomes a tractable strategy.
However, we have discovered one class of abduction problems in which
hypothesis assembly can find the best explanation without exhaustive search.
Informally, the constraints that define this class are: no incompatibility rela-
tionships, no cancellation interactions, the plausibilities of the individual
hypotheses are all different from each other, and one explanation is qualita-
tively better than any other explanation. Unfortunately, it is intractable to
determine whether the last condition holds. We consider one abduction sys-
tem (RED-2) in which hypothesis assembly was applied, so as to examine
the ramifications of these constraints in a real world situation.
The remainder of this chapter is organized as follows. First, we provide a
brief historical background to abduction. Then, we define our model of ab-
duction problems and show how it applies to other theories of abduction.
Next, we describe our complexity results (proofs of which are given in
Bylander, Allemang, Tanner, and Josephson, 1991). Finally, we consider the
relationship of these results to RED-2.
Background
C. S. Peirce, who first described abductive inference, provided two intuitive
characterizations: given an observation d and the knowledge that h causes d9
it is an abduction to hypothesize that h occurred; and given a proposition q
and the knowledge p ->• q, it is an abduction to conclude p (Fann, 1970). In
either case, an abduction is uncertain because something else might be the
actual cause of d, or because the reasoning pattern is the classical fallacy of
"affirming the consequent" and is formally invalid. Additional difficulties
can exist because h might not always cause d, or because p might imply q
only by default. In any case, we shall say that h "explains" d and p "ex-
plains" q, and we shall refer to h and p as "hypotheses" and d and q as
"data."
Pople (1973) pointed out the importance of abduction to AI, and he with
Computational complexity of abduction 159
Miller and Myers implemented one of the earliest abduction systems, IN-
TERNIST-I, which performed medical diagnosis in the domain of internal
medicine (Miller, Pople, & Myers, 1982; Pople, 1977). This program con-
tained an explicit list of diseases and symptoms, explicit causal links be-
tween the diseases and the symptoms, and probabilistic information associ-
ated with the links. INTERNIST-I used a form of hill climbing - once a dis-
ease outscored its competitors by a certain threshold, it was permanently
selected as part of the final diagnosis. Hypothesis assembly (e.g., Machine 2
in chapter 3) is a generalization of this technique. Below, we describe a
restricted class of problems for which hypothesis assembly can efficiently
find the best explanation.
Based on similar explicit representations, Pearl (1987) and Peng & Reggia
(1990) find the most probable composite hypothesis that explains all the
data, a task that is known to be intractable in general (Cooper, 1990). Later
we describe additional constraints under which this task remains intractable.
In contrast to maintaining explicit links between hypotheses and data,
Davis & Hamscher's (1988) model-based diagnosis, determines at run time
what data need to be explained and what hypotheses can explain the data.
Much of this work, such as de Kleer & Williams (1987) and Reiter (1987),
place an emphasis on generating all "minimal" composite hypotheses that
explain all the data. However, there can be an exponential number of such
hypotheses. Recent research has investigated how to focus the reasoning on
the most relevant composite hypotheses (de Kleer & Williams, 1989; Dvorak
& Kuipers, 1989; Struss & Dressier, 1989). However, we have shown that it
is intractable in general to find a composite hypothesis that explains all the
data, and that even if it is easy to find explanations, generating all the rel-
evant composite hypotheses is still intractable.
Whatever the technique or formulation, certain fundamentals of the ab-
duction task do not change. In particular, our analysis shows how computa-
tional complexity arises from constraints on the explanatory relationship
from hypotheses to data and on plausibility ordering among hypotheses. These
constraints do not depend on the style of the representation or reasoning
method (causal vs. logical, probabilistic vs. default, explicit vs. model-based,
ATMS or not, etc.). In other words, certain kinds of abduction problems are
hard no matter what representation or reasoning method is chosen.
Model of abduction
An abduction problem is a tuple (Da[r Haip e, pi), where:
Dall is a finite set of all the data to be explained,
Hall is a finite set of all the individual hypotheses,
e is a map from subsets of Hall to subsets of Dall
(H explains e(//)), and
pi is a map from subsets of Hall to a partially ordered set
(H has plausibility pl{H)).
For the purpose of this definition and the results that follow, it does not
matter whether pl(H) is a probability, a measure of belief, a fuzzy value, a
degree of fit, or a symbolic likelihood. The only requirement is that the
range of pi is partially ordered.
H is complete if e(H) = DaW That is, H explains all the data.
H is parsimonious if 3/f a H (e(H) cz e(H)). That is, no proper subset of
H explains all the data that H does.
H is an explanation if it is complete and parsimonious. That is, //explains
all the data and has no explanatorily superfluous elements. Note that an
explanation exists if and only if a complete composite hypothesis exists. 2
H is a best explanation if it is an explanation, and if there is no explanation H
such that/?/(//) > pl{H). That is, no other explanation is more plausible than //.
It is just "a best" because pi might not impose a total ordering over composite
hypotheses (e.g., because of probability intervals or qualitative likelihoods).
Consequently, several composite hypotheses might satisfy this definition.
Tractability assumptions
In our complexity analysis, we assume that e and pi are tractable. We also
assume that e and pi can be represented reasonably, in particular, that the
size of their internal representations is polynomial in \Dall | + \Hall |.
Clearly, the tractability of these functions is central to abduction, since it
is difficult to find plausible hypotheses explaining the data if it is difficult
to compute e and pi. This should not be taken to imply that the tractability
of these functions can be taken for granted. For example, it can be intrac-
table to determine explanatory coverage of a composite hypothesis (Reiter,
1987) and to calculate the probability that an individual hypothesis is present,
ignoring other hypotheses (Cooper, 1990). We make these assumptions to
simplify our analysis of abduction problems. To reflect the complexity of
these functions in our tractability results, we denote the time complexity of
e and pi with respect to the size of an abduction problem as Ce and CpP re-
spectively, e.g., nCe indicates n calls to e.
For convenience, we assume the existence and the tractability of a func-
tion that determines which individual hypotheses can contribute to explain-
ing a datum. Although it is not a true inverse, we refer to this function as e'1,
formally defined as:
Simplifications
We should note that these definitions and assumptions simplify several as-
pects of abduction. For example, we define composite hypotheses as simple
combinations of individual hypotheses. In reality, the relationships among
the parts of an abductive answer and the data being explained can be much
more complex, both logically and causally.
Another simplification is that domains are not defined. One way to do this
would be to specify what data are possible (D ) and general functions for
computing explanatory coverage and plausibilities based on the data {e en
and pi en). Then for a specific abduction problem, the following constraints
woul/hold: DM <= Dposs, e{H) = egm (H, DJ, and pl{H) = plgen (H, DJ (cf.
Allemang, Tanner, Bylander, & Josephson, 1987).
The definitions of abduction problems or domains do not mention the data
that do not have to be explained, even though they could be important for
determining e and pi. For example, the age of a patient does not have to be
explained, but can influence the plausibility of a disease. We assume here
that e and pi implicitly take into account data that do not have to be ex-
plained, e.g., in the definition of domains just mentioned, these data can be
an additional argument to e and pi .
° gen r gen
An example
Let us use the following example to facilitate our discussion:
- {hr h2, hy
DM — {d., d2, d,d4)
e(h,) = {dt} pl(hj) = superior
e(h2) = {dr d2} pl(h 2) = excellent
e(h}) pl(h3) = good
e{h4) = idr d4) pl(h4) = fair
e{h5) = {d3, d4) pl(h5) = poor
164 ABDUCTIVE INFERENCE
-* superior
-> excellent
-» good
->fair
-» poor
That is, a composite hypothesis explains a datum if and only if one of its
Computational complexity of abduction 165
elements explains the datum. This constraint makes explanatory coverage
equivalent to set covering (Reggia, 1983). Assuming independence, the ex-
planations in our example (refer to Figure 7.1) are: {hr hy h4}, {h]f h3, h5},
{hr h4, h5}, {h2, hy h4), and {h2, h5}. One way to find a best explanation
would be to generate all explanations and then sort them by plausibility.
However, it is well known that there can be an exponential number of expla-
nations. It is not surprising then that determining the number of explana-
tions is hard.
Algorithm 7.1 performs this task within this order of complexity. A de-
tailed explanation of this algorithm is given in (Bylander, Allemang, Tan-
ner, & Josephson, 1991), but we note several aspects of its operation here.
Return nil
Find an explanation.
all
For each h e Hall
If e(W\{h)) = Da
W<r- W\{h)
Return W
sumptions for the definition of e'1), then the working hypothesis W, instead
of being initialized to Hall, can be initialized to include only one element
from e\d) for each de Dair This modification has an advantage if e1 is easy
to compute and the working hypothesis remains "small."
That is, a composite hypothesis does not "lose" any data explained by any of
its subsets and might explain additional data. All independent abduction prob-
Computational complexity of abduction 167
lems are monotonic, but a monotonic abduction problem is not necessarily
independent. If, in Figure 7.1, {h2, h3} also explained d4, then {h2, h3} would
also be an explanation and {h2, hy h4) would not be. Monotonic abduction
problems from the literature include our hypothesis-assembly strategies de-
scribed so far (see Allemang et al., 1987) and Pearl's belief revision theory
if interactions are restricted to noisy-OR and noisy-AND (Pearl, 1987).
Because the class of monotonic abduction problems includes the indepen-
dent class, it is also hard to determine the number of explanations. In addi-
tion, we have shown that it is hard to enumerate a polynomial number of
explanations.
Algorithm 7.1 performs this task within this order of complexity. Because
of the monotonicity constraint, Hall must explain as much or more data than
any other composite hypothesis. The loop in Algorithm 7.1 works for the
same reasons as for independent abduction problems. Also, it is possible to
use e1 to initialize W, though one must be careful because more than one
element from e~\d) might be needed to explain d.
We have proven this result by reduction from 3 SAT (Garey & Johnson,
1979), which is satisfiability of Boolean expressions in conjunctive normal
form, with no more than three literals in any conjunct. Informally, the re-
duction works as follows. Each 3SAT literal and its negation corresponds to
an incompatible pair of hypotheses. Each conjunct of the Boolean expres-
sion corresponds to a datum to be explained. Satisfying a conjunct corre-
sponds to a hypothesis explaining a datum. Clearly then, a complete com-
posite hypothesis exists if and only if the Boolean expression is satisfiable.
Furthermore, a complete composite hypothesis exists if and only if an ex-
planation exists. Our proof shows that only O (|#J) incompatible pairs are
needed to give rise to intractability.
The underlying difficulty is that the choice between a pair of incompat-
ible hypotheses cannot be made locally, but is dependent on the choices
Computational complexity of abduction 169
from all other incompatible pairs. It is interesting to note that the parsimony
constraint plays no role in this result. Just finding a complete composite
hypothesis is hard in incompatibility abduction problems.
It follows that:
Complexity of plausibility
The restriction to OR interactions means that each effect can be true only
if one or more of its parents are true. This restriction makes it easy to find a
value assignment w such that P (w \v) > 0. Although this theorem could be
demonstrated by adapting the proof for Theorem 7.12, it is useful to show
that the best-small plausibility criterion has a correlate in probabilistic rea-
soning.
The reduction from independent abduction problems using best-small works
as follows. Each h e Hall is mapped to a "hypothesis" variable. Each d E Dall
is mapped to a "data" variable that is true if and only if one or more of the
hypothesis variables corresponding to e\d) are true, i.e., an OR interaction.
The a priori probabilities of the hypothesis variables being true must be
between 0 and 0.5, and are ordered according to the plausibilities in the
abduction problem. Initializing all the data variables to true sets up the prob-
lem. The MPE for this belief revision problem corresponds to a best expla-
nation for the best-small problem. Because finding a best explanation is NP-
hard, finding the MPE must be NP-hard even for belief networks that only
contain OR interactions.
VA, h' E Hall (h*h* -> (pi (h) < pi (/*') v pi (h) > pi (/*')))
Again, let/i = | D J + | / J J .
Algorithm 7.2 performs this task within this order of complexity. It is the
same as Algorithm 7.1 except that the loop considers the individual hypoth-
eses from least to most plausible. The explanation that Algorithm 7.2 finds
174 ABDUCTIVE INFERENCE
W<-W\{h}
Return W
Independent. The additive nature of the reactions means that for separate
reactions and compatible hypotheses, the independence constraint is met.
However, since independent abduction problems do not allow for parts of
data to be explained, they cannot describe additivity of reaction strengths.
Monotonic. If we view a weak result for some reaction as a separate result from
a strong result for the same reaction, then we can say that the phenomenon of
additive reaction strengths falls into the class of monotonic abduction prob-
lems. That is, each of two antibodies alone might explain a weak reaction. To-
gether, they would explain either a weak reaction or a strong reaction.
Discussion
We have discovered one restricted class of abduction problems in which it is
tractable to find the best explanation. In this class, there can be no incom-
patibility relationships or cancellation interactions, the plausibilities of the
individual hypotheses are all different from each other, and there must be
exactly one best explanation according to the best-small plausibility crite-
rion.
Unfortunately, it is intractable to determine whether there is more than
one best explanation in ordered abduction problems. However, it is still trac-
table to find one of the best-small best explanations in ordered monotonic
abduction problems. Incompatibility relationships, cancellation interactions,
and similar plausibilities among individual hypotheses are factors leading
to intractability.
For abduction in general, however, our results are not encouraging. We
believe that few domains satisfy the independent or monotonic property,
i.e., they usually have incompatibility relationships and cancellation inter-
actions. Requiring the most plausible explanation appears to guarantee in-
tractability for abduction. It is important to note that these difficulties result
178 ABDUCTIVE INFERENCE
from the nature of abduction problems, and not the representations or algo-
rithms being used to solve the problem. These problems are hard no matter
what representation or algorithm is used.
Fortunately, there are several mitigating factors that might hold for spe-
cific domains. One factor is that incompatibility relationships and cancella-
tion interactions might be sufficiently sparse so that it is not expensive to
search for explanations. However, only O (n) incompatibilities or cancella-
tions are sufficient to lead to intractability, and the maximum plausibility
requirement still remains a difficulty.
Another factor, as discussed in the introductory section of this chapter, is
that some constraint might guarantee a polynomial search space, e.g., a limit
on the size of hypotheses or sufficient knowledge to rule out most individual
hypotheses. For example, if rule-out knowledge can reduce the number of
individual hypotheses from h to log h, then the problem is tractable. It is
important to note that such factors do not simply call for "more knowledge,"
but knowledge of the right type, e.g., rule-out knowledge. Additional knowl-
edge per se does not reduce complexity. For example, more knowledge about
incompatibilities or cancellations makes abduction harder.
The abductive reasoning of the RED-2 system works because of these
factors. The size of the right answer is usually small, and rule-out knowl-
edge is able to eliminate many hypotheses. RED-2 is able to avoid exhaus-
tive search because the non-ruled-out hypotheses are close to an ordered
monotonic abduction problem.
If there are no tractable algorithms for a class of abduction problems, then
there is no choice but to do abduction heuristically (unless one is willing to
wait for a very long time). This poses a challenge to researchers who at-
tempt to deal with abductive inference: Provide a characterization that re-
spects the classic criteria of good explanations (parsimony, coverage, con-
sistency, and plausibility), but avoids the computational pitfalls that beset
solutions attempting to optimize these criteria. We believe that this will lead
to the adoption of a more naturalistic or satisficing conceptualization of
abduction in which the final explanation is not guaranteed to be optimal,
e.g., it might not explain some data. (This idea is developed in chapter 9.)
Perhaps one mark of intelligence is being able to act despite the lack of
optimal solutions.
Our results show that abduction, characterized as finding the most plausible
composite hypothesis that explains all the data, is generally an intractable prob-
lem. Thus, it is futile to hope for a tractable algorithm that produces optimal
answers for all kinds of abduction problems. To be solved efficiently, an abduc-
tion problem must have certain features that make it tractable, and there must
be a reasoning method that takes advantage of those features. Understanding
abduction, as for any portion of intelligence, requires a theory of reasoning that
takes care for the practicality of computations.
Computational complexity of abduction 179
Notes
1 The latter constraint is not the same as "eliminating candidates" in de Kleer & Williams (1987) or
"inconsistency" in Reiter (1987). If a hypothesis is insufficient to explain all the observations, the
hypothesis is not ruled out because it can still be in composite hypotheses.
2 Composite hypotheses that do not explain all the data can still be considered explanations, albeit
partial. Nevertheless, because explaining all the data is a goal of the abduction problems that we
are considering, for convenience, this goal is incorporated into the definition of "explanation."
3 The symbol "\" stands for set difference, i.e. A\B consists of the set of all members of A that are
not members of B.
4 There might be more than one maximal subset of observations that satisfies these conditions. If
so, then e(H) selects some preferred subset.
5 For belief networks, we use a boldface italic lower case letter to stand for a (set of) value
assignment(s) to a (set of) variable(s), which is denoted by a BOLDFACE ITALIC uppercase
letter.
6 One difficulty with the more "natural" mapping/?/ (H) = P (H=true |v) is that even if the MPE is
parsimonious, it might not be the best explanation.
7 Also, it is #P-complete to determine the number of complete composite hypotheses. The defini-
tion of #P-complete comes from Valiant (1979).
8 Incompatible pairs are the most natural case, e.g., one hypothesis of the pair is the negation of the
other, n mutually exclusive hypotheses can be represented as n(n-l)ll incompatible pairs. Incom-
patible triplets (any two of the three, but not all three) and so on are conceivable, but allowing
these possibilities in the formal definition does not affect the complexity results.
8 Two more diagnostic systems
180
Two more diagnostic systems 181
actual knowledge-based systems. One reason for this chapter's presence is
to show some variety in abductive-assembly strategies so that the reader
will gain a broader view of the possibilities. Another is to show that interest-
ing diagnostic systems have been built for complex real-world domains. Fi-
nally, we want to show that abduction gives some handle on learning.
Jon Sticklen's MDX2 is a medical diagnosis system in a subdomain of
mixed clinical diseases. The system integrates generic-task problem solvers
(described in chapter 2) for hierarchical classification, hypothesis match-
ing, knowledge-directed data retrieval, and abductive assembly. It uses knowl-
edge of function and structure to help make some decisions, and it can ask
for further medical tests to help resolve difficult cases. It uses "concept clus-
ters" to organize knowledge and to help control the abductive processing.
QUAWDS by Michael Weintraub, Tom Bylander, and Sheldon Simon is a
system for diagnosing human-gait disorders that result from diseases affect-
ing motor control such as cerebral palsy or stroke. The potential for compu-
tational complexity problems is great in this domain due to an especially
high density of interacting hypotheses. In the third section of this chapter,
an abductive approach to learning (as knowledge-base refinement) is de-
scribed. Mistakes made by a system can be explained by assigning blame to
incorrect knowledge in the knowledge base. This was implemented for the
QUAWDS system by Michael Weintraub.
The diagnosis,
with accounting
scheme.
Is ,,, high/low.
Which hypotheses are Has ... been given':
plausible ? What can they
account for?
How many S &S
can you account
for?
Am I a plausible
hypothesis for the
current case?
Figure 8.1 shows the major types of problem-solving units in MDX2 and
the major information-processing requests that the various units can make
of one another. Conceptually, MDX2 consists of a compiled-level, abduc-
tion-based problem solver (the topic of this section), an intelligent database
(Mittal, Chandrasekaran, & Sticklen, 1984), and a deep-level problem solver
(described in Sticklen & Chandrasekaran, 1989).
The fundamental abductive-control cycle in MDX2 is as follows:
Step 1. The top-level abductive assembler calls one of the disease-cluster prob-
lem solvers. The cluster selected is chosen by counting the number of pa-
tient signs and symptoms for which the cluster can account.1 Note that the
entire disease cluster is thus treated by the abductive assembler as an
epistemically important object. In particular, the information required by
the abductive assembler (a potential importance measure of each classifier
for the current case) can be provided by the cluster based on a compiled
listing of the patient observations that the categories of the cluster might
explain. The cluster compares this compiled list with the observations of
the current patient, then reports the number of observations that categories
of the cluster may explain. The "may" is because, for any particular case,
not all the categories within one cluster will be plausible.
Step 2. The selected cluster - which in MDX2 is a type of classifier- au-
Two more diagnostic systems 183
tonomously accounts for as much as possible given the constraint that
it can use only the plausible diagnostic categories that it contains (i.e.,
the classifier produces a local abductive answer). To carry out its clas-
sification problem solving, the classifier uses other problem solvers of
the system to determine how plausible a given diagnostic classification
category is. When a particular classifier completes its task, it reports to
the abductive assembler a listing of plausible diseases (as a simple clas-
sifier would) and an accounting of what each plausible disease can ac-
count for in the current case. The local nature of the construction of
this partial accounting scheme also allows a disciplined examination
of interacting diseases. Discussion of how disease interaction is handled
is beyond the scope of this section but is described in Sticklen,
Chandrasekaran, & Josephson (1985).
Step 3. The abductive assembler checks whether all patient signs and symp-
toms are accounted for. If they are, the abductive assembler first con-
structs a nonredundant accounting scheme for patient manifestations,
returns the accounting scheme to the user, then halts. If not all patient
manifestations are accounted for, the abductive assembler repeats the
procedure. The final accounting algorithm that is used is similar to a
portion of the RED algorithm (the portion that removes superfluous
hypotheses) and is not computationally complex.
The prevalent abductive models partition the abduction problem into two
subtasks, either explicitly or implicitly: Generate candidate hypotheses and
construct an abductive answer from the generated hypotheses. In MDX2,
the abductive problem is partitioned into three tasks: (1) Focus on appropri-
ate clusters of hypotheses (step 1, carried out by the abductive assembler),
(2) determine locally (within the individual hypothesis cluster) a partial ac-
counting scheme of the plausible hypotheses and the observations for which
they account (step 2, carried out by the hypothesis area focused on in step
1), and (3) carry out bookkeeping to determine whether all observations are
accounted for and, if so, remove any superfluous hypotheses (step 3, carried
out by the top-level abductive assembler).
The basic mechanism of the focusing step is for the assembler to poll
the hypothesis clusters, asking for a measure of relevance for the current
case. In MDX2, this measure of importance is computed in a rudimentary
way: Each hypothesis area compares the current patient observations with
a compiled list of the observations that its hypotheses may account for. The
computed importance of a particular cluster is assigned according to the
number of observations for which a cluster may account. Clearly, more so-
phisticated methods of determining hypothesis-cluster importance are de-
sirable.
Discussion
An abductive answer produced by MDX2 is not ensured to be an optimal
solution. MDX2 halts once all observations are accounted for. This can hap-
184 ABDUCTIVE INFERENCE
pen after only a few of the classification units of the system are invoked.
Complete coverage of observations is a characteristic of MDX2 abduction;
minimal cardinality of the final composite answer is not.
Thus, MDX2 can make mistakes. For example, suppose that the indexing
to the classification units leads to invocation of classifiers in order C p C2,
C3. Further, suppose that three diseases are found likely by these three clas-
sifiers; these diseases taken together account for the known patient observa-
tions. As soon as MDX2 encounters this situation, it halts and yields an
answer that includes the three diseases along with the associated accounting
scheme that covers all the current observations. Finally, suppose that there
is a classifier Cn such that it contains a single disease that accounts for all
patient observations by itself. MDX2 will miss this lower cardinality solu-
tion.
There are two responses to the awareness of this difficulty in MDX2. We
could suggest that MDX2 procedures are faulty, that, in fact, all disease
areas should be examined to make sure that the best abductive answer is
found. However, this strategy would be computationally expensive. Alter-
natively, we could suggest that the indexing to the disease areas was not
adequate, that the index should have led us to examine Cn first. The index-
ing in the MDX2 system is extremely primitive. This indexing technique
has the advantage of being simple but the disadvantage of almost surely
being inadequate. The point illustrated by MDX2 is not to propose a defini-
tive indexing technique but to shift the emphasis from a search for a clever
algorithm for solving a demonstrably intractable problem, to a search for
the principles of memory organization and indexing that will allow us to
effectively avoid the intractable problem.
Introduction
In this section we describe QUAWDS (Qualitative Analysis of Walking
Disorders), a system for interpreting human gait. The QUAWDS system is
presently restricted to pathologies resulting from diseases that affect motor
control, such as cerebral palsy (CP) or stroke. CP affects the brain and mani-
fests itself by interfering with the coordination of muscle activity. These
effects - muscle tightness, spasticity, and weakness - rather than CP are the
focus in pathologic gait analysis (that the patient has CP is known before the
gait analysis is performed). We begin by describing the domain of human
gait analysis. We then discuss the advantages and limitations of some tradi-
tional diagnostic models and describe how they have been applied to gait
analysis. Next, we present a diagnostic architecture that combines "associa-
Two more diagnostic systems 185
tional" and "qualitative causal" knowledge, and that takes advantage of the
subtasks that each kind of knowledge can help accomplish efficiently and
effectively. An abductive hypothesis assembler is used to coordinate the dif-
ferent modules. It produces a diagnostic solution that is locally best (i.e., no
single change to the answer will produce a better solution). Finally, we pro-
vide an example of gait analysis.
In a normal person, the neurological system controls the muscles through coor-
dinated commands to rotate limbs at several joints, providing body propulsion
and stability for walking. A gait cycle consists of the time between a heel strike
and the next heel strike of the same foot. The most significant events of the gait
cycle are right heel strike (RHS), left toe-off (LTO), left heel strike (LHS), and
right toe-off (RTO). These events delimit the four major phases of gait: weight
acceptance (WA), single-limb stance (SLS), weight release (WR), and swing.
For example, right WA is from RHS to LTO; right SLS is from LTO to LHS;
right WR is from LHS to RTO; and right swing is from RTO to the next RHS.
These events are illustrated in Figure 8.2.
The goal of diagnosis in this domain is to identify the improper muscle
activity and the joint limitations that cause the deviations from normal that
are observed in a patient's gait. The input is the information gathered at the
Gait Analysis Laboratory at The Ohio State University and includes three
types of data: clinical, historical, and motion. Clinical data come from the
physical examination of the patient, and measure both the range of motion
of the different joints and the qualitative strength of different muscle groups.
Historical data include information about any past medical procedures or
diagnoses. Motion data specify the time/distance parameters of walking (ve-
locity, stride length, stance and swing times, etc.) and the angular position
of the patient's leg joints (hips, knees, and ankles) in all three planes during
a gait cycle. Motion data also include electromyograph (EMG) data of se-
lected muscle groups; EMG data indicate when nervous stimulation of a
muscle occurs during the gait cycle.
Gait analysis is very difficult. Many gait parameters cannot be measured
directly using current technology. For example, EMG data are at best a rela-
tive measure of muscle forces (Simon, 1982). Multiple faults often occur
and interact with one another. In our experience it is not unusual for a pa-
tient to have 10 or more faults. Moreover, human gait involves a number of
highly interacting components and processes including mechanisms that at-
tempt to compensate for a fault. As a result, an apparently abnormal behav-
ior might actually serve to improve overall functioning.
186 ABDUCTIVE INFERENCE
-Gait Cycle-
Observation
Identifier
Abductive
Assembler
Fault Hypothesis
Rater
Qualitative
Knowledge
Explanatory Coverage
Determiner
from human experts (or other sources) are still needed to help guide the
search through the hypothesis space. The diagnostic architecture of QUAWDS
takes advantage of the information available from both types of models to
perform diagnosis efficiently, while avoiding the potential combinatorics.
Figure 8.3 shows the high-level functions and components of the architec-
ture. The associational component identifies which observations need to be
explained and determines the relative plausibilities of potential fault hy-
potheses; the qualitative component generates the set of fault hypotheses
offering to account for an observation and determines the explanatory cov-
erage of single- or multiple-fault hypotheses, taking hypothesis interactions
into account. A third component, a hypothesis assembler, coordinates these
subtasks and constructs a composite diagnosis.
Tight Hamstrings
Summary KG
if (Motion Evidence=+2) and (NonmotionEvidence=+1)
then +3
else...
A gait analysis includes data on the patient's range of motions - both those
measured during the physical examination and those measured dynamically
- and EMG data. The clinical examination reveals a limitation in the range
of motion at the hip on both sides and a limitation in the range of ankle
dorsiflexion (pointing the foot up). To keep the example simple we show
only the sagittal plane motions (Figure 8.6). Dynamic range-of-motion mea-
surements indicate a joint's observed range of motion during a gait cycle;
this measurement can be different from the physical exam data because the
forces exerted on a joint during gait are greater than those applied in a physical
exam. EMG data (Figure 8.7) identifies muscle activity by phase over the
tested muscle groups. The symbols in the table represent whether the muscle
group was on, off, or unknown (DK for "Don't Know") for each muscle
group and phase. From this data, disphasic (out of phase) activity is deter-
mined. In this example both knees' range of motion is significantly decreased.
The physical exam also indicates some tightness of the hamstrings.
Given the set of findings, the assembler begins by asking the finding iden-
tifier to select a finding to be explained. To select a finding, the finding
identifier considers the patient's medical history, the amount of deviation
from normal, and the duration of the deviation. In this example the decreased
motion of the right hip during nearly the entire gait cycle is selected. This
choice is made because the hamstrings (which were surgically lengthened)
directly affect hip motions, because of the large amount of deviation from
normal motion, and because of the long duration of the deviation.
The assembler now needs to know the set of fault hypotheses that can
cause this finding. Several types of faults are possible in this domain: Muscles
Two more diagnostic systems 193
\
>
right rectus femoris and underactive right rectus femoris are possible. Al-
though the equation used by the model includes the motions of other joints
as forces affecting hip motion (e.g., knee motions), QUAWDS does not gen-
erate these indirect hypotheses because they will be generated when abnor-
mal motions of that joint, if any, are considered.
For the abnormal hip motion and the case data given in this example, any of
the four faults generated can explain the abnormal hip motion. Also, if the knee
is less flexed during swing (or an abnormal acceleration during swing leads to
less knee flexion during swing), and if some fault in the hypothesis explains the
decreased knee flexion, then this fault would, in the right circumstances, also
explain the abnormal hip extension. For example, if overactive quadriceps ex-
plains a lack of knee flexion during swing, it would also explain the hip exten-
sion, provided that the "amount" of hip extension does not exceed the "amount"
of decreased knee flexion during swing.
The next step in the assembly method is to rate the plausibility of fault
hypotheses by using a classification hierarchy and hypothesis matchers imple-
mented in CSRL (Bylander & Mittal, 1986). In our example the hypothesis
overactive right hamstrings is considered the most plausible because of: in-
creased right hip extension during most of the gait cycle, some increased
Two more diagnostic systems 195
flexion of the right knee (the hamstrings also flex the knee), EMG data that
indicated disphasic activity, and physical exam data that showed hip-exten-
sor tightness. The hamstring lengthening tends to weaken the muscle, but
this is contraindicated by the hip tightness indicated in the physical exam.
The other hypotheses are not considered highly plausible because there is
not much supporting data.
The assembler now needs to determine the explanatory coverage of the
hypothesis overactive right hamstrings. This hypothesis is given to the quali-
tative model to determine what motions this hypothesis explains. The model
uses a set of qualitative differential equations (de Kleer & Brown, 1984)
that describe the main torque-producing forces in gait. An equation is speci-
fied for each rotational motion of interest, for each joint, during each seg-
ment of the gait cycle. Each equation specifies the muscles and indirect
forces that produce torques affecting the rotation (Hirsch et al., 1989). For
example, the sagittal plane motion of the hip during swing is described by:
196 ABDUCTIVE INFERENCE
Our domain expert, Sheldon Simon, judges that two additional faults should
be added to the diagnosis: overactive right quadriceps, to explain the de-
creased knee flexion in the first half of swing, and weak left gastroc/soleus,
to explain the ankle dorsiflexion during late SLS and WR. In both cases,
QUAWDS concluded that other hypotheses indirectly caused these findings.
QUAWDS determined that an overactive glutens maximus caused decreased
hip flexion, which in turn caused decreased knee flexion, and that overac-
tive left hamstrings caused increased knee flexion, which caused increased
ankle dorsiflexion.2 We are studying possible changes to QUAWDS to re-
solve this problem.
CREAM
CREAM is a system that uses an abductive method to assign credit in a
knowledge base. It begins by reviewing a trace of the KBS's execution of
the case to propose a set of error candidates (hypotheses). (The following
subsection describes how a set of error candidates are defined for a KBS and
how they are generated for a case.) CREAM uses a system called ICE (Iden-
tify Candidate Errors) to identify the error candidates. Next CREAM selects
the best set of error candidates for explaining the mistake by generating and
critiquing domain explanations. Not only is domain knowledge necessary
for this step, but knowledge as to what constitutes an adequate explanation
and criteria for selecting among the explanations must also be defined. A
system called CONE (CONstrained Explainer) is used to generate appropri-
ately detailed domain explanations of both normal and abnormal behaviors
using a model of the domain represented in FR (described in chapter 5)
augmented by qualitative differential equations (described for QUAWDS in
198 ABDUCTIVE INFERENCE
knowledge-base
system trace expert's answer
system's answer
ICE
CONE
Apply explanation
evaluation criteria
CREAM
A set of plausible
error candidates
the previous section). CREAM then compares these explanations using sev-
eral criteria until one explanation is rated better than the others or until it is
determined that one best explanation cannot be identified. The architecture
of CREAM is shown in Figure 8.8. The details of the system are described in
Weintraub (1991).
ha(O) = H
Notes
1 An initial data gathering step is carried out prior to initial classifier selection. The disease area is
selected based on the observations made available during that step.
2 In Weintraub and Bylander (1989), QUAWDS did include weak left gastrocnemius/soleus in the
final answer of the diagnosis for this case, but for the wrong reasons. At the time, QUAWDS did
not infer that overactive left hamstrings explains knee flexion throughout SLS, which then is
considered to explain the ankle dorsiflexion. Our domain expert agrees that overactive left ham-
strings is a factor affecting the abnormal ankle dorsiflexion but judges that this fault does not
cause all of the abnormality. Similarly, overactive right gluteus maximus is a factor that decreases
knee flexion, but our domain expert judges that the fault does not explain all of the decrease.
Better task definition, better strategy - machine 5
Tractable abduction
Abduction can be described as "inference to the best explanation," which
includes the generation, criticism, and possible acceptance of explanatory
hypotheses. What makes one explanatory hypothesis better than another are
such considerations as explanatory power, plausibility, parsimony, and in-
ternal consistency. In general a hypothesis should be accepted only if it sur-
passes other explanations for the same data by a distinct margin and only if
a thorough search was conducted for other plausible explanations.
Abduction seems to be an especially appropriate and insightful way to
describe the evidence-combining characteristics of a variety of cognitive
and perceptual processes, such as diagnosis, scientific theory formation,
comprehension of written and spoken language, visual object recognition,
and inferring intentions from behavior. Thus abductive inference appears
to be ubiquitous in cognition. Moreover, humans can often interpret im-
ages, understand sentences, form causal theories of everyday events, and
so on, apparently making complex abductive inferences in fractions of a
second.
Yet the abstract task of inferring the best explanation for a given set of
data, as the task was characterized in chapter 7, has been proved to be
computationally intractable under ordinary circumstances. Clearly there is
a basic tension among the intractability of the abduction task, the ubiquity
of abductive processes, and the rapidity with which humans seem to make
abductive inferences. An adequate model of abduction must explain how
cognitive agents can make complex abductive inferences routinely and rapidly.
In this section we describe two related ideas for understanding how ab-
duction can be done efficiently: (1) a better characterization of the infor-
mation-processing task of abductive assembly and (2) a better way to handle
incompatibility relationships between plausible elementary hypotheses. The
new characterization of the abductive-assembly task is "explaining as much
as possible," or, somewhat more precisely, "maximizing explanatory cover-
Tractable Abduction s by John R. Josephson and Ashok K. Goel. Software: PEIRCE-
IGTT is by Richard Fox and John R. Josephson. Experiment: Uncertainty and Correct-
ness is by Michael C. Tanner and John R. Josephson.
202
Better task definition, better strategy 203
age consistent with maintaining a high standard of confidence." This task is
computationally tractable, in contrast to "finding the best explanation for
all of the data," which is generally computationally intractable (at least as it
was characterized in chapter 7). The tractability of the task under the new
description is demonstrated by giving an efficient strategy for accomplish-
ing it. Using this strategy a confident explanation is synthesized by starting
from islands of relative certainty and then expanding the explanation oppor-
tunistically. This strategy does well at controlling the computational costs
of accommodating interactions among explanatory hypotheses, especially
incompatibility interactions. Until now incompatibility relationships have
seemed to be a main source of intractability by potentially forcing a combi-
natorially explosive search to find a complete and consistent explanation. In
contrast, the new strategy demonstrates how incompatibility relationships
can be used to distinct advantage in expanding partial explanations to more
complete ones.
ii
practical
\ cutoff
certaint v
explanatory coverage
explanatory
coverage
time practical
cutoff
we are trying to explain is why the nuclear reactor is rapidly getting hotter
and hotter). Thus we typically want to maximize explanatory coverage while
minimizing specific kinds of error costs.
Along with maximizing explanatory coverage and minimizing the cost of
error, we often want to minimize the amount of time spent trying to explain.
In real life the practical need to act often imposes cutoffs on our processing
time. Thus we can pose the problem of maximizing explanatory coverage in
a given amount of processing time. Evolution's job is to design abductive
mechanisms that not only arrive at time-critical conclusions quickly, but
also make good use of more processing time if it is available. Better yet are
mechanisms that combine abilities to rapidly come to time-critical conclu-
sions, to fruitfully spend more available time, and to learn from experience
to perform better. Spending processing time to gain explanatory coverage is
illustrated in Figure 9.2.
Finally, we would like to design or discover the optimal computational
strategy for doing abductive processing. We do not claim to have done this.
• Old Definition -
• New Definition •
Processing strategy
Suppose that we are given a set of data to be explained and a set of elemen-
tary hypotheses able to explain various portions of the data. Suppose that
associated with each elementary hypothesis is a description of its explana-
tory coverage and an initial confidence score. This confidence score might
arise as a result of matching the stored hypotheses with pre-established pat-
terns of evidence features, or in some other way. Suppose, too, that we are
given information about the interactions of elementary hypotheses; for this
discussion we limit the interaction types to hypothesis-pair incompatibili-
ties and "independent" explanatory interactions (in the sense of Chapter 7),
although several other types of interactions can be handled by relatively
simple extensions to the strategy described here.
We describe an abductive process that climbs a hill of increasing explana-
tory power, while preserving confidence, and quitting as soon as "signifi-
Better task definition, better strategy 209
cant difficulty" is encountered. We explicitly treat the kind of difficulty where
explaining any more would require decreasing confidence below a given
threshold, but the basic strategy we describe can be readily extended to other
kinds of difficulty, such as exhaustion of computational resources. Similar
to Machine 4's essentials-first strategy, described in chapter 6, we describe
a strategy that pursues the abduction task as a sequence of subtasks, as fol-
lows:
believed
guessed
plausible hypotheses potential explainers
explanatorily useless
disbelieved
explained
puzzling data tentatively explained
unexplained
Discussion of the strategy. As we have just described the process, the syn-
thesis of the explanation starts from islands of relative certainty, then grows
opportunistically. It uses Machine 4's essentials-first strategy, but now ex-
tends the strategy to make good use of incompatibility interactions. This
new strategy for handling incompatibility interactions is Machine 5's con-
tribution. We sometimes call this the essentials-first leveraging-incompat-
ibilities strategy.
In depending on distinguishing Clear-Best hypotheses, Machine 5 is quite
different from the hypothesis-assembly strategy of QUAWDS, described in
chapter 8, even though, like QUAWDS, it is willing to settle for partial ex-
planations. To clarify the difference let us consider a case with one finding
to be explained, and two hypotheses offering to explain it. If these hypoth-
eses are close in confidence values, then QUAWDS would choose the one
with the larger confidence value, whereas Machine 5 would decline to choose,
since a confident choice cannot be made. Machine 5 would make the choice
if guessing were enabled, but then it would be marked as a low-confidence
conclusion.
In many domains, such as medicine, Essential hypotheses are probably
rare. The basic Machine 5 strategy is to find the highest mountains of confi-
212 ABDUCTIVE INFERENCE
dence that can be found, and to leverage the hypothesis formation from that
point, until there is trouble (e.g., no further progress can be made at that
level of confidence). The highest mountains are the Essentials. If there are
no Essentials, then the strategy calls for moving to the next highest moun-
tains, the Clear-Best hypotheses. A hypothesis is really only "Essential" rela-
tive to a limit of implausibility for the best alternative explanation. If such
hypotheses as Unknown-Disease, Data-is-Noise, Complete-Deception, or In-
tervention-by-Space-Aliens are considered to be plausible, then no hypoth-
esis will be Essential, although there may still be Clear-Bests. The introduc-
tion of low-likelihood alternative hypotheses turns Essentials into high-con-
fidence Clear-Bests without fundamentally altering anything.
The strategy as we have described it represents a discretization of a more
basic strategy into the three distinct stages of accepting Essentials (and work-
ing out the consequences), Clear-Bests (working out the consequences), and
Weak-Best hypotheses. The stages can be separated more finely by provid-
ing several levels of thresholds by which one elementary hypothesis can
surpass rivals for explaining some finding. An Essential becomes one that
far surpasses any rival, a Very Clear-Best might surpass rivals by a large
margin (but less than an Essential), a Clear-Best hypothesis might surpass
rivals by a somewhat lesser margin, and so on. Thus a smoother descent to
lower and lower confidence levels occurs as the growing composite hypoth-
esis gains more and more explanatory coverage. (See Figure 9.1 again.) A
virtue of this approach is that it shows the way to a descent from what might
be called "strongly justified explanatory conclusions," through less strongly
justified conclusions, through "intelligent speculation," all the way to "pure
guessing," and it provides meanings for all these terms.
Performing tests for data gathering is a natural extension of this strategy.
Tests are performed to resolve ambiguities (a datum is ambiguous if mul-
tiple hypotheses offer to explain it but none stands out as Essential or Clear-
Best). Such a test should be designed to discriminate among the best poten-
tial explainers.
In the idealized strategy we describe it is assumed that all plausible el-
ementary hypotheses are generated before assembly processing. More real-
istically, we might suppose a hypothesis generator with a dependable ten-
dency to supply hypotheses in rough order of plausibility, higher plausibil-
ity hypotheses first. Under these circumstances a good strategy for the hy-
pothesis assembler, especially if time is critical, is to assemble on the basis
of an initially generated set of hypotheses, provoking the generator for more
hypotheses if some deficiency in the initial set is encountered (e.g., there is
an unexplainable finding or the best explanation for something appears to
be poor for some reason). If elementary hypotheses are generated in ap-
proximate order of plausibility, then hypotheses accepted on the basis of
essentialness or clear-best-ness are probably correct and can tentatively be
Better task definition, better strategy 213
trusted. If the cost of error is high, and more processing time is available,
then further hypotheses can be generated.
The strategy can also be easily and naturally extended to accommodate
interactions of additive explanatory coverage, statistical or causal associa-
tion, and logical implication (see the discussion in chapter 6 about exten-
sions to the essentials-first strategy). It can also be extended to a layered-
interpretation model of abductive processing where Believed hypotheses at
one level of interpretation become data to be explained at the next level.
(This model is described in chapter 10.) As described in chapter 6, the strat-
egy can make use of concurrent processing, and this concurrency can be
realized on a neural architecture (Goel, 1989; Goel, Ramanujan, &
Sadayappan, 1988).
Software: PEIRCE-IGTT
In this section, we describe a shell for building abductive problem-solving
agents that use the Machine-5 strategy. This shell is called PEIRCE-IGTT to
distinguish it from a similar tool named PEIRCE constructed by Bill Punch
(chapter 4).3 The "IGTT" extension marks that this PEIRCE is a piece of the
Integrated Generic-Task Toolset.4 The IGT Toolset recognizes all the ge-
neric goals: "to recognize" (hypothesis matching), to classify, to explain,
and so on (see chapter 2). Each generic goal is realized as a type of special-
ist with a built-in ability to pursue a goal of that type. The toolset provides
for the construction of classification specialists, recognition specialists, and
so on. The toolset also includes a full-featured CSRL, a hierarchical classi-
fication tool. Classification specialists built using CSRL-IGTT can call upon
hypothesis matchers (in this system called "recognition agents") to set con-
fidence values. The toolset also includes a rudimentary database tool to sup-
port the knowledge-directed information-retrieval task. If a system needs
some case-specific information, then it accesses that information by way of
a database specialist. For synthesizing explanatory hypotheses, PEIRCE-
IGTT provides a kind of specialist, called an "abducer," which is an agent
into which a version of the Machine-5 strategy is embedded by default.
PEIRCE-IGTT can link these abducers into communities of abducers that
pass off explanatory subproblems to subabducers, similar to the PEIRCE
tool described in chapter 4.
One or more of the classical generic-task (GT) methods are associated
with each generic goal, and these are provided as default methods. However
if supplied methods are somehow inappropriate because of unavailable knowl-
edge, or because of particular unusual demands of the domain, the toolset
user can access Common Lisp or the Common Lisp Object System (CLOS)
216 ABDUCTIVE INFERENCE
to program an alternative method. Thus the IGTT tools are designed to al-
low easy escape to Lisp so that the built-in strategies can be used, but their
use is not mandatory. The toolset was designed to be programmer-friendly,
more so than the original GT tools and the original PEIRCE.
Unlike the GT theory described in chapter 4, where the development was
towards increased flexibility at run time, PEIRCE-IGTT is not optimized
for flexibility. Yet, unlike the classical GT's described in chapter 2, the new
toolset supports the idea that there is more than one way of achieving a
generic goal, though there is often a best method, which is supplied as the
default. So the new PEIRCE tool is positioned as conceptually intermediate
between the opportunistic run-time control of the original PEIRCE and the
fixed strategies of the classical GTs. It recognizes the need, in principle, for
flexibility, but it takes the stance that, for each generic goal, there is a ca-
nonical method which is preferable, if it applies, because it is generally the
most efficient.
How it works
The PEIRCE-IGTT default algorithm presupposes that there is a means of
generating or obtaining the findings for the case and a means of generating
hypotheses to explain the findings. The generated hypotheses must have ini-
tial confidence values, and they must have associations with the findings
that each can explain.
The steps of the algorithm are:
1. Generate or obtain findings to be explained and generate hypotheses (with
their confidence values and coverages).
2. Initialize the composite with any hypotheses predetermined to be in the com-
posite (set up by the tool user who has decided to always include certain
hypotheses or by the system user interactively while he or she explores alter-
native hypotheses).
3. When this algorithm is used in a layered-abduction machine, expand expecta-
tions from higher levels (if a higher level abductive conclusion has implica-
tions5 either positively or negatively for hypotheses at the current level). The
expectations will cause the confidence values of the hypotheses in question to
be adjusted. (Layered abduction machines are described in chapter 10.)
4. Propagate the effects of hypotheses initially accepted into the composite. This
may rule out other hypotheses that are incompatible with those in the com-
posite, or it may alter confidence values of other hypotheses that are implied
by hypotheses in the composite.
5. Loop on the following, either until all findings are accounted for or until no
more progress is made in extending the explanatory coverage.
a. Find all Confirmed hypotheses and include them in the composite. A Con-
firmed hypothesis is (here) one that receives the highest possible confi-
dence score (this is an optional feature that can be turned off by the system
builder if top-scoring hypotheses should not be automatically included in
the composite).
If Confirmed hypotheses are found, then propagate the effects of the lat-
est inclusions and go back to the loop beginning, else continue.
Better task definition, better strategy 217
b. Find all Essential hypotheses and include them in the composite.
If Essential hypotheses are found, then propagate the effects of including
them in the composite, and go back to the loop beginning, else continue.
c. Find all Clear-Best hypotheses. To be a Clear-Best, a hypothesis must have
a score higher than a given threshold and must surpass all other explana-
tions for some finding by another given threshold. (Thresholds are estab-
lished by the tool user at the time the system is built; they can be easily
modified during or between cases; the tool provides defaults if no thresh-
olds are specified.)
If Clear-Best hypotheses are found, then propagate the effects of includ-
ing them in the composite, and go back to the loop beginning, else con-
tinue.
d. Find and include all the Weak-Bests. Here we may relax the criteria set for
the Clear-Bests. This step is optional.
If Weak-Best hypotheses are included in the composite, then propagate
the effects and go back to the loop beginning, else continue.
End loop.
6. (Optional extended-guessing step) If there are still some unaccounted find-
ings, attempt to guess among the remaining hypotheses that have not been
ruled out. Guessing is accomplished by letting each unexplained finding vote
for the highest rated hypotheses offering to explain it. This voting allows
hypotheses to stand out from alternatives according to their power to help
explain the unexplained remainder, if in no other way.
If any guessed hypotheses are included, then propagate the effects and go
back to the loop beginning, else end.
At this point, either all findings have been accounted for, or there are no
more hypotheses available to explain findings, or the only remaining hy-
potheses are too close in plausibility and explanatory power to decide be-
tween them.
The default algorithm given to PEIRCE-IGTT abducers propagates the
effects of including a hypothesis into the working composite, in part, by
removing hypotheses that are incompatible with it from the set of available
hypotheses. This is a variant of the strategy (described in the section of this
chapter entitled Tractable Abduction) that adjusted the confidence scores of
incompatible hypotheses depending on the confidence status of an included
hypothesis, and that completely eliminated the incompatible hypotheses only
if the included hypothesis was an essential. This difference has implications
for the proper treatment of hypotheses included at each stage. In PEIRCE-
IGTT, each pass through the loop leads to conclusions whose proper confi-
dence is relative to the previous passes. A hypothesis judged to be Essential
because a competing hypothesis was ruled out as a result of its being incom-
patible with a Clear-Best, is only an Essential hypothesis relative to Clear-
Bests. An Essential from the first pass through the loop is more confidently
an Essential than an Essential that is relative to Clear-Bests is. Similarly,
any newly included hypothesis that is relative to guessing (that is, a hypoth-
esis is included as a result of the effects of the inclusion of a guessed hy-
pothesis) must be regarded as less confident than any hypotheses included
before guessing began. Thus, each pass through the loop portion of the algo-
218 ABDUCTIVE INFERENCE
rithm further limits the confidence in any hypothesis newly included into
the composite. Hypotheses may be Confirmed, Essential, Clear-Best, Weak-
Best, Guessed, Disbelieved (because of incompatibility), or Ruled-Out (be-
cause of a low confidence rating), and these judgments may be relative to
Confirmeds, Essentials, Clear-Best, Weak-Best, or Guessed hypotheses.
PEIRCE-IGTT can handle hypothesis-coverage interactions of the sort
where hypotheses contribute to the answer either independently (where each
hypothesis explains some of the findings but does not interact with how
other hypotheses explain other findings, except in being alternatives) or in
an additive fashion (where hypotheses may combine their explanatory power
towards individual findings), but it cannot handle cancellation interactions
(where hypotheses may counteract what other hypotheses can explain).
PEIRCE-IGTT's default method for handling incompatibility interactions is
to eliminate from further consideration hypotheses incompatible with those
included in the composite. It also handles interactions whereby hypotheses
have varying degrees of "sympathy" or "antipathy" for each other, either
symmetrically or asymmetrically. This is done while propagating the effects
of including a hypothesis in the composite, by appropriately readjusting the
confidence values of related hypotheses.
A series of small and medium-sized knowledge-based systems have been
constructed using PEIRCE-IGTT in domains including acoustic speech rec-
ognition, speech recognition from articulation, legal reasoning, and scien-
tific theory evaluation. The diversity of domains demonstrates the domain
independence of the abductive-assembly strategy.
Experimental design
The experiment was based on the RED-2 system described in chapter 3.
RED-2 has two main components: a hierarchical classifier and an abductive
assembler. The classifier rates the plausibility of each antibody hypothesis.
In the modified versions of the program used for the experiment, the hy-
potheses are not hierarchically organized, and the classifier simply matches
the prototypical pattern of data for each hypothesis to the data given in a
case in order to determine a confidence score. This component will be called
the "Matcher." The explanatory hypotheses will be called "antibodies" be-
cause each hypothesizes the presence of a particular antibody in the patient's
serum, and each offers to explain certain test reactions (sometimes only par-
tially explaining a reaction). The role of the abductive assembler, or "As-
sembler," is to form a composite hypothesis that is consistent, explains as
much of the data as possible, and meets some other conditions (such as par-
simony and high plausibility).
We ran 42 antibody-identification problems, here called "cases," using
four distinct machines, and measured their performance. The four machines
are: (1) a matcher-only machine, (2) a matcher machine with extra process-
Better task definition, better strategy 225
Table 9.1. Certainty vocabulary
c (Confirmed)
VP (Very Plausible)
p (Plausible)
u (Unknown)
I (Implausible)
VI (Very Implausible)
RO (Ruled Out)
Matcher
The Matcher produced a confidence rating for each antibody on an ascend-
ing integer scale from - 3 to +3. For each antibody, the Matcher did the
following:
1. It checked for critical data that should be observed if the antibody is present.
If it failed to find any of this data, it ruled out the antibody (assigned it -3).
2. If it failed to rule out the antibody, the Matcher then checked the patient's
history. Once a person develops an antibody, they always have it. So, if the
patient's history contained a record of the antibody's previous presence, the
Matcher returned a definitive positive rating for the antibody (+3). Other-
wise, the corresponding antigen was on the patient's own red blood cells, the
Matcher ruled out the antibody (-3) because people do not normally develop
antibodies to their own antigens. (Sometimes they do, but rarely; RED-2 was
designed only to handle cases in which these autoantibodies were already
ruled out.)
3. If no tests were conducted in which the antibody's presence would appear
(i.e., the antibody was untested), the Matcher rated it at a middle value, 0 on
the Matcher's scale. This prevented the matcher from making a commitment
when it had no grounds.
4. If steps 1 through 3 failed to produce a value, the Matcher rated the antibody
according to how closely the observed data matched the pattern of data ex-
pected for when the antibody is present. This produced values from -2 to +2.
226 ABDUCTIVE INFERENCE
Matcher Certainty
+3 C
+2 VP
+1 P
0 U
-1 I
-2 VI
-3 RO
Assembler Certainty
Confirmed C
Clear Best VP
Best P
Undetected U
Unresolved u
Likely Absent I
Ruled Out -2 VI
Ruled Out -3 RO
tial antibodies were called "Confirmed." "Undetected" means that the anti-
body was not tested. An antibody was considered "Unresolved" if it was in
the best explanation but was rated low by the Matcher, or not in the best
explanation but was rated high by the Matcher, reflecting evidence pulling
in opposite directions. Antibodies classified "Likely Absent" were not in the
best explanation, not ruled out, and not highly rated by the Matcher. "Ruled
Out - 2 " and "Ruled Out - 3 " mean that the antibody was rated - 2 or - 3 by
the matcher, and was not in the best explanation.
In cases where the Assembler did not produce a complete explanation, the
final rating method was modified somewhat to reflect a lowered confidence
in the result. The best explanation might fail to be a complete explanation
for one of two reasons:
1. There was a finding that no antibody could explain.
2. For each unexplained finding there was at least a near tie between possible
explainers; that is, there were insufficient grounds for discriminating between
them with a significant degree of confidence.
In Case 1, something was obviously wrong, a significant anomaly oc-
curred and the system's knowledge must be incomplete or incorrect. Addi-
tional hypotheses, or more generous explaining by existing ones, might have
changed everything. So all non-ruled-out hypotheses should be reclassified
as Unresolved. The ruled-out hypotheses could be left alone because the
additional information that might make a complete explanation possible
would be unlikely to affect them. (This is a domain-specific piece of strat-
egy, reflecting high confidence in the rule-out knowledge in this domain, a
confidence which was later empirically justified [see Table 9.7]. In domains
where rule-out knowledge is not so dependable, an unexplained finding could
be the result of a hypothesis improperly ruled out, so it would be more cor-
rect to change the classification of ruled-out hypotheses to Unresolved if
they could have helped to explain the finding.)
In Case 2, a complete explanation may have existed but it would have
Better task definition, better strategy 229
been hard to find or hard to determine which complete explanation was best.
Under these circumstances there was no reason to change the rating of hy-
potheses already included in the working hypothesis, or of hypotheses clas-
sified as Ruled Out (-2 or -3). Of the rest, any that would have been clas-
sified as Likely Absent (not in the best explanation, not ruled out, not highly
rated by the matcher) and offering to help explain an unexplained finding,
were instead considered Unresolved. Other ratings were left unchanged.
Results
The library of cases included correct answers for each case based on expert
judgment about which antibodies were present (we will refer to these as
230 ABDUCTIVE INFERENCE
c 0 0 23 23
VP 70 70 16 16
p 188 149 53 52
I 85 92 375 355
VI 250 250 250 242
RO 282 282 282 290
"in") and which were not tested for (called "untested"). The other antibod-
ies (i.e., the common ones that were tested for but were not in), were consid-
ered not present ("out"). We should note that we do not know the real truth
about any case. Many of these cases actually occurred in a hospital blood
bank. The "correct answers" were the human experts' best judgments about
which antibodies were present, but they are not guaranteed to be correct.
However, the experts' judgments are the best measure we have of the pro-
grams' performance.
Uncertainty. For each case, each machine produced a certainty value (C, VP,
etc.) for each of the 27 common antibodies. Because identifying which anti-
bodies were untested is a trivial task, and all machines were easily able to
do it, these antibodies are ignored in our analysis. The 42 cases represent
1,012 tested antibodies, of which 83 were present and 929 were absent. Tables
9.4 through 9.10 show the results of running the four machines on the 42
cases.
Table 9.4 shows the number of antibodies that each machine classified in
each category (C, VP, etc.). The Unknown (U) category is omitted from this
table, which shows only where the machines were able to make some com-
mitment (i.e., judge an antibody to be present or absent with some degree of
confidence). Thus, of the 1,012 antibodies that were tested in the 42 cases,
the Matcher committed itself on 875; it considered none of them to be Con-
firmed, 70 to be Very Plausible, and so forth. Similarly for the other ma-
chines. Some interesting points to note from this table are:
1. The matcher never confirmed anything. This was not very surprising. This is
at least partly a feature of the particular domain, in which pathognomonic
(directly confirming) evidence is unusual (see point 2).
2. The Assembler rated far fewer antibodies positively (Confirmed, Very Plau-
sible, or Plausible) than the Matcher did. This may simply be a feature of the
domain, which has easily available rule-out information, but supporting in-
formation is more difficult to obtain. So the Matcher can confidently rule out,
Better task definition, better strategy 231
but in the absence of evidence against an antibody, the Matcher may err by
rating it positively. Because missing an antibody that is actually present can
have more severe consequences for the patient than asserting an antibody that
is actually absent, pragmatic concerns encourage error in the direction of posi-
tive rating. The assembler, on the other hand, by taking into account explana-
tory relations between antibodies and data, is better able to sort things out,
and moves many of the antibodies rated positively by the Matcher into the
Unknown or negative categories.
3. Adding incompatibility knowledge made both the Matcher and the Assembler
more cautious in general. That is, the machines tended to commit on fewer
Antibodies. Incompatibility knowledge tended to increase uncertainty. This is
even more obvious in Table 9.5 where the number of antibodies classified as
Unknown is summarized. This increase in uncertainty might have been ex-
pected since most of the changes to the machines for handling incompatible
antibodies tend to move antibodies toward uncertainty. In the Matcher the
main effect was to move hypotheses from Plausible to Unknown. In the As-
sembler, the main effect was to move from Implausible to Unknown. Both of
these turned out to be correct changes (see the discussion of Table 9.7).
4. Adding incompatibility knowledge allowed the assembler to rule out 8 more
antibodies (for being incompatible with essentials). This amounted to a de-
crease in uncertainty for these hypotheses, contrary to the overall tendency of
incompatibility knowledge to increase uncertainty.
Correctness. Table 9.6 shows the number of antibodies that each machine
classified correctly in each category. An antibody was correctly classified if
(1) it was considered "in" by the experts and the machine classified it as
Confirmed, Very Plausible, or Plausible or (2) it was considered "out" by the
experts and the machine classified it as Implausible, Very Implausible, or
Ruled Out. Antibodies classified as Unknown were considered to be neither
correct nor incorrect.
Without With
Incompatibles Incompatibles
c 0 0 23 23
VP 24 24 11 11
p 40 35 20 20
I 84 90 354 338
VI 248 248 248 240
RO 282 282 282 290
c 100.0 100.0
VP 34.3 34.3 68.8 68.8
p 21.3 23.5 37.7 38.5
I 98.8 97.8 94.4 95.2
VI 99.2 99.2 99.2 99.2
RO 100.0 100.0 100.0 100.0
Without With
Incompatibles Incompatibles
Without With
Incompatibles Incompatibles
where the effect can readily be seen by reading down the columns. Thus, in
concert with a like effect for reducing uncertainty, knowledge of explana-
tory relationships had a dramatic effect for increasing correctness.
Incompatibility knowledge contributed surprisingly little to correctness, as
can be seen by reading across the rows in Tables 9.8 and 9.9. In retrospect it
seems likely that this was a domain-specific effect, and perhaps should not have
been so surprising. We have seen that the introduction of incompatibility knowl-
edge can give rise to complications resulting in an overall increase in uncer-
tainty (see Table 9.5). In addition Bylander has shown that finding a complete
explanation in the presence of incompatible hypotheses is an NP-Complete prob-
lem (see chapter 7). One way to make it tractable is to rule out enough hypoth-
eses so that there are no incompatible pairs among the non-ruled-out hypotheses
(i.e., reduce it to the problem of finding a complete explanation when there are
no incompatible hypotheses). It appears that the antibody-identification knowl-
edge has been optimized to do this; that is, tests have been designed and knowl-
edge has been compiled to discriminate between incompatible hypotheses, thus
reducing the processing load for test interpretation and increasing the chances
of arriving at a complete explanation. Since the Matchers are good at ruling out
(better than 99% accuracy), and since subsequent processing only uses the hy-
potheses not ruled out by the Matchers, there should be few incompatible pairs
among these hypotheses most of the time, so incompatibility knowledge should
usually be of little further use.
Problem difficulty. Let us see how the various machines fared on difficult
versus easy problems. There are several possible measures of
difficulty: number of reactions to explain, commonness of the antibodies
Better task definition, better strategy 235
Table 9.10. Percentage correctly classified by difficulty
No. of
antibodies Matcher with Assembler with
present Matcher Incompatibles Assembler Incompatibles
present or absent, commonness of the reactions, and so forth. But the sim-
plest measure available is the number of antibodies present. Normally, the
more antibodies present, the more difficult the problem, because the reac-
tion patterns overlap and tend to hide one another, because more interac-
tions must be considered, and because more steps might be needed to solve
the problem. This is not universally true since it might be harder to become
convinced that a single rare antibody is present than that two or three com-
mon ones are. Although it is not a perfect measure, it is a good crude mea-
sure of difficulty. Our library of test cases contained 14 one-antibody prob-
lems (i.e., cases in which exactly one antibody was present), 15 two-anti-
body problems, and 13 three-antibody problems.
Table 9.10 summarizes the results according to the number of antibod-
ies - each row shows how each machine performed on cases of a certain
difficulty (one-, two-, or three-antibody cases). Following are some points
to note about the results:
1. The Assemblers were more accurate than the Matchers were at all levels of
difficulty. Thus the increased correctness attributable to explanatory knowl-
edge was a persistent and broadly occurring phenomenon.
2. As expected, all the machines were at their worst on the most difficult cases.
3. The effect of incompatibility knowledge for the Matcher was also persistent,
serving to improve its performance at all difficulty levels, but primarily on
the easiest and hardest cases. This effect probably occurred because the Matcher
rated far too many antibodies positively, and incompatibility knowledge pushed
antibodies from low positive ratings to Unknown. For the Assembler, incom-
patibility knowledge made only a slight difference in accuracy, and that just
on the hardest cases. The way the Assembler's algorithm works, incompat-
ibility knowledge is likely to have more of an effect when the answer has
more parts.
4. Surprisingly, all the machines performed better on the two-antibody cases
than on the one-antibody cases (i.e., they did not perform as well on the easi-
est cases as they did on somewhat more difficult cases). Each case tests for
approximately the same number of antibodies, so a random guess on a two-
antibody problem is twice as likely to produce a correct answer as a random
guess on a one-antibody problem is. So machines such as the Matchers, which
assign positive ratings when there is no reason not to, will be correct more
236 ABDUCTIVE INFERENCE
often when more antibodies are present. Indeed, the phenomenon was largest
with the Matchers. However, the phenomenon also occurred with the Assem-
blers, although it was smaller. Furthermore, the Matchers' performance de-
creased on the three-antibody cases, so the random-hit hypothesis cannot ex-
plain the phenomenon completely. Perhaps it is a result of some bias in the
test cases.
Because the number of antibodies in the correct answer is only a crude
measure of case difficulty, and because the proportion of antibodies present
to antibodies tested is higher in the multiple-antibody cases, we cannot rely
strongly on the numbers in Table 9.10 to reveal the true change in accuracy
of each machine with increasing case difficulty. Nevertheless, the compara-
tive performance of the four machines clearly holds for each level of diffi-
culty.
Implications for RED. Some facts stand out from the data presented here,
and suggest possible improvements to future versions of RED.
1. The Matcher is well suited to ruling out hypotheses. In this experiment it was
correct in its negative ratings more than 99% of the time. Therefore, RED
should be cautious about assigning an antibody a positive rating after the
Matcher assigns it a negative rating. In particular, the Assembler can ignore
antibodies rated Implausible or lower by the Matcher (currently it ignores
only those that are ruled out) and use them only if it cannot find a complete
explanation otherwise.
2. When an antibody is not useful in forming a complete explanation, this is not
good evidence against it. So it should not be lowered in confidence more than
about one step.
Further study is needed to determine the effects of these proposed changes.
Conclusions
We conducted an experiment to measure the degree to which explicitly rep-
resented knowledge of explanatory relationships can contribute to reducing
uncertainty and increasing correctness for abductive reasoning. Explanatory
relationships were found to make a significant contribution to reducing un-
certainty and to increasing correctness. In both cases the effect was dra-
matic.
Explicitly represented knowledge of hypothesis incompatibility was also
tested. The apparent effect of using this knowledge was increased overall
uncertainty, but a slightly increased rate of correct judgments. The increase
was so slight, however, that it should not be considered significant.
Confidence values correlated well with correctness. All four machines
were more often correct when they were more certain, and they were never
wrong at the extremes of confidence. This fact provides a form of validation
for the qualitative confidence vocabulary used in the experiment, and for
the confidence-setting rules used in the machines. In particular, the strategy
Better task definition, better strategy 237
for weighing explanatory evidence should be considered to be validated in
the large, although not in every detail. The confidence-setting behavior of
the machines was not only reasonable (according to internal considerations),
but also realistic (based on correspondence with the facts).
This experiment is in the spirit of others done on knowledge-based sys-
tems, though testing the contribution of a particular type of knowledge may
be unique. It is a reasonable sort of validation to ask for any advisory system
that makes judgments qualified by confidence estimates, that its confidence
should correlate with its correctness, though to our knowledge this is rarely
done.
Notes
1 If two Essential hypotheses are incompatible, then a serious anomaly has occurred, and normal
processing is suspended. It is possible that the conflict can be resolved by generating more hy-
potheses or by performing tests to reconfirm the existing data, but both of these solutions are
outside the scope of the present discussion.
2 Again, if two Clear-Best hypotheses are incompatible, then special handling takes over.
3 PEIRCE-IGTT was designed by John R. Josephson, Hari Narayanan, and Diana Smetters, and
implemented in Common Lisp and CLOS (Common Lisp Object System) by Diana Smetters and
Richard Fox.
4 The Integrated Generic-Task Toolset (IGTT) used in many of our systems benefited greatly from
the sponsorship of The Defense Research Projects Agency under the Strategic Computing Pro-
gram. Other significant support for its development came at various times from Xerox, IBM,
DEC, and Texas Instruments corporations.
5 Implications can be hard or soft. A soft implication has a degree of strength.
6 Thagard gave a somewhat counterintuitive analysis of the explanatory relationships here and in a
few other places. We first encoded the case using his analysis, ran it, and then reencoded it in a
form that seemed more natural. The changes were few, and the performance was not greatly al-
tered. This alteration is described later in this section.
7 Compare this with the popular TV detective Columbo, who often solves a difficult case by dog-
gedly trying to find a satisfactory explanation for some seemingly insignificant fact.
10 Perception and language understanding
238
Perception and language understanding 239
Also, computational models of information processing for both vision and
spoken language understanding have commonly supposed an orderly pro-
gression of layers, beginning near the retina or auditory periphery, where
hypotheses are formed about "low-level" features, e.g., edges (in vision) or
bursts (in speech perception), and proceeding by stages to higher level hy-
potheses. Models intended to be comprehensive often suppose three or more
major layers, often with sublayers, and sometimes with parallel channels
that separate and combine to support higher level hypotheses. For example,
shading discontinuities and color contrasts may separately support hypoth-
eses about object boundary (Marr, 1982). Recent work on primate vision
appears to show the existence of separate channels for information about
shading, texture, and color, not all supplying information to the same layers
of interpretation (Livingstone & Hubel, 1988). Audition, phonetics, gram-
mar, and semantics have sometimes been proposed as distinct layers of in-
terpretation for speech understanding.
In both vision and speech understanding most of the processing of infor-
mation is presumably bottom up, from information produced by the sensory
organ, through intermediate representations, to the abstract cognitive cat-
egories that are used for reasoning. Yet top-down processing is presumably
also significant, as higher level information imposes biases and helps with
identification and disambiguation. Both vision and speech understanding
can thus be thought of as layered-interpretation tasks wherein the output
from one layer becomes data to be interpreted at the next. Layered-interpre-
tation models for nonperceptual interpretive processes make sense too. For
example, medical diagnosis can be thought of as an inference that typically
proceeds from symptoms to pathological states to diseases to etiologies. Simi-
lar to perception, medical diagnosis is presumably mostly bottom up, but
with a significant amount of top-down processing serving similar functions.
It is reasonable to expect that perceptual processes have been optimized
over evolutionary time (become efficient, not necessarily optimal) and that
the specific layers and their hypotheses, especially at lower levels, have
been compiled into special-purpose mechanisms. Within the life span of a
single organism, language learning and perceptual learning provide addi-
tional opportunities for compilation and optimization. Nevertheless, it seems
that at each layer of interpretation the abstract information-processing task
is the same: that of forming a coherent, composite best explanation of the
data from the previous layer or layers. That is, the task is abduction, and, in
particular, abduction requiring the formation of composite hypotheses.
If the information processing that occurs in the various layers and senses
is functionally similar, then perhaps their mechanisms are similar too at a
certain level of description. Thus we are led to hypothesize that the informa-
tion-processing mechanisms that occur in vision, in hearing, in understand-
ing spoken language, and in interpreting information from other senses (natu-
240 ABDUCTIVE INFERENCE
ral and robotic) are all variations, incomplete realizations, or compilations
(domain-specific optimizations) of one basic computational mechanism. Thus
we propose what we may call the layered-abduction model of perception.
What is new in this model is the specific hypothesis that perception uses
abductive inferences, occurring in layers, together with a specific computa-
tional model of abductive processing.
1. Actually the islands are strong. They are never based only on a hypothesis
having high initial confidence; a hypothesis must also be a best explanation
for some datum.
2. Inconsistencies lead to detected anomalies, which lead to special strategies
that weigh alternative courses of action. Originally accepted hypotheses can
collide with others and subsequently be called into question.
3. Inconsistency collisions can occur laterally or from above (violation of ex-
pectation) and they can come in degrees of strength. In effect there is broad
cross-checking of accepted hypotheses.
4. An inexplicable datum can be doubted and called into question. If after re-
evaluation the datum remains strong despite the doubt, then the system can
detect that it has encountered the limits of its knowledge, and it is positioned
to learn a new hypothesis category
5. Sometimes two parts of a compound hypothesis are inconsistent in context ,
that is, a consistent hypothesis cannot be formed at the next highest level. (It
seems that this can account for unstable perceptual objects such as the Necker
246 ABDUCTIVE INFERENCE
cube.) Under these circumstances special handling takes over, similar to that
for other kinds of detected inconsistencies.
Notice the level of description that we gave for Machine 6. It might be imple-
mented as an algorithmic computer - an instruction f o l l o w e r - o r as a
connectionist computer, whose primitive processing elements work by propa-
gating activations. We described the functional and semantic significance of
various actions of the machine, and the flow of control, but we did not describe
precisely how these actions are implemented on an underlying machine.
Pragmatic Stratum
E
discourse structure field
I
tenor
mode
Grammatical Stratum
sentence
clause non-finite
group/phrase imperative injOccMiye
word
declarative interrogative
morpheme
—constituent hierarchy— —classification hierarchy—
Pho noIogical-Prosodic Stratum
intonation phrases syllable
intermediate phrases accented unaccented
prosodic words
stress feet unstressed stressed
syllables full reduced
phonemes
—constituent hierarchy— —classification hierarchy—
Phonetic Stratum
tone
lip aperture
tongue tip constriction
tongue body constriction
velic aperture
glottal aperture
Auditory Stratum
I voice
voice fundamental
tundamental frequency
channels formant frequenci
;— burst events
Acoustic Stratum
spectral profile
the units are organized into a classification hierarchy. For example, as shown
in Figure 10.1, the grammatical stratum has these ranks: sentence, clause,
phrase, word, and morpheme; clauses can be classified as finite or nonfinite
and even further subclassified. Composite hypotheses are formed at each
rank and channel by processes that are abductive inferences, both logically
and procedurally (and thus go beyond simply recognition and classifica-
tion). The direction of interpretive inferences is from bottom towards top,
whereas the direction of inference based on expectation is from the top down.
Let us describe each of the six strata in more detail.
CV and 3-Level
CV was a first step in using abduction for speech recognition from acoustic
signals.7 The system had many interesting features, including abductive
formant trackers that worked by assembling consistent hypotheses to ex-
plain spectral peaks at each time slice.8 CV demonstrated that a multicriterial,
feature-based recognizer can produce reasonable accuracy where the acous-
tic correlates of phonetic features are fairly well understood. Yet it appears
to be difficult to determine place of articulation in any direct way from
acoustic evidence, at least not with any reliability. (Place of articulation is
where the vocal tract is constricted to produce a consonant, e.g., it is the
Perception and language understanding 251
basis for the difference between /b/, /d/, and /g/.) However, the system used
a classification hierarchy for its hypotheses. So, for example, even when it
was unable to distinguish between Ibl and /d/, it could nevertheless confi-
dently identify the sound as a voiced stop consonant (which might be good
enough in some contexts). Although CV used separate abducers for formant
tracking and for arriving at its final output, it was not really a layered-ab-
duction machine in the sense of the Machine 6 model.
The 3-Level system was a first attempt to use abduction to infer words
from articulatory input, one of the interlayer inferences called for by the
model of speech processing described earlier in this chapter.9 It was also a
first attempt to stack abducers into a multilevel machine and to try top-
down processing. The system inferred phonemes from articulation and words
from phonemes, thus using two levels of abductive interpretation to bridge
three levels of representation.10 The input was a (hand-coded) gestural score
(Browman & Goldstein, 1989) of articulatory events laid out over time, in-
cluding such events as lip closure, tongue lowering, and onset of voicing.
Taking the articulatory events as data to be explained, the system evoked
phoneme hypotheses and then rated these hypotheses according to prestored
knowledge of associations (positive and negative) of phonemes with
articulatory events. It then assembled a best explanation (using the PEIRCE-
IGTT version of the abductive-assembly strategy described in chapter 9).
(The entire system was built using the IGT toolset.) The best explanation
accepted at the phoneme agora became data to be explained at the word
agora, where a similar process of hypothesis evocation, scoring, and compo-
sition occurred. The output was a word or set of words that explained the
input gestures.11
With accurate inputs, the system had no difficulty finding the correct pho-
nemes and then the correct words. Layered abduction worked. With "dirty"
inputs (perturbations introduced by hand), performance degraded as expected.
When knowledge of incompatibles and implications was added, performance
on dirty inputs improved significantly. Thus the mechanisms for using these
forms of knowledge also functioned as expected.
Finally, a form of top-down discrimination was encoded. Sometimes more
than one equally rated word hypothesis offered to account for approximately
the same stretch of phonemes. In one example, "beet," "boot," and "bat"
were the only alternatives, and they were equally rated. This sort of situa-
tion triggered a reevaluation of the unknown phoneme by which they dif-
fered, in this case the middle vowel. In this example the matter had been left
unresolved at the phoneme agora because the two rivals III and /er/ could
not be distinguished (the tongue was high, but how far back was indefinite).
"Beet" wanted the vowel /I/, "boot" wanted /uw/, and "bat" wanted /ae/.
Intersecting the possibilities from above and below, III was the only remain-
ing possibility, so III was chosen at the phoneme agora, causing the word
252 ABDUCTIVE INFERENCE
"beet" to win. (The processing strategy was actually encoded as top-down
expectation-based encouragement for the remaining possibilities from above,
which made IV the winner.) Thus a definite improvement was demonstrated
for this form of top-down processing.
3-Level was able to find word boundaries by trading off alternative word-
level explanations for the occurrence of phonemes. The choice between word-
boundary locations is a byproduct of the choice between alternative com-
posite word-level hypotheses. This is especially interesting because, with
current technology, computer-based interpretation of continuous speech lags
far behind computer-based recognition of isolated words precisely because
word boundaries are not well marked acoustically (in English at least). Our
abductive machinery even (properly) allows a phoneme to belong to two
different words at the same time; so, for example, the middle /s/ in "six
stones" can be explained as belonging to both words (just as in RED a reac-
tion may be explained by the presence of two different antibodies).
How ArtRec works. The task of ArtRec is to infer the words, and to deter-
mine which word, if any, is being emphasized. A three-layered abduction
system is used, consisting of three layers of representation with two
abductive transitions between them. Each abductive transition uses the ba-
Perception and language understanding 253
sic strategy of evocation, instantiation, and composition of hypotheses.
The first findings to be explained are the motions of the pellets, repre-
sented by their extremes, and evident as peaks and valleys in the horizontal
and vertical components of the pellet motions (viewed from the side of the
head). The available explanatory hypotheses are qualitative vocal tract ges-
tures, including apical (alveolar) closure, labial closure, labiodental closure,
dorsal closure, tongue-forward, tongue retroflection, tongue lowering, tongue
blade retraction, tongue high and front, mandible low, tongue dorsal for-
ward, and palatal-glide tongue motion. Motion extremes that surpass preset
thresholds cue gestures that might explain them. Gestures are then scored by
comparison with ideal forms of the gestures and based on the presence and
absence of other relevant pellet-motion features. An eventual attempt is made
to explain all peaks and valleys no matter how extreme (whether they pass a
threshold or not), but certain hypotheses are plausible only if the events of
interest have surpassed a threshold. (Apical closure, for example, requires
that the tongue tip is raised far enough to reach the alveolar ridge.) Usually,
many more gesture hypotheses are cued than are needed to explain a par-
ticular peak or valley.
The gesture hypotheses receive initial plausibility scores based on pellet-
motion features chosen by human examination of the pellet data. We identi-
fied necessary, impossible, and other positively and negatively associated
features. A gesture receives a high confidence score if all the necessary fea-
tures and none of the impossible features are present; it receives a lower
score if necessary features are missing or if impossible features occur. Then,
other associated features are checked, and small additional alterations to the
confidence score may be made.
After initial plausibility scoring, the system determines what each gesture
hypothesis can explain. The cueing of a gesture hypothesis is based on the
pellet motions necessarily associated with the gesture, but typically other
motions can be accounted for as well. For example, apical closure requires
that the tongue tip be raised, but apical closure can also explain other mo-
tions such as the tongue blade being raised.
Besides articulatory-gesture hypotheses, noise hypotheses are generated
and can be used to explain portions of the data. The articulatory data are
inherently noisy because of the need to minimize X-ray exposures of the
subjects. In addition, some motions of the articulators are unintentional or
in other ways are not a result of linguistic control. These nonlinguistic phe-
nomena are treated simply as noise (i.e., apparent pellet motions not best
explained as gestures). Noise hypotheses are treated as explanatory hypoth-
eses, much like any others. A noise hypothesis has a modest initial plausibil-
ity (which depends on various factors), and will be accepted if, in context, it
is the best explanation for some datum.
After evocation and scoring of noise and gesture hypotheses, the first
254 ABDUCTIVE INFERENCE
abducer is run. This abducer generates a composite hypothesis consisting of
gesture and noise hypotheses and constituting a best explanation for the pel-
let motions. Typically 75 to 100 gesture hypotheses and 25 to 50 noise hy-
potheses are generated. The abducer prunes this down to only 25 to 50, which
together constitute the best gesture-level explanation for the pellet data in
the 5-second utterance. This hypothesis then becomes data to be explained
at the word level.
A word is hypothesized to be centered at each point where the mouth is
maximally open. This is detected as a point where the mandible (jaw) reaches
a local low point. This simple method works largely because all the words in
our initial lexicon are monosyllables, and it cannot be expected to work for
polysyllabic words (though it may be a good way to cue consonant-vowel-
consonant syllables in a future system).
For each hypothesized word location, all the gestures nearby are clustered
and are used to cue word hypotheses. The word hypotheses are then scored
based on the presence, absence, and temporal ordering of gestures of inter-
est in that cluster. For example, the word "nine" expects two alveolar clo-
sures surrounding a tongue-blade lowering gesture. If these are found, the
"nine" hypothesis is scored as highly plausible, unless unexpected or impos-
sible gestures are found nearby. The word "nine" would not expect labial
closure; in fact it is impossible to pronounce the word "nine" with one's lips
closed. The presence of unexpected gestures lowers the plausibility score to
one degree or another.
The next step is to determine the gestures that each word hypothesis can
explain. As usual, word hypotheses can account for more gestures than just
those that are used to score the hypotheses.
Word hypotheses that overlap too much in time are set to be incompatible
because two words cannot be uttered simultaneously. Incompatibility rela-
tionships are subsequently used to reduce ambiguity, according to the
abductive-assembly strategy that was described in chapter 9.
Besides word hypotheses, this level, too, has noise hypotheses, which
may be used to explain gestures not explained by any word. Even if a ges-
ture is correctly accepted at the lower level as part of the best explanation,
there is no guarantee that the gesture is actually part of a word. Such ges-
tures may be well formed but unintentional, or they may be byproducts of
motions associated with preceding or succeeding gestures, or they may be
manifestations of a speaker who is nervous, who is tired, who is excited,
or who speaks idiosyncratically. ArtRec covers for all these possibilities by
judiciously using explicit noise hypotheses to account for data items that
have, in context, no better alternative explanations. Noise hypotheses are
introduced automatically as possible explanations for gestures that were not
given high plausibility scores. They are also introduced if all the word hy-
potheses score poorly. Noise hypotheses are set to be incompatible with
Perception and language understanding 255
word hypotheses that are attempting to explain the same gestures.
Using the word and noise hypotheses, the second abducer forms a best
explanation of the gestures accepted by the previous abducer. The abductive-
assembly strategy at both levels is the strategy described in chapter 9. In
particular, the abducers accept Confirmed, Essential, Clear-Best, and Weak-
Best hypotheses, but they do not go any further in attempting to resolve ties
between equally plausible hypotheses (e.g., by voting).
After forming a composite word-level hypothesis, mandible motion is used to
determine which word (if any) was emphasized in the utterance. The empha-
sized word is one in which the mandible movement is exaggerated (i.e., signifi-
cantly larger than the mandible movement of other words surrounding it).
Experimental results. Before describing the results we want to point out that
the lexicon is very small, but tricky, because there are many close articulatory
similarities among the words. For example, "five," "nine," and "pine" are
very difficult to distinguish because there are no acoustic inputs to indicate
voicing, nasality, or the acoustic bursts indicative of stop consonants. This
lexicon was chosen because we wanted to explore the phenomenon of cor-
rective emphasis.
Five sets of speaker data were initially acquired from the Wisconsin Mi-
crobeam facility. Each of the five English speakers uttered 80 different ques-
tion-answer pairs, using all the combinations of "five" and "nine," and us-
ing affirmative and negative assertion patterns, as described earlier in the
subsection entitled The Initial Setup.
After creating the initial ArtRec system using the integrated generic task
toolset and PEIRCE-IGTT (described in chapter 9), we tested it on 20 utter-
ances from a single speaker. This testing required examining the pellet data
and tuning the recognition (hypothesis-scoring) knowledge to the specifics
of the individual speaker. Speakers differ not only in the dimensions of their
articulatory apparatuses, but also in the number and exact placements of the
gold pellets. After the system was run on the 20 utterances, the mistakes
were analyzed, and a process of retuning began. After some time the system
achieved approximately 90% accuracy on the 20 utterances. The next step
was to run the system on all the utterances of that speaker (60 more). This
required further tuning and testing, but eventually all 80 cases were run.
Results were again in the 90% range.
The next step was to use the system on the data from a different speaker.
Frustratingly, this required making further modifications to the system. As a
consequence, the system was redesigned so that almost all speaker differ-
ences could be handled by setting a small number of parameters, primarily
threshold values for recognizing gestures from pellet positions. Tuning these
parameters to each new speaker is an iterative process of examining the new
pellet data, setting parameters, testing, and adjusting again until satisfac-
256 ABDUCTIVE INFERENCE
tory results are obtained. This was done for each new speaker. Finally, the
system had been tuned for all five speaker sets, and all the data were run
through the system. Results were encouraging although not entirely satisfy-
ing. Consequently, the system was redesigned to work as described earlier
in the subsection entitled How ArtRec Works. Let us briefly examine the
differences between the initial system (the one just described tuned for all
five speaker sets) and the redesigned system.
First, the initial system had a single abduction layer. The same three levels of
representations were used - pellet motions, articulatory gestures, and words -
but the transition from motions to gestures was simply recognition-based and
did not use abductive assembly. That is, gesture hypotheses were generated and
scored, but all hypotheses that were not ruled out (lowest score) were passed
upward to be explained at the word agora by an explicitly abductive process.
The result was that many more gesture hypotheses were generated than were
needed to explain the data, and many more than were actually true. The single
abducer then attempted to explain all these gestures with word hypotheses. Be-
cause of the lack of explicit explanatory relationships at the motions-to-ges-
tures transition, and because there was no attempt to explain the motions, no
noise hypotheses were generated. The redesigned system added a lower level
abducer for explaining motion events using articulatory-gesture hypotheses, and
noise hypotheses were made available.
Second, the initial system used mandible closings to locate word bound-
aries. This overly constrained where a word hypothesis might find its asso-
ciated gestures, because gestures outside the word boundaries were not con-
sidered part of the word. Because the mandible closings were not closely
synchronized with the beginnings and endings of words, this was a source of
many errors. The redesigned system instead used mandible openings to lo-
cate word centers.
After the system was redesigned, the data for the five speakers were run
again . The accuracy noticeably improved. These newer results are presented
in Table 10.1.
Table 10.1 lists the five speakers. For each speaker it shows the total num-
ber of utterances, the total number of words that ArtRec found in all of the
speaker's utterances, the number of words for which the system committed
to a word identity, the number of these words that were correctly identified,
the percentage of the words located that were correctly identified, and the
percentage of the committed-to words that were correctly identified. An ut-
terance consisted of the entire question-answer pair (e.g., "Is it five five
nine Pine Street? No, it's NINE five nine Pine Street.") and lasted approxi-
mately 5 seconds. We had 80 utterances for each speaker, but some of these
were not usable because they contained pellet-tracking errors or other forms
of significantly bad data. Even for usable utterances, often words at the end
were not captured in the pellet data. Consequently, the average number of
words per usable utterance was approximately 11 of the 14.
Perception and language understanding 257
Table 10.1. ArtRec performance
Words Words Words Located Decided
Speaker Utterances located decided correct % correct % correct
Next steps
Speech recognition from articulation is more to us than simply a real-world
domain to use for experimentation with layered abduction. We intend to use
it as a step towards computer-based understanding of naturally spoken lan-
guage. Articulatory effects, and prosodic phenomena such as emphasis, ap-
pear to be at the core of the technical impediments to computer-based rec-
ognition of natural speech.
One of the deep, main disadvantages of the currently popular statistical
approaches to speech recognition is that a prototypical linguistic unit is ex-
aggerated, whereas a typical unit is reduced. Systems trained on the typical
will tend to fail on the prototypical, so statistical systems tend to fail if
something is fully pronounced. Humans, in contrast, seem to learn on the
basis of prototypes and then recognize using (we believe) best-explanation
inference based on redundant information present in scattered evidence. Thus,
258 ABDUCTIVE INFERENCE
Table 10.2. ArtRec emphasis determination
5p 54 39 72
6pp 44 29 66
7p 31 31 100
llpp 38 35 92
12ppp 35 32 91
Total 202 166 82.2
abductive argument, and that this justification is such that the conclusion, if
true, is "knowledge" in the philosophical sense. Yet abductions are fallible, so
abductive conclusions are always somewhat uncertain. Thus abductions pro-
duce knowledge without absolute certainty. We pointed out that abductions are
ampliative inferences that abductions can transcend the information given in
their premises. We described this by saying that successful abductions are "truth
producing." Beyond these abstract points of logical and conceptual analysis, we
have shown by many experiments and working systems that abductive processes
actually function well to yield true conclusions. Abductive processes are good
at "truth finding." Conceptual analysis also reveals that abductions can produce
conclusions that are more certain than any of their premises. We called this
phenomenon "emergent certainty." Experiments reported in chapter 9 show that
abductions are uncertainty reducing.
If perceptions are abductions, then perceptions come with implicit abductive
justifications. Veridical perceptions are knowledge. The "conclusion" of a per-
ception is a best explanation and is logically justified, though it might be false
anyway. Perceptions have just the same sorts of vulnerabilities as scientific theo-
ries: there may be a better explanation that was not considered, or one that was
mistakenly ruled out, and so forth (see the section of chapter 1 on abductive
justification). Thus, perceptual "theory formation" is logically the same as sci-
entific theory formation, and also the theory formation of diagnosticians, crash
investigators, journalists, and juries. Perceptions go beyond the given informa-
tion in a way similar to scientific theories; they change the vocabulary of de-
scription and make the leap from data language to theory language. Perceptual
processes are explanation-seeking and explanation-accepting processes with the
same characteristics as scientific reasoning of being knowledge producing, truth
producing, fallible, truth finding, and uncertainty reducing. Scientific knowl-
edge is paradigmatic of knowledge; scientific knowledge is real knowledge if
anything is. Since perceptions have the same logic, perceptions are knowledge
too.
Knowledge comes from experience by abduction - in science, ordinary
life, and perception.
Notes
1 That is, there is an isomorphism after factoring out the subtask processing described in the text.
2 Under certain circumstances it is good enough just to score on the basis of voting by the stimu-
lating data from below, and then no top-down processing need occur, at least for scoring. Al-
though this strategy is less than logically ideal, it is computationally less expensive, at least in
the short run. This represents a type of compilation in which accuracy is traded for quickness.
3 An idea from William Whewell, British philosopher and historian of science, 1794-1866.
4 Ideas about the lower strata have benefited greatly from discussions with Mary Beckman, Rob-
ert Fox, Lawrence Feth, and Ashok Krishnamurthy.
5 [The best performing current technology for computer speech recognition, based on Hidden
Markov Models, is covertly abductive. I suggest that it can be improved by selecting hypotheses
Perception and language understanding 261
preferentially according to the difference among the scores of the best and its alternatives, rather
than by simply selecting the highest scoring hypothesis. - J. J.]
6 For many reasons we doubt that traditionally construed phonemes are an accurate way to de-
scribe speech. They serve as a placeholder here, but we expect this to be revised.
7 CV was designed by Richard Fox, Mary Beckman, Robert Fox, and John Josephson with addi-
tional contributions of linguistic, perceptual, and signal-processing expertise provided by Ashok
Krishnamurthy, Jane Rauschenberg, and Benjamin Ao. The system was implemented by Rich-
ard Fox using the IGT toolset described in chapter 9. This work benefited from support from the
National Science Foundation and the Defense Advanced Project Agency under grant CBT-
8703745.
8 Formants are distinctive spectral concentrations characteristic of human speech sounds.
9 3-Level was designed by Richard Fox, John Josephson, and Sunil Thadani, and implemented by
Richard Fox and Sunil Thadani. Linguistic expertise was provided by Mary Beckman, Robert
Fox, and Benjamin Ao.
10 As we said before, we doubt that traditionally construed phonemes are an accurate way to de-
scribe human speech. However, they are good enough for these initial experiments.
11 The outputs should be considered to be prosodic rather than lexical words; that is, they are dis-
tinctive, word-sized speaking events without grammatical associations.
12 ArtRec was designed and built by Richard Fox, with some guidance from John Josephson. Kevin
Lenzo assisted with the experiments and the write-up. Osamu Fujimura and Donna Erickson
provided linguistic expertise and overall encouragement and support. ArtRec was implemented
in Common Lisp and the IGT toolset described in Chapter 9.
Appendix A Truth seekers
Abduction machines
In this book, we described six generations of abduction machines. Each
generation's story was told by describing an abstract machine and experi-
ments with realizations of the machine as actual computer programs. Each
realization was approximate, partial, something less than a full realization
of the abstract machine. Each realization was also more than the abstract
machine: an actual chunk of software, a knowledge-based expert system
constructed to do a job, with an abundance of insights, domain-specific so-
lutions, and engineering shortcuts to get the job done. The abstract machines
are simplified idealizations of actual software.
An abstract abduction machine is a design for a programming language for
building knowledge systems. It is also a design for a tool for constructing these
systems (a partial design, since a tool also has a programming environment).
Each of the six machines has a strategy for finding and accepting best
explanations. Machine 6 inherits all the abilities of the earlier machines.
Suppose that we connect it to abstract machines for the subtasks of hypoth-
esis matching, hierarchical classification, and knowledge-directed data re-
trieval (see chapter 2). Then we conjoin abstract machines able to derive
knowledge for various subtasks from certain forms of causal and structural
knowledge (see chapters 5 and 8). Then we implement the whole abductive
device as a program written for a problem-solving architecture, which is an
abstract device of a different sort that provides control for generalized, flex-
ible, goal-pursuing behavior (see chapter 4). The result is an abduction ma-
chine capable of the following:
• forming hypotheses by instantiation and by composition (Machine 1)
• forming maximally plausible composite hypotheses (Machine 1)
• making composite hypotheses parsimonious by removing explanatorily
superfluous parts (Machine 1)
• critically evaluating hypotheses, determining their most confident parts
(Machine 1)
This appendix was written by John R. Josephson
262
Truth seekers 263
• making use of elementary hypotheses with varying levels of specificity
organized into a taxonomic hierarchy (Machine 1 with CSRL)
•justifying its conclusions (Machine 1)
• decomposing larger explanation problems into smaller ones (Machine 2)
• handling incompatible hypotheses to ensure a consistent composite ex-
planation (Machine 2)
• flexible, opportunistic control (Machine 3)
• assembling composite hypotheses from parts that are themselves com-
posite hypotheses (Machine 3)
• assembling low-specificity hypotheses, then controlling their refinement
(Machine 3)
• using explicitly represented causal and structural knowledge to derive
hypothesis-matching knowledge, to derive explanatory relationships, and
to assemble causally integrated composite hypotheses (TIPS, PATHEX/
LIVER, MDX2, QUAWDS)
• processing concurrently (Machine 4)
• minimizing backtracking by forming the most confident partial conclu-
sions first (Machine 4)
• descending gradually from higher to lower confidence partial conclu-
sions, all the way to intelligent guessing or even arbitrary choice if nec-
essary (Machine 4)
• forming confident partial explanations with ambiguous data left unex-
plained (Machine 4)
• using incompatibility relationships to reduce ambiguity and to extend
partial explanations to more complete ones (Machine 5)
• handling hard and soft incompatibility and implication relationships
(Machine 5)
• coordinating multiple sites of hypothesis formation while integrating
bottom-up and top-down flow of information (Machine 6)
• handling noise and novelty (described with Machine 6)
• forming a composite hypothesis that follows a causal chain from effects
to causes (Machines 2 and 6)
Machine 3 shows how abduction machines can have parts that are them-
selves abduction machines. Machine 6 shows how abduction machines can
be composed of layers of communicating abduction machines and how these
layered machines can be used for perception.
Machines 1 through 3 are goal-seeking machines with the primary goal
of forming a complete explanation, while maintaining consistency, maxi-
mizing plausibility, and so forth. Machine 4 is transitional, and for Machines
264 ABDUCTIVE INFERENCE
5 and 6 the primary goal is that of explaining as much as possible while
maintaining standards of confidence. All the machines are explanation seek-
ers.
In synthetic worlds
Suppose that an explanation-seeking abduction machine is placed in a syn-
thetic world with certain predefined characteristics. Suppose the machine is
equipped with sensors that detect certain elements of the world and deliver
sensory events as findings to be explained. The abduction machine is a syn-
thetic agent in a synthetic world. Under what conditions can it infer the
characteristics of its world?
We can try to answer this question by designing and building systems,
proving theorems, and performing experiments, while controlling the char-
acteristics of the worlds. By this process we turn philosophical questions
about the limits of knowledge into technical questions about the behavior of
information-processing systems.
Suppose that the synthetic world is changing. Instead of "snapshot" ab-
duction we need "moving picture" abduction and a machine capable of track-
ing the changing conditions and of dynamically maintaining a changing rep-
resentation of its world. One way to do this is with a snapshot machine able
to take pictures fast enough to make a movie. A better way is with a machine
that uses its best estimate of the state of the world to generate predictions,
and then detects deviations from these predictions and tries to explain them.
Explanations for the deviations result in revising the estimate of the world
state and appropriately revising predictions.1
Suppose that the synthetic world has general characteristics unknown to
the abductive agent. Suppose that there are unknown causal processes, hid-
den mechanisms, and unknown laws. Is it more difficult to design abduction
machines capable of forming general theories than to design ones that form
minitheories for explaining particular cases? It should be possible to design
abduction machines that hypothesize empirical generalizations to explain
observed regularities (see chapter 1 on how inductive generalization is ab-
duction) and that explain empirical generalizations by hypothesizing hidden
causality.
In any reasonably complex world a seeker must prioritize the search for
explanations; there is simply too much to be known. Abductive agents need
pragmatic bias. They need goals.
Suppose that the synthetic world is populated with other agents capable of
abduction and prediction and of having goals and of planning to achieve
them. Suppose that an agent has the goal of understanding other agents, of
inferring their intentions from overt behavior (which may include linguistic
behavior). Can we design abductive agents for this world? One reasonable
Truth seekers 265
way to form hypotheses about other agents is by analogy to oneself.
(Hypothesization from analogy is described in chapter 6 and in Appendix
B.) So it seems that empathy may be an important computational strategy.
We should be able to investigate empathy formally and empirically by de-
signing empathetic agents and by experimenting with software realizations
of them.
Notes
1 [Harry Pople's EAGOL system does this for power-plant diagnosis and monitoring; my informa-
tion comes from spoken communication; as far as I know a description has not been published. -
J. J.] Also see Leake (1992). Prediction as a separate form of inference is discussed in Chapter 1.
Appendix B Plausibility
266
Plausibility 267
sufficient to represent physician reasoning during diagnosis. In most of our
later systems, including the RED systems, we continued to use coarse confi-
dence scales. We also continued to set initial confidence values and updat-
ing methods to reflect our own, and domain experts', estimates of what
seemed reasonable, usually keeping in mind some English language equiva-
lents such as "very implausible," "neutral confidence," and "highly plau-
sible." This worked very well overall (see, for example, the experiments
reported in chapter 9). But how can we understand these confidence values?
Are they likelihoods? Probabilities? We discuss these alternatives next.
sideration for a cycle of confidence updating after initial confidence is set. This
strategy takes advantage of the prospects for eliminating low-likelihood elemen-
tary hypotheses immediately, thus simplifying the hypothesis space, and it makes
it possible to use initial confidences scores as a guide to steering through the
complexity of interactions and combinations.
After an initial confidence value is set for a hypothesis, and if that initial
confidence is not so low that the hypothesis is immediately rejected, then even-
tually the abduction machine updates the confidence to an "all-things-consid-
ered" confidence value. This can be viewed as an estimate of posterior prob-
ability conditioned by all the data for the case. In our most mature abstract
abduction machine, Machine 6 of chapter 10, many steps of updating may occur
before a final value is reached. These steps would reflect the impacts of hypoth-
esis interactions of various sorts, including the impacts of expectations from
hypotheses at other agoras and the impacts of incompatible hypotheses at the
same agora. Especially, the final confidence value takes account of explanatory
interactions that arise for the case, such as that a hypothesis is the most plau-
sible (highest confidence at the time) explanation for some datum.
Thus the confidence values used in our abduction machines can be con-
sidered to be probabilities, with an abduction machine providing an elabo-
rate strategy for arriving at all-things-considered posterior probabilities.
Yet it is not clear that much, if anything, is gained by interpreting the
confidences as probabilities. Coarse-scale confidence values seem to be good
enough for purposes of reasoning. Coarse-scale confidence values seem to
be all we can usually get from experience, considering the reference-class
problem (see chapter 1). Coarse-scale confidence values are almost always
sufficient to decide action. Moreover, as we argued in chapter 1, if fine-
scale confidence values are needed to solve a problem, probably the prob-
lem cannot be confidently solved. Therefore, the property of mathematical
probabilities that they come with a continuum of values from 0 to 1 is of no
use. Worse, if we use high-precision numbers to represent plausibilities in
our systems, we may mislead system builders by encouraging them to ex-
pend unnecessary resources in setting values precisely and may mislead us-
ers into overestimating the precision of confidence estimates. If one deci-
mal digit of precision already exaggerates what is needed, the use of float-
ing-point numbers to represent confidences in computers encourages abuse
(poor software design).
Of course people can easily be trained not to take precise numbers too
seriously, or the excess precision can be hidden behind a user interface, so
misinterpretation and abuse are not important objections. The important
question is whether the plausibilities should be considered estimates of prob-
abilities, however imprecise the estimates. Perhaps probability theory pro-
vides the proper theory of rationality for confidences; perhaps human think-
ing is only truly intelligent to the degree to which it approximates the ideal.
Plausibility 269
Actually, three alternatives present themselves: Plausibilities should be con-
sidered to be probabilities, may consistently be interpretable as probabili-
ties, or cannot reasonably be considered to be probabilities. Just now it seems
most likely that plausibilities may consistently be interpreted as mathemati-
cal probabilities, but that there is no significant computational payoff from
making this interpretation, and it appears to oversimplify a multi-dimen-
sional phenomenon into a single dimension.
ner. Ignoring antibody reactions in which there was a tie for best, Ku found
395 reactions for which there was a unique best explanation (unique highest
scoring antibody hypothesis). Whether the best explanation was correct, or
not, was decided by consulting the record of expert judgment for each case.
Ku investigated how the correctness of best explanations correlated with
various conditions.
As expected, correctness correlated highly with the difference in scores
between the best and second best hypotheses; the larger the difference, the
greater the likelihood that the best explanation was correct. Also, for anti-
body reactions for which the best and second best were one or two steps of
confidence apart (on the seven-step scale), correctness correlated negatively
with the number of hypotheses tied for second best: The greater the number
of ties for second best, the less likely that the best explanation was correct.
Correctness was also sensitive to the distance between the scores of the best
and third best hypotheses when the distance between best and second best
was held constant. Thus correctness was sensitive to more than just the score
of the best explanation and the difference between the scores of best and
second best. Overall the behavior was consistent with what would be ex-
pected if the plausibility scores were acting like measures of probability.
Ku found several other interesting correlations. Viewed from a certain per-
spective, her findings seem to suggest that the correct abductive-confidence
function ignores the score of the highest scoring hypothesis (other than noting
that it is the highest) and sets the final score for this hypothesis as a function of
the entire mass of alternatives; the more alternatives and the higher their scores,
the less confidence there should be that the best explanation is correct. Overall
it appears that the score of the best explanation is insignificant as an indicator of
its likely correctness (apart from making it the best explanation), but rather
correctness depends primarily on the scores of the explanatory alternatives. It is
as if the best explanation must fight for acceptance, while the alternatives com-
bine forces in an attempt to pull it down.
Dimensions of plausibility
Throughout the research described in this book we avoided making commit-
ments to the manner in which the confidence in a composite hypothesis may
depend on the confidence values of the constituents.1 Other things being
equal, of course, more confident constituents should yield a more confident
composite hypothesis. But typically other things are not equal, and compos-
ite hypotheses differ in explanatory power, parsimony, internal coherence,
or other characteristics, and these characteristics have no obvious common
measure. It is not at all obvious that plausibility is a one-dimensional mag-
nitude, a simple intensity.
Alternatives to probability
What are the alternatives to considering plausibilities to be probabilities?
Probabilities presumably measure likelihoods, ranging on a scale (in fre-
quency terms) from never, through increasing degrees of sometimes, to al-
ways. Plausibilities (for initial confidence estimates) may be case-specific
estimates of possibility ranging from impossible, up through decreasing de-
grees of doubt, all the way to definitely - could-b e.
In chapter 6 we pointed to the phenomenon of hypothesization by anal-
ogy, in which the present situation reminds an agent of a previous episode
and causal knowledge is then brought over and adapted.2 The present situa-
tion and the previous episode are members of the same class, which may be
a new category set up by the similarity relationship. The occurrence of the
previous episode is then an existence proof that things of that kind actually
happen, so they may be happening again in the present situation. The new
hypothesis adapted from the previous episode is then prima facie plausible
because it is already known that such things happen.
Whether a hypothesis is generated from an analogy or arises in some other
way, an analogy can be used to argue for plausibility. (What happened once
may be happening again.) Plausibility arguments based on analogy are ex-
tremely common. (This is obvious after we know to look for them. Some
examples are given in chapter 6.) Note that a plausibility argument based on
272 ABDUCTIVE INFERENCE
Notes
1 The best-small plausibility criterion described in chapter 7 represents a minimal constraint on the
plausibility of composites.
2 These ideas about plausibility from analogy have benefited greatly from discussions with Lindley
Darden.
Extended Bibliography
Achinstein, P. (1971). Law and Explanation. New York: Oxford University Press.
Ajjanagadde, V. (1991). Abductive Reasoning in Connectionist Networks. Technical Report,
Wilhelm Schickard Institute, Tubingen, Germany.
Ajjanagadde, V. (1991). Incorporating Background Knowledge and Structured Explananda in
Abductive Reasoning. Technical Report, Wilhelm Schickard Institute, Tubingen, Germany.
Allemang, D., Tanner, M. C , Bylander, T., & Josephson, J. R. (1987). On the Computational
Complexity of Hypothesis Assembly. In Proceedings of the Tenth InternationalJoint Con-
ference on Artificial Intelligence, Milan, Italy: IJCAI.
Allen, J. F. (1984). Towards a General Theory of Time and Action. Artificial Intelligence, 23(2),
123-154.
American Association of Blood Banks. (1977). Technical Manual. Washington, DC: American
Association of Blood Banks.
Aristotle. (1941). Posterior Analytics. In R. McKeon (Ed.), The Basic Works of Aristotle (G. R.
G. Mure, Trans.), (pp. 110-186). New York: Random House.
Ayeb, B. E., Marquis, P., & Rusinowitch, M. (1990). Deductive /Abductive Diagnosis: The DA-
Principles. In Proceedings of the Ninth European Conference on Artificial Intelligence
(ECAI), Pitman.
Ayeb, B. E., Marquis, P., & Rusinowitch, M. (in press). Preferring Diagnoses by Abduction.
IEEE Transactions on Systems, Man and Cybernetics.
Bareiss, R. (1989). Exemplar-Based Knowledge Acquisition: A Unified Approach to Concept
Representation, Classification, and Learning. Boston: Academic Press.
Bhaskar, R. (1981). Explanation. In W. F. Bynum, E. J. Browne, & R. Porter (Eds.), Dictionary
of the History of Science, (pp. 140-142). Princeton University Press.
Bhatnagar, R., & Kanal, L. K. (1991). Abductive Reasoning with Causal Influences of Interac-
tions. In Notes from the AAAI Workshop on Domain-Independent Strategies for Abduction,
(pp. 1-8). Anaheim, California. (Josephson & Dasigi, 1991).
Bhatnagar, R., & Kanal, L. K. (1991). Hypothesizing Causal Models for Reasoning. Technical
Report, Computer Science Department, University of Cincinnati, Cincinnati.
Black, M. (1967). Induction. In P. Edwards (Ed.), The Encyclopedia of Philosophy. New York:
Macmillan and The Free Press.
Blois, R. (1989). Information and Medicine: The Nature of Medical Description. Berkeley:
University of California Press.
Bonneau, A., Charpillet, F., Coste, S., Haton, J. P., Laprie, Y, & Marquis, P. (1992). A Model for
Hypothetical Reasoning applied to Speech Recognition. In Proceedings of the Tenth Euro-
pean Conference on Artificial Intelligence (ECAI) , New York: Wiley.
Boole, G. (1854). An Investigation of the Laws of Thought, on Which are Founded the Math-
ematical Theory of Logic and Probabilities. N.p.
Boose, J., & Bradshaw, J. (1988). Expertise Transfer and Complex Problems: Using AQUINAS
as a Knowledge-Acquisition Workbench for Knowledge-Based Systems. In J. H. Boose &
B. R. Gaines (Eds.), Knowledge Acquisition Tools for Expert Systems, Volume 2. (pp. 3 9 -
64). New York: Academic Press.
273
274 ABDUCTIVE INFERENCE
Bromberger, S. (1966). Why-Questions. In Mind and Cosmos: Essays in Contemporary Science
and Philosophy. Pittsburgh, PA: University of Pittsburgh Press. Reprinted in Readings in
the Philosophy of Science, edited by Baruch A. Brody, Prentice Hall, 1970.
Browman, C. P., & Goldstein, L. (1989). Tiers in Articulatory Phonology with some Implications for
Casual Speech. In J. Kingston & M. E. Beckman (Eds.), Laboratory Phonology I: Between the
Grammar and the Physics of Speech, (pp. 341-376). Cambridge University Press.
Brown, D. C. (1984). Expert Systems for Design Problem-Solving Using Design Refinement
with Plan Selection and Redesign. Ph.D. diss., Department of Computer and Information
Science, The Ohio State University, Columbus.
Brown, D. C , & Chandrasekaran, B. (1985). Plan Selection in Design Problem-Solving. In Pro-
ceedings oftheAISB 85 Conference, Warwick, UK: The Society for Artificial Intelligence
and the Simulation of Behavior.
Brown, D. C , & Chandrasekaran, B. (1986). Knowledge and Control for a Mechanical Design
Expert System. IEEE Computer, 19, 92-101.
Brown, D. C , & Chandrasekaran, B. (1989). Design Problem Solving: Knowledge Structures
and Control Strategies. London & San Mateo: Pitman & Morgan Kaufmann.
Bruce, D. J. (1958). The effect of listeners' anticipations on the intelligibility of heard speech.
Language and Speech, 1(2).
Bruner, J. S. (1957). On Perceptual Readiness. Psychological Review, 64(2), 123-152.
Bruner, J. S. (1973). Beyond the Information Given. New York: W. W. Norton.
Buchanan, B. G., & Feigenbaum, E. A. (1981). Dendral and Meta-Dendral: Their Applications
Dimension. In B. L. Webber & N. J. Nilsson (Eds.), Readings in Artificial Intelligence. (pp.
313-322). Palo Alto: Tioga.
Buchanan, B. G., & Shortliffe, E. H. (Eds.). (1984). Rule-Based Expert Systems: The MYCIN
Experiments of the Stanford Heuristic Programming Project. Reading, MA: Addison-Wesley.
Buchanan, B. G., Sutherland, G., & Feigenbaum, E. A. (1969). Heuristic Dendral: A program
for generating explanatory hypotheses in organic chemistry. In B. Meltzer & D. Michie
(Eds.), Machine Intelligence, (pp. 209-254). Edinburgh: Edinburgh University Press.
Bylander, T. (1985). A Critique of Qualitative Simulation from a Consolidation Viewpoint. In
Proceedings of the International Conference on Cybernetics and Society, (pp. 589-596).
IEEE Computer Society Press.
Bylander, T. (1986). Consolidation: A Method for Reasoning about the Behavior of Devices.
Ph.D. diss., Department of Computer and Information Science, The Ohio State University,
Columbus.
Bylander, T. (1987). Using Consolidation for Reasoning About Devices. Technical Report, The
Ohio State University, Laboratory for Artificial Intelligence Research, Columbus.
Bylander, T. (1991). The Monotonic Abduction Problem: A Functional Characterization on the
Edge of Tractability. In J. Allen, R. Fikes, & E. Sandewall (Eds.), Proceedings of the Sec-
ond International Conference on Principles of Knowledge Representation and Reasoning,
(pp. 70-77). Morgan Kaufmann.
Bylander, T. (1991). A Tractable Partial Solution to the Monotonic Abduction Problem. In Notes
from the AAAI Workshop on Domain-Independent Strategies for Abduction, (pp. 9-13).
Anaheim. (Josephson & Dasigi, 1991).
Bylander, T., Allemang, D., Tanner, M. C , & Josephson, J. R. (1989a). Some Results Concern-
ing the Computational Complexity of Abduction. In R. J. Brachman, H. J. Levesque, & R.
Reiter (Eds.), Proceedings of the First International Conference on Principles of Knowl-
edge Representation and Reasoning, (pp. 44-54). Toronto.
Bylander, T., Allemang, D., Tanner, M. C , & Josephson, J. R. (1989). When Efficient Assembly Per-
forms Correct Abduction and Why Abduction is Otherwise Trivial or Intractable. Technical
Report, The Ohio State University, Laboratory for Artificial Intelligence Research, Columbus.
Bylander, T., Allemang, D., Tanner, M. C , & Josephson, J. R. (1991). The Computational Com-
plexity of Abduction. Artificial Intelligence, 49, 25-60.
Bylander, T., Allemang, D., Tanner, M. C , & Josephson, J. R. (1992). The Computational Com-
plexity of Abduction. In R. J. Brachman, H. J. Levesque, & R. Reiter (Eds.), Knowledge
Extended Bibliography 275
Representation, (pp. 25-60). Cambridge, MA: MIT Press. Also appears in volume 49 of
Artificial Intelligence.
Bylander, T., & Chandrasekaran, B. (1985). Understanding Behavior Using Consolidation. In
Proceedings of the Ninth International Joint Conference on Artificial Intelligence, (pp.
450-454). Los Angeles.
Bylander, T., & Chandrasekaran, B. (1989). Understanding Behavior Using Consolidation. In
Proceedings of the Ninth International Joint Conference on Artificial Intelligence, 1 (pp.
300-306). San Mateo: Morgan Kaufmann.
Bylander, T., Chandrasekaran, B., & Josephson, J. R. (1987). The Generic Task Toolset. In G.
Salvendy (Ed.), Proceedings of the Second International Conference on Human-Computer
Interaction and Expert Systems, (pp. 507-514). Amsterdam: Elsevier.
Bylander, T., Johnson, T. R., & Goel, A. K. (1991). Structured matching: A task-specific tech-
nique for making decisions. Knowledge Acquisition, 3(1), 1-20.
Bylander, T., & Mittal, S. (1986). CSRL: A Language for Classificatory Problem Solving and
Uncertainty Handling. AI Magazine, 7(3), 66-77.
Bylander, T., Mittal, S., & Chandrasekaran, B. (1983). CSRL: A Language for Expert Systems
for Diagnosis. In Proceedings of the Eighth International Joint Conference on Artificial
Intelligence, (pp. 218-221). IJCAI.
Bylander, T., & Smith, J. W. (1985). Mapping Medical Knowledge into Conceptual Structures.
In K. N. Kama (Ed.), Proceedings of the Expert Systems in Government Symposium, (pp.
503-511). IEEE Computer Society Press.
Bylander, T., Smith, J. W., & Svirbely, J. (1986). Qualitative Representation of Behavior in the
Medical Domain. In R. Salamon, B. Blum, & M. Jorgenson (Eds.), Proceedings of the Fifth
Conference on Medical Informatics, (pp. 7-11). New York: North Holland.
Bylander, T., & Weintraub, M. A. (1988). A Corrective Learning Procedure Using Different
Explanatory Types. In 1988 A A AI Spring Symposium Series on Explanation-Based Learn-
ing, Stanford University.
Cain, T. (1991). Constraining Abduction in Integrated Learning. In Notes from the AAAI Work-
shop on Domain-Independent Strategies for Abduction, (pp. 14-17). Anaheim. (Josephson
& Dasigi, 1991).
Calistri, R. J. (1989). Classifying and Detecting Plan-Based Misconceptions. Technical Report,
Brown University Department of Computer Science.
Carver, N., & Lesser, V. (1991). Blackboard-based Sensor Interpretation Using a Symbolic Model
of the Sources of Uncertainty inAbductive Inference. In Notes from the AAAI Workshop on
Domain-Independent Strategies for Abduction, (pp. 18-24). Anaheim. (Josephson & Dasigi,
1991).
Chandrasekaran, B. (1983). Towards a Taxonomy of Problem-Solving Types. AI Magazine,
Winter/Spring, 9-17.
Chandrasekaran, B. (1986). Generic Tasks in Knowledge-Based Reasoning: High-Level Build-
ing Blocks for Expert System Design. IEEE Expert, 1(3), 23-30.
Chandrasekaran, B. (1987). Towards a Functional Architecture for Intelligence Based on Ge-
neric Information Processing Tasks. In Proceedings of the Tenth International Joint Con-
ference on Artificial Intelligence, Morgan Kaufmann.
Chandrasekaran, B. (1988). Generic Tasks as Building Blocks for Knowledge-Based Systems: The
Diagnosis and Routine Design Examples. Knowledge Engineering Review, 3(3), 183-210.
Chandrasekaran, B. (1989). Task Structures, Knowledge Acquisition, and Learning. Machine
Learning, 4, 339-345.
Chandrasekaran, B. (1990). Design Problem Solving: A Task Analysis. Artificial Intelligence
Magazine, 11(4), 59-71.
Chandrasekaran, B. (1991). Models versus Rules, Deep versus Compiled, Content versus Form:
Some distinctions in knowledge systems research. IEEE Expert, 6(2), 75-79.
Chandrasekaran, B., Bylander, T., & Sembugamoorthy, V. (1985). Functional Representations
and Behavior Composition by Consolidation: Two Aspects of Reasoning about Devices.
SIGART, 21-24.
276 ABDUCTIVE INFERENCE
Chandrasekaran, B., & Goel, A. (1988). From Numbers to Symbols to Knowledge Structures:
Artificial Intelligence Perspectives on the Classification Task. IEEE Transactions on Sys-
tems, Man, and Cybernetics, 18(3), 415-424.
Chandrasekaran, B., Gomez, F., Mittal, S., & Smith, J. W. (1979). An Approach to Medical Diag-
nosis Based on Conceptual Structures. In Proceedings of the Sixth International Joint Con-
ference on Artificial Intelligence, (pp. 134-142). IJCAI.
Chandrasekaran, B., & Johnson, T. R. (to appear). Generic Tasks and Task Structures: History,
Critique and New Direction. In J. M. David, J. P. Krivine, & R. Simmons (Eds.), Second
Generation Expert Systems. New York: Springer-Verlag.
Chandrasekaran, B., & Josephson, J. R. (1986). Explanation, Problem Solving, and New Gen-
eration Tools: A Progress Report. In Proceedings of the Expert Systems Workshop, (pp.
122-126). Pacific Grove, CA: Science Applications International Corporation.
Chandrasekaran, B., Josephson, J. R., & Herman, D. (1987). The Generic Task Toolset: High
Level Languages for the Construction of Planning and Problem Solving Systems. In J.
Rodriguez (Ed.), Proceedings of the Workshop on Space Telerobotics, III (pp. 59-65). Pasa-
dena: Jet Propulsion Laboratories and the California Institute of Technology.
Chandrasekaran, B., & Mittal, S. (1983). Conceptual Representation of Medical Knowledge for
Diagnosis by Computer: MDX and Related Systems. In M. Yovits (Ed.), Advances in Com-
puters, (pp. 217-293). New York: Academic Press.
Chandrasekaran, B., & Mittal, S. (1983). On Deep Versus Compiled Approaches to Diagnostic
Problem-Solving. International Journal of Man-Machine Studies, 19(5), 425-436.
Chandrasekaran, B., & Punch, W. F., Ill (1987). Data Validation During Diagnosis: A Step Be-
yond Traditional Sensor Validation. In Proceedings ofAAAI-87 Sixth National Conference
on Artificial Intelligence, (pp. 778-782). Los Altos: Morgan Kaufmann.
Chandrasekaran, B., Smith, J. W., & Sticklen, J. (1989). 'Deep' Models and Their Relation to
Diagnosis. Artificial Intelligence in Medicine, 1(1), 29-40.
Chandrasekaran, B., & Tanner, M. (1986). Uncertainty Handling in Expert Systems: Uniform
vs. Task-Specific Formalisms. In L. N. Kamal & J. Lemmer (Eds.), Uncertainty in Artificial
Intelligence. New York: North Holland.
Charniak, E. (1983). The Bayesian Basis of Common Sense Medical Diagnosis. In Proceedings of
the National Conference on Artificial Intelligence, (pp. 70-73). Los Altos: Morgan Kaufmann.
Charniak, E. (1986). A Neat Theory of Marker Passing. In Proceedings of AAAI -86, 1 (pp. 584-
588). Los Altos: Morgan Kaufmann.
Charniak, E. (1988). Motivation Analysis, Abductive Unification, and Nonmonotonic Equality.
Artificial Intelligence, 275-295.
Charniak, E., & Goldman, R. P. (1988). A logic for semantic interpretation. In Proceedings of
the 26th Annual Meeting of the Association for Computational Linguistics, (pp. 87—94).
Charniak, E., & Goldman, R. P. (1989). Plan recognition in stories and in life. In The Fifth
Conference on Uncertainty in Artificial Intelligence, (pp. 54-60).
Charniak, E., & Goldman, R. P. (1989). A semantics for probabilistic quantifier-free first-order
languages, with particular application to story understanding. In Proceedings of the Inter-
national Joint Conference on Artificial Intelligence, (pp. 1074-1079).
Charniak, E., & Goldman, R. P. (1991). A Probabilistic Model of Plan Recognition. In Proceed-
ings of the 1991 Conference of the American Association for Artificial Intelligence, (pp.
160-165).
Charniak, E., & Goldman, R. P. (forthcoming). A Bayesian Model of Plan Recognition. Artifi-
cial Intelligence.
Charniak, E., & McDermott, D. (1985). Introduction to Artificial Intelligence. Reading, MA:
Addison-Wesley.
Charniak, E., & Santos, E. (1992). Dynamic MAP calculations for abduction. In Proceedings of
the Tenth National Conference on Artificial Intelligence, (pp. 552-557). Menlo Park: AAAI
Press/MIT Press.
Charniak, E., & Shimony, S. E. (1990). Anew algorithm for finding MAP assignments to belief
networks. In Proceedings of the Conference on Uncertainty in Artificial Intelligence.
Extended Bibliography 277
Charniak, E., & Shimony, S. E. (1990). Probabilistic Semantics for Cost Based Abduction. In
Proceedings of the National Conference on Artificial Intelligence, (pp. 106-111). Boston.
Clancey, W. J. (1985). Heuristic Classification. Artificial Intelligence, 27(3), 289-350.
Clancey, W. J. (1992). Model Construction Operators. Artificial Intelligence, 53, 1-115.
Clancey, W. J., & Letsinger, R. (1981). NEOMYCIN: Reconfiguring a Rule-Based Expert Sys-
tem for Application to Teaching. In Proceedings of the International Joint Conference on
Artificial Intelligence, (pp. 829-836). Vancouver, British Columbia.
Clancey, W. J., & Shortliffe, E. H. (Eds.). (1984). Readings in Medical Artificial Intelligence.
Reading, MA: Addison-Wesley.
Console, L., Dupr'e, D. T., & Torasso, P. (1991). On the Relationship between Abduction and
Deduction. Journal of Logic and Computation, 1(5), 661-690.
Console, L., & Torasso, P. (1991). On the co-operation between abductive and temporal reason-
ing in medical diagnosis. Artificial Intelligence in Medicine, 3, 291-311.
Coombs, M. J., & Hartley, R. T. (1987). The MGR algorithm and its application to the genera-
tion of explanations for novel events. International Journal of Man-Machine Studies, 679-
708.
Coombs, M. J., & Hartley, R. T. (1988). Explaining novel events in process control through
model generative reasoning. International Journal of Expert Systems, 89-109.
Cooper, G. (1990). The Computational Complexity of Probabilistic Inference Using Bayesian
Belief Networks. Artificial Intelligence, 42, 393-405.
Craik, K. (1967). The Nature of Explanation. London: Cambridge University Press.
Darden, L. (1990). Diagnosing and Fixing Faults in Theories. In J. Schrager & P. Langley (Eds.),
Computational Models of Scientific Discovery and Theory Formation, (pp. 319-346). San
Mateo: Morgan-Kaufmann.
Darden, L. (1991). Theory Change in Science: Strategies from Mendelian Genetics. New York:
Oxford University Press.
Darden, L., Moberg, D., Thadani, S., & Josephson, J. (1992). A Computational Approach to
Scientific Theory Revision: The TRANSGENE Experiments. Technical Report, The Ohio
State University, Laboratory for Artificial Intelligence Research.
Darden, L., & Rada, R. (1988). Hypothesis Formation Using Part-Whole Interrelations. In D.
Helman (Ed.), Analogical Reasoning, (pp. 341-375). Dordrecht, The Netherlands: Kluwer
Academic Publishers.
Dasigi, V. (1991). Abductive Modeling in Sub-Domains of Language Processing. In Notes from
the AAAI Workshop on Domain-Independent Strategies for Abduction, (pp. 25-31). Ana-
heim. (Josephson & Dasigi, 1991).
Dasigi, V. R. (1988). Word Sense Disambiguation in Descriptive Text Interpretation: A Dual-
Route Parsimonious Covering Model. Ph.D. diss., University of Maryland, College Park.
Dasigi, V. R., & Reggia, J. A. (1988). Parsimonious Covering as a Method for Natural Language
Interfaces to Expert Systems. Technical Report, University of Maryland, College Park.
Davis, R. (1979). Interactive Transfer of Expertise: Acquisition of New Inference Rules. Artifi-
cial Intelligence, 12, 121-157.
Davis, R. (1982). Applications of Meta-Level Knowledge to the Construction, Maintenance,
and Use of Large Knowledge Bases. In R. Davis & D. Lenat (Eds.), Knowledge-Based
Systems in Artificial Intelligence, (pp. 229-490). New York: McGraw- Hill.
Davis, R. (1984). Diagnostic Reasoning Based on Structure and Function. Artificial Intelligence,
24, 347-410.
Davis, R. (1989). Form and Content in Model-Based Reasoning. In 1989 AAAI Workshop on
Model-Based Reasoning, Detroit.
Davis, R., & Hamscher, W. (1988). Model-Based Reasoning: Troubleshooting. In H. E. Shrobe
(Ed.), Exploring Artificial Intelligence, (pp. 297-346). San Mateo: Morgan Kaufmann.
de Kleer, J. (1986). Reasoning About Multiple Faults. AI Magazine, 7(3), 132-139.
de Kleer, J., & Brown, J. S. (1983). Assumptions and Ambiguities in Mechanistic Mental Mod-
els. In D. Gentner &A. Stevens {Eds.), Mental Models, (pp. 155-190). Hillsdale, NJ: Law-
rence Erlbaum.
278 ABDUCTIVE INFERENCE
de Kleer, J., & Brown, J. S. (1984). A Qualitative Physics Based on Confluences. Artificial
Intelligence, 24, 7-83.
de Kleer, J., Mackworth, A. K., & Reiter, R. (1992). Characterizing Diagnoses and Systems.
Artificial Intelligence, 56(2-3), 197-222.
de Kleer, J., & Williams, B. C. (1987). Diagnosing Multiple Faults. Artificial Intelligence, 32(1),
97-130.
de Kleer, J., & Williams, B. C. (1989). Diagnosis With Behavioral Modes. h\ Proceedings of the
Eleventh InternationalJoint Conference on Artificial Intelligence, (pp. 1324-1330). Detroit.
DeJong, G., & Mooney, R. (1986). Explanation-Based Learning: An Alternative View. Machine
Learning, 1, 145-176.
DeJongh, M. (1991). Causal Processes in the Problem Space Computational Model: Integrat-
ing Multiple Representations of Causal Processes in Abductive Problem Solving. Ph.D. diss.,
Department of Computer and Information Science, The Ohio State University, Columbus.
Descartes, R. (1641). Meditations on First Philosophy. (Lawrence Lafleur, Trans.). (Second re-
vised ed.). Bobbs-Merrill, (1960).
Donaldson, M. L. (1986). Children s Explanations: A Psycholinguistic Study. Cambridge: Cam-
bridge University Press.
Doyle, A. C. (1967). The Annotated Sherlock Holmes. Edited by W. S. Baring-Gould. New York:
Clarkson N. Potter.
Doyle, A. C. (1967). The Sign of the Four. In W. S. Baring-Gould (Ed.), The Annotated Sherlock
Holmes, (pp. 610-688). New York: Clarkson N. Potter. Originally published in 1890.
Dumouchel, P., & Lennig, M. (1987). Using Stress Information in Large Vocabulary Speech
Recognition. In P. Mermelstein (Ed.), Proceedings of the Montreal Symposium on Speech
Recognition, (pp. 73-74). Montreal.
Dvorak, D., & Kuipers, B. (1989). Model-Based Monitoring of Dynamic Systems. In Proceed-
ings of the Eleventh International Joint Conference on Artificial Intelligence, (pp. 1238—
1243). Detroit.
Dzierzanowski, J. (1984). Artificial Intelligence Methods in Human Locomotor Electromyogra-
phy. Ph.D. diss., Vanderbilt University, Nashville.
Eco, U., & Sebeok,T. A. (Eds.). (1983). The Sign of Three: Dupin, Holmes, Peirce. Bloomington:
Indiana University Press.
Eiter, T., & Gottlob, G. (1992). The Complexity of Logic-Based Abduction. Technical Report,
Christian Doppler Lab. for Expert Systems, Vienna Univ. of Technology, Vienna.
Ennis, R. (1968). Enumerative Induction and Best Explanation. The Journal of Philosophy,
LXV(18), 523-529.
Ericsson, K., & Simon, H. (1984). Protocol Analysis: Verbal Reports as Data. Cambridge, MA:
MIT Press.
Erman, L. D., & Lesser, V. R. (1980). The Hearsay-II Speech Understanding System: A Tutorial.
In W. Lea (Ed.), Trends in Speech Recognition. Englewood Cliffs, NJ: Prentice Hall. Re-
printed in Readings in Speech Recognition, Ed. Alex Waibel and Kai-Fu Lee, Morgan
Kaufmann, 1990.
Eshelman, L. (1988). MOLE: A Knowledge-Acquisition Tool for Cover-and-Differentiate Sys-
tems. In S. Marcus (Ed.), Automating Knowledge Acquisition for Expert Systems, (pp. 37-
80). Boston: Kluwer Academic Publishers.
Fann, K. T. (1970). Peirce s Theory of Abduction. The Hague: Martinus Nijhoff.
Fattah, Y. E., & O'Rorke, P. (1991). Learning Multiple Fault Diagnosis. In Proceedings of the
Seventh IEEE Conference on Artificial Intelligence Applications, (pp. 235-239). IEEE
Computer Society Press.
Fattah, Y. E., & O'Rorke, P. (1992). Explanation-Based Learning for Diagnosis. Technical Re-
port, Department of Information and Computer Science, University of California, Irvine.
(Report No. 92-21).
Fattah, Y. E., & O'Rorke, P. (1992). Learning Approximate Diagnosis. In Proceedings of the
Eighth IEEE Conference on Artificial Intelligence Applications, (pp. 150-156). Monterey:
IEEE Computer Society Press.
Extended Bibliography 279
Fine, A. (1984). The Natural Ontological Attitude. In J. Leplin (Ed.), Scientific Realism, (pp.
83-107). Berkeley and Los Angeles: University of California Press.
Fink, P. K., & Lusth, J. C. (1987). Expert Systems and Diagnostic Expertise in the Mechanical and
Electrical Domains. IEEE Transactions on Systems, Man and Cybernetics, 17, 340-349.
Fischer, O. (1991). Cognitively Plausible Heuristics to Tackle the Computational Complexity of
Abductive Reasoning. Ph.D. diss., Department of Computer and Information Science, The
Ohio State University, Columbus.
Fischer, O., & Goel, A. (1990). Abductive Explanation: On Why the Essentials are Essential. In
Ras, Zemenkova, & Emrich (Eds.), Intelligent Systems Methodologies - V. (pp. 354-361).
Amsterdam: North Holland.
Fischer, O., & Goel, A. (1990). Explanation: The Essentials versus the non-Essentials. In J.
Moore & M. Wick (Eds.), Proceedings oftheAAAI-90 Workshop on Explanation, (pp. 142-
151). Boston.
Fischer, O., Goel, A., Svirbely, J., & Smith, J. (1991). The Role of Essential Explanation in
Abduction. Artificial Intelligence in Medicine, 3(4), 181-191.
Fischer, O., Smith, J. W., & Smith, P. (1991). Meta Abduction: A Cognitively Plausible Domain-
Independent Computational Model of Abductive Reasoning. In Notes from the AAAI Work-
shop on Domain-Independent Strategies for Abduction, (pp. 32-36). Anaheim. (Josephson
&Dasigi, 1991).
Fodor, J. (1983). The Modularity of Mind. Cambridge, MA: MIT Press.
Forbus, K. D. (1984). Qualitative Process Theory. Artificial Intelligence, 24, 85-168.
Forgy, C, & McDermott, J. (1977). OPS: A Domain-Independent Production System. In Pro-
ceedings of the Fifth International Joint Conference on Artificial Intelligence, (pp. 933-
939). Pittsburgh.
Fox, R. K. (1992). Layered Abduction for Speech Recognition from Articulation. Ph.D. diss.,
Department of Computer and Information Science, The Ohio State University, Columbus.
Friedrich, G., Gottlob, G., & Nejdl, W. (1990). Hypothesis Classification, Abductive Diagnosis
and Therapy. In G. Gottlob & W. Nejdl (Eds.), Expert Systems in Engineering, (pp. 69-78).
Berlin: Springer-Verlag.
Fujimura, O. (1992). Phonology and phonetics - A syllable-based model of articulatory organi-
zation. Journal of the Acoustical Society of Japan, E(13), 39-48.
Fujimura, O., Ishida, Y., & Kiritani, S. (1973). Computer-Controlled Radiography for Observation
of Movements of Articulatory and other Human Organs. Comp. Biology and Med., 3, 371-384.
Fumerton, A. (1980). Induction and Reasoning to the Best Explanation. Philosophy of Science,
47, 589-600.
Garey, M., & Johnson, D. (1979). Computers and Intractability. New York: W. H. Freeman.
Geffner, H., & Pearl, J. (1987). An Improved Constraint-Propagation Algorithm for Diagnosis.
In Proceedings of the International Joint Conference on Artificial Intelligence, (pp. 1105—
1111). Milan.
Genesereth, M. R. (1984). The use of design descriptions in automated diagnosis. Artificial
Intelligence, 24, 411-436.
Ginsberg, A. (1988). Theory Revision via Prior Operationalization. In Proceedings ofAAAI-88:
The Seventh National Conference on Artificial Intelligence, (pp. 590-595). San Mateo:
Morgan Kaufmann.
Ginsberg, A., Weiss, S., & Politakis, P. (1985). SEEK2: A Generalized Approach to Automatic
Knowledge Base Refinement. In Ninth International Joint Conference on Artificial Intelli-
gence, (pp. 367-374). Los Altos: Morgan Kaufmann.
Goel, A. (1989). What is Abductive Reasoning? Neural Network Review, 3(4), 181-187.
Goel, A., Josephson, J. R., & Sadayappan, P. (1987). Concurrency in Abductive Reasoning. In
Proceedings of the Knowledge-Based Systems Workshop, (pp. 86-92). St. Louis: Science
Applications International Corporation.
Goel, A., Ramanujan, J., & Sadayappan, P. (1988). Towards a Neural Architecture for Abductive
Reasoning. In Proceedings of the Second IEEE International Conference on Neural Net-
works, 1 (pp. 681-688). San Diego: IEEE Computer Society Press.
280 ABDUCTIVE INFERENCE
Goel, A., Sadayappan, P., & Josephson, J. R. (1988). Concurrent Synthesis of Composite Ex-
planatory Hypotheses. In Proceedings of the Seventeenth International Conference on Par-
allel Processing, (pp. 156-160). The Pennsylvania State University Press.
Goldman, R. P. (1990). A Probabilistic Approach to Language Understanding. Ph.D. diss., Brown
University.
Goldman, R. P., & Charniak, E. (1991). Dynamic construction of belief networks. In P. P.
Bonissone, M. Henrion, L. N. Kanal, & J. F. Lemmer (Eds.), Uncertainty in Artificial Intel-
ligence, Volume 6. (pp. 171-184). Elsevier.
Goldman, R. P., & Charniak, E. (1992). Probabilistic text understanding. Statistics and Comput-
ing, 2, 105-114.
Goldman, R. P., & Charniak, E. (forthcoming). A language for construction of belief networks.
IEEE Transactions on Pattern Analysis and Machine Intelligence.
Gomez, F., & Chandrasekaran, B. (1984). Knowledge Organization and Distribution for Medi-
cal Diagnosis. In W. J. Clancey & E. H. Shortliffe (Eds.), Readings in Medical Artificial
Intelligence, (pp. 320-338). Reading, MA: Addison-Wesley. Also appears in IEEE Trans-
actions on Systems, Man and Cybernetics, SMC-ll(l):34-42, January, 1981.
Goodman, N. D. (1987). Intensions, Church's thesis, and the formalization of mathematics. The
Notre Dame Journal of Formal Logic, 28, 473-489.
Gregory, R. L. (1987). Perception as Hypotheses. In R. L. Gregory (Ed.), The Oxford Compan-
ion to the Mind. (pp. 608-611). New York: Oxford University Press.
Gregory, R. L. (1990). Eye and Brain: The Psychology of Seeing. Princeton, NJ: Princeton Uni-
versity Press.
Halliday, M. (1978). Language as Social Semantic. London: Edward Arnold.
Halliday, M. (1985). An Introduction to Functional Grammar. London: Edward Arnold.
Hamscher, W., Console, L., & de Kleer, J. (Eds.). (1992). Readings in Model-Based Diagnosis.
San Mateo: Morgan Kaufmann.
Hanson, N. R. (1958). Patterns of Discovery. London: Cambridge University Press.
Harman, G. (1965). The Inference to the Best Explanation. Philosophical Review, 74, 88-95.
Harman, G. (1968). Enumerative Induction as Inference to the Best Explanation. The Journal of
Philosophy, 65(18), 529-533.
Harman, G. (1986). Change in View. Cambridge, MA: MIT Press.
Harvey, A. M., & Bordley, J., Ill (1972). Differential Diagnosis, the Interpretation of Clinical
Evidence. Philadelphia: W. B. Saunders.
Hayes-Roth, F. (1985). A Blackboard Architecture for Control. Artificial Intelligence, 26, 251-321.
Hayes-Roth, F., & Lesser, V. (1977). Focus of Attention in the HEARSAY-H System. In Pro-
ceedings of the Fifth International Joint Conference on Artificial Intelligence, (pp. 27—35).
Helft, N., & Konolige, K. (1991). Integrating Evidential Information with Abduction. In Notes
from the AAAI Workshop on Domain-Independent Strategies for Abduction, (pp. 37-43).
Anaheim. (Josephson & Dasigi, 1991).
Hemami, H. (1985). Modeling, Control, and Simulation of Human Movement. Critical Reviews
in Biomedical Engineering, 13(1), 1-34.
Hempel, C. G. (1965). Aspects of Scientific Explanation. New York: Free Press.
Herman, D. J. (1992). An Extensible, Task-Specific Shell for Routine Design Problem Solving.
Ph.D. diss., Department of Computer and Information Science, The Ohio State University,
Columbus.
Heustis, D. W., Bove, J. R., & Busch, S. (1976). Practical Blood Transfusion. Little, Brown.
Hillis, W. D. (1986). The Connection Machine. Cambridge, MA: MIT Press.
Hirsch, D. (1987). An Expert System for Diagnosing Gait for Cerebral Palsy Patients. Technical
Report, Laboratory for Computer Science, MIT, Cambridge, MA. (Report No. MIT/LCS/
TR-388).
Hirsch, D., Simon, S. R., Bylander, T., Weintraub, M. A., & Szolovits, P. (1989). Using Causal
Reasoning in Gait Analysis. Applied Artificial Intelligence, 3(2-3), 253-272.
Hoare, C. A. R. (1978). Communicating Sequential Processes. Communications of the ACM,
23(9), 666-677.
Hobbs, J. R. (1987). Implicature and Definite Reference. Technical Report, Center for the Study
Extended Bibliography 281
of Language and Information, Stanford University, Stanford, California.
Hobbs, J. R. (1992). Metaphor and Abduction. In A. Ortony, J. Slack, & O. Stock (Eds.), Com-
munication from an Artificial Intelligence Perspective: Theoretical and Applied Issues, (pp.
35-58). Springer.
Hobbs, J. R., Appelt, D. E., Bear, J., Tyson, M., & Magerman, D. (1992). Robust Processing of
Real-World Natural-Language Texts. In P. Jacobs (Ed.), Text-Based Intelligent Systems:
Current Research and Practice in Information Extraction and Retrieval, (pp. 13-33).
Hillsdale, NJ: Lawrence Erlbaum.
Hobbs, J. R., Appelt, D. E., Bear, J. S., Tyson, M., & Magerman, D. (1991). The TACITUS Sys-
tem: The MUC-3 Experience. Technical Report, SRI International.
Hobbs, J. R., & Kameyama, M. (1990). Translation by Abduction. In H. Karlgren (Ed.), Pro-
ceedings of the Thirteenth International Conference on Computational Linguistics, (pp.
155-161). Helsinki.
Hobbs, J. R., Stickel, M., Appelt, D., & Martin, P. (1993). Interpretation as Abduction. Artificial
Intelligence Journal, 63(1-2), 69-142.
Holland, J. H., Holyoak, K. J., Nisbett, R. E., & Thagard, P. R. (1987). Induction. Cambridge,
MA: MIT Press.
Hunt, J. (1989). Towards a Generic, Qualitative-Based, Diagnostic Architecture. Technical Re-
port, Department of Computer Science, University College of Wales, Aberystwyth, Dyfed,
U.K. (Report No. RRG-TR-145-89).
Inman, V. T., Ralston, H. J., &Todd, F. (1981).Human Walking. Baltimore: Williams andWilkins.
Jackson, P. (1991). Possibilistic Prime Implicates and their Use in Abduction. In Notes from the
AAAI Workshop on Domain-Independent Strategies for Abduction, (pp. 44-50). Anaheim.
(Josephson & Dasigi, 1991).
Johnson, K., Sticklen, J., & Smith, J. W. (1988). IDABLE - Application of an Intelligent Data
Base to Medical Systems. In Proceedings of the AAAI Spring Artificial Intelligence in Medi-
cine Symposium, (pp. 43-44). Stanford University: AAAI.
Johnson, T., Chandrasekaran, B., & Smith, J. W. (1989). Generic Tasks and Soar. In Working
notes of the AAAI Spring Symposium on Knowledge System Development Tools and Lan-
guages, Menlo Park: AAAI.
Johnson, T. R. (1986). HYPER: The Hypothesis Matcher Tool. In Proceedings of the Expert Systems
Workshop, (pp. 122-126). Pacific Grove, CA: Science Applications International Corporation.
Johnson, T. R. (1991). Generic Tasks in the Problem-Space Paradigm: Building Flexible Knowl-
edge Systems While Using Task-Level Constraints. Ph.D. diss., Department of Computer
and Information Science, The Ohio State University, Columbus.
Johnson, T. R., & Smith, J. W. (1991). A Framework for Opportunistic Abductive Strategies. In
Proceedings of the Thirteenth Annual Conference of the Cognitive Science Society, (pp.
760-764). Hillsdale, NJ: Lawrence Erlbaum.
Johnson, T. R., Smith, J. W., & Bylander, T. (1988). HYPER - Hypothesis Matching Using Com-
piled Knowledge. In Proceedings of the Spring Symposium Series: Artificial Intelligence
in Medicine, (pp. 45-46). Stanford University.
Josephson, J. R. (1982). Explanation and Induction. Ph.D. diss., Department of Philosophy, The
Ohio State University, Columbus.
Josephson, J. R. (1988). Reducing Uncertainty by Using Explanatory Relationships. In Pro-
ceedings of the Space Operations Automation and Robotics 1988 Workshop, (pp. 149-151).
Dayton: NASA, The United States Air Force, and Wright State University.
Josephson, J. R. (1989). Inference to the Best Explanation is Basic. Behavioral and Brain Sci-
ences, 12(3). Appears as a commentary on "Explanatory Coherence" by Paul Thagard.
Josephson, J. R. (1989). A Layered Abduction Model of Perception: Integrating Bottom-up and
Top-down Processing in a Multi-Sense Agent. In Proceedings of the NASA Conference on
Space Telerobotics, (pp. 197-206). Pasadena: JPL Publication 89-7.
Josephson, J. R. (1989). Speech Understanding Based on Layered Abduction: Working Notes
for the Symposium on Spoken Language Systems. In Proceedings of the AAAI-89 Spring
Symposium Series, Stanford University: AAAI.
Josephson, J. R. (1990). Spoken Language Understanding as Layered Abductive Inference. Tech-
282 ABDUCTIVE INFERENCE
nical Report, The Ohio State University, Laboratory for Artificial Intelligence Research,
Columbus.
Josephson, J. R., Chandrasekaran, B., & Smith, J. (1984). Assembling the Best Explanation. In
Proceedings of the IEEE Workshop on Principles of Knowledge-Based Systems, (pp. 185—
190). Denver, CO: IEEE Computer Society.
Josephson, J. R., Chandrasekaran, B., Smith, J., & Tanner, M. C. (1986). Abduction by Classifi-
cation and Assembly. In A. Fine & P. Machamer (Eds.), PSA 1986, 1 (pp. 458-470). East
Lansing: Philosophy of Science Association.
Josephson, J. R., Chandrasekaran, B., Smith, J., & Tanner, M. C. (1987). A Mechanism for Forming
Composite Explanatory Hypotheses. IEEE Transactions on Systems, Man and Cybernetics,
Special Issue on Causal and Strategic Aspects of Diagnostic Reasoning, 17(3), 445-54.
Josephson, J. R., & Dasigi, V. (1991). Notes from the AAAI Workshop on Domain-Independent
Strategies for Abduction. Available as a technical report from the Laboratory for Artificial
Intelligence Research, Department of Computer and Information Science, The Ohio State
University, Columbus, Ohio, 43210. Internet: [email protected]
Josephson, J. R., Smetters, D., Fox, R., Oblinger, D., Welch, A., & Northrup, G. (1989). The
Integrated Generic Task Toolset, Fafner Release 1.0. Technical Report, The Ohio State
University, Laboratory for Artificial Intelligence Research.
Josephson, J. R., Tanner, M. C, Smith, J., Svirbely, J., & Strohm, P. (1985). Red: Integrating
Generic Tasks to Identify Red-Cell Antibodies. In K. N. Kama (Ed.), Proceedings of The
Expert Systems in Government Symposium, (pp. 524-531). IEEE Computer Society Press.
Kant, I. (1787). Critique of Pure Reason. (Norman Kemp Smith, Trans.). New York: St. Martins
Press, (1968).
Kautz, H. A. (1987). A Formal Theory ofPlan Recognition. Ph.D. diss., University of Rochester.
Kautz, H. A., & Allen, J. F. (1986). Generalized Plan Recognition. In Proceedings ofAAAI-86, 1
(pp. 32-35). Los Altos: Morgan Kaufmann.
Keller, R. (1990). In Defense of Compilation. In Second AAAI Workshop on Model-Based Rea-
soning, (pp. 22-31). Boston.
Keuneke, A. M. (1989). Understanding Devices: Causal Explanation of Diagnostic Conclu-
sions. Ph.D. diss., Department of Computer and Information Science, The Ohio State Uni-
versity, Columbus.
Keuneke, A. M. (1991). Device representation: the significance of functional knowledge. IEEE
Expert, 6(2), 22-25.
Kitcher, P., & Salmon, W. C. (Eds.). (1989). Scientific Explanation. Minneapolis: University of
Minnesota Press.
Konolige, K. (1992). Abduction versus Closure in Causal Theories. Artificial Intelligence, 53(2-
3), 255-272.
Korner, S. (Ed.). (1975). Explanation. New Haven: Yale University Press.
Kosaka, M., & Wakita, H. (1987). Syllable Structure of English Words: Implications for Lexical
Access. In P. Mermelstein (Ed.), Proceedings of the Montreal Symposium on Speech Rec-
ognition, (pp. 59-60). Montreal.
Koton, P. (1988). Reasoning About Evidence in Causal Explanations. In Proceedings ofAAAI-
88: The Seventh National Conference on Artificial Intelligence, (pp. 256-261). San Mateo:
Morgan Kaufmann.
Ku, I. O.-C. (1991). Theoretical and Empirical Perspectives on the Abductive Confidence Func-
tion. Master's thesis, Department of Computer and Information Science, The Ohio State
University, Columbus.
Kuhn, T. (1970). The Structure of Scientific Revolutions. Chicago: University of Chicago Press.
Kuipers, B. J. (1986). Qualitative Simulation. Artificial Intelligence, 29(3), 289-338.
Laird, J., Rosenbloom, P., & Newell, A. (1986). Universal Subgoaling and Chunking. Boston:
Kluwer Academic Publishers.
Laudan, L. L. (1971). William Whewell on the Consilience of Inductions. The Monist, 55.
Leake, D. B. (1992). Evaluating Explanations: A Content Theory. Hillsdale, NJ: Lawrence
Erlbaum.
Extended Bibliography 283
Lesser, V. R., Fennell, L. D., Erman, L. D., & Reddy, R. D. (1975). The Hearsay-II Speech Un-
derstanding System. IEEE Transactions on Acoustics, Speech and Signal Processing, 23,
11-24.
Levesque, H. J. (1989). A Knowledge-Level Account of Abduction. In Proceedings of the Elev-
enth International Joint Conference on Artificial Intelligence, (pp. 1061-1067). Detroit.
Lin, D. (1991). Obvious Abduction. In Notes from the AAAI Workshop on Domain-Independent
Strategies for Abduction, (pp. 51-58). Anaheim. (Josephson & Dasigi, 1991).
Lin, D. (1992). Obvious Abduction. Ph.D. diss., Department of Computing Science, University
of Alberta.
Lin, D. (1993). Principle-based Parsing without Overgeneration. In Proceedings of ACL-93,
(pp. 112-120). Columbus.
Lin, D., & Goebel, R. (1991). Integrating probabilistic, taxonomic and causal knowledge in
abductive diagnosis. In Uncertainty in Artificial Intelligence: Volume VI, (pp. 77-87).
Elsevier Science Publishers.
Lin, D., & Goebel, R. (1991). A Message Passing Algorithm for Plan Recognition. In Proceed-
ings of the International Joint Conference on Artificial Intelligence, (pp. 280-285).
Lipton, P. (1991). Inference to the Best Explanation. London: Routledge.
Livingstone, M., & Hubel, D. (1988). Segregation of Form, Color, Movement, and Depth:
Anatomy, Physiology, and Perception. Science, 240, 740-749.
Lycan, W. G. (1985). Epistemic Value. Synthese, 64, 137-164.
Lycan, W. G. (1988). Judgement and Justification. Cambridge: Cambridge University Press.
Mackie, J. L. (1974). The Cement of the Universe: A Study of Causation. Oxford, UK: Clarendon
Press.
Maier, J. (Ed.). (1988). AAAI-88 Workshop on Plan Recognition. St. Paul: AAAI.
Marquis, P. (1991). Extending Abduction from Propositional to First-Order Logic. In Proceed-
ings of the First International Workshop on Fundamentals of Artificial Intelligence Re-
search (FAIR) , New York: Springer-Verlag.
Marquis, P. (1991). Mechanizing Skeptical Abduction and its Applications to Artificial Intelli-
gence. In Proceedings of the Third IEEE International Conference on Tools for Artificial
Intelligence, San Jose: IEEE Computer Society Press.
Marquis, P. (1991). Novelty Revisited. In Proceedings of the Sixth International Symposium on
Methodologies for Intelligent Systems (ISMIS) , New York: Springer-Verlag.
Marquis, P. (1991). Towards Data Interpretation by Deduction and Abduction. In Notes from the
AAAI Workshop on Domain-Independent Strategies for Abduction, (pp. 59-63). Anaheim.
(Josephson & Dasigi, 1991).
Marquis, P. (1992). Hypothetico-Deductive Diagnoses. In Proceedings of the International Sym-
posium of the Society for Optical Engineering (SPIE), Applications of Artificial Intelli-
gence: Knowledge-Based Systems. SPIE Society Press.
Marr, D. (1979). Representing and Computing Visual Information. In P. H. Winston & R. H.
Brown (Eds.), Artificial Intelligence: An MIT Perspective, (pp. 17-82). Cambridge, MA:
MIT Press.
Marr, D. (1981). Artificial Intelligence: A Personal View. In J. Haugeland (Ed.), Mind Design.
(pp. 129-142). Cambridge, MA: MIT Press. Also appears in Artificial Intelligence 9(1):47-
48, 1977.
Marr, D. (1982). Vision. San Francisco: W. H. Freeman.
Massaro, D. W. (1987). Speech Perception by Ear and Eye: A Paradigm for Psychological In-
quiry. Hillsdale, NJ: Lawrence Erlbaum.
McCarthy, J. (1980). Circumscription - A Form of Non-Monotonic Reasoning. Artificial Intel-
ligence, 13, 27-39.
McDermott, D. (1990). A Critique of Pure Reason. In M. A. Boden (Ed.), The Philosophy of
Artificial Intelligence, (pp. 206-230). New York: Oxford University Press.
McDermott, J. (1988). Preliminary Steps Toward a Taxonomy of Problem Solving Types. In S.
Marcus (Ed.), Automating Knowledge Acquisition for Expert Systems, (pp. 225-256). Bos-
ton: Kluwer Academic Publishers.
284 ABDUCTIVE INFERENCE
Miller, R. A., Pople, H. E., Jr. , & Myers, J. D. (1982). INTERNIST-I, An Experimental Com-
puter-Based Diagnostic Consultant for General Internal Medicine. New England Journal
of Medicine, 307(8), 468-476.
Milne, A. A. (1926). Winnie-the-Pooh. E. P. Dutton and Co.
Minsky, M. (1963). Steps Towards Artificial Intelligence. In E. A. Feigenbaum & J. Feldman
(Eds.), Computers and Thought. New York: McGraw-Hill.
Minsky, M. (1975). A Framework for Representing Knowledge. In P. H. Winston (Ed.), The
Psychology of Computer Vision. New York: McGraw-Hill.
Mitchell, T., Keller, R., & Kedar-Cebelli, S. (1986). Explanation-Based Generalization: A uni-
fying view. Machine Learning, 1, 47-80.
Mittal, S. (1980). Design of a Distributed Medical Diagnosis and Database System. Ph.D. diss.,
Department of Computer and Information Science, The Ohio State University, Columbus.
Mittal, S., & Chandrasekaran, B. (1980). Conceptual Representation of Patient Data Bases. Jour-
nal of Medical Systems, 4, 169-185.
Mittal, S., Chandrasekaran, B., & Sticklen, J. (1984). PATREC: A Knowledge-Directed Data
Base for a Diagnostic Expert System. IEEE Computer Special Issue, 17, 51-58.
Mittal, S., & Frayman, F. (1987). Making Partial Choices in Constraint Reasoning Problems. In
Sixth National Conference on Artificial Intelligence, (pp. 631-636). Los Altos: Morgan
Kaufmann.
Moberg, D., Darden, L., & Josephson, J. (1992). Representing and Reasoning About a Scientific
Theory. In Proceedings oftheAAAI Workshop on Communicating Scientific and Technical
Knowledge, (pp. 58-64). Stanford University.
Moberg, D., & Josephson, J. R. (1990). Implementation Note on Diagnosing and Fixing Faults
in Theories. In J. Shrager & P. Lindley (Eds.), Computational Models of Scientific Discov-
ery and Theory Formation, (pp. 347-353). San Mateo: Morgan Kaufmann.
Monastersky, R. (1990). The scoop on dino droppings. Science News, Vol. 138, No. 17, p. 270.
Newell, A. (1982). The Knowledge Level. Artificial Intelligence, 18, 82-106.
Newell, A. (1990). Unified Theories of Cognition. Cambridge, MA: Harvard University Press.
Newell, A., & Simon, H. A. (1963). GPS, A Program That Simulates Human Thought. In
Feigenbaum & Feldman (Eds.), Computers and Thought, (pp. 279-293). New York:
McGraw-Hill.
Newell, A., & Simon, H. A. (1976). Computer Science as Empirical Inquiry: Symbols and Search,
Thel976ACM Turing Lecture. Communications of the ACM, 19(3), 113-126. Reprinted in
Mind Design, J. Haugeland (ed.),1981, MIT Press.
Newell, A., Yost, G., Laird, J. E., Rosenbloom, P. S., & Altmann, E. (1991). Formulating the
Problem Space Computational Model. In R. F. Rashid (Ed.), Carnegie-Mellon Computer
Science: A 25-Year Commemorative Reading, Boston: ACM-Press: Addison-Wesley.
Ng, H. T. (1992). A General Abductive System with Application to Plan Recognition and Diag-
nosis. Ph.D. diss., University of Texas.
Norvig, P. (1987). ^4 Unified Theory of Inference for Text Understanding. Ph.D. diss., University
of California, Berkeley.
Norvig, P., & Wilensky, R. (1990). A Critical Evaluation of Commensurable Abduction Models
for Semantic Interpretation. In Proceedings of the Thirteenth International Conference on
Computational Linguistics, Helsinki, Finland.
Nusbaum, H., & Pisoni, D. (1987). The Role of Structural Constraints in Auditory Word Recog-
nition. In P. Mermelstein (Ed.), Proceedings of the Montreal Symposium on Speech Recog-
nition, (pp. 57-58). Montreal.
O'Rorke, P. (1988). Automated Abduction and Machine Learning. In G. DeJong (Ed.), Working
Notes oftheAAAI 1988 Spring Symposium on Explanation-Based Learning, (pp. 170-174).
Stanford University: AAAI.
O'Rorke, P. (1989). LT Revisited: Explanation-Based Learning and the Logic of Principia
Mathematica. Machine Learning, 4(2), 117-159.
O'Rorke, P. (1990). Integrating Abduction and Learning. In Working Notes of the AAAI 1990
Spring Symposium on Automated Abduction, (pp. 30-32). Available as Technical Report
90-32, University of California, Irvine, Department of Information and Computer Science.
Extended Bibliography 285
O'Rorke, P. (1990). Review of the AAAI-1990 Spring Symposium on Automated Abduction.
Sigart Bulletin, 1(3), 12-13.
O'Rorke, P., & Josephson, J. (Eds.), (planned). Computational Models for Abduction.
O'Rorke, P., & Morris, S. (1992). Abductive Signal Interpretation for Nondestructive Evalua-
tion. In Applications ofArtificial Intelligence X: Knowledge-Based Systems, 1707 (pp. 68-
75). SPIE - The International Society for Optical Engineering.
O'Rorke, P., Morris, S., & Schulenburg, D. (1989). Abduction and World Model Revision. In
Proceedings of the Eleventh Annual Conference of the Cognitive Science Society, (pp. 789-
796). Hillsdale, NJ: Lawrence Erlbaum.
O'Rorke, P., Morris, S., & Schulenburg, D. (1989). Theory Formation by Abduction: Initial Re-
sults of a Case Study Based on the Chemical Revolution. In Proceedings of the Sixth Inter-
national Workshop on Machine Learning, San Mateo: Morgan Kaufmann.
O'Rorke, P., Morris, S., & Schulenburg, D. (1990). Theory Formation by Abduction: A Case
Study Based on the Chemical Revolution. In J. Shrager & P. Lindley (Eds.), Computational
Models of Scientific Discovery and Theory Formation, (pp. 197-224). San Mateo: Morgan
Kaufmann.
O'Rorke, P., & Ortony, A. (1992). Abductive Explanation of Emotions. In Proceedings of the
Fourteenth Annual Conference of the Cognitive Science Society, Hillsdale, NJ: Lawrence
Erlbaum.
O'Rorke, P., & Ortony, A. (1992). Explaining Emotions. Technical Report, University of Cali-
fornia, Irvine, Department of Information and Computer Science. (Report No. 92-22).
Patil, R. S. (1981). Causal Representation ofPatient Illness for Electrolyte and Acid-Base Diag-
nosis. Ph.D. diss., Laboratory for Computer Science, Massachusetts Institute of Technol-
ogy.
Patil, R. S., Szolovits, P., & Schwartz, W. B. (1981). Causal Understanding of Patient Illness in
Medical Diagnosis. In Proceedings of the Seventh International Joint Conference on Arti-
ficial Intelligence, (pp. 893-899). Vancouver, B.C. IJCAI.
Patil, R. S., Szolovits, P., & Schwartz, W. B. (1982). Modeling Knowledge of the Patient in Acid-
Base and Electrolyte Disorders. In P. Szolovits (Ed.), Artificial Intelligence in Medicine.
(pp. 191-226). Boulder: Westview Press.
Patten, T. (1988). Systemic Text Generation as Problem Solving. New York: Cambridge Univer-
sity Press.
Patten, T., & Richie, G. (1987). A Formal Model of Systemic Grammar. In G. Kempen (Ed.),
Natural Language Generation, (pp. 279-299). The Hague: Martinus Nijhoff.
Pazzani, M. (1988). Selecting the Best Explanation for Explanation-Based Learning. In Pro-
ceedings of the AAAI Symposium on Explanation-Based Learning, Stanford University.
Pazzani, M. J. (1990). Creating a Memory of Causal Relationships. Hillsdale, NJ: Lawrence
Erlbaum.
Pearl, J. (1986). Fusion, Propagation, and Structuring in Belief Networks. Artificial Intelligence,
29(3), 241-288.
Pearl, J. (1987). Distributed Revision of Composite Beliefs. Artificial Intelligence, 33(2), 173-215.
Pearl, J. (1988). Evidential Reasoning Under Uncertainty. In H. Shrobe (Ed.), Exploring Artifi-
cial Intelligence. San Mateo: Morgan Kaufmann.
Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems. San Mateo: Morgan Kaufmann.
Pearl, J. (1990). Probabilistic and Qualitative Abduction. In Proceedings of the AAAI Spring
Symposium on Abduction, (pp. 155-158). Stanford: AAAI.
Peirce, C. S. (1839-1914). Collected Papers of Charles Sanders Peirce. Edited by C. Hartshorne,
P. Weiss, & A. Burks. Cambridge, MA: Harvard University Press, (1931-1958).
Peirce, C. S. (1902). Perceptual Judgments. In J. Buchler (Ed.), Philosophical Writings of Peirce.
(pp. 302-305). New York: Dover (1955).
Peirce, C. S. (1903). Abduction and Induction. In J. Buchler (Ed.), Philosophical Writings of
Peirce. (pp. 150-156). New York: Dover (1955).
Peirce, C. S. (1955). Philosophical Writings of Peirce. Edited by J. Buchler. New York: Dover
Publications Inc.
Peng, Y. (1986). A Formalization of Parsimonious Covering and Probabilistic Reasoning in
286 ABDUCTIVE INFERENCE
Abductive Diagnostic Inference. Ph.D. diss., University of Maryland, College Park.
Peng, Y., & Reggia, J. A. (1986). Plausibility of Diagnostic Hypotheses: The Nature of Simplic-
ity. In Proceedings of the Fifth Annual National Conference on Artificial Intelligence, (pp.
140-145). Philadelphia: Morgan Kaufmann.
Peng, Y., & Reggia, J. A. (1987). A Probabilistic Causal Model for Diagnostic Problem Solv-
ing-Part 1. IEEE Transactions on Systems, Man, and Cybernetics, 17(2), 147-162.
Peng, Y, & Reggia, J. A. (1987). A Probabilistic Causal Model for Diagnostic Problem Solv-
ing - Part 2. IEEE Transactions on Systems, Man, and Cybernetics, 17(3), 395-406.
Peng, Y, & Reggia, J. A. (1990). Abductive Inference Models for Diagnostic Problem Solving.
New York: Springer-Verlag.
Pennington, N., & Hastie, R. (1988). Explanation-based decision making: Effects of Memory
Structure on Judgment. Journal of Experimental Psychology: Learning, Memory, and Cog-
nition, 14(3), 521-533.
Peterson, I. (1990). Stellar X-ray burst brings theory shock. Science News, Vol. 138, No. 16, p.
246.
Piaget, J. (1969). The Child's Conception ofPhysical Causality. Totowa, NJ: Littlefield, Adams.
Piaget, J. (1974). Understanding Causality. New York: W. W. Norton.
Politakis, P., & Weiss, S. (1984). Using Empirical Analysis to Refine Expert System Knowledge
Bases. Artificial Intelligence, 22(1), 23-84.
Poole, D. (1988). A Logical Framework for Default Reasoning. Artificial Intelligence, 36(1),
27-47.
Poole, D. (1989). Explanation and Prediction: An Architecture for Default and Abductive Rea-
soning. Computational Intelligence, 5(2), 97-110.
Pople, H. (1973). On the Mechanization of Abductive Logic. In Proceedings of the Third Inter-
national Joint Conference on Artificial Intelligence, Stanford, CA.
Pople, H. (1977). The Formation of Composite Hypotheses in Diagnostic Problem Solving: An
Exercise in Synthetic Reasoning. In Proceedings of the Fifth International Joint Confer-
ence on Artificial Intelligence, (pp. 1030-1037). Pittsburgh.
Pople, H. (1982). Heuristic Methods for Imposing Structure on Ill-Structured Problems: The
Structure of Medical Diagnosis. In P. Szolovits (Ed.), Artificial Intelligence in Medicine.
(pp. 119-190). Boulder: Westview Press.
Pople, H. E. (1985). Evolution of an Expert System: From INTERNIST to Caduceus. Artificial
Intelligence In Medicine, 179-208.
Punch, W. F., Ill (1989). A Diagnosis System Using a Task Integrated Problem Solver Architec-
ture (TIPS), Including Causal Reasoning. Ph.D. diss., Department of Computer and Infor-
mation Science, The Ohio State University, Columbus.
Punch, W. F., Ill, Tanner, M. C, & Josephson, J. R. (1986). Design Considerations for PEIRCE,
a High-Level Language for Hypothesis Assembly. In K. N. Kama, K. Parsaye, & B. G.
Silverman (Eds.), Proceedings of the Expert Systems in Government Symposium, (pp. 279-
281). IEEE Computer Society Press.
Punch, W. F., Ill, Tanner, M. C, Josephson, J. R., & Smith, J. W. (1990). PEIRCE: A Tool for
Experimenting with Abduction. IEEE Expert, 5(5), 34-44.
Pylyshyn, Z. (1979). Complexity and the Study of Artificial and Human Intelligence. In J.
Haugeland (Ed.), Mind Design, (pp. 67-94). Cambridge, MA: MIT Press.
Pylyshyn, Z. (1984). Computation and Cognition: Towards a Foundation for Cognitive Sci-
ence. Cambridge, MA: MIT Press.
Ramesh, T. S. (1989). A Knowledge-Based Framework for Process and Malfunction Diagnosis
in Chemical Plants. Ph.D. diss., Department of Chemical Engineering, The Ohio State Uni-
versity, Columbus.
Reggia, J. A. (1985). Abductive Inference. In K. N. Kama (Ed.), Proceedings of The Expert
Systems in Government Symposium, (pp. 484-489). IEEE Computer Society Press.
Reggia, J. A., & Nau, D. S. (1985). A Formal Model of Diagnostic Inference. Information Sci-
ences, 37, 227-285.
Reggia, J. A., Nau, D. S., & Wang, P. (1983). Diagnostic Expert Systems Based on a Set Cover-
Extended Bibliography 287
ing Model. International Journal of Man-Machine Studies, 19(5), 437-460.
Reggia, J. A., Perricone, B.T., Nau, D. S., & Peng,Y. (1985). Answer Justification in Diagnostic
Expert Systems. IEEE Transactions on Biomedical Engineering, 32(4), 263-272.
Reiter, R. (1987). A Theory of Diagnosis from First Principles. Artificial Intelligence, 32(1),
57-95.
Rock, I. (1983). The Logic of Perception. Cambridge, MA: MIT Press.
Rosenbloom, P. S., Laird, J. E., & Newell, A. (1987). SOAR: An Architecture for General Intel-
ligence. Artificial Intelligence, 33, 1-64.
Roth, E. M., Woods, D. D., & Pople, H. E. (1992). Cognitive simulation as a tool for cognitive
task analysis. Ergonomics, 35(10), 1163-1198.
Rubin, A. (1975). The Role of Hypotheses in Medical Diagnosis. In Proceedings of the Interna-
tional Joint Conference on Artificial Intelligence, (pp. 856-862).
Salmon, W. C. (1967). The Foundations of Scientific Inference. Pittsburgh, PA: University of
Pittsburgh Press.
Salmon, W. C. (1975). Theoretical Explanations. In S. Koerner (Ed.), Explanation, (pp. 118-
184). New Haven: Yale University Press.
Salmon, W. C. (1990). Four Decades of Scientific Explanation. Minneapolis: University of Min-
nesota Press.
Santos, E., Jr. (1991). Cost-Based Abduction, Constraint Systems and Alternative Generation.
In Notes from the AAAI Workshop on Domain-Independent Strategies for Abduction, (pp.
64-71). Anaheim. (Josephson & Dasigi, 1991).
Santos, E. (1991). On the generation of alternative explanations with implications for belief revi-
sion. In Proceedings of the Conference on Uncertainty in Artificial Intelligence, (pp. 339-348).
Santos, E. (Forthcoming). A linear constraint satisfaction approach to cost-based abduction.
Artificial Intelligence.
Schank, R., & Abelson, R. (1977). Scripts, Plans, Goals, and Understanding. Hillsdale, NJ:
Lawrence Erlbaum.
Schank, R. C. (1986). Explanation Patterns: Understanding Mechanically and Creatively.
Hillsdale, NJ: Lawrence Erlbaum.
Schrager, J., & Langley, P. (Eds.). (1990). Computational Models of Scientific Discovery and
Theory Formation. San Mateo: Morgan-Kaufman.
Schum, D. A. (1987). Evidence and Inference for the Intelligence Analyst. Lanham, MD: Uni-
versity Press of America.
Schwartz, L. M., & Miles, W. (1977). Blood Bank Technology. Baltimore: Williams andWilkins.
Searle, J. (1980). Minds, Brains and Programs. Behavioral and Brain Sciences, 3, 417-457.
Sebeok, T. A., & Umiker-Sebeok, J. (1983). "You Know My Method": A Juxtaposition of Charles
S. Peirce and Sherlock Holmes. In U. Eco & T. A. Sebeok (Eds.), The Sign of Three: Dupin,
Holmes, Peirce. (pp. 11-54). Bloomington: Indiana University Press.
Selman, B., & Levesque, H. J. (1990). Abductive and Default Reasoning: A Computational Core.
In Proceedings of the Eighth National Conference on Artificial Intelligence, (pp. 343-348).
Boston: AAAI Press, Menlo Park, California.
Sembugamoorthy, V., & Chandrasekaran, B. (1986). Functional Representation of Devices and
Compilation of Diagnostic Problem Solving Systems. In J. L. Kolodner & C. K. Riesbeck
(Eds.), Experience, Memory and Reasoning, (pp. 47-73). Hillsdale, NJ: Lawrence Erlbaum.
Shimony, S. (1991). Algorithms for Irrelevance-Based Partial MAPs. In Proceedings of the Con-
ference on Uncertainty in Artificial Intelligence, (pp. 370-378).
Shimony, S. (1991). Explanation, Irrelevance and Statistical Independence. In Proceedings of
the Ninth National Conference on Artificial Intelligence, (pp. 482-487). Cambridge, MA:
MIT Press.
Shimony, S., & Charniak, E. (forthcoming). Cost-based abduction and MAP explanation. Artifi-
cial Intelligence.
Shortliffe, E. H. (1976). Computer-Based Medical Consultations: MYCIN. Elsevier.
Shrager, J., & Langley, P. (Eds.). (1990). Computational Models of Discovery and Theory For-
mation. San Mateo: Morgan Kaufmann.
288 ABDUCTIVE INFERENCE
Simmons, R., & Davis, R. (1987). Generate, Test and Debug: Combining Associational Rules
and Causal Models. In Proceedings of the Tenth International Joint Conference on Artifi-
cial Intelligence, (pp. 1071-78). Milan.
Simon, H. A. (1981). The Sciences of the Artificial (Second Edition). Cambridge, MA: MIT
Press.
Simon, S. R. (1982). Kinesiology—Its Measurement and Importance to Rehabilitation. In V. L.
Nickel (Ed.), Orthopedic Rehabilitation. New York: Churchill Livingstone.
Smith, J., Svirbely, J. R., Evans, C. A., Strohm, P., Josephson, J. R., & Tanner, M. C. (1985).
RED: A Red-Cell Antibody Identification Expert Module. Journal of Medical Systems, 9(3),
121-138.
Smith, J. W. (1985). RED: A Classificatory and Abductive Expert System. Ph.D. diss., Depart-
ment of Computer and Information Science, The Ohio State University, Columbus.
Smith, J. W., Josephson, J. R., Evans, C, Strohm, P., & Noga, J. (1983). Design For a Red-Cell
Antibody Identification Expert. In Proceedings of the Second International Conference on
Medical Computer Science and Computational Medicine. IEEE Society Press.
Smith, J. W., Josephson, J. R., Tanner, M. C, Svirbely, J., & Strohm, P. (1986). Problem Solving
in Red Cell Antibody Identification: Red's Performance on 20 Cases. Technical Report,
The Ohio State University, Laboratory for Artificial Intelligence Research.
Smolensky, P. (1988). On the Proper Treatment of Connectionism. Behavioral and Brain Sci-
ences, 11(1), 1-23.
Sosa, E. (Ed.). (1975). Causation and Conditionals. New York: Oxford University Press.
Spiegelhalter, D. J., & Lauritzen, S. L. (1989). Sequential Updating of Conditional Probabili-
ties on Directed Graphical Structures. Technical Report, Institut for Electorniske Systemer,
Aalborg Univrsity, Aalborg, Denmark.
Steels, L. (1990). Components of Expertise. AI Magazine, 11(2), 28-49.
Stefik, M. (1981). Planning with Constraints MOLGEN: Part 1. Artificial Intelligence, 16, 111-
140.
Stickel, M. (1989). Rationale and Methods for Abductive Reasoning in Natural-Language Inter-
pretation. In R. Studer (Ed.), Proceedings, Natural Language and Logic, International Sci-
entific Symposium, Hamburg, Germany, May 1989, (pp. 233-252). Berlin: Springer-Verlag.
Stickel, M. E. (1988). A Prolog-like Inference System for Computing Minimum-Cost Abductive
Explanations in Natural-Language Interpretation. In Proceeding of the International Com-
puter Science Conference-88, (pp. 343-350). Hong Kong.
Sticklen, J. (1987). MDX2: An Integrated Medical Diagnostic System. Ph.D. diss., Department
of Computer and Information Science, The Ohio State University, Columbus.
Sticklen, J. (1989). Distributed Abduction in MDX2. In C. Kulikowski, J. M. Krivine, & J. M.
David (Eds.), Artificial Intelligence in Scientific Computation: Towards Second Genera-
tion Systems. Basel: J. C. Baltzer.
Sticklen, J., Chandrasekaran, B., & Bond, W. E. (1989). Distributed Causal Reasoning. Knowl-
edge Acquisition, 1(2), 139-162.
Sticklen, J., Chandrasekaran, B., & Josephson, J. (1985). Control Issues in Classificatory Diag-
nosis. In Proceedings of the Ninth International Joint Conference on Artificial Intelligence,
Los Altos: Morgan Kaufmann.
Sticklen, J., Kamel, A., & Bond, W. E. (1991). Integrating Quantitative and Qualitative Compu-
tations in a Functional Framework. Engineering Applications of Artificial Intelligence, 4(1),
1-10.
Sticklen, J., Kamel, A., & Bond, W. E. (1991). A Model-Based Approach for Organizing Quan-
titative Computations. In Proceedings of the Second Annual Conference on AI, Simulation
and Planning in High Autonomy Systems, Orlando.
Struss, P. (1987). Problems of Interval-Based Qualitative Reasoning. In Fruechtennicht & oth-
ers (Eds.), Wissenrepraesentation und Schlussfolgerungsverfahren fuer Technische
Expertsysteme. Munich: INF 2 ARM-1-87, Seimens.
Struss, P., & Dressier, O. (1989). 'Physical Negation'— Integrating Fault Models into the Gen-
Extended Bibliography 289
eral Diagnostic Engine. In Proceedings of the Eleventh International Joint Conference on
Artificial Intelligence, (pp. 1318-1323). Detroit.
Suwa, M., & Motoda, H. (1991). Learning Abductive Strategies from an Example. In Notes from
the AAAI Workshop on Domain-Independent Strategies for Abduction, (pp. 72-79). Ana-
heim. (Josephson & Dasigi, 1991).
Swartout, W. R. (1983). XPLAIN: A System for Creating and Explaining Expert Consulting
Programs. Artificial Intelligence, 21(3), 285-325.
Szolovits, P., & Pauker, S. G. (1984). Categorical and Probabilistic Reasoning in Medical Diag-
nosis. Artificial Intelligence, 11, 115-144.
Tanner, M. C. (1989). Explaining Knowledge Systems: Justifying Diagnostic Conclusions. Ph.D.
diss., Department of Computer and Information Science, The Ohio State University, Columbus.
Tanner, M. C , Fox, R., Josephson, J. R., & Goel, A. K. (1991). On a Strategy for Abductive
Assembly. In Notes from the AAAI Workshop on Domain-Independent Strategies for Ab-
duction, (pp. 80-83). Anaheim. (Josephson & Dasigi, 1991).
Tanner, M. C , Josephson, J., & Smith, J. W. (1991). RED-2 Demonstration Case OSU-9. Tech-
nical Report, The Ohio State University, Laboratory for Artificial Intelligence Research.
Tanner, M. C , & Josephson, J. R. (1988). Abductive Justification. Technical Report, The Ohio
State University, Laboratory for Artificial Intelligence Research, Columbus.
Thagard, P. (1987). The best explanation: Criteria for theory choice. Journal of Philosophy, 75,
76-92.
Thagard, P. (1988). Computational Philosophy of Science. Cambridge, MA: MIT Press.
Thagard, P. (1989). Explanatory Coherence. Behavioral and Brain Sciences, 12(3).
Thagard, P. (1992). Conceptual Revolutions. Princeton: Princeton University Press.
Thagard, P. (unpublished). Probabilistic Networks and Explanatory Coherence. To appear in a
volume edited by P. O'Rorke and J. Josephson.
Thirunarayan, K., & Dasigi, V. (1991). On the Relationship Between Abductive Reasoning and
Boolean Minimization. In Notes from the AAAI Workshop on Domain-Independent Strate-
gies for Abduction, (pp. 84-88). Anaheim. (Josephson & Dasigi, 1991).
Torasso, P., & Console, L. (1989). Diagnostic Problem Solving. New York: Van Nostrand
Reinhold.
Tracy, K., Montague, E., Gabriel, R., & Kent, B. (1979). Computer-Assisted Diagnosis of Or-
thopedic Gait Disorders. Physical Therapy, 59(3), 268-277.
Truzzi, M. (1983). Sherlock Holmes: Applied Social Psychologist. In U. Eco & T. A. Sebeok
(Eds.), The Sign of Three: Dupin, Holmes, Peirce. (pp. 55-80). Bloomington: Indiana Uni-
versity Press.
Tuhrim, S., Reggia, J. A., & Goodall, S. (1991). An experimental study of criteria for hypothesis
plausibility. Journal of Experimental and Theoretical Artificial Intelligence, 3, 129-144.
Valiant, L. G. (1979). The Complexity of Enumeration and Reliability Problems. SI AM Journal
of Computing, 8(3), 4 1 0 ^ 2 1 .
Waibel, A. (1989). Suprasegmentals in Very Large Vocabulary Word Recognition. In E. C. Schwab
& H. C. Nusbaum (Eds.), Pattern Recognition by Humans and Machines, (pp. 159-186).
Boston: Academic Press.
Waibel, A., & Lee, K.-F. (Eds.). (1990). Readings in Speech Recognition. San Mateo: Morgan
Kaufmann.
Wallace, W. A. (1972). Causality and Scientific Explanation, Volume 1, Medieval and Early
Classical Science. Ann Arbor: University of Michigan Press.
Wallace, W. A. (1974). Causality and Scientific Explanation, Volume 2, Classical and Contem-
porary Science. Ann Arbor: University of Michigan Press.
Weintraub, M. A. (1991). An Explanation-Based Approach for Assigning Credit. Ph.D. diss.,
Department of Computer and Information Science, The Ohio State University, Columbus.
Weintraub, M. A. (1991). Selecting an Appropriate Explanation. In Notes from the AAAI Work-
shop on Domain-Independent Strategies for Abduction, (pp. 89-91). Anaheim. (Josephson
& Dasigi, 1991).
290 ABDUCTIVE INFERENCE
Weintraub, M. A., & Bylander, T. (1989). QUAWDS: A Composite Diagnostic System for Gait
Analysis. In L. C. Kingsland (Ed.), Proceedings of the Thirteenth Annual Symposium on
Computer Applications in Medical Care, (pp. 145-151). Washington, D.C. IEEE Computer
Society Press.
Weintraub, M. A., & Bylander, T. (1991). Generating Error Candidates for Assigning Blame in a
Knowledge Base. In Proceedings of the Eighth International Workshop on Machine Learn-
ing, (pp. 33-37). San Mateo: Morgan Kaufmann.
Weintraub, M. A., Bylander, T., & Simon, S. R. (1990). QUAWDS: A Composite Diagnostic
System for Gait Analysis. Computer Methods and Programs in Biomedicine, 32(1), 9 1 -
106.
Winograd, T. (1983). Language as a Cognitive Process. Reading, MA: Addison-Wesley.
Zadrozny, W. (1991). Perception as Abduction: Some Parallels with NLU. In Notes from the
AAAI Workshop on Domain-Independent Strategies for Abduction, (pp. 92-97). Anaheim.
(Josephson & Dasigi, 1991).
Acknowledgments
291
292 ABDUCTIVE INFERENCE
Department of Computer and Information Science. We are grateful for sub-
stantial support from the Defense Advanced Research Projects Agency and
the US Air Force by way of research contracts F30602-85-C-0010 and
F49620-89-C-0110. Research on abduction in the RED domain has been sup-
ported by the National Heart, Lung and Blood Institute, NIH Grant 1 R01
HL 38776-01. At a critical time we were encouraged to continue working
in the speech domain by grant IRI-8902142 from DARPA and the National
Science Foundation. Support has come indirectly from the NEC Corpora-
tion by way of a gift to Osamu Fujimura in support of research on speech
science. Some work has been supported by the National Library of Medi-
cine under grant LM-04298. The Integrated Generic Task Toolset used in
several of our systems benefited greatly from the sponsorship of DARPA
under the Strategic Computing Program. Other significant support for toolset
development came at various times from Xerox Corporation, IBM, Digital
Equipment Corporation, and Texas Instruments. Development of CREAM
described in chapter 8 was supported by National Institute on Disability
and Rehabilitation Research grants H133E80017 and H133C90090, and par-
tially supported by McDonnell-Douglas research contract Z 81 225 on func-
tional reasoning.
Special software credits: Microsoft Word, DoItAll!, LaTeX, BibTex,
Interlisp D, Loops, GNU Emacs, TOPS-20, Medley, MM, Apple Macintosh
System 7, and CLOS.
Some parts of Chapter 1 are derived from a technical report distributed
as uOn the Logical Form of Abduction," some from John Josephson's Ph.D.
dissertation (Josephson, 1982), some from "A Mechanism for Forming Com-
posite Explanatory Hypotheses" (Josephson et al., 1987), some from "In-
ference to the Best Explanation is Basic" (Josephson, 1989), and from a
technical report distributed as "Abductive Justification." The first section
of chapter 2 is derived from a paper presented by Susan Josephson in June
1990 at the workshop entitled "Artificial Intelligence: Emerging Science
or Dying Art Form" held at the State University of New York at Binghamton,
sponsored by the Research Foundation of the State University of New York
and by the American Association for Artificial Intelligence. Other parts of
chapter 2 are adapted from: "Expert Systems: Matching Techniques to
Tasks," which appears in Artificial Intelligence Applications for Business,
edited by W. Reitman, Ablex Corp., publishers; "Generic Tasks in Knowl-
edge-Based Reasoning: High-Level Building Blocks for Expert System De-
sign" (Chandrasekaran, 1986); "Generic Tasks as Building Blocks for
Knowledge-Based Systems: The Diagnosis and Routine Design Examples"
(Chandrasekaran, 1988); "What Kind of Information Processing is Intelli-
gence? A Perspective on AI Paradigms and a Proposal," in The Founda-
tions of Artificial Intelligence, A Sourcebook, edited by Derek Partridge and
Acknowledgments 293
Yorick Wilks, Cambridge University Press, 1990; and "Towards a Functional
Architecture for Intelligence Based on Generic Information Processing
Tasks" (Chandrasekaran, 1987). Parts of chapter 3 are adapted from: Michael
Tanner's Ph.D. dissertation (Tanner, 1989); "Red: Integrating Generic Tasks
to Identify Red-Cell Antibodies" (Josephson et al., 1985); "Design for a Red-
Cell Antibody Identification Expert" (Smith et al., 1983); "A Mechanism
for Forming Composite Explanatory Hypotheses" (Josephson et al, 1987);
"Problem Solving in Red Cell Antibody Identification: RED's Performance
on 20 Cases," (Smith et al., 1986); and from a technical report distributed
as "Abduction Experiment, 'Sherlock Holmes' Strategy" by Susan T. Korda,
Michael Tanner, and John Josephson. Parts of chapter 4 are drawn from
"Pierce: A Tool for Experimenting with Abduction" (Punch et al., 1990);
parts from "A Framework for Opportunistic Abductive Strategies" (Johnson
& Smith, 1991), copyright Cognitive Science Society Incorporated, used
with permission. One paragraph of chapter 4 is from "Generic Tasks as
Building Blocks for Knowledge-Based Systems: The Diagnosis and Rou-
tine Design Examples" (Chandrasekaran, 1988). The concluding section of
chapter 4 draws from "Design Problem Solving: A Task Analysis"
(Chandrasekaran, 1990). Chapter 5 draws from: "An Investigation of the
Roles of Problem-Solving Methods in Diagnosis" by W.F. Punch III and B.
Chandrasekaran, which appears in the Proceedings of the Tenth International
Workshop on Expert Systems and their Applications: Second Generation
Expert Systems, May 1990. Chapter 6 draws from: "Concurrent Synthesis
of Composite Explanatory Hypotheses" (Goel et al., 1988); and
"Concurrency in Abductive Reasoning" by Ashok Goel, John Josephson, and
P. Sadayappan, which appears in the Proceedings of the Knowledge-Based
Systems Workshop, St. Louis, published by Science Applications Interna-
tional Corporation. Chapter 7 is from "The Computational Complexity of
Abduction," (Bylander et. al., 1991) and is reprinted with permission from
Elsevier Science Publishers B. V. Chapter 8 draws from "Distributed Ab-
duction in MDX2," (Sticklen, 1989), used with permission of J. C. Baltzer
AG Science Publishers; "QUAWDS: A Composite Diagnostic System for
Gait Analysis" by Weintraub and Bylander that appeared in the Proceed-
ings of the Thirteenth Annual Symposium on Computer Applications in Medi-
cal Care, 1989, used here with permission of SCAMC, Inc. A similar pa-
per by Weintraub, Bylander, and Simon appeared in Computer Methods and
Programs in Biomedicine, vol. 32, no. 1, 1990; material is used here with
permission of Elsevier Science Publishers B.V. Chapter 9 draws from tech-
nical reports distributed as: "Practical Abduction," by John Josephson and
Ashok K. Goel; PEIRCE-IGTT by Richard Fox and John Josephson; and
"Experimental Comparison of Strategies for Abduction: Contribution of
Explicit Explanatory Relationships" by Michael C. Tanner and John Joseph-
294 ABDUCTIVE INFERENCE
son. Chapter 10 draws from "A Layered Abduction Model of Perception:
Integrating Bottom-up and Top-down Processing in a Multi-Sense Agent"
(Josephson, 1989); from a technical report circulated as "Spoken Language
Understanding as Layered Abductive Inference" by John Josephson; and
from "Speech Understanding Based on Layered Abduction" (Josephson,
1989).
Index
295
296 ABDUCTIVE INFERENCE
acceptance, threshold for 145 Argument from Design note 6.2
accounts-for Aristotle 16, 17, note 1.11
adjustment during assembly 134 articulatory gestures 254, 258
determining 133, 190, 199 articulatory invariants 247
knowledge 96, 111 ArtRec 250, 252-258
acoustic invariants 247 attention, focus of 49
acoustic stratum 248 auditory stratum 249
active vision 244
ad hoc category 141 backtracking minimization 263
additive reaction strengths 176 Bayes's Theorem 26
affirming the consequent 158 Bayesian belief network 161
agglutination 65 BB1 116
agora 242 Beckman, M. note 10.4, note 10.7, note 10.9
AI as art 34-35 behavior note 5.4
AI as design science 35-37 belief 12
AI as engineering 32 belief revision, Pearl's theory of 161-162,
AI as the science of intelligence 52 164, 169, 173
AI as traditional science 32-34 Believed hypotheses 209, see Machine 5,
AI disunity 31 Machine 6
AI programs, evaluation of 32 best explanation
Allemang, D. 3, 157, 158, 163, 165, 166, algorithm for finding 173
167, 171,291 compared to what? 15
Altmann, E. 106 confidence in 269
ambiguity 143, 149, 212, 214, 223, 244, 263 likelihood of being correct 232, 270
ampliative inference 13, 260 unique 270
analogy 141, 265, 271 best-small plausibility criterion 171, 173,
closeness of 272 Appendix B note
annotated state transition 128 Bhaskar, R. note 1.9
antecedent conditions 17 biased sampling processes 21
anti-A 64 blackboard systems 246
antibody 63 blood typing 63
antibody identification 67 blood-bankers' vocabulary 65
antigens 63 Bogenschutz, S. 291
antigram 65 boldness 86
Ao, B. note 10.7, note 10.9 Bonaparte, N. 8
Appelt, D. 238 Bond, W. E. 129
architecture 39, 42 Boole, G. 44, 45
for abduction 124, 131, 150, 241 Bordley, J. 10, 11, 121
analog and symbolic contrasted 40 bottom-up processing 238, 239, 242, 245
artifacts of 49 Bromberger, S. 29
connectionist 40 Browman, C. P. 249,251
connectionist, for abduction 213 Brown, D. C. 2, 54, 60, 97, 116, 118
consequences of assumptions about 145 Brown, J. S. 187, 195
for diagnosis 124 Bruce, D. J. 250
empirical investigation of 61 Bruner, J. 238
flexible 115 Buchanan, B. 72
for general intelligence 105 Bylander, T. 2, 3, 60, 61, note 3.4, 117, 130,
general problem solving 116, 262 132, 157, 158, 163, 165, 171, 180, 181,
independence/dependence 42 186, 191, 194, note 8.2, 234,293
mental 59
method-specific 116 C/D model 258
multileveled 49 Caduceus 103
ofQUAWDS 187 causal processes 126-129, 130
of RED-1 and RED-2 66-75 causal responsibility 17, 18, 29
SOAR-like 106 causal sufficiency 17
symbolic 40 causal understanding 29
task-level 60 causality
task-specific 115 axiomatization of 271
TIPS 118 causation 16-18
unitary 49, 51, 52, 59 cause 17
universal 39, 50 direct and indirect 189
Index 297
efficient 17, 35 confidence
and explanation note 1.10 adjustment 120, 144, 146, 210, 216, 217,
final 17,35 218, 243, 268
formal 17 all things considered 268
hidden 264 coarse scale 26, 267, 268
and implication 17 of composite hypothesis 171-173, 271
material 17 of conclusions 212
mechanical 17 correlated with correctness 232
remote 92 of data items 120
caution 86, 231 equal 223
cells 64 grades 144
certainty 1,16, 266, see also doubt, uncer- graduated descent 263
tainty initial 142, 199, 243, 267
emergent 15-16, see also emergent numerical 26
certainty precision 144
increased by hedging 23 prima facie 71, 141, 142, 143, 146, 243,
islands of 142, 144, 151, 203, 211, 223, 267, 271
224, 245, 246 quality of the estimate of 206
values 226, 227 set by expert judgment 71
vocabulary 225 set by voting note 10.2
certainty-increasing inference 16, see also symbolic 68
uncertainty reduction thresholds, relative and absolute 143
chance, as explanatory 21 vocabularies for expressing 219, 267
Chandra, see Chandrasekaran, B. confidence / coverage tradeoff 144
Chandrasekaran, B. 1, 2, 31, 50, 54, 56, 60, confidence-setting behavior 237
61, 63, 68, 73, 94, 106, 115, 116, 117, confidence-setting rules 232, 267
note 5.1, 118, 120, 126, 129, 130, 132, validation of 236
134, 160, 182, 183, 291, 292, 293 confident portion of the best explanation
channels 239, 259 143, 210
Charniak, E. 5, 25, 238 Connection Machine 150
classification 52, 59, see also hierarchical connectionism 40-42
classification consilience of inductions 245
specialist 54, 215 consistency 83, 169, 244, 263
clause 248 consolidation 130-132
coarticulation 258 constrictor symptoms 104
cognitive dissonance 146 content theory 47
cognitive psychology 33 contrasting possibility 29
coherence 19, 222, 271 control
coincidence 22 constructs 39
Columbo note 9.7 fine 116
communication-intensive processing 150 fixed 96, 216
compiled deliberation 238, 240 in generic tasks 53
complete explanation 143, see also in hierarchical classification 54
explanation, complete opportunistic 96, 216, 263
complexity 44, 51, 54, 124, 181, 187, 202, programmable 97
203, 204, 214-215, 268, chapter 7 of refinement 100, 102, 104, 146, 263
complexity reduction 68, 71, 178, 214 in rule systems 49
component behavior 131 specification of 60
composite hypothesis 84, see also hypoth- strategies 52
esis, composite control knowledge 49, 97, 107, 109, 111,
generating all plausible 75 112
locally best 191 controlled experiment 21
subtasks of forming 139 converter/distributor model of phonetic
composition of hypotheses 242, 243-244 implementation see C/D model
computational architecture see architecture Cooke, W. 291
computational complexity of abduction see Cooper, G. 159, 162, 169
complexity corrective learning task 196
concept correctness 44, 45, 47, 77, 85, 88, 89, 138,
instantiation 142 223, 233, 234, 235, 236, 270
new 241 correlation 19, 22, 146, 270, 271
concurrent processing 139, 263 correspondence with the facts 237, 264
298 ABDUCTIVE INFERENCE
cost of error 206 in various architectures 51
cover-and-differentiate method 164 diagnostic dialog 11
coverage / confidence tradeoff 144 diagnostic differential 10, 91, 146, 244
covering-law model of explanation see differential see diagnostic differential
explanation, covering-law model of disbelief 210
CREAM 196, 197-201 disjunctive syllogism 12
credit assignment 196-197, 200 disphasic muscle activity 192
criticism 29, 66, 74, 78, 83-88, 95, 102, domain independence 94, 137, 138, 218
134-135, 146, 214, 262 Donahue, T. note 1.12
cross-match 66 doubt 13, 244, see also uncertainty
crucial experiment 7, 244 cannot be eliminated 14
CSRL 60, note 3.4, 99, 124, 263 persistence of 16
CSRL-IGTT215 Doyle, A. C. 7,90
cueing 138, 141,242 Dr. Gait-1 186
CV system 250 Dr. Gait-2 186
Dressier, O. 159
Darden, L. note 1.3, 129, note B.2, 291 DSPL60, 97, 118
Darwin, C. 8 DSPL++ 116
Dasigi, V. 238 Dunlop, B. 291
data Dvorak, D. 159
relativity of 242 Dzierzanowski, J. 186
roles of in abduction 243
to be explained 199, 210 EAGOL note A.I
data gathering 14, 21, 143, 212, 242, 244 ECHO 218, 222
data language 260 emergent certainty 15, 260
data perspective 147 empathy 265
David, J. M. 180 emphasis 255, 257
Davis, R. 159, 187, 196 empirical generalization 19, 264
de Kleer, J. 143, note 7.1, 159, 162, 164, empirical knowledge 16
187, 195 empiricism 16
deception 13, 212 Ennis, R. 19,20,291
decision point 83, 97, 102 enumerative induction 19
reoccurring 100, 118 Erickson, D. 238, note 10.12
decision table 99 Erman, L. 246
deduction 15 error candidate generation & selection 197
and abduction 12, 13 error detection 120, 131, 146, 228
and explanation 16 error minimization 206
and prediction 23 Eshelman, L. 164
deductive-nomological model of explanation essential hypotheses 76, 84, 85, 143, 148,
see explanation, deductive-nomological 209, 227, see also hypothesis, Essential
model of actually occur 154
default method 216 are not essential 211
DeJong, G. 197 relative to rivals 144, 212
DeJongh, M. 3, 130, 141 uncovered 223
delaying hard decisions 102, 103, 138, 144 essential kernel 85
deliberation-perception isomorphism 241 essentials-first leveraging-incompatibilities
DENDRAL 72, 75, 87 strategy 211
Descartes, R. 13, note 6.2 essentials-first strategy 142-155
design science 35-37 establish-refine control strategy 55, 68
design task 52, 53, 54,60, 115 establish-reject knowledge 54
detective 7, note 9.7 etiologies 239
determinism 17 Evans, C. 63, note 3.1, note 3.4, note 3.5,
device 126, 129, 130, 132, 187 note 3.6
diagnosis 9-12, 50, 58, 115 evidence
errors in 10-11, 200 breadth of 76
generic task analysis 52-59 domain-independent theory of 269
perception-like 241 negative to positive, note 1.5, 91
use of functional knowledge in 126-130, patterns of 71
131-134 evidence abstractions 56
use of structural knowledge in 130-134 evidential links 240
Index 299
evidential relationships, science of 12 explanatory interactions 169
evidential support 12 explanatory power 271
evocation of hypotheses 139-141, 242 monotonically increasing 83
expectation-based disambiguation 252 explanatory relationships 223
expectations 131, 244, 246 explanatory scope 243
failure of 21,264 explanatory usefulness 76
experiment 6, 7, 16, 21, 31
expert judgment 71, 91, 230 fair questions 11
explaining as much as possible 207, 264 false but useful propositions 45
explanandum 20 Fann, K. T. 8, 158
explanation 16, note 1.9, 18, 29, 160 Fannin, E. 291
best 15, 160,205 feature patterns 57
cognitive aspects of 18 Feigenbaum, E. 38, 72
complete 83, 85, 160, 166, 169, 203, 204, Feth, L. note 10.4
214, 224, 228, see also complete filing system of the mind 48
explanation final-cause naming 64
confidence in 269, see also confidence finding the best explanation 204
confident, partial 263 findings 103
correct 85 ambiguous 143
of correlation 22 order of consideration 90-91
covering-law model of 20 unexplainable 143
deductive-nomological model of 17, 20 unusual 90
and deductive proof 16 Fischer, O. 3, 94, note 4.2, 136
definition of 204 focus of attention 49, 50, 67, 79, 90, 96, 103,
expansion 224 104, 159, 183,206,249
by general statements 20 focus of puzzlement 206
ideally complete 25 Fodor, J. 238
incomplete 17, 214, 228 formant note 10.8
models of 17 four-machine experiment 224
of multiple findings 103 Fox, Richard 3, 202, note 9.3, 238, note
object of 20, 21, 29 10.7, note 10.9, note 10.12,293
partial note 7.2, 204, 263 Fox, Robert note 10.4, note 10.7, note 10.9
plausible 15 FR see Functional Representation Language
and pragmatics 18 frames 46-47, 73
and prediction 24-25 Frayman, F. 102
of prediction failure 264 frequency semantics for probabilities 267,
psychological 17 269
of sample frequencies 19-22 Fromkes, J. note 5.3
of special findings 103 Fujimura, O. 238, note 10.12, 252, 258, 292
of state transitions 128 functional knowledge 126, 132
statistical 17 functional modularity 60
true 10, 15 Functional Representation Language 126
uncertainty of 214 functions
and understanding 28, 206 of a device 126
unique best 158 explained by causal processes 128
what needs to be accounted for 189
explanation function 160 Gabriel, R. 186
explanation seekers 264 gait 184
explanation-accepting processes 260 Gardocki, R. 291
explanation-based learning 197 Garey, M. 157, 168, 172
explanation-seeking processes 180, 260 generalization, hedged 23
explanations Generalized Set Covering model 72
all possible consistent 143 generating elementary hypotheses 199
generating all possible 136 generic goals 216
George-did-it type 24 generic tasks 35, 60, 106, 115, 200, 216
number of 165 alternative methods for 114
explanatory coverage 142, 143, 162, 199 canonical methods for 216
determining 142 characterization 53
maximizing 204, 205, 206 classical view 62
explanatory inference 5 elementary 53, 60, 61
300 ABDUCTIVE INFERENCE
generic tasks (cont.) deep knowledge as a source of 132
in antibody identification 67 generating elementary 139
generic-task approach 59-62 partitioning sets of 210
gestural score 249 hypothesis
gesture hypotheses 253 acceptance 8
Gettier problem note 1.8 Believed 209, see also Machine 5, Machine
givens 29 6
gluteus maximus 193 Best 227
goal, problem-solving 96 Clear-Best 143, 144, 146, 148, 151, 209,
goal-seeking machines 263 210, 211, 217, 227
God note 6.2 comparison 136
Goel, A. 3, 136, 147, 148, 150, 174, 202, composite 9, see also composite hypothesis
213, 293 Confirmed 216
Goldstein, L. 249, 251 criticism 8
Gomez, F. 50, 68 Disbelieved 210
Gregory, R. L. 238 Essential 143, 210, 212, 217, see also
guess 210, 212, 217, 232, 263 essential hypotheses
informed 143, 149 evaluation 243, 262
guilt 219 evocation see evocation of hypotheses
formation 242, 262, 272
Halliday, M. 249, 250 fragment 79
Hamscher, W. 159, 187 generation 8, 67, 68, 199, 212
hard decisions 144 Guessed 210
delaying 102-103, 144 maximally plausible 262
hard implication note 9.5 most plausible first 136
Harman, G. 1, 5, 6, 7, 8, note 1.8, 19, 25 OTHER 269
Harvey, A. 10, 121 parts of 84
Hastie, R. 5 reconsideration 244
Hayes-Roth, B. 116 scoring 131,213,243,245
Hayes-Roth, F. 102,246 testing 272
hearing 239 Weak-Best 143, 149, 209, 210, 217, 227
HEARS AY-II 102,246 hypothesis assembly 60, 67
hedged generalizations 23 anomaly during 228
hedging, to strengthen inference 23 as composition of mathematical functions
Helmholtz, H. 238 199
Hemami, H. 187 cross checking during 146
hemolysis 65 opportunistic 211
Hempel, C. 17 inQUAWDS 188-191
Herman, D. 116 recursive 263
Herring, M. 291 hypothesis comparison, implicit 75
hesitation 13 hypothesis interactions 91-93, 132-134,
heterogeneous problem solvers 59 145-146, 163, 167, 218, 243, 267, 268
Hidden Markov Models note 10.5 additive overlap 134, 145
hierarchical classification 54-56, 60, 141 cancellation 158, 169-171, 176, 177, 178
in RED 67-69 causal 146, 263
is ubiquitous 56 compensatory behavior 185
hierarchy of constituents 247 dense 181
Hillis,D. 150 explanatory alternatives 91
Hinkelman, E. 291 explanatory need 146
Hirsch, D. 186, 195 implication 216, 251, 263
Hoare, C. 147 incompatibility 79, 91, 146, 155-156, 158,
Hobbs, J. 25, 238 note 7.8, 176, 177, 178, 202, 203, 208-
Holmes, S. 7, 90 210, 219, 223, 224, 226, 244, 251,
hot map 259 263
Hubel, D. 239 probabilistic 146
Huber, R. 180 soft implication 223
Hume, D. 270 subtype 146
HYPER 60, 125 suggestiveness 146
hypotheses sympathy and antipathy 91, 218
causally integrated composites 263 type-subtype 91, 243
Index 301
unknown 269 intelligence of the hearer 258
hypothesis matching 56-57, 59, 60, 199 intelligent database 72, 73, 182
in RED 69-72 InterLisp-D note 4.1
hypothesis perspective 147 INTERNIST 68, 72, 75, 87, 159, 164
hypothesis selection, arbitrary 138, 143, 144, intractability, factors leading to 157, 177
149, 152 invocable method 97
hypothesis-improving operations 97 Ishida, Y. 252
hypothesis-space simplification 268 island driving strategy 102, 246
hypothesization by analogy 141, 271 islands of certainty see certainty, islands of
hypothetico-deductive model 25
Johnson, D. 157, 168, 172
ID ABLE 60, 124 Johnson, K. note 5.3
IDB see intelligent database Johnson, T. 3, 94, note 4.3, note 5.3, 293
IGT Toolset see Integrated Generic Task Josephson, J. 1,5, 60, 63, note 3.1, note 3.2,
Toolset note 3.7, 94, note 4.1, 129, 134, 136,
implication, causality and 17 147, 154, 157, 158, 160, 163, 165, 171,
implication, soft 222 174, 180, 183, 202, note 9.3, 238, note
incompatibility abduction problems 167-169 10.7, note 10.9, note 10.12, 262, 266,
incompatibility knowledge 224, 231, 232, 270,291, 292, 293,294
233, 234, 235 Josephson, S. G. 3,31,291,292
incompatible hypotheses, discriminating Josephson, S. J. 291
between 234 jury 5, 7, 218
independent abduction problems 164-166, justified true belief 16
167
independent incompatibility abduction Kamel,A. 129
problem 168 Kant, I. 238
induction 18, 22-27 Kedar-Cebelli, S. 197
inductive generalization 16, 19, 22 Keller, R. 197
inductive projection 22, 24 Kent, B. 186
inference 26 Keuneke,A. 61, 126
ampliative 13, 260 Kiritani, S. 252
converging lines of 245 knowing that we know 16
defined 12 knowledge
fallible 12, 260 accounts-for 96
function of 12 causal 114, 126,263
knowledge producing 259 compilation 132, 234, 239, note 10.2
in perception 238, 240 compiled 123, 124-126, 129, 130
strengthened by hedging 23 control 49
taxonomy of types 27 deep 114, 132
truth preserving 13, 43 diagnostic note 5.2
truth producing 13, 260 evoking 138
inference mechanism 49 and explanatory hypotheses 29
inference to the best explanation, see of explanatory relationships 236, 263
abduction finding-importance 96
infinite loop in hypothesis assembly 83 of functions 126
information-processing strategies 38, 50, 52, general 141
53, 54,59 of hypothesis incompatibility 79, 236
information-processing task 35, 37, 40, 41, hypothesis interaction 96
42, 61 of hypothesis interactions 243
information-seeking processes 14 hypothesis matching 263
innocence 219 inseparable from use 45
instantiation of hypotheses 139, 142,242, justified true belief 16
243,244,262 limits of 1, 264
Integrated Generic Task Toolset 215, 251 method-specific 116
intelligence 14, 51, 53, 268, 272 operator-implementation 106
and action 178 operator-proposal 106
biological 45 operator-selection 106
as a collection of strategies 53 from perception 259
information-processing theory of 45 in the philosophical sense 16, 260
where does the power come from? 45, 48 plausibility assignment 96
302 ABDUCTIVE INFERENCE
knowledge (cont.) limitations of traditional note 1.13, 141
of possibilities 141 for program semantics 46
possibility of 16, 136, 180, 272 for representation 43-46
scientific 260 as the science of evidential relationships
search-control 107 12
strategic 96 use in AI note 2.1
structural 263 logic of discovery note 1.6
symbolic logic for representing 44 logic of justification note 1.6
veridical perceptions 260 logic of science 1
without absolute certainty 260 logical implication, confused with causation 17
knowledge base 43, 49 logical possibility 272
knowledge engineer 60 logically justified 259
knowledge groups 70, 189 lookahead 110, 111, 112
knowledge level 46, 61 LOOPS note 4.1
knowledge organization 47 low-likelihood alternative explanations 14,
knowledge-producing inferences 16, 259 212, see also explanation
knowledge reformulation 196 Lucas, J. note 5.3
knowledge representation 60, 159 Lycan, W. 5, note 1.1, note 1.7, 291
depends on use 50, 60
knowledge-base refinement problem 196 MacDonald, J. 291
knowledge-directed data retrieval 57-58, 60 Machine 1 137, 142, 262
in RED 72-74 essential hypotheses 144
knowledge-use vocabulary 61 Machine 2 137-138, 263
Korda, S. 3, 63, 293 essential hypotheses 144
Krishnamurthy, A. note 10.4, note 10.7 Machine 3 94, 138, 263
Krivine, J. M. 180 Machine 4 139, 142,225,263
Ku, I. 3, 269, 270 essential hypotheses 143
Kuhn, T. 35 and Machine 5 146
Kuipers, B. 159, 187 strategy 143
Kulikowski, C. 180 summary 150
Machine 5 146, 208-213, 225, 229, 242,
LAIR 2 243, 263
Laird, J. 95, 105, 106 assembly strategy 211
language of thought 60 Machine 6 242, 262, 263
language understanding 238 control strategy 246
law 7, 218 malfunction hierarchy 114
layer-layer harmony 244 malfunction model 126
layered-abduction model of perception 238, Marr, D. 35,41,239
240 Martin, P. 238
Leake, D. note A.I Matcher 224, 225
learn time 241 Matcher with assembler and incompatibility
learning 1, 28, 40, 48, 181, 196, 197, 240, handling 229
241, 257 Matcher with incompatible hypotheses
from mistakes 196 handling 226
opportunities for 245 Matcher-with-Assembler 226
least commitment strategy 102 maximizing explanatory coherence 19
Lenzo, K. 238, note 10.12 maximizing explanatory coverage 202, 206
Lesser, V. 102,246 maximum plausibility requirement 158, 178
Levesque, H. 143, 162 McDermott, D. 5,238
likelihood 14, 84, 142, 266, 267, 271, 272 MDX 181,266
qualitative 160 MDX2 3, 129, 181, 184,263
likely absent 76 medical diagnosis 5, 95, 239, chapter 5,
likely present 76 chapter 8
Livingstone, M. 239 memory organization 47, 184
local abductive criteria 142 mental architectures 59
local abductive reasoning 145 mental structures 51
logic mentalese 60
for attributing knowledge to an agent 46 message passing, for abductive assembly
for describing the objects of thought 46 147
for justification 45 Messaro, D. 259
Index 303
method choice 101, 115 parsimonious covering theory 164
run time 115 parsimony 84, 85, 111, 145, 160, 171,211,
Miller, R. A. 68, 72, 159, 164 213-214, 227, 262, 271, see also
Milne, A. A. 7 simplicity
mind, fundamental operations of 25 essentials-first strategy and 152, 154
minimal cardinality 111 testing for 148
minimum cover 172 Partridge, D. 292
Minsky, M. 46, 196 PATHEX/LIVER 95, 117, 263
mistakes, recovering from 244-246 pathognomonic 103, 230
Mitchell, T. 197 pathological states 239
Mittal, S. 58, 60, note 3.4, 73, 102, 181, 182, Patil,R. 170
194 Patten, T. 238
Moberg, D. 129 Pearl, J. 159, 160, 161, 162, 164, 167, 169,
model-based diagnosis 159 271, 291
modus ponens 5 PEIRCE 3, 60, 94, 125
monotonic abduction problems 166-167 compared to ABD-SOAR 113
Montague, E. 186 control mechanism 97-101
Mooney, R. 197 single-finding abducers 145
morpheme 248 Peirce, C. S. 3, 5, 8, 18, note 4.1, 158, 238
most probable explanation 161 PEIRCE-IGTT 215, 216, 222, 251, 255
motor theory of speech perception 249 Peng, Y. 9, 68, 72, 150, 159, 162, 164, 271
MPE see most probable explanation Pennington, N. 5
multisense integration 259 perception 238
Murrer, S. 291 abductively justified 260
Myers, J. A 72 as compiled deliberation 240
Myers, J. D. 68, 159 inference in 238, 240
multisense 258, 259
Narayanan, H. note 9.3 perception-deliberation isomorphism 241
Nau, D. 72, 164 Perricone, B. 72
Necker cube 245 phoneme note 10.6, 249
neural-net abduction 150 phonetic stratum 249
Newell, A. 32, 46, 95, 105, 106, 107, 116 phonological-prosodic stratum 249
Newton, I. 7 phrase 248
nine five NINE Pine Street 252 physical symbol system hypothesis 32
noise hypothesis 8, 212, 253, 254 place of articulation 250
noisy channel 16 placebo 21
non-superfluousness, kinds of 85 plan selection and refinement 60
normality 9, 24, 126, 161, 169, 185 planning 26
novelty 263 plans 47-48
NP-hard abduction 157 Plato 16
numerical vs. qualitative representations plausibility 12, 70, 71, 75, 83, 171, 266-272,
42-43 appendix B, see also confidence
arguments 271
object identity 259 of composite hypotheses 89, 171, 174, see
object-oriented programming 47 also best-small criterion
objectivity 1 from analogy 141, 271
observation 16, 19, 20, 21 function 160
observation language 13 grades 144
Occam's razor 58, 84, see also simplicity, initial 142
parsimony linearly ordered values 158
optimal solutions 178 multidimensional 269
ordered abduction problems 173 as a number 268
other minds 6, 265 ordering 136
outre 90 prima facie 141
Overview 75, 78, 125 probabilistic interpretation of 267-270
tractability of determining 162
panel, test 64 Pooh, W. T. 7
parallel channels 247 Poole, D. 162
parallel processing 144, 150 Pople, H. 68, 72, 103, 158, note A.I
in abduction 147 population-to-sample inference 19
304 ABDUCTIVE INFERENCE
possibility 271, 272 domain knowledge 67
knowledge of 141 future versions 236
merely logical 141 inPEIRCE 101
plausibility as 271 project goals 65
real 141,272 RED-1 2,3,63,75, 141
practicality of computations 178 performance 76-78
pragmatic bias 264 strategy 78
pragmatic stratum 250 RED-2 2, 3, 63, 78, 83, 133, 134, 141, 174
pragmatics 134, 149, 231 performance 88-90
prediction 19, 22-28 RED-3 101
based on trust 25 red-cell antibody identification 63
and explanation 24-25 reference-class problem 27, 268, 269
failure of 25, 264 Reggia, J. 9, 50, 68, 72, 150, 159, 162, 164,
is not deduction 23 271, 291
revision of 264 register 250
from structural descriptions 131 reidentification 259
as a subtask of abduction 26 Reiter's theory of diagnosis 160-161, 169
priming 141, 242 Reiter, R. 51, 143, note 7.1, 159, 160, 162,
principle of plausible repetition 141, 271 164
probability 24, 173, 266, 267, 272 Reitman, W. 292
and abduction 26-27 representation, of causal processes 127-129
posterior 267 representational commitments 40, 41
prima facie 267 revising the data 8
prior 27 Rh types 64
problem decomposition 263, see also risk of error 144
generic tasks robot scientist 1, 259, 264
problem difficulty 234 Rock, I. 238
problem of induction 1, 22-24 role 36, 128
problem space 106 Rosenbloom, P. 95, 105, 106
problem-solving architecture see architec- rule-based systems 48-50
ture rule-out knowledge 178, 228
problem-solving strategy 67, 144
problem-space computational model 106 Sadayappan, P. 136, 147, 150, 174, 213, 293
process abstraction 129 Salmon, W. note 1.9,27
propensity 267 sample size 21
prosodic control 258 sample, biased 19, 20, 21
prototype vs. typical 257 sample-to-population inference 19
PSCM see problem-space computational sampling processes 21
model Schank, R. 47, 141
Punch, W. 3, 60, 94, note 4.1, 117, note 5.1, Schwartz, W. 170
119, 120, note 5.3, 215, 293 scientific reasoning 260, 270
Putnam, C. 291 scripts 47-48
Pyati, R. 291 Searle, J. 34
Pylyshin, Z. W. 42 Sebeok, T. 7
second-best explanation 270
qualitative simulation 132, 187 segment, phonetic 258
quantifier of ordinary life 23 selectors 97-101
QUAWDS 3, 181, 184, 187, 263 semantic networks 49
Sembugamoorthy, V. 56, 61, 126, 129
Ramanujan, J. 150, 213 sensory events 264
rationality see also intelligence sentence 248
Bayesian 268 serum 64
emergent 48 set-covering model 68, 72, 75, 87, 164
and hedging 23 Sherlock Holmes strategy 90
Rauschenberg, J. note 10.7 SIMD parallelism 150
real possibility 272 similarity, degree of 141, 272
Realism 1,237,264 Simon, H. 32, 35, note 4.5, 181
Reason 1 Simon, S. 180, 181, 185, 186, 196, 293
recognition 215, 247, 258 simplicity 84, see also parsimony, Occam's
RED 2, chapter 3, 63, 175, 267, 269
Index 305
as an epistemic virtue 84 task-structure analysis 114, 117
single-fault assumption 158 taxonomic hierarchy 263
skepticism 16, 180, 206 teleology 18, 35
Smetters, D. note 9.3 term introduction 13
Smith, J. W. 2, 63, note 3.1, note 3.2, 94, 117, test design 234
note 5.1, note 5.3, 136, 154, 160, 293 testimony 7
Smolensky, P. 40 Thadani, S. 129, note 10.9
SOAR 95, 105, 106, 116 Thagard, P. note 1.2, 150, 218, 219, note 7.6,
soft constraint 246 222, 291
soft implication note 7.5, 222 theory formation 260, 264
software modularity 101 theory language 13, 260
specialists 54, 57, 58, 59, 62, 68, 69, 118, Thomas, S. 291
215 thresholds of acceptance 143, 145, 159, 205,
specificity 263 209, 211, 217
speculation 210, 212 tie breaking 149
speech recognition 238, 246, 247, see also TIPS 3, 95, 117, 118,263
spoken language understanding ToMake 127
obstacles to automating 247, 252, 257 top-down processing 238, 239, 242, 244,
statistical approaches 257 251,252, 263
Speicher, C. note 5.3 Tracy, K. 186
spoken language understanding 238, 239, Transgene project 129
246 true cause 180
strata 247-250 truth 1,44,45
sponsor-selector mechanism 97, 118 truth finding 260
sponsors 97 truth preservation 13, 43
spreading disambiguation 223, 243 truth production 13, 260
spurious 22 Truzzi, M. 7
statistical association 146 Turing Machines 39
statistical syllogism 23 Turing test 34
Stefik, M. 102 Turing universal 50
stereotypes 46
Stickel, M. 238 Umiker-Sebeok, J. 7
Sticklen, J. 2, 3, 61, 129, 130, 134, 180, 182, uncertainty 223, 230, see also certainty
183, 293 uncertainty control 144, 206
strategy-specific primitives 50 uncertainty reduction 231, 236, 243
Strohm, P. 63, 154 uncovered essentials 223
strong AI 34 understanding 29
structure/function models causal processes 129
advantages for diagnosis 123 complex devices 126
shortcomings 123 and explanation 206
Struss, P. 159, 187 how something works 126
substances 130 unexplainable findings 143, 149
subtask dependencies 139 unexplained remainder 79, 143, 210, 263
Sutherland, G. 72 universal quantifier 23
Svirbely, J. 63, note 5.1, note 5.3, 136, 154 Unknown-Disease 212
syllable 249, 258
symbol 41 Valiant, L. G. note 7.7
synthetic agents 264 Very Clear-Best hypothesis 212
synthetic worlds 264 vision 239
systemic grammar 249 visual metaphors 241
Szolovits, P. 170, 186
Wallace, W. note 1.10
Tanner, M. 2, 3, 5, 60, 63, note 3.1, note 3.2, Wang, P. 164
note 3.4, note 3.5, note 3.7, 94, note 4.1, Weak-Best hypotheses see hypothesis, Weak-
154, 157, 158, 160, 163, 165, 171, 197, Best
202, 269, 293 Weintraub, M. 3, 180, 181, 186, note 8.2,
task specificity 94 198, 293
task-method-subtask analysis 116 Welch, A. 291
task-specific problem solver 69 what can be known 1
task-specific vocabulary 52 Whewell, W. note 10.3
306 ABDUCTIVE INFERENCE
why questions 29 word 248
Wilks, Y. 292 word boundaries 252
Williams, B. 143, note 7.1, 159, 162, 164 word hypothesis 254
Winograd, T. 249
witness 7 X-ray Microbeam 252
Wittgenstein, L. 46
wonder 28 Yost, G. 106