The first successful commercial expert system, R1, began operation at the Digital Equipment
Corporation (McDermott, 1982). The program helped configure orders for new computer
systems; by 1986, it was saving the company an estimated $40 million a year. By 1988,
DEC’s AI group had 40 expert systems deployed, with more on the way. DuPont had 100 in
use and 500 in development, saving an estimated $10 million a year. Nearly every major U.S.
corporation had its own AI group and was either using or investigating expert systems.
In 1981, the Japanese announced the “Fifth Generation” project, a 10-year plan to build
intelligent computers running Prolog. In response, the United States formed the
and Computer Technology Corporation (MCC) as a research consortium designed to
assure national competitiveness. In both cases, AI was part of a broad effort, including chip
design and human-interface research. In Britain, the Alvey report reinstated the funding that
was cut by the Lighthill report.13 In all three countries, however, the projects never met their
ambitious goals.
Overall, the AI industry boomed from a few million dollars in 1980 to billions of dollars
in 1988, including hundreds of companies building expert systems, vision systems, robots,
and software and hardware specialized for these purposes. Soon after that came a period
called the “AIWinter,” in which many companies fell by the wayside as they failed to deliver
on extravagant promises.
1.3.7 The return of neural networks (1986–present)
BACK-PROPAGATION In the mid-1980s at least four different groups reinvented the back-
propagation learning
algorithm first found in 1969 by Bryson and Ho. The algorithm was applied to many learning
problems in computer science and psychology, and the widespread dissemination of the
results in the collection Parallel Distributed Processing (Rumelhart and McClelland, 1986)
caused great excitement.
CONNECTIONIST These so-called connectionist models of intelligent systems were seen
by some as direct
competitors both to the symbolic models promoted by Newell and Simon and to the
logicist approach of McCarthy and others (Smolensky, 1988). It might seem obvious that
at some level humans manipulate symbols—in fact, Terrence Deacon’s book The Symbolic
Species (1997) suggests that this is the defining characteristic of humans—but the most
connectionists questioned whether symbol manipulation had any real explanatory role in
detailed models of cognition. This question remains unanswered, but the current view is that
connectionist and symbolic approaches are complementary, not competing. As occurred with
the separation of AI and cognitive science, modern neural network research has bifurcated
into two fields, one concerned with creating effective network architectures and algorithms
and understanding their mathematical properties, the other concerned with careful modeling
of the empirical properties of actual neurons and ensembles of neurons.
1.3.8 AI adopts the scientific method (1987–present)
Recent years have seen a revolution in both the content and the methodology of work in
artificial intelligence.14 It is now more common to build on existing theories than to propose
brand-new ones, to base claims on rigorous theorems or hard experimental evidence rather
than on intuition, and to show relevance to real-world applications rather than toy examples.
AI was founded in part as a rebellion against the limitations of existing fields like control
theory and statistics, but now it is embracing those fields. As David McAllester (1998) put it:
In the early period of AI it seemed plausible that new forms of symbolic computation,
e.g., frames and semantic networks, made much of classical theory obsolete. This led to
a form of isolationism in which AI became largely separated from the rest of computer
science. This isolationism is currently being abandoned. There is a recognition that
machine learning should not be isolated from information theory, that uncertain reasoning
should not be isolated from stochastic modeling, that search should not be isolated from
classical optimization and control, and that automated reasoning should not be isolated
from formal methods and static analysis.
In terms of methodology, AI has finally come firmly under the scientific method. To be
hypotheses must be subjected to rigorous empirical experiments, and the results must
be analyzed statistically for their importance (Cohen, 1995). It is now possible to replicate
experiments by using shared repositories of test data and code.
The field of speech recognition illustrates the pattern. In the 1970s, a wide variety of
different architectures and approaches were tried. Many of these were rather ad hoc and
fragile, and were demonstrated on only a few specially selected examples. In recent years,
approaches based on hiddenMarkov models (HMMs) HIDDEN MARKOV have come to
dominate the area. Two
aspects of HMMs are relevant. First, they are based on a rigorous mathematical theory. This
has allowed speech researchers to build on several decades of mathematical results developed
in other fields. Second, they are generated by a process of training on a large corpus of
real speech data. This ensures that the performance is robust, and in rigorous blind tests the
HMMs have been improving their scores steadily. Speech technology and the related field of
handwritten character recognition are already making the transition to widespread industrial
and consumer applications. Note that there is no scientific claim that humans use HMMs to
recognize speech; rather, HMMs provide a mathematical framework for understanding the
problem and support the engineering claim that they work well in practice.
Machine translation follows the same course as speech recognition. In the 1950s there
was initial enthusiasm for an approach based on sequences of words, with models learned
according to the principles of information theory. That approach fell out of favor in the
1960s, but returned in the late 1990s and now dominates the field.
Neural networks also fit this trend. Much of the work on neural nets in the 1980s was
done in an attempt to scope out what could be done and to learn how neural nets differ from
“traditional” techniques. Using improved methodology and theoretical frameworks, the field
arrived at an understanding in which neural nets can now be compared with corresponding
techniques from statistics, pattern recognition, and machine learning, and the most promising
technique can be applied to each application. As a result of these developments, so-called
data DATA MINING mining technology has spawned a vigorous new industry.
Judea Pearl’s (1988) Probabilistic Reasoning in Intelligent Systems led to a new acceptance
of probability and decision theory in AI, following a resurgence of interest epitomized
BAYESIAN NETWORK by Peter Cheeseman’s (1985) article “In Defense of Probability.”
The Bayesian network
formalism was invented to allow efficient representation of, and rigorous reasoning with,
uncertain knowledge. This approach largely overcomes many problems of the probabilistic
reasoning systems of the 1960s and 1970s; it now dominates AI research on uncertain
and expert systems. The approach allows for learning from experience, and it combines
the best of classical AI and neural nets. Work by Judea Pearl (1982a) and by Eric Horvitz and
David Heckerman (Horvitz and Heckerman, 1986; Horvitz et al., 1986) promoted the idea of
normative expert systems: ones that act rationally according to the laws of decision theory
and do not try to imitate the thought steps of human experts. The WindowsTM operating
includes several normative diagnostic expert systems for correcting problems. Chapters
13 to 16 cover this area.
Similar gentle revolutions have occurred in robotics, computer vision, and knowledge
representation. A better understanding of the problems and their complexity properties,
with increased mathematical sophistication, has led to workable research agendas and
robust methods. Although increased formalization and specialization led fields such as vision
and robotics to become somewhat isolated from “mainstream” AI in the 1990s, this trend has
reversed in recent years as tools from machine learning in particular have proved effective for
many problems. The process of reintegration is already yielding significant benefits.