Chapter 9: The Expert System Boom
The winter had barely ended when a new spring arrived—or so it seemed....
Chapter 9: The Expert System Boom
The winter had barely ended when a new spring arrived—or so it seemed.
By the early 1980s, artificial intelligence had found a new form: the expert system. Instead of pursuing general intelligence, that distant goal that had embarrassed so many predictions, researchers focused on narrow domains. Medicine. Chemical analysis. Computer configuration. Tax preparation. The systems encoded the knowledge of human experts in rules, then applied those rules to specific problems.
This was intelligence that corporations could buy.
In 1980, Digital Equipment Corporation began deploying XCON, an expert system developed at Carnegie Mellon to configure VAX computer orders. The system examined customer requirements and generated layouts for the components that filled them. It was estimated to save the company $40 million over six years. Within a few years, corporations around the world were investing in AI—over a billion dollars annually by 1985, mostly funding in-house expert system departments.
An industry materialized. Teknowledge and Intellicorp sold software. Symbolics and LISP Machines Inc. built specialized hardware, computers optimized to run LISP, the programming language that dominated American AI research. For a brief, heady moment, it seemed that artificial intelligence had finally become practical.
MYCIN was the paradigm case, the system that showed what expert systems could do.
Developed at Stanford between 1972 and 1976, MYCIN diagnosed bacterial infections and recommended antibiotics. It worked through backward chaining: given a patient's symptoms, it searched through roughly 600 rules to identify possible pathogens, then suggested treatments adjusted for body weight and drug interactions.
In controlled evaluations, MYCIN performed remarkably well. Researchers presented the system with 10 cases of bacteremia or meningitis. Eight infectious disease specialists evaluated MYCIN's recommendations alongside those of the treating physicians and other experts. MYCIN achieved an acceptability rating of 65%—comparable to the 42% to 62% ratings of faculty members, and better than most general practitioners.
Here was a machine that could match human experts in a narrow medical domain. It couldn't examine patients or read their faces. It knew nothing of the world outside blood infections. But within its domain, it was genuinely competent.
And yet MYCIN was never used in practice.
The reasons were instructive. Legal and ethical questions arose: if MYCIN recommended the wrong antibiotic, who bore responsibility? The physician who followed the advice? The hospital that deployed the system? Stanford? The questions had no clear answers, and hospitals were unwilling to create the precedents. Beyond liability, there was professional resistance. Doctors did not want a machine second-guessing their diagnoses. MYCIN outperformed many physicians, but physicians still controlled whether MYCIN would be used.
MYCIN's true legacy was methodological. Its rule-based architecture became a template. "Expert system shells"—including E-MYCIN, a generalized version of its reasoning engine—enabled rapid development of systems in other domains. The approach spread across industries: manufacturing, finance, customer service, military applications.
The knowledge was finally being extracted and encoded. Or so it appeared.
In 1982, Japan announced a 10-year initiative to build the "fifth generation" of computers, with a budget of approximately ¥50-54 billion (roughly $400-500 million at the time).
The previous four generations had moved from vacuum tubes to transistors to integrated circuits to microprocessors. The fifth, according to Japan's Ministry of International Trade and Industry, would be computers capable of knowledge processing, natural language understanding, and artificial intelligence. Japan would build the future of computing, and the future would be intelligent.
The announcement sent tremors through Western governments. In the Cold War framework that still dominated strategic thinking, Japan's declaration read as a challenge—an AI "space race" with national prestige at stake. The United States responded with the Strategic Computing Initiative, a DARPA program that committed over a billion dollars to develop autonomous vehicles, pilot assistance systems, and battle management AI. Europe launched its own ESPRIT program. Suddenly, artificial intelligence was not merely an academic pursuit but a matter of geopolitical competition.
The Fifth Generation project established the Institute for New Generation Computer Technology (ICOT) and set to work on parallel inference machines running Prolog. The goal was "epoch-making computers" that would combine supercomputer-like performance with genuine knowledge processing.
The results were mixed. ICOT produced working parallel hardware—machines with hundreds of processors capable of logical inference. The technical achievements were real. But the commercial and AI ambitions went largely unfulfilled. Prolog, popular in Europe, never caught on in markets dominated by LISP and mainstream languages. The specialized hardware was eventually surpassed by cheaper, more versatile machines. General-purpose workstations from Sun and Intel x86 computers became faster without the overhead of specialized architecture.
By 1992, when the project concluded, the assessment was sobering. "ICOT has done little to advance the state of knowledge based systems, or Artificial Intelligence per se," one evaluation noted. Natural language goals had been dropped or spun off. Very large knowledge bases remained elusive. The project had created "a positive aura for AI" and trained a generation of researchers, but the fifth generation of intelligent computers had not arrived.
The panic it caused, however, had lasting effects. The Strategic Computing Initiative, the competitive funding, the sense that AI mattered strategically—all of this sustained interest and investment during years when the technology remained far from its goals.
Through both the boom and its aftermath, the critics continued their work.
Hubert Dreyfus, whose "Alchemy and AI" had scandalized the field in 1965, returned in 1972 with What Computers Can't Do—an expanded critique arguing that human intelligence depended fundamentally on embodiment. We know how to use tools not because we possess rules for tool use but because we have bodies that grip and lift and feel resistance. We understand language not through grammatical parsing but through immersion in contexts we inhabit physically and socially.
Dreyfus drew on phenomenologists the AI community had never read: Merleau-Ponty on the body, Heidegger on being-in-the-world. The AI researchers spoke a language of rules and symbols; Dreyfus spoke of tacit knowledge and lived experience. They talked past each other for years.
Marvin Minsky dismissed him: "They misunderstand, and should be ignored." Edward Feigenbaum called phenomenology "cotton candy." But Dreyfus was identifying something real. Expert systems encoded explicit rules, yet human experts often couldn't articulate the knowledge that guided their judgments. The "knowledge acquisition bottleneck," the difficulty of extracting expertise into rule form, became one of the defining problems of the era. The experts knew more than they could say, and what they couldn't say couldn't be programmed.
Joseph Weizenbaum offered a different critique. In Computer Power and Human Reason (1976), the ELIZA creator distinguished between calculation and judgment. Computers excelled at the former—applying rules, processing data, following procedures. But judgment involved intuition, creativity, moral reasoning, wisdom. These were not computable in any meaningful sense, Weizenbaum argued, and tasks requiring them should not be delegated to machines.
Weizenbaum worried about more than technical limitations. He saw a culture of "technological determinism," the belief that because something could be built, it should be built. He saw AI researchers pursuing artificial intelligence without considering the ethical and social consequences. He accused the field of arrogance and irresponsibility.
Both critics were largely ignored during the boom years. But they were sharpening questions the field eventually had to answer.
In 1984, at the annual meeting of the American Association for Artificial Intelligence, Marvin Minsky and Roger Schank—two pioneers who had lived through the first winter, issued a warning. Enthusiasm for AI had "spiraled out of control." Disappointment was certain to follow.
They described a cascade: pessimism in the research community would spread to the press, which would spread to funders, which would kill serious research. They called it an "AI winter," by analogy to nuclear winter. The chain reaction would leave the field in darkness.
Three years later, they were proven right. But the warning itself revealed what the boom had exposed. MYCIN demonstrated that narrow AI could match human performance in circumscribed domains. XCON showed that expert systems could deliver real commercial value—$40 million saved was real money. Knowledge engineering, however limited, developed techniques for extracting expertise that informed later work.
The critics had revealed something too. Dreyfus's phenomenology pointed toward why expert systems proved brittle: human expertise wasn't just rules waiting to be extracted. Weizenbaum's ethics pointed toward questions the field had not yet learned to ask: just because something could be automated, should it be?
The Fifth Generation, for all its unrealized ambitions, had trained a generation of researchers in parallel computing. The panic it caused had kept AI on the strategic agenda. Even failure was teaching lessons.
And at the edges of the field, seeds were germinating that had nothing to do with expert systems. Geoffrey Hinton was working on neural networks. The Internet was beginning to scale. Graphics processing units were being designed for video games. None of this seemed relevant to the expert system builders. All of it mattered.
The boom revealed both the promise and the limits of narrow AI. It proved that practical applications were possible. It proved that over-promising was dangerous. And it set the stage for what would come next: a second winter, colder than the first, and then the long wait for spring.