Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
166 views292 pages

Book 3

Uploaded by

t4ez5rut
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
166 views292 pages

Book 3

Uploaded by

t4ez5rut
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 292

Book III

Contents
The Machine in the Ghost

Minds: An Introduction 539


Interlude: The Power of Intelligence 547
L The Simple Math of Evolution
131 An Alien God 553
132 The Wonder of Evolution 561
133 Evolutions Are Stupid (But Work Anyway) 565
134 No Evolutions for Corporations or Nanodevices 570
135 Evolving to Extinction 576
136 The Tragedy of Group Selectionism 582
137 Fake Optimization Criteria 588
138 Adaptation-Executers, Not Fitness-Maximizers 591
139 Evolutionary Psychology 595
140 An Especially Elegant Evolutionary Psychology Experiment 600
141 Superstimuli and the Collapse of Western Civilization 604
142 Thou Art Godshatter 608
M Fragile Purposes
143 Belief in Intelligence 617
144 Humans in Funny Suits 621
145 Optimization and the Intelligence Explosion 627
146 Ghosts in the Machine 634
147 Artificial Addition 639
148 Terminal Values and Instrumental Values 645
149 Leaky Generalizations 654
150 The Hidden Complexity of Wishes 658
151 Anthropomorphic Optimism 665
152 Lost Purposes 669
N A Human’s Guide to Words
153 The Parable of the Dagger 677
154 The Parable of Hemlock 679
155 Words as Hidden Inferences 683
156 Extensions and Intensions 686
157 Similarity Clusters 690
158 Typicality and Asymmetrical Similarity 692
159 The Cluster Structure of Thingspace 695
160 Disguised Queries 699
161 Neural Categories 704
162 How An Algorithm Feels From Inside 710
163 Disputing Definitions 716
164 Feel the Meaning 721
165 The Argument from Common Usage 725
166 Empty Labels 730
167 Taboo Your Words 733
168 Replace the Symbol with the Substance 737
169 Fallacies of Compression 742
170 Categorizing Has Consequences 746
171 Sneaking in Connotations 749
172 Arguing “By Definition” 753
173 Where to Draw the Boundary? 757
174 Entropy, and Short Codes 761
175 Mutual Information, and Density in Thingspace 765
176 Superexponential Conceptspace, and Simple Words 772
177 Conditional Independence, and Naive Bayes 779
178 Words as Mental Paintbrush Handles 787
179 Variable Question Fallacies 790
180 37 Ways That Words Can Be Wrong 793
Interlude: An Intuitive Explanation of Bayes’s Theorem 803
Minds: An Introduction
by Rob Bensinger

You’re a mind, and that puts you in a pretty strange predicament.


Very few things get to be minds. You’re that odd bit of stuff in the universe
that can form predictions and make plans, weigh and revise beliefs, suffer,
dream, notice ladybugs, or feel a sudden craving for mango. You can even
form, inside your mind, a picture of your whole mind. You can reason about
your own reasoning process, and work to bring its operations more in line
with your goals.
You’re a mind, implemented on a human brain. And it turns out that a
human brain, for all its marvelous flexibility, is a lawful thing, a thing of pattern
and routine. Your mind can follow a routine for a lifetime, without ever once
noticing that it is doing so. And these routines can have great consequences.
When a mental pattern serves you well, we call that “rationality.”
You exist as you are, hard-wired to exhibit certain species of rationality and
certain species of irrationality, because of your ancestry. You, and all life on
Earth, are descended from ancient self-replicating molecules. This replication
process was initially clumsy and haphazard, and soon yielded replicable differ-
ences between the replicators. “Evolution” is our name for the change in these
differences over time.
Since some of these reproducible differences impact reproducibility—a
phenomenon called “selection”—evolution has resulted in organisms suited to
reproduction in environments like the ones their ancestors had. Everything
about you is built on the echoes of your ancestors’ struggles and victories.
And so here you are: a mind, carved from weaker minds, seeking to under-
stand your own inner workings, that they can be improved upon—improved
upon relative to your goals, and not those of your designer, evolution. What
useful policies and insights can we take away from knowing that this is our
basic situation?

Ghosts and Machines


Our brains, in their small-scale structure and dynamics, look like many other
mechanical systems. Yet we rarely think of our minds in the same terms
we think of objects in our environments or organs in our bodies. Our basic
mental categories—belief, decision, word, idea, feeling, and so on—bear little
resemblance to our physical categories.
Past philosophers have taken this observation and run with it, arguing that
minds and brains are fundamentally distinct and separate phenomena. This
is the view the philosopher Gilbert Ryle called “the dogma of the Ghost in
the Machine.”1 But modern scientists and philosophers who have rejected
dualism haven’t necessarily replaced it with a better predictive model of how
the mind works. Practically speaking, our purposes and desires still function
like free-floating ghosts, like a magisterium cut off from the rest of our scientific
knowledge. We can talk about “rationality” and “bias” and “how to change
our minds,” but if those ideas are still imprecise and unconstrained by any
overarching theory, our scientific-sounding language won’t protect us from
making the same kinds of mistakes as those whose theoretical posits include
spirits and essences.
Interestingly, the mystery and mystification surrounding minds doesn’t just
obscure our view of humans. It also accrues to systems that seem mind-like or
purposeful in evolutionary biology and artificial intelligence (AI). Perhaps, if
we cannot readily glean what we are from looking at ourselves, we can learn
more by using obviously inhuman processes as a mirror.
There are many ghosts to learn from here—ghosts past, and present, and
yet to come. And these illusions are real cognitive events, real phenomena that
we can study and explain. If there appears to be a ghost in the machine, that
appearance is itself the hidden work of a machine.
The first sequence of The Machine in the Ghost, “The Simple Math of
Evolution,” aims to communicate the dissonance and divergence between
our hereditary history, our present-day biology, and our ultimate aspirations.
This will require digging deeper than is common in introductions to evolution
for non-biologists, which often restrict their attention to surface-level features
of natural selection.
The third sequence, “A Human’s Guide to Words,” discusses the basic
relationship between cognition and concept formation. This is followed by a
longer essay introducing Bayesian inference.
Bridging the gap between these topics, “Fragile Purposes” abstracts from
human cognition and evolution to the idea of minds and goal-directed systems
at their most general. These essays serve the secondary purpose of explaining
the author’s general approach to philosophy and the science of rationality,
which is strongly informed by his work in AI.

Rebuilding Intelligence
Yudkowsky is a decision theorist and mathematician who works on founda-
tional issues in Artificial General Intelligence (AGI), the theoretical study of
domain-general problem-solving systems. Yudkowsky’s work in AI has been a
major driving force behind his exploration of the psychology of human ratio-
nality, as he noted in his very first blog post on Overcoming Bias, The Martial
Art of Rationality:

Such understanding as I have of rationality, I acquired in the


course of wrestling with the challenge of Artificial General Intel-
ligence (an endeavor which, to actually succeed, would require
sufficient mastery of rationality to build a complete working ra-
tionalist out of toothpicks and rubber bands). In most ways the
AI problem is enormously more demanding than the personal art
of rationality, but in some ways it is actually easier. In the martial
art of mind, we need to acquire the real-time procedural skill of
pulling the right levers at the right time on a large, pre-existing
thinking machine whose innards are not end-user-modifiable.
Some of the machinery is optimized for evolutionary selection
pressures that run directly counter to our declared goals in using
it. Deliberately we decide that we want to seek only the truth; but
our brains have hardwired support for rationalizing falsehoods.
[...]
Trying to synthesize a personal art of rationality, using the sci-
ence of rationality, may prove awkward: One imagines trying to
invent a martial art using an abstract theory of physics, game the-
ory, and human anatomy. But humans are not reflectively blind;
we do have a native instinct for introspection. The inner eye is
not sightless; but it sees blurrily, with systematic distortions. We
need, then, to apply the science to our intuitions, to use the ab-
stract knowledge to correct our mental movements and augment
our metacognitive skills. We are not writing a computer program
to make a string puppet execute martial arts forms; it is our own
mental limbs that we must move. Therefore we must connect the-
ory to practice. We must come to see what the science means, for
ourselves, for our daily inner life.

From Yudkowsky’s perspective, I gather, talking about human rationality with-


out saying anything interesting about AI is about as difficult as talking about
AI without saying anything interesting about rationality.
In the long run, Yudkowsky predicts that AI will come to surpass humans
in an “intelligence explosion,” a scenario in which self-modifying AI improves
its own ability to productively redesign itself, kicking off a rapid succession of
further self-improvements. The term “technological singularity” is sometimes
used in place of “intelligence explosion;” until January 2013, MIRI was named
“the Singularity Institute for Artificial Intelligence” and hosted an annual Sin-
gularity Summit. Since then, Yudkowsky has come to favor I.J. Good’s older
term, “intelligence explosion,” to help distinguish his views from other futurist
predictions, such as Ray Kurzweil’s exponential technological progress thesis.2
Technologies like smarter-than-human AI seem likely to result in large
societal upheavals, for the better or for the worse. Yudkowsky coined the
term “Friendly AI theory” to refer to research into techniques for aligning an
AGI’s preferences with the preferences of humans. At this point, very little is
known about when generally intelligent software might be invented, or what
safety approaches would work well in such cases. Present-day autonomous AI
can already be quite challenging to verify and validate with much confidence,
and many current techniques are not likely to generalize to more intelligent
and adaptive systems. “Friendly AI” is therefore closer to a menagerie of
basic mathematical and philosophical questions than to a well-specified set of
programming objectives.
As of 2015, Yudkowsky’s views on the future of AI continue to be debated by
technology forecasters and AI researchers in industry and academia, who have
yet to converge on a consensus position. Nick Bostrom’s book Superintelligence
provides a big-picture summary of the many moral and strategic questions
raised by smarter-than-human AI.3
For a general introduction to the field of AI, the most widely used textbook
is Russell and Norvig’s Artificial Intelligence: A Modern Approach.4 In a chapter
discussing the moral and philosophical questions raised by AI, Russell and
Norvig note the technical difficulty of specifying good behavior in strongly
adaptive AI:

[Yudkowsky] asserts that friendliness (a desire not to harm hu-


mans) should be designed in from the start, but that the designers
should recognize both that their own designs may be flawed, and
that the robot will learn and evolve over time. Thus the challenge
is one of mechanism design—to define a mechanism for evolv-
ing AI systems under a system of checks and balances, and to
give the systems utility functions that will remain friendly in the
face of such changes. We can’t just give a program a static utility
function, because circumstances, and our desired responses to
circumstances, change over time.

Disturbed by the possibility that future progress in AI, nanotechnology, biotech-


nology, and other fields could endanger human civilization, Bostrom and
Ćirković compiled the first academic anthology on the topic, Global Catas-
trophic Risks.5 The most extreme of these are the existential risks, risks that
could result in the permanent stagnation or extinction of humanity.6
People (experts included) tend to be extraordinarily bad at forecasting major
future events (new technologies included). Part of Yudkowsky’s goal in dis-
cussing rationality is to figure out which biases are interfering with our ability
to predict and prepare for big upheavals well in advance. Yudkowsky’s contri-
butions to the Global Catastrophic Risks volume, “Cognitive biases potentially
affecting judgement of global risks” and “Artificial intelligence as a positive
and negative factor in global risk,” tie together his research in cognitive sci-
ence and AI. Yudkowsky and Bostrom summarize near-term concerns along
with long-term ones in a chapter of the Cambridge Handbook of Artificial
Intelligence, “The ethics of artificial intelligence.”7
Though this is a book about human rationality, the topic of AI has rele-
vance as a source of simple illustrations of aspects of human cognition. Long-
term technology forecasting is also one of the more important applications
of Bayesian rationality, which can model correct reasoning even in domains
where the data is scarce or equivocal.
Knowing the design can tell you much about the designer; and knowing
the designer can tell you much about the design.
We’ll begin, then, by inquiring into what our own designer can teach us
about ourselves.

1. Gilbert Ryle, The Concept of Mind (University of Chicago Press, 1949).

2. Irving John Good, “Speculations Concerning the First Ultraintelligent Machine,” in Advances in
Computers, ed. Franz L. Alt and Morris Rubinoff, vol. 6 (New York: Academic Press, 1965), 31–88,
doi:10.1016/S0065-2458(08)60418-0.
3. Nick Bostrom, Superintelligence: Paths, Dangers, Strategies (Oxford University Press, 2014).

4. Stuart J. Russell and Peter Norvig, Artificial Intelligence: A Modern Approach, 3rd ed. (Upper Saddle
River, NJ: Prentice-Hall, 2010).

5. Bostrom and Ćirković, Global Catastrophic Risks.

6. An example of a possible existential risk is the “grey goo” scenario, in which molecular robots
designed to efficiently self-replicate do their job too well, rapidly outcompeting living organisms
as they consume the Earth’s available matter.

7. Nick Bostrom and Eliezer Yudkowsky, “The Ethics of Artificial Intelligence,” in The Cambridge
Handbook of Artificial Intelligence, ed. Keith Frankish and William Ramsey (New York: Cambridge
University Press, 2014).
Interlude
The Power of Intelligence

In our skulls we carry around three pounds of slimy, wet, grayish tissue, corru-
gated like crumpled toilet paper.
You wouldn’t think, to look at the unappetizing lump, that it was some of
the most powerful stuff in the known universe. If you’d never seen an anatomy
textbook, and you saw a brain lying in the street, you’d say “Yuck!” and try not
to get any of it on your shoes. Aristotle thought the brain was an organ that
cooled the blood. It doesn’t look dangerous.
Five million years ago, the ancestors of lions ruled the day, the ancestors
of wolves roamed the night. The ruling predators were armed with teeth and
claws—sharp, hard cutting edges, backed up by powerful muscles. Their prey,
in self-defense, evolved armored shells, sharp horns, toxic venoms, camouflage.
The war had gone on through hundreds of eons and countless arms races.
Many a loser had been removed from the game, but there was no sign of a
winner. Where one species had shells, another species would evolve to crack
them; where one species became poisonous, another would evolve to tolerate
the poison. Each species had its private niche—for who could live in the seas
and the skies and the land at once? There was no ultimate weapon and no
ultimate defense and no reason to believe any such thing was possible.
Then came the Day of the Squishy Things.
They had no armor. They had no claws. They had no venoms.
If you saw a movie of a nuclear explosion going off, and you were told an
Earthly life form had done it, you would never in your wildest dreams imagine
that the Squishy Things could be responsible. After all, Squishy Things aren’t
radioactive.
In the beginning, the Squishy Things had no fighter jets, no machine guns,
no rifles, no swords. No bronze, no iron. No hammers, no anvils, no tongs,
no smithies, no mines. All the Squishy Things had were squishy fingers—too
weak to break a tree, let alone a mountain. Clearly not dangerous. To cut stone
you would need steel, and the Squishy Things couldn’t excrete steel. In the
environment there were no steel blades for Squishy fingers to pick up. Their
bodies could not generate temperatures anywhere near hot enough to melt
metal. The whole scenario was obviously absurd.
And as for the Squishy Things manipulating DNA—that would have been
beyond ridiculous. Squishy fingers are not that small. There is no access to
DNA from the Squishy level; it would be like trying to pick up a hydrogen
atom. Oh, technically it’s all one universe, technically the Squishy Things and
DNA are part of the same world, the same unified laws of physics, the same
great web of causality. But let’s be realistic: you can’t get there from here.
Even if Squishy Things could someday evolve to do any of those feats, it
would take thousands of millennia. We have watched the ebb and flow of Life
through the eons, and let us tell you, a year is not even a single clock tick of
evolutionary time. Oh, sure, technically a year is six hundred trillion trillion
trillion trillion Planck intervals. But nothing ever happens in less than six
hundred million trillion trillion trillion trillion Planck intervals, so it’s a moot
point. The Squishy Things, as they run across the savanna now, will not fly
across continents for at least another ten million years; no one could have that
much sex.
Now explain to me again why an Artificial Intelligence can’t do anything
interesting over the Internet unless a human programmer builds it a robot
body.
I have observed that someone’s flinch-reaction to “intelligence”—the
thought that crosses their mind in the first half-second after they hear the
word “intelligence”—often determines their flinch-reaction to the notion of
an intelligence explosion. Often they look up the keyword “intelligence” and
retrieve the concept booksmarts—a mental image of the Grand Master chess
player who can’t get a date, or a college professor who can’t survive outside
academia.
“It takes more than intelligence to succeed professionally,” people say, as
if charisma resided in the kidneys, rather than the brain. “Intelligence is no
match for a gun,” they say, as if guns had grown on trees. “Where will an
Artificial Intelligence get money?” they ask, as if the first Homo sapiens had
found dollar bills fluttering down from the sky, and used them at convenience
stores already in the forest. The human species was not born into a market
economy. Bees won’t sell you honey if you offer them an electronic funds
transfer. The human species imagined money into existence, and it exists—for
us, not mice or wasps—because we go on believing in it.
I keep trying to explain to people that the archetype of intelligence is not
Dustin Hoffman in Rain Man. It is a human being, period. It is squishy things
that explode in a vacuum, leaving footprints on their moon. Within that gray
wet lump is the power to search paths through the great web of causality, and
find a road to the seemingly impossible—the power sometimes called creativity.
People—venture capitalists in particular—sometimes ask how, if the Ma-
chine Intelligence Research Institute successfully builds a true AI, the results
will be commercialized. This is what we call a framing problem.
Or maybe it’s something deeper than a simple clash of assumptions. With
a bit of creative thinking, people can imagine how they would go about trav-
elling to the Moon, or curing smallpox, or manufacturing computers. To
imagine a trick that could accomplish all these things at once seems downright
impossible—even though such a power resides only a few centimeters behind
their own eyes. The gray wet thing still seems mysterious to the gray wet thing.
And so, because people can’t quite see how it would all work, the power of
intelligence seems less real; harder to imagine than a tower of fire sending a
ship to Mars. The prospect of visiting Mars captures the imagination. But if
one should promise a Mars visit, and also a grand unified theory of physics,
and a proof of the Riemann Hypothesis, and a cure for obesity, and a cure
for cancer, and a cure for aging, and a cure for stupidity—well, it just sounds
wrong, that’s all.
And well it should. It’s a serious failure of imagination to think that intelli-
gence is good for so little. Who could have imagined, ever so long ago, what
minds would someday do? We may not even know what our real problems are.
But meanwhile, because it’s hard to see how one process could have such
diverse powers, it’s hard to imagine that one fell swoop could solve even such
prosaic problems as obesity and cancer and aging.
Well, one trick cured smallpox and built airplanes and cultivated wheat
and tamed fire. Our current science may not agree yet on how exactly the
trick works, but it works anyway. If you are temporarily ignorant about a
phenomenon, that is a fact about your current state of mind, not a fact about
the phenomenon. A blank map does not correspond to a blank territory. If
one does not quite understand that power which put footprints on the Moon,
nonetheless, the footprints are still there—real footprints, on a real Moon, put
there by a real power. If one were to understand deeply enough, one could
create and shape that power. Intelligence is as real as electricity. It’s merely
far more powerful, far more dangerous, has far deeper implications for the
unfolding story of life in the universe—and it’s a tiny little bit harder to figure
out how to build a generator.

*
Part L

The Simple Math of Evolution


131
An Alien God

“A curious aspect of the theory of evolution,” said Jacques Monod, “is that
everybody thinks he understands it.”
A human being, looking at the natural world, sees a thousand times pur-
pose. A rabbit’s legs, built and articulated for running; a fox’s jaws, built and
articulated for tearing. But what you see is not exactly what is there . . .
In the days before Darwin, the cause of all this apparent purposefulness
was a very great puzzle unto science. The Goddists said “God did it,” because
you get 50 bonus points each time you use the word “God” in a sentence. Yet
perhaps I’m being unfair. In the days before Darwin, it seemed like a much
more reasonable hypothesis. Find a watch in the desert, said William Paley,
and you can infer the existence of a watchmaker.
But when you look at all the apparent purposefulness in Nature, rather than
picking and choosing your examples, you start to notice things that don’t fit
the Judeo-Christian concept of one benevolent God. Foxes seem well-designed
to catch rabbits. Rabbits seem well-designed to evade foxes. Was the Creator
having trouble making up Its mind?
When I design a toaster oven, I don’t design one part that tries to get
electricity to the coils and a second part that tries to prevent electricity from
getting to the coils. It would be a waste of effort. Who designed the ecosystem,
with its predators and prey, viruses and bacteria? Even the cactus plant, which
you might think well-designed to provide water and fruit to desert animals, is
covered with inconvenient spines.
The ecosystem would make much more sense if it wasn’t designed by a
unitary Who, but, rather, created by a horde of deities—say from the Hindu or
Shinto religions. This handily explains both the ubiquitous purposefulnesses,
and the ubiquitous conflicts: More than one deity acted, often at cross-purposes.
The fox and rabbit were both designed, but by distinct competing deities. I
wonder if anyone ever remarked on the seemingly excellent evidence thus
provided for Hinduism over Christianity. Probably not.
Similarly, the Judeo-Christian God is alleged to be benevolent—well, sort
of. And yet much of nature’s purposefulness seems downright cruel. Darwin
suspected a non-standard Creator for studying Ichneumon wasps, whose para-
lyzing stings preserve its prey to be eaten alive by its larvae: “I cannot persuade
myself,” wrote Darwin, “that a beneficent and omnipotent God would have de-
signedly created the Ichneumonidae with the express intention of their feeding
within the living bodies of Caterpillars, or that a cat should play with mice.”1 I
wonder if any earlier thinker remarked on the excellent evidence thus provided
for Manichaean religions over monotheistic ones.
By now we all know the punchline: you just say “evolution.”
I worry that’s how some people are absorbing the “scientific” explanation,
as a magical purposefulness factory in Nature. I’ve previously discussed the
case of Storm from the movie X-Men, who in one mutation gets the ability
to throw lightning bolts. Why? Well, there’s this thing called “evolution”
that somehow pumps a lot of purposefulness into Nature, and the changes
happen through “mutations.” So if Storm gets a really large mutation, she can
be redesigned to throw lightning bolts. Radioactivity is a popular super origin:
radiation causes mutations, so more powerful radiation causes more powerful
mutations. That’s logic.
But evolution doesn’t allow just any kind of purposefulness to leak into
Nature. That’s what makes evolution a success as an empirical hypothesis. If
evolutionary biology could explain a toaster oven, not just a tree, it would be
worthless. There’s a lot more to evolutionary theory than pointing at Nature
and saying, “Now purpose is allowed,” or “Evolution did it!” The strength of a
theory is not what it allows, but what it prohibits; if you can invent an equally
persuasive explanation for any outcome, you have zero knowledge.
“Many non-biologists,” observed George Williams, “think that it is for their
benefit that rattles grow on rattlesnake tails.”2 Bzzzt! This kind of purposeful-
ness is not allowed. Evolution doesn’t work by letting flashes of purposefulness
creep in at random—reshaping one species for the benefit of a random recipi-
ent.
Evolution is powered by a systematic correlation between the different
ways that different genes construct organisms, and how many copies of those
genes make it into the next generation. For rattles to grow on rattlesnake tails,
rattle-growing genes must become more and more frequent in each successive
generation. (Actually genes for incrementally more complex rattles, but if I
start describing all the fillips and caveats to evolutionary biology, we really will
be here all day.)
There isn’t an Evolution Fairy that looks over the current state of Nature,
decides what would be a “good idea,” and chooses to increase the frequency of
rattle-constructing genes.
I suspect this is where a lot of people get stuck, in evolutionary biology.
They understand that “helpful” genes become more common, but “helpful”
lets any sort of purpose leak in. They don’t think there’s an Evolution Fairy,
yet they ask which genes will be “helpful” as if a rattlesnake gene could “help”
non-rattlesnakes.
The key realization is that there is no Evolution Fairy. There’s no outside
force deciding which genes ought to be promoted. Whatever happens, happens
because of the genes themselves.
Genes for constructing (incrementally better) rattles must have somehow
ended up more frequent in the rattlesnake gene pool, because of the rattle.
In this case it’s probably because rattlesnakes with better rattles survive more
often—rather than mating more successfully, or having brothers that reproduce
more successfully, etc.
Maybe predators are wary of rattles and don’t step on the snake. Or maybe
the rattle diverts attention from the snake’s head. (As George Williams suggests,
“The outcome of a fight between a dog and a viper would depend very much
on whether the dog initially seized the reptile by the head or by the tail.”)
But that’s just a snake’s rattle. There are much more complicated ways that a
gene can cause copies of itself to become more frequent in the next generation.
Your brother or sister shares half your genes. A gene that sacrifices one unit of
resources to bestow three units of resource on a brother, may promote some
copies of itself by sacrificing one of its constructed organisms. (If you really
want to know all the details and caveats, buy a book on evolutionary biology;
there is no royal road.)
The main point is that the gene’s effect must cause copies of that gene to
become more frequent in the next generation. There’s no Evolution Fairy that
reaches in from outside. There’s nothing which decides that some genes are
“helpful” and should, therefore, increase in frequency. It’s just cause and effect,
starting from the genes themselves.
This explains the strange conflicting purposefulness of Nature, and its
frequent cruelty. It explains even better than a horde of Shinto deities.
Why is so much of Nature at war with other parts of Nature? Because there
isn’t one Evolution directing the whole process. There’s as many different
“evolutions” as reproducing populations. Rabbit genes are becoming more
or less frequent in rabbit populations. Fox genes are becoming more or less
frequent in fox populations. Fox genes which construct foxes that catch rabbits,
insert more copies of themselves in the next generation. Rabbit genes which
construct rabbits that evade foxes are naturally more common in the next
generation of rabbits. Hence the phrase “natural selection.”
Why is Nature cruel? You, a human, can look at an Ichneumon wasp,
and decide that it’s cruel to eat your prey alive. You can decide that if you’re
going to eat your prey alive, you can at least have the decency to stop it from
hurting. It would scarcely cost the wasp anything to anesthetize its prey as well
as paralyze it. Or what about old elephants, who die of starvation when their
last set of teeth fall out? These elephants aren’t going to reproduce anyway.
What would it cost evolution—the evolution of elephants, rather—to ensure
that the elephant dies right away, instead of slowly and in agony? What would
it cost evolution to anesthetize the elephant, or give it pleasant dreams before
it dies? Nothing; that elephant won’t reproduce more or less either way.
If you were talking to a fellow human, trying to resolve a conflict of interest,
you would be in a good negotiating position—would have an easy job of
persuasion. It would cost so little to anesthetize the prey, to let the elephant
die without agony! Oh please, won’t you do it, kindly . . . um . . .
There’s no one to argue with.
Human beings fake their justifications, figure out what they want using one
method, and then justify it using another method. There’s no Evolution of
Elephants Fairy that’s trying to (a) figure out what’s best for elephants, and then
(b) figure out how to justify it to the Evolutionary Overseer, who (c) doesn’t
want to see reproductive fitness decreased, but is (d) willing to go along with
the painless-death idea, so long as it doesn’t actually harm any genes.
There’s no advocate for the elephants anywhere in the system.
Humans, who are often deeply concerned for the well-being of animals,
can be very persuasive in arguing how various kindnesses wouldn’t harm
reproductive fitness at all. Sadly, the evolution of elephants doesn’t use a
similar algorithm; it doesn’t select nice genes that can plausibly be argued to
help reproductive fitness. Simply: genes that replicate more often become
more frequent in the next generation. Like water flowing downhill, and equally
benevolent.
A human, looking over Nature, starts thinking of all the ways we would de-
sign organisms. And then we tend to start rationalizing reasons why our design
improvements would increase reproductive fitness—a political instinct, trying
to sell your own preferred option as matching the boss’s favored justification.
And so, amateur evolutionary biologists end up making all sorts of won-
derful and completely mistaken predictions. Because the amateur biologists are
drawing their bottom line—and more importantly, locating their prediction
in hypothesis-space—using a different algorithm than evolutions use to draw
their bottom lines.
A human engineer would have designed human taste buds to measure how
much of each nutrient we had, and how much we needed. When fat was scarce,
almonds or cheeseburgers would taste delicious. But if you started to become
obese, or if vitamins were lacking, lettuce would taste delicious. But there is no
Evolution of Humans Fairy, which intelligently planned ahead and designed a
general system for every contingency. It was a reliable invariant of humans’
ancestral environment that calories were scarce. So genes whose organisms
loved calories, became more frequent. Like water flowing downhill.
We are simply the embodied history of which organisms did in fact survive
and reproduce, not which organisms ought prudentially to have survived and
reproduced.
The human retina is constructed backward: The light-sensitive cells are at
the back, and the nerves emerge from the front and go back through the retina
into the brain. Hence the blind spot. To a human engineer, this looks simply
stupid—and other organisms have independently evolved retinas the right way
around. Why not redesign the retina?
The problem is that no single mutation will reroute the whole retina simul-
taneously. A human engineer can redesign multiple parts simultaneously, or
plan ahead for future changes. But if a single mutation breaks some vital part
of the organism, it doesn’t matter what wonderful things a Fairy could build
on top of it—the organism dies and the gene decreases in frequency.
If you turn around the retina’s cells without also reprogramming the nerves
and optic cable, the system as a whole won’t work. It doesn’t matter that,
to a Fairy or a human engineer, this is one step forward in redesigning the
retina. The organism is blind. Evolution has no foresight, it is simply the frozen
history of which organisms did in fact reproduce. Evolution is as blind as a
halfway-redesigned retina.
Find a watch in a desert, said William Paley, and you can infer the watch-
maker. There were once those who denied this, who thought that life “just
happened” without need of an optimization process, mice being spontaneously
generated from straw and dirty shirts.
If we ask who was more correct—the theologians who argued for a Creator-
God, or the intellectually unfulfilled atheists who argued that mice sponta-
neously generated—then the theologians must be declared the victors: evo-
lution is not God, but it is closer to God than it is to pure random entropy.
Mutation is random, but selection is non-random. This doesn’t mean an intel-
ligent Fairy is reaching in and selecting. It means there’s a non-zero statistical
correlation between the gene and how often the organism reproduces. Over a
few million years, that non-zero statistical correlation adds up to something
very powerful. It’s not a god, but it’s more closely akin to a god than it is to
snow on a television screen.
In a lot of ways, evolution is like unto theology. “Gods are ontologically
distinct from creatures,” said Damien Broderick, “or they’re not worth the
paper they’re written on.” And indeed, the Shaper of Life is not itself a creature.
Evolution is bodiless, like the Judeo-Christian deity. Omnipresent in Nature,
immanent in the fall of every leaf. Vast as a planet’s surface. Billions of years
old. Itself unmade, arising naturally from the structure of physics. Doesn’t
that all sound like something that might have been said about God?
And yet the Maker has no mind, as well as no body. In some ways, its
handiwork is incredibly poor design by human standards. It is internally
divided. Most of all, it isn’t nice.
In a way, Darwin discovered God—a God that failed to match the precon-
ceptions of theology, and so passed unheralded. If Darwin had discovered that
life was created by an intelligent agent—a bodiless mind that loves us, and will
smite us with lightning if we dare say otherwise—people would have said “My
gosh! That’s God!”
But instead Darwin discovered a strange alien God—not comfortably “in-
effable,” but really genuinely different from us. Evolution is not a God, but if it
were, it wouldn’t be Jehovah. It would be H. P. Lovecraft’s Azathoth, the blind
idiot God burbling chaotically at the center of everything, surrounded by the
thin monotonous piping of flutes.
Which you might have predicted, if you had really looked at Nature.
So much for the claim some religionists make, that they believe in a vague
deity with a correspondingly high probability. Anyone who really believed
in a vague deity, would have recognized their strange inhuman creator when
Darwin said “Aha!”
So much for the claim some religionists make, that they are waiting
innocently curious for Science to discover God. Science has already discov-
ered the sort-of-godlike maker of humans—but it wasn’t what the religionists
wanted to hear. They were waiting for the discovery of their God, the highly spe-
cific God they want to be there. They shall wait forever, for the great discovery
has already taken place, and the winner is Azathoth.
Well, more power to us humans. I like having a Creator I can outwit. Beats
being a pet. I’m glad it was Azathoth and not Odin.

1. Francis Darwin, ed., The Life and Letters of Charles Darwin, vol. 2 (John Murray, 1887).

2. George C. Williams, Adaptation and Natural Selection: A Critique of Some Current Evolutionary
Thought, Princeton Science Library (Princeton, NJ: Princeton University Press, 1966).
132
The Wonder of Evolution

The wonder of evolution is that it works at all.


I mean that literally: If you want to marvel at evolution, that’s what’s
marvel-worthy.
How does optimization first arise in the universe? If an intelligent agent
designed Nature, who designed the intelligent agent? Where is the first design
that has no designer? The puzzle is not how the first stage of the bootstrap can
be super-clever and super-efficient; the puzzle is how it can happen at all.
Evolution resolves the infinite regression, not by being super-clever and
super-efficient, but by being stupid and inefficient and working anyway. This is
the marvel.
For professional reasons, I often have to discuss the slowness, randomness,
and blindness of evolution. Afterward someone says: “You just said that
evolution can’t plan simultaneous changes, and that evolution is very inefficient
because mutations are random. Isn’t that what the creationists say? That you
couldn’t assemble a watch by randomly shaking the parts in a box?”
But the reply to creationists is not that you can assemble a watch by shaking
the parts in a box. The reply is that this is not how evolution works. If you think
that evolution does work by whirlwinds assembling 747s, then the creationists
have successfully misrepresented biology to you; they’ve sold the strawman.
The real answer is that complex machinery evolves either incrementally, or
by adapting previous complex machinery used for a new purpose. Squirrels
jump from treetop to treetop using just their muscles, but the length they can
jump depends to some extent on the aerodynamics of their bodies. So now
there are flying squirrels, so aerodynamic they can glide short distances. If
birds were wiped out, the descendants of flying squirrels might reoccupy that
ecological niche in ten million years, gliding membranes transformed into
wings. And the creationists would say, “What good is half a wing? You’d just fall
down and splat. How could squirrelbirds possibly have evolved incrementally?”
That’s how one complex adaptation can jump-start a new complex adapta-
tion. Complexity can also accrete incrementally, starting from a single muta-
tion.
First comes some gene A which is simple, but at least a little useful on its
own, so that A increases to universality in the gene pool. Now along comes
gene B, which is only useful in the presence of A, but A is reliably present
in the gene pool, so there’s a reliable selection pressure in favor of B. Now
a modified version of A∗ arises, which depends on B, but doesn’t break B’s
dependency on A/A∗. Then along comes C, which depends on A∗ and B,
and B ∗, which depends on A∗ and C. Soon you’ve got “irreducibly complex”
machinery that breaks if you take out any single piece.
And yet you can still visualize the trail backward to that single piece: you
can, without breaking the whole machine, make one piece less dependent on
another piece, and do this a few times, until you can take out one whole piece
without breaking the machine, and so on until you’ve turned a ticking watch
back into a crude sundial.
Here’s an example: DNA stores information very nicely, in a durable format
that allows for exact duplication. A ribosome turns that stored information
into a sequence of amino acids, a protein, which folds up into a variety of
chemically active shapes. The combined system, DNA and ribosome, can build
all sorts of protein machinery. But what good is DNA, without a ribosome
that turns DNA information into proteins? What good is a ribosome, without
DNA to tell it which proteins to make?
Organisms don’t always leave fossils, and evolutionary biology can’t always
figure out the incremental pathway. But in this case we do know how it hap-
pened. RNA shares with DNA the property of being able to carry information
and replicate itself, although RNA is less durable and copies less accurately.
And RNA also shares the ability of proteins to fold up into chemically active
shapes, though it’s not as versatile as the amino acid chains of proteins. Al-
most certainly, RNA is the single A which predates the mutually dependent
A∗ and B.
It’s just as important to note that RNA does the combined job of DNA and
proteins poorly, as that it does the combined job at all. It’s amazing enough
that a single molecule can both store information and manipulate chemistry.
For it to do the job well would be a wholly unnecessary miracle.
What was the very first replicator ever to exist? It may well have been an
RNA strand, because by some strange coincidence, the chemical ingredients of
RNA are chemicals that would have arisen naturally on the prebiotic Earth
of 4 billion years ago. Please note: evolution does not explain the origin of
life; evolutionary biology is not supposed to explain the first replicator, because
the first replicator does not come from another replicator. Evolution describes
statistical trends in replication. The first replicator wasn’t a statistical trend, it
was a pure accident. The notion that evolution should explain the origin of life
is a pure strawman—more creationist misrepresentation.
If you’d been watching the primordial soup on the day of the first replicator,
the day that reshaped the Earth, you would not have been impressed by how
well the first replicator replicated. The first replicator probably copied itself
like a drunken monkey on LSD. It would have exhibited none of the signs of
careful fine-tuning embodied in modern replicators, because the first replicator
was an accident. It was not needful for that single strand of RNA, or chemical
hypercycle, or pattern in clay, to replicate gracefully. It just had to happen at all.
Even so, it was probably very improbable, considered in an isolated event—but
it only had to happen once, and there were a lot of tide pools. A few billions of
years later, the replicators are walking on the Moon.
The first accidental replicator was the most important molecule in the
history of time. But if you praised it too highly, attributing to it all sorts of
wonderful replication-aiding capabilities, you would be missing the whole point.
Don’t think that, in the political battle between evolutionists and creation-
ists, whoever praises evolution must be on the side of science. Science has
a very exact idea of the capabilities of evolution. If you praise evolution one
millimeter higher than this, you’re not “fighting on evolution’s side” against
creationism. You’re being scientifically inaccurate, full stop. You’re falling into
a creationist trap by insisting that, yes, a whirlwind does have the power to as-
semble a 747! Isn’t that amazing! How wonderfully intelligent is evolution,
how praiseworthy! Look at me, I’m pledging my allegiance to science! The
more nice things I say about evolution, the more I must be on evolution’s side
against the creationists!
But to praise evolution too highly destroys the real wonder, which is not how
well evolution designs things, but that a naturally occurring process manages
to design anything at all.
So let us dispose of the idea that evolution is a wonderful designer, or a
wonderful conductor of species destinies, which we human beings ought to
imitate. For human intelligence to imitate evolution as a designer, would be
like a sophisticated modern bacterium trying to imitate the first replicator as a
biochemist. As T. H. Huxley, “Darwin’s Bulldog,” put it:1

Let us understand, once and for all, that the ethical progress of
society depends, not on imitating the cosmic process, still less in
running away from it, but in combating it.

Huxley didn’t say that because he disbelieved in evolution, but because he


understood it all too well.

1. Thomas Henry Huxley, Evolution and Ethics and Other Essays (Macmillan, 1894).
133
Evolutions Are Stupid (But Work
Anyway)

In the previous essay, I wrote:

Science has a very exact idea of the capabilities of evolution. If


you praise evolution one millimeter higher than this, you’re not
“fighting on evolution’s side” against creationism. You’re being
scientifically inaccurate, full stop.

In this essay I describe some well-known inefficiencies and limitations of evolu-


tions. I say “evolutions,” plural, because fox evolution works at cross-purposes
to rabbit evolution, and neither can talk to snake evolution to learn how to
build venomous fangs.
So I am talking about limitations of evolution here, but this does not mean
I am trying to sneak in creationism. This is standard Evolutionary Biology
201. (583 if you must derive the equations.) Evolutions, thus limited, can still
explain observed biology; in fact the limitations are necessary to make sense
of it. Remember that the wonder of evolutions is not how well they work, but
that they work at all.
Human intelligence is so complicated that no one has any good way to
calculate how efficient it is. Natural selection, though not simple, is simpler
than a human brain; and correspondingly slower and less efficient, as befits the
first optimization process ever to exist. In fact, evolutions are simple enough
that we can calculate exactly how stupid they are.
Evolutions are slow. How slow? Suppose there’s a beneficial mutation that
conveys a fitness advantage of 3%: on average, bearers of this gene have 1.03
times as many children as non-bearers. Assuming that the mutation spreads at
all, how long will it take to spread through the whole population? That depends
on the population size. A gene conveying a 3% fitness advantage, spreading
through a population of 100,000, would require an average of 768 generations
to reach universality in the gene pool. A population of 500,000 would require
875 generations. The general formula is

Generations to fixation = 2 ln(N )/s ,

where N is the population size and (1 + s) is the fitness. (If each bearer of the
gene has 1.03 times as many children as a non-bearer, s = 0.03.)
Thus, if the population size were 1,000,000—the estimated population in
hunter-gatherer times—then it would require 2,763 generations for a gene
conveying a 1% advantage to spread through the gene pool.1
This should not be surprising; genes have to do all their own work of spread-
ing. There’s no Evolution Fairy who can watch the gene pool and say, “Hm,
that gene seems to be spreading rapidly—I should distribute it to everyone.”
In a human market economy, someone who is legitimately getting 20% re-
turns on investment—especially if there’s an obvious, clear mechanism behind
it—can rapidly acquire more capital from other investors; and others will start
duplicate enterprises. Genes have to spread without stock markets or banks or
imitators—as if Henry Ford had to make one car, sell it, buy the parts for 1.01
more cars (on average), sell those cars, and keep doing this until he was up to
a million cars.
All this assumes that the gene spreads in the first place. Here the equation
is simpler and ends up not depending at all on population size:

Probability of fixation = 2s .
A mutation conveying a 3% advantage (which is pretty darned large, as muta-
tions go) has a 6% chance of spreading, at least on that occasion.2 Mutations
can happen more than once, but in a population of a million with a copying
fidelity of 10−8 errors per base per generation, you may have to wait a hun-
dred generations for another chance, and then it still has only a 6% chance of
fixating.
Still, in the long run, an evolution has a good shot at getting there eventually.
(This is going to be a running theme.)
Complex adaptations take a very long time to evolve. First comes allele A,
which is advantageous of itself, and requires a thousand generations to fixate
in the gene pool. Only then can another allele B, which depends on A, begin
rising to fixation. A fur coat is not a strong advantage unless the environment
has a statistically reliable tendency to throw cold weather at you. Well, genes
form part of the environment of other genes, and if B depends on A, then
B will not have a strong advantage unless A is reliably present in the genetic
environment.
Let’s say that B confers a 5% advantage in the presence of A, no advantage
otherwise. Then while A is still at 1% frequency in the population, B only
confers its advantage 1 out of 100 times, so the average fitness advantage of B is
0.05%, and B’s probability of fixation is 0.1%. With a complex adaptation, first
A has to evolve over a thousand generations, then B has to evolve over another
thousand generations, then A∗ evolves over another thousand generations . . .
and several million years later, you’ve got a new complex adaptation.
Then other evolutions don’t imitate it. If snake evolution develops an
amazing new venom, it doesn’t help fox evolution or lion evolution.
Contrast all this to a human programmer, who can design a new complex
mechanism with a hundred interdependent parts over the course of a single
afternoon. How is this even possible? I don’t know all the answer, and my
guess is that neither does science; human brains are much more complicated
than evolutions. I could wave my hands and say something like “goal-directed
backward chaining using combinatorial modular representations,” but you
would not thereby be enabled to design your own human. Still: Humans can
foresightfully design new parts in anticipation of later designing other new
parts; produce coordinated simultaneous changes in interdependent machin-
ery; learn by observing single test cases; zero in on problem spots and think
abstractly about how to solve them; and prioritize which tweaks are worth try-
ing, rather than waiting for a cosmic ray strike to produce a good one. By the
standards of natural selection, this is simply magic.
Humans can do things that evolutions probably can’t do period over the
expected lifetime of the universe. As the eminent biologist Cynthia Kenyon
once put it at a dinner I had the honor of attending, “One grad student can do
things in an hour that evolution could not do in a billion years.” According to
biologists’ best current knowledge, evolutions have invented a fully rotating
wheel on a grand total of three occasions.
And don’t forget the part where the programmer posts the code snippet to
the Internet.
Yes, some evolutionary handiwork is impressive even by comparison to the
best technology of Homo sapiens. But our Cambrian explosion only started,
we only really began accumulating knowledge, around . . . what, four hundred
years ago? In some ways, biology still excels over the best human technology:
we can’t build a self-replicating system the size of a butterfly. In other ways,
human technology leaves biology in the dust. We got wheels, we got steel, we
got guns, we got knives, we got pointy sticks; we got rockets, we got transistors,
we got nuclear power plants. With every passing decade, that balance tips
further.
So, once again: for a human to look to natural selection as inspiration on the
art of design is like a sophisticated modern bacterium trying to imitate the first
awkward replicator’s biochemistry. The first replicator would be eaten instantly
if it popped up in today’s competitive ecology. The same fate would accrue
to any human planner who tried making random point mutations to their
strategies and waiting 768 iterations of testing to adopt a 3% improvement.
Don’t praise evolutions one millimeter more than they deserve.
Coming up next: More exciting mathematical bounds on evolution!

1. Dan Graur and Wen-Hsiung Li, Fundamentals of Molecular Evolution, 2nd ed. (Sunderland, MA:
Sinauer Associates, 2000).
2. John B. S. Haldane, “A Mathematical Theory of Natural and Artificial Selection,” Math-
ematical Proceedings of the Cambridge Philosophical Society 23 (5 1927): 607–615,
doi:10.1017/S0305004100011750.
134
No Evolutions for Corporations or
Nanodevices

The laws of physics and the rules of math don’t cease to apply.
That leads me to believe that evolution doesn’t stop. That further
leads me to believe that nature—bloody in tooth and claw, as
some have termed it—will simply be taken to the next level . . .
[Getting rid of Darwinian evolution is] like trying to get rid
of gravitation. So long as there are limited resources and multiple
competing actors capable of passing on characteristics, you have
selection pressure.
—Perry Metzger, predicting that the reign of natural selection
would continue into the indefinite future

In evolutionary biology, as in many other fields, it is important to think quan-


titatively rather than qualitatively. Does a beneficial mutation “sometimes
spread, but not always”? Well, a psychic power would be a beneficial muta-
tion, so you’d expect it to spread, right? Yet this is qualitative reasoning, not
quantitative—if X is true, then Y is true; if psychic powers are beneficial, they
may spread. In Evolutions Are Stupid, I described the equations for a bene-
ficial mutation’s probability of fixation, roughly twice the fitness advantage
(6% for a 3% advantage). Only this kind of numerical thinking is likely to
make us realize that mutations which are only rarely useful are extremely un-
likely to spread, and that it is practically impossible for complex adaptations
to arise without constant use. If psychic powers really existed, we should ex-
pect to see everyone using them all the time—not just because they would be
so amazingly useful, but because otherwise they couldn’t have evolved in the
first place.
“So long as there are limited resources and multiple competing actors
capable of passing on characteristics, you have selection pressure.” This is
qualitative reasoning. How much selection pressure?
While there are several candidates for the most important equation in
evolutionary biology, I would pick Price’s Equation, which in its simplest
formulation reads:
∆z = cov(vi , zi );

change in average characteristic =

covariance(relative fitness, characteristic) .

This is a very powerful and general formula. For example, a particular gene
for height can be the Z, the characteristic that changes, in which case Price’s
Equation says that the change in the probability of possessing this gene equals
the covariance of the gene with reproductive fitness. Or you can consider
height in general as the characteristic Z, apart from any particular genes, and
Price’s Equation says that the change in height in the next generation will equal
the covariance of height with relative reproductive fitness.
(At least, this is true so long as height is straightforwardly heritable. If
nutrition improves, so that a fixed genotype becomes taller, you have to add a
correction term to Price’s Equation. If there are complex nonlinear interactions
between many genes, you have to either add a correction term, or calculate the
equation in such a complicated way that it ceases to enlighten.)
Many enlightenments may be attained by studying the different forms
and derivations of Price’s Equation. For example, the final equation says that
the average characteristic changes according to its covariance with relative
fitness, rather than its absolute fitness. This means that if a Frodo gene saves
its whole species from extinction, the average Frodo characteristic does not
increase, since Frodo’s act benefited all genotypes equally and did not covary
with relative fitness.
It is said that Price became so disturbed with the implications of his equation
for altruism that he committed suicide, though he may have had other issues.
(Overcoming Bias does not advocate committing suicide after studying Price’s
Equation.)
One of the enlightenments which may be gained by meditating upon Price’s
Equation is that “limited resources” and “multiple competing actors capable of
passing on characteristics” are not sufficient to give rise to an evolution. “Things
that replicate themselves” is not a sufficient condition. Even “competition
between replicating things” is not sufficient.
Do corporations evolve? They certainly compete. They occasionally spin
off children. Their resources are limited. They sometimes die.
But how much does the child of a corporation resemble its parents? Much
of the personality of a corporation derives from key officers, and CEOs cannot
divide themselves by fission. Price’s Equation only operates to the extent that
characteristics are heritable across generations. If great-great-grandchildren
don’t much resemble their great-great-grandparents, you won’t get more than
four generations’ worth of cumulative selection pressure—anything that hap-
pened more than four generations ago will blur itself out. Yes, the personality
of a corporation can influence its spinoff—but that’s nothing like the heritabil-
ity of DNA, which is digital rather than analog, and can transmit itself with
10−8 errors per base per generation.
With DNA you have heritability lasting for millions of generations. That’s
how complex adaptations can arise by pure evolution—the digital DNA lasts
long enough for a gene conveying 3% advantage to spread itself over 768 gener-
ations, and then another gene dependent on it can arise. Even if corporations
replicated with digital fidelity, they would currently be at most ten generations
into the RNA World.
Now, corporations are certainly selected, in the sense that incompetent
corporations go bust. This should logically make you more likely to observe
corporations with features contributing to competence. And in the same sense,
any star that goes nova shortly after it forms, is less likely to be visible when
you look up at the night sky. But if an accident of stellar dynamics makes
one star burn longer than another star, that doesn’t make it more likely that
future stars will also burn longer—the feature will not be copied onto other
stars. We should not expect future astrophysicists to discover complex internal
features of stars which seem designed to help them burn longer. That kind of
mechanical adaptation requires much larger cumulative selection pressures
than a once-off winnowing.
Think of the principle introduced in Einstein’s Arrogance—that the vast
majority of the evidence required to think of General Relativity had to go into
raising that one particular equation to the level of Einstein’s personal attention;
the amount of evidence required to raise it from a deliberately considered possi-
bility to 99.9% certainty was trivial by comparison. In the same sense, complex
features of corporations that require hundreds of bits to specify are produced
primarily by human intelligence, not a handful of generations of low-fidelity
evolution. In biology, the mutations are purely random and evolution supplies
thousands of bits of cumulative selection pressure. In corporations, humans
offer up thousand-bit intelligently designed complex “mutations,” and then
the further selection pressure of “Did it go bankrupt or not?” accounts for a
handful of additional bits in explaining what you see.
Advanced molecular nanotechnology—the artificial sort, not biology—
should be able to copy itself with digital fidelity through thousands of genera-
tions. Would Price’s Equation thereby gain a foothold?
Correlation is covariance divided by variance, so if A is highly predictive
of B, there can be a strong “correlation” between them even if A is ranging
from 0 to 9 and B is only ranging from 50.0001 and 50.0009. Price’s Equation
runs on covariance of characteristics with reproduction—not correlation! If
you can compress variance in characteristics into a tiny band, the covariance
goes way down, and so does the cumulative change in the characteristic.
The Foresight Institute suggests, among other sensible proposals, that the
replication instructions for any nanodevice should be encrypted. Moreover,
encrypted such that flipping a single bit of the encoded instructions will en-
tirely scramble the decrypted output. If all nanodevices produced are precise
molecular copies, and moreover, any mistakes on the assembly line are not
heritable because the offspring got a digital copy of the original encrypted in-
structions for use in making grandchildren, then your nanodevices ain’t gonna
be doin’ much evolving.
You’d still have to worry about prions—self-replicating assembly errors
apart from the encrypted instructions, where a robot arm fails to grab a carbon
atom that is used in assembling a homologue of itself, and this causes the
offspring’s robot arm to likewise fail to grab a carbon atom, etc., even with
all the encrypted instructions remaining constant. But how much correlation
is there likely to be, between this sort of transmissible error, and a higher
reproductive rate? Let’s say that one nanodevice produces a copy of itself every
1,000 seconds, and the new nanodevice is magically more efficient (it not only
has a prion, it has a beneficial prion) and copies itself every 999.99999 seconds.
It needs one less carbon atom attached, you see. That’s not a whole lot of
variance in reproduction, so it’s not a whole lot of covariance either.
And how often will these nanodevices need to replicate? Unless they’ve
got more atoms available than exist in the solar system, or for that matter,
the visible Universe, only a small number of generations will pass before they
hit the resource wall. “Limited resources” are not a sufficient condition for
evolution; you need the frequently iterated death of a substantial fraction of
the population to free up resources. Indeed, “generations” is not so much an
integer as an integral over the fraction of the population that consists of newly
created individuals.
This is, to me, the most frightening thing about gray goo or nanotechnolog-
ical weapons—that they could eat the whole Earth and then that would be it,
nothing interesting would happen afterward. Diamond is stabler than proteins
held together by van der Waals forces, so the goo would only need to reassem-
ble some pieces of itself when an asteroid hit. Even if prions were a powerful
enough idiom to support evolution at all—evolution is slow enough with digi-
tal DNA!—fewer than 1.0 generations might pass between when the goo ate
the Earth and when the Sun died.
To sum up, if you have all of the following properties:

• Entities that replicate;

• Substantial variation in their characteristics;

• Substantial variation in their reproduction;

• Persistent correlation between the characteristics and reproduction;

• High-fidelity long-range heritability in characteristics;

• Frequent birth of a significant fraction of the breeding population;

• And all this remains true through many iterations . . .

Then you will have significant cumulative selection pressures, enough to pro-
duce complex adaptations by the force of evolution.

*
135
Evolving to Extinction

It is a very common misconception that an evolution works for the good of its
species. Can you remember hearing someone talk about two rabbits breeding
eight rabbits and thereby “contributing to the survival of their species”? A
modern evolutionary biologist would never say such a thing; they’d sooner
breed with a rabbit.
It’s yet another case where you’ve got to simultaneously consider multiple
abstract concepts and keep them distinct. Evolution doesn’t operate on partic-
ular individuals; individuals keep whatever genes they’re born with. Evolution
operates on a reproducing population, a species, over time. There’s a natural
tendency to think that if an Evolution Fairy is operating on the species, she
must be optimizing for the species. But what really changes are the gene fre-
quencies, and frequencies don’t increase or decrease according to how much
the gene helps the species as a whole. As we shall later see, it’s quite possible
for a species to evolve to extinction.
Why are boys and girls born in roughly equal numbers? (Leaving aside
crazy countries that use artificial gender selection technologies.) To see why
this is surprising, consider that 1 male can impregnate 2, 10, or 100 females; it
wouldn’t seem that you need the same number of males as females to ensure
the survival of the species. This is even more surprising in the vast majority of
animal species where the male contributes very little to raising the children—
humans are extraordinary, even among primates, for their level of paternal
investment. Balanced gender ratios are found even in species where the male
impregnates the female and vanishes into the mist.
Consider two groups on different sides of a mountain; in group A, each
mother gives birth to 2 males and 2 females; in group B, each mother gives
birth to 3 females and 1 male. Group A and group B will have the same number
of children, but group B will have 50% more grandchildren and 125% more
great-grandchildren. You might think this would be a significant evolutionary
advantage.
But consider: The rarer males become, the more reproductively valuable
they become—not to the group, but to the individual parent. Every child has
one male and one female parent. Then in every generation, the total genetic
contribution from all males equals the total genetic contribution from all
females. The fewer males, the greater the individual genetic contribution per
male. If all the females around you are doing what’s good for the group, what’s
good for the species, and birthing 1 male per 10 females, you can make a
genetic killing by birthing all males, each of whom will have (on average) ten
times as many grandchildren as their female cousins.
So while group selection ought to favor more girls, individual selection fa-
vors equal investment in male and female offspring. Looking at the statistics
of a maternity ward, you can see at a glance that the quantitative balance be-
tween group selection forces and individual selection forces is overwhelmingly
tilted in favor of individual selection in Homo sapiens.
(Technically, this isn’t quite a glance. Individual selection favors equal
parental investments in male and female offspring. If males cost half as much
to birth and/or raise, twice as many males as females will be born at the evolu-
tionarily stable equilibrium. If the same number of males and females were
born in the population at large, but males were twice as cheap to birth, then
you could again make a genetic killing by birthing more males. So the ma-
ternity ward should reflect the balance of parental opportunity costs, in a
hunter-gatherer society, between raising boys and raising girls; and you’d have
to assess that somehow. But ya know, it doesn’t seem all that much more
reproductive-opportunity-costly for a hunter-gatherer family to raise a girl, so
it’s kinda suspicious that around the same number of boys are born as girls.)
Natural selection isn’t about groups, or species, or even individuals. In a sex-
ual species, an individual organism doesn’t evolve; it keeps whatever genes it’s
born with. An individual is a once-off collection of genes that will never reap-
pear; how can you select on that? When you consider that nearly all of your
ancestors are dead, it’s clear that “survival of the fittest” is a tremendous mis-
nomer. “Replication of the fitter” would be more accurate, although technically
fitness is defined only in terms of replication.
Natural selection is really about gene frequencies. To get a complex adap-
tation, a machine with multiple dependent parts, each new gene as it evolves
depends on the other genes being reliably present in its genetic environment.
They must have high frequencies. The more complex the machine, the higher
the frequencies must be. The signature of natural selection occurring is a gene
rising from 0.00001% of the gene pool to 99% of the gene pool. This is the in-
formation, in an information-theoretic sense; and this is what must happen
for large complex adaptations to evolve.
The real struggle in natural selection is not the competition of organisms
for resources; this is an ephemeral thing when all the participants will vanish in
another generation. The real struggle is the competition of alleles for frequency
in the gene pool. This is the lasting consequence that creates lasting information.
The two rams bellowing and locking horns are only passing shadows.
It’s perfectly possible for an allele to spread to fixation by outcompeting
an alternative allele which was “better for the species.” If the Flying Spaghetti
Monster magically created a species whose gender mix was perfectly opti-
mized to ensure the survival of the species—the optimal gender mix to bounce
back reliably from near-extinction events, adapt to new niches, et cetera—
then the evolution would rapidly degrade this species optimum back into
the individual-selection optimum of equal parental investment in males and
females.
Imagine a “Frodo gene” that sacrifices its vehicle to save its entire species
from an extinction event. What happens to the allele frequency as a result? It
goes down. Kthxbye.
If species-level extinction threats occur regularly (call this a “Buffy envi-
ronment”) then the Frodo gene will systematically decrease in frequency and
vanish, and soon thereafter, so will the species.
A hypothetical example? Maybe. If the human species was going to stay
biological for another century, it would be a good idea to start cloning Gandhi.
In viruses, there’s the tension between individual viruses replicating as fast
as possible, versus the benefit of leaving the host alive long enough to transmit
the illness. This is a good real-world example of group selection, and if the
virus evolves to a point on the fitness landscape where the group selection
pressures fail to overcome individual pressures, the virus could vanish shortly
thereafter. I don’t know if a disease has ever been caught in the act of evolving
to extinction, but it’s probably happened any number of times.
Segregation-distorters subvert the mechanisms that usually guarantee fair-
ness of sexual reproduction. For example, there is a segregation-distorter on
the male sex chromosome of some mice which causes only male children to
be born, all carrying the segregation-distorter. Then these males impregnate
females, who give birth to only male children, and so on. You might cry “This
is cheating!” but that’s a human perspective; the reproductive fitness of this
allele is extremely high, since it produces twice as many copies of itself in the
succeeding generation as its nonmutant alternative. Even as females become
rarer and rarer, males carrying this gene are no less likely to mate than any
other male, and so the segregation-distorter remains twice as fit as its alterna-
tive allele. It’s speculated that real-world group selection may have played a
role in keeping the frequency of this gene as low as it seems to be. In which
case, if mice were to evolve the ability to fly and migrate for the winter, they
would probably form a single reproductive population, and would evolve to
extinction as the segregation-distorter evolved to fixation.
Around 50% of the total genome of maize consists of transposons, DNA
elements whose primary function is to copy themselves into other locations of
DNA. A class of transposons called “P elements” seem to have first appeared
in Drosophila only in the middle of the twentieth century, and spread to every
population of the species within 50 years. The “Alu sequence” in humans,
a 300-base transposon, is repeated between 300,000 and a million times in
the human genome. This may not extinguish a species, but it doesn’t help
it; transposons cause more mutations which are as always mostly harmful,
decrease the effective copying fidelity of DNA. Yet such cheaters are extremely
fit.
Suppose that in some sexually reproducing species, a perfect DNA-copying
mechanism is invented. Since most mutations are detrimental, this gene com-
plex is an advantage to its holders. Now you might wonder about beneficial
mutations—they do happen occasionally, so wouldn’t the unmutable be at
a disadvantage? But in a sexual species, a beneficial mutation that began in
a mutable can spread to the descendants of unmutables as well. The muta-
bles suffer from degenerate mutations in each generation; and the unmutables
can sexually acquire, and thereby benefit from, any beneficial mutations that
occur in the mutables. Thus the mutables have a pure disadvantage. The per-
fect DNA-copying mechanism rises in frequency to fixation. Ten thousand
years later there’s an ice age and the species goes out of business. It evolved to
extinction.
The “bystander effect” is that, when someone is in trouble, solitary individ-
uals are more likely to intervene than groups. A college student apparently
having an epileptic seizure was helped 85% of the time by a single bystander,
and 31% of the time by five bystanders. I speculate that even if the kinship rela-
tion in a hunter-gatherer tribe was strong enough to create a selection pressure
for helping individuals not directly related, when several potential helpers were
present, a genetic arms race might occur to be the last one to step forward.
Everyone delays, hoping that someone else will do it. Humanity is facing mul-
tiple species-level extinction threats right now, and I gotta tell ya, there ain’t
a lot of people steppin’ forward. If we lose this fight because virtually no one
showed up on the battlefield, then—like a probably-large number of species
which we don’t see around today—we will have evolved to extinction.
Cancerous cells do pretty well in the body, prospering and amassing more
resources, far outcompeting their more obedient counterparts. For a while.
Multicellular organisms can only exist because they’ve evolved powerful
internal mechanisms to outlaw evolution. If the cells start evolving, they rapidly
evolve to extinction: the organism dies.
So praise not evolution for the solicitous concern it shows for the individual;
nearly all of your ancestors are dead. Praise not evolution for the solicitous
concern it shows for a species; no one has ever found a complex adaptation
which can only be interpreted as operating to preserve a species, and the
mathematics would seem to indicate that this is virtually impossible. Indeed,
it’s perfectly possible for a species to evolve to extinction. Humanity may be
finishing up the process right now. You can’t even praise evolution for the
solicitous concern it shows for genes; the battle between two alternative alleles
at the same location is a zero-sum game for frequency.
Fitness is not always your friend.

*
136
The Tragedy of Group Selectionism

Before 1966, it was not unusual to see serious biologists advocating evolution-
ary hypotheses that we would now regard as magical thinking. These muddled
notions played an important historical role in the development of later evo-
lutionary theory, error calling forth correction; like the folly of English kings
provoking into existence the Magna Carta and constitutional democracy.
As an example of romance, Vero Wynne-Edwards, Warder Allee, and J. L. Br-
ereton, among others, believed that predators would voluntarily restrain their
breeding to avoid overpopulating their habitat and exhausting the prey popu-
lation.
But evolution does not open the floodgates to arbitrary purposes. You
cannot explain a rattlesnake’s rattle by saying that it exists to benefit other
animals who would otherwise be bitten. No outside Evolution Fairy decides
when a gene ought to be promoted; the gene’s effect must somehow directly
cause the gene to be more prevalent in the next generation. It’s clear why our
human sense of aesthetics, witnessing a population crash of foxes who’ve eaten
all the rabbits, cries “Something should’ve been done!” But how would a gene
complex for restraining reproduction—of all things!—cause itself to become
more frequent in the next generation?
A human being designing a neat little toy ecology—for entertainment
purposes, like a model railroad—might be annoyed if their painstakingly
constructed fox and rabbit populations self-destructed by the foxes eating all
the rabbits and then dying of starvation themselves. So the human would
tinker with the toy ecology—a fox-breeding-restrainer is the obvious solution
that leaps to our human minds—until the ecology looked nice and neat. Nature
has no human, of course, but that needn’t stop us—now that we know what we
want on aesthetic grounds, we just have to come up with a plausible argument
that persuades Nature to want the same thing on evolutionary grounds.
Obviously, selection on the level of the individual won’t produce individual
restraint in breeding. Individuals who reproduce unrestrainedly will, naturally,
produce more offspring than individuals who restrain themselves.
(Individual selection will not produce individual sacrifice of breeding op-
portunities. Individual selection can certainly produce individuals who, after
acquiring all available resources, use those resources to produce four big eggs
instead of eight small eggs—not to conserve social resources, but because that
is the individual sweet spot for (number of eggs)×(egg survival probability).
This does not get rid of the commons problem.)
But suppose that the species population was broken up into subpopulations,
which were mostly isolated, and only occasionally interbred. Then, surely,
subpopulations that restrained their breeding would be less likely to go extinct,
and would send out more messengers, and create new colonies to reinhabit
the territories of crashed populations.
The problem with this scenario wasn’t that it was mathematically impossible.
The problem was that it was possible but very difficult.
The fundamental problem is that it’s not only restrained breeders who reap
the benefits of restrained breeding. If some foxes refrain from spawning cubs
who eat rabbits, then the uneaten rabbits don’t go to only cubs who carry the
restrained-breeding adaptation. The unrestrained foxes, and their many more
cubs, will happily eat any rabbits left unhunted. The only way the restraining
gene can survive against this pressure, is if the benefits of restraint preferentially
go to restrainers.
Specifically, the requirement is C/B < FST where C is the cost of altruism
to the donor, B is the benefit of altruism to the recipient, and FST is the
spatial structure of the population: the average relatedness between a randomly
selected organism and its randomly selected neighbor, where a “neighbor” is
any other fox who benefits from an altruistic fox’s restraint.1
So is the cost of restrained breeding sufficiently small, and the empirical
benefit of less famine sufficiently large, compared to the empirical spatial
structure of fox populations and rabbit populations, that the group selection
argument can work?
The math suggests this is pretty unlikely. In this simulation, for example,
the cost to altruists is 3% of fitness, pure altruist groups have a fitness twice as
great as pure selfish groups, the subpopulation size is 25, and 20% of all deaths
are replaced with messengers from another group: the result is polymorphic for
selfishness and altruism. If the subpopulation size is doubled to 50, selfishness
is fixed; if the cost to altruists is increased to 6%, selfishness is fixed; if the
altruistic benefit is decreased by half, selfishness is fixed or in large majority.
Neighborhood-groups must be very small, with only around 5 members, for
group selection to operate when the cost of altruism exceeds 10%. This doesn’t
seem plausibly true of foxes restraining their breeding.
You can guess by now, I think, that the group selectionists ultimately lost
the scientific argument. The kicker was not the mathematical argument, but
empirical observation: foxes didn’t restrain their breeding (I forget the exact
species of dispute; it wasn’t foxes and rabbits), and indeed, predator-prey
systems crash all the time. Group selectionism would later revive, somewhat,
in drastically different form—mathematically speaking, there is neighborhood
structure, which implies nonzero group selection pressure not necessarily
capable of overcoming countervailing individual selection pressure, and if you
don’t take it into account your math will be wrong, full stop. And evolved
enforcement mechanisms (not originally postulated) change the game entirely.
So why is this now-historical scientific dispute worthy material for Overcoming
Bias?
A decade after the controversy, a biologist had a fascinating idea. The
mathematical conditions for group selection overcoming individual selection
were too extreme to be found in Nature. Why not create them artificially, in
the laboratory? Michael J. Wade proceeded to do just that, repeatedly selecting
populations of insects for low numbers of adults per subpopulation.2 And what
was the result? Did the insects restrain their breeding and live in quiet peace
with enough food for all?
No; the adults adapted to cannibalize eggs and larvae, especially female
larvae.
Of course selecting for small subpopulation sizes would not select for indi-
viduals who restrained their own breeding; it would select for individuals who
ate other individuals’ children. Especially the girls.
Once you have that experimental result in hand—and it’s massively ob-
vious in retrospect—then it suddenly becomes clear how the original group
selectionists allowed romanticism, a human sense of aesthetics, to cloud their
predictions of Nature.
This is an archetypal example of a missed Third Alternative, resulting
from a rationalization of a predetermined bottom line that produced a fake
justification and then motivatedly stopped. The group selectionists didn’t start
with clear, fresh minds, happen upon the idea of group selection, and neu-
trally extrapolate forward the probable outcome. They started out with the
beautiful idea of fox populations voluntarily restraining their reproduction to
what the rabbit population would bear, Nature in perfect harmony; then they
searched for a reason why this would happen, and came up with the idea of
group selection; then, since they knew what they wanted the outcome of group
selection to be, they didn’t look for any less beautiful and aesthetic adaptations
that group selection would be more likely to promote instead. If they’d really
been trying to calmly and neutrally predict the result of selecting for small sub-
population sizes resistant to famine, they would have thought of cannibalizing
other organisms’ children or some similarly “ugly” outcome—long before they
imagined anything so evolutionarily outré as individual restraint in breeding!
This also illustrates the point I was trying to make in Einstein’s Arrogance:
With large answer spaces, nearly all of the real work goes into promoting one
possible answer to the point of being singled out for attention. If a hypothesis
is improperly promoted to your attention—your sense of aesthetics suggests
a beautiful way for Nature to be, and yet natural selection doesn’t involve an
Evolution Fairy who shares your appreciation—then this alone may seal your
doom, unless you can manage to clear your mind entirely and start over.
In principle, the world’s stupidest person may say the Sun is shining, but
that doesn’t make it dark out. Even if an answer is suggested by a lunatic on
LSD, you should be able to neutrally calculate the evidence for and against,
and if necessary, un-believe.
In practice, the group selectionists were doomed because their bottom line
was originally suggested by their sense of aesthetics, and Nature’s bottom line
was produced by natural selection. These two processes had no principled
reason for their outputs to correlate, and indeed they didn’t. All the furious
argument afterward didn’t change that.
If you start with your own desires for what Nature should do, consider
Nature’s own observed reasons for doing things, and then rationalize an ex-
tremely persuasive argument for why Nature should produce your preferred
outcome for Nature’s own reasons, then Nature, alas, still won’t listen. The
universe has no mind and is not subject to clever political persuasion. You can
argue all day why gravity should really make water flow uphill, and the water
just ends up in the same place regardless. It’s like the universe plain isn’t lis-
tening. J. R. Molloy said: “Nature is the ultimate bigot, because it is obstinately
and intolerantly devoted to its own prejudices and absolutely refuses to yield
to the most persuasive rationalizations of humans.”
I often recommend evolutionary biology to friends just because the modern
field tries to train its students against rationalization, error calling forth correc-
tion. Physicists and electrical engineers don’t have to be carefully trained to
avoid anthropomorphizing electrons, because electrons don’t exhibit mindish
behaviors. Natural selection creates purposefulnesses which are alien to hu-
mans, and students of evolutionary theory are warned accordingly. It’s good
training for any thinker, but it is especially important if you want to think
clearly about other weird mindish processes that do not work like you do.

1. David Sloan Wilson, “A Theory of Group Selection,” Proceedings of the National Academy of
Sciences of the United States of America 72, no. 1 (1975): 143–146.
2. Michael J. Wade, “Group selections among laboratory populations of Tribolium,” Proceedings of
the National Academy of Sciences of the United States of America 73, no. 12 (1976): 4604–4607,
doi:10.1073/pnas.73.12.4604.
137
Fake Optimization Criteria

I’ve previously dwelt in considerable length upon forms of rationalization


whereby our beliefs appear to match the evidence much more strongly than
they actually do. And I’m not overemphasizing the point, either. If we could
beat this fundamental metabias and see what every hypothesis really predicted,
we would be able to recover from almost any other error of fact.
The mirror challenge for decision theory is seeing which option a choice
criterion really endorses. If your stated moral principles call for you to provide
laptops to everyone, does that really endorse buying a $1 million gem-studded
laptop for yourself, or spending the same money on shipping 5,000 olpcs?
We seem to have evolved a knack for arguing that practically any goal im-
plies practically any action. A phlogiston theorist explaining why magnesium
gains weight when burned has nothing on an Inquisitor explaining why God’s
infinite love for all His children requires burning some of them at the stake.
There’s no mystery about this. Politics was a feature of the ancestral envi-
ronment. We are descended from those who argued most persuasively that
the good of the tribe meant executing their hated rival Uglak. (We sure ain’t
descended from Uglak.)
And yet . . . is it possible to prove that if Robert Mugabe cared only for
the good of Zimbabwe, he would resign from its presidency? You can argue
that the policy follows from the goal, but haven’t we just seen that humans
can match up any goal to any policy? How do you know that you’re right and
Mugabe is wrong? (There are a number of reasons this is a good guess, but
bear with me here.)
Human motives are manifold and obscure, our decision processes as vastly
complicated as our brains. And the world itself is vastly complicated, on
every choice of real-world policy. Can we even prove that human beings are
rationalizing—that we’re systematically distorting the link from principles to
policy—when we lack a single firm place on which to stand? When there’s
no way to find out exactly what even a single optimization criterion implies?
(Actually, you can just observe that people disagree about office politics in ways
that strangely correlate to their own interests, while simultaneously denying
that any such interests are at work. But again, bear with me here.)
Where is the standardized, open-source, generally intelligent, consequen-
tialist optimization process into which we can feed a complete morality as an
XML file, to find out what that morality really recommends when applied to
our world? Is there even a single real-world case where we can know exactly
what a choice criterion recommends? Where is the pure moral reasoner—of
known utility function, purged of all other stray desires that might distort its
optimization—whose trustworthy output we can contrast to human rational-
izations of the same utility function?
Why, it’s our old friend the alien god, of course! Natural selection is guar-
anteed free of all mercy, all love, all compassion, all aesthetic sensibilities, all
political factionalism, all ideological allegiances, all academic ambitions, all
libertarianism, all socialism, all Blue and all Green. Natural selection doesn’t
maximize its criterion of inclusive genetic fitness—it’s not that smart. But
when you look at the output of natural selection, you are guaranteed to be look-
ing at an output that was optimized only for inclusive genetic fitness, and not
the interests of the US agricultural industry.
In the case histories of evolutionary science—in, for example, The Tragedy
of Group Selectionism—we can directly compare human rationalizations to the
result of pure optimization for a known criterion. What did Wynne-Edwards
think would be the result of group selection for small subpopulation sizes?
Voluntary individual restraint in breeding, and enough food for everyone.
What was the actual laboratory result? Cannibalism.
Now you might ask: Are these case histories of evolutionary science really
relevant to human morality, which doesn’t give two figs for inclusive genetic
fitness when it gets in the way of love, compassion, aesthetics, healing, freedom,
fairness, et cetera? Human societies didn’t even have a concept of “inclusive
genetic fitness” until the twentieth century.
But I ask in return: If we can’t see clearly the result of a single monotone
optimization criterion—if we can’t even train ourselves to hear a single pure
note—then how will we listen to an orchestra? How will we see that “Always
be selfish” or “Always obey the government” are poor guiding principles for
human beings to adopt—if we think that even optimizing genes for inclusive
fitness will yield organisms that sacrifice reproductive opportunities in the
name of social resource conservation?
To train ourselves to see clearly, we need simple practice cases.

*
138
Adaptation-Executers, Not
Fitness-Maximizers

Individual organisms are best thought of as adaptation-executers


rather than as fitness-maximizers.
—John Tooby and Leda Cosmides,
“The Psychological Foundations of Culture”1

Fifty thousand years ago, the taste buds of Homo sapiens directed their bearers
to the scarcest, most critical food resources—sugar and fat. Calories, in a word.
Today, the context of a taste bud’s function has changed, but the taste buds
themselves have not. Calories, far from being scarce (in First World countries),
are actively harmful. Micronutrients that were reliably abundant in leaves and
nuts are absent from bread, but our taste buds don’t complain. A scoop of ice
cream is a superstimulus, containing more sugar, fat, and salt than anything in
the ancestral environment.
No human being with the deliberate goal of maximizing their alleles’ in-
clusive genetic fitness would ever eat a cookie unless they were starving. But
individual organisms are best thought of as adaptation-executers, not fitness-
maximizers.
A Phillips-head screwdriver, though its designer intended it to turn screws,
won’t reconform itself to a flat-head screw to fulfill its function. We created
these tools, but they exist independently of us, and they continue independently
of us.
The atoms of a screwdriver don’t have tiny little XML tags inside describing
their “objective” purpose. The designer had something in mind, yes, but that’s
not the same as what happens in the real world. If you forgot that the designer
is a separate entity from the designed thing, you might think, “The purpose of
the screwdriver is to drive screws”—as though this were an explicit property
of the screwdriver itself, rather than a property of the designer’s state of mind.
You might be surprised that the screwdriver didn’t reconfigure itself to the
flat-head screw, since, after all, the screwdriver’s purpose is to turn screws.
The cause of the screwdriver’s existence is the designer’s mind, which
imagined an imaginary screw, and imagined an imaginary handle turning.
The actual operation of the screwdriver, its actual fit to an actual screw head,
cannot be the objective cause of the screwdriver’s existence: The future cannot
cause the past. But the designer’s brain, as an actually existent thing within
the past, can indeed be the cause of the screwdriver.
The consequence of the screwdriver’s existence may not correspond to the
imaginary consequences in the designer’s mind. The screwdriver blade could
slip and cut the user’s hand.
And the meaning of the screwdriver—why, that’s something that exists in
the mind of a user, not in tiny little labels on screwdriver atoms. The designer
may intend it to turn screws. A murderer may buy it to use as a weapon. And
then accidentally drop it, to be picked up by a child, who uses it as a chisel.
So the screwdriver’s cause, and its shape, and its consequence, and its various
meanings, are all different things; and only one of these things is found within
the screwdriver itself.
Where do taste buds come from? Not from an intelligent designer visual-
izing their consequences, but from a frozen history of ancestry: Adam liked
sugar and ate an apple and reproduced, Barbara liked sugar and ate an apple
and reproduced, Charlie liked sugar and ate an apple and reproduced, and 2763
generations later, the allele became fixed in the population. For convenience of
thought, we sometimes compress this giant history and say: “Evolution did it.”
But it’s not a quick, local event like a human designer visualizing a screwdriver.
This is the objective cause of a taste bud.
What is the objective shape of a taste bud? Technically, it’s a molecular
sensor connected to reinforcement circuitry. This adds another level of indi-
rection, because the taste bud isn’t directly acquiring food. It’s influencing the
organism’s mind, making the organism want to eat foods that are similar to
the food just eaten.
What is the objective consequence of a taste bud? In a modern First World
human, it plays out in multiple chains of causality: from the desire to eat more
chocolate, to the plan to eat more chocolate, to eating chocolate, to getting fat,
to getting fewer dates, to reproducing less successfully. This consequence is
directly opposite the key regularity in the long chain of ancestral successes that
caused the taste bud’s shape. But, since overeating has only recently become
a problem, no significant evolution (compressed regularity of ancestry) has
further influenced the taste bud’s shape.
What is the meaning of eating chocolate? That’s between you and your
moral philosophy. Personally, I think chocolate tastes good, but I wish it were
less harmful; acceptable solutions would include redesigning the chocolate or
redesigning my biochemistry.
Smushing several of the concepts together, you could sort-of-say, “Modern
humans do today what would have propagated our genes in a hunter-gatherer
society, whether or not it helps our genes in a modern society.” But this still
isn’t quite right, because we’re not actually asking ourselves which behaviors
would maximize our ancestors’ inclusive fitness. And many of our activities
today have no ancestral analogue. In the hunter-gatherer society there wasn’t
any such thing as chocolate.
So it’s better to view our taste buds as an adaptation fitted to ancestral
conditions that included near-starvation and apples and roast rabbit, which
modern humans execute in a new context that includes cheap chocolate and
constant bombardment by advertisements.
Therefore it is said: Individual organisms are best thought of as adaptation-
executers, not fitness-maximizers.

1. John Tooby and Leda Cosmides, “The Psychological Foundations of Culture,” in The Adapted Mind:
Evolutionary Psychology and the Generation of Culture, ed. Jerome H. Barkow, Leda Cosmides,
and John Tooby (New York: Oxford University Press, 1992), 19–136.
139
Evolutionary Psychology

Like “IRC chat” or “TCP/IP protocol,” the phrase “reproductive organ” is


redundant. All organs are reproductive organs. Where do a bird’s wings
come from? An Evolution-of-Birds Fairy who thinks that flying is really neat?
The bird’s wings are there because they contributed to the bird’s ancestors’
reproduction. Likewise the bird’s heart, lungs, and genitals. At most we might
find it worthwhile to distinguish between directly reproductive organs and
indirectly reproductive organs.
This observation holds true also of the brain, the most complex organ
system known to biology. Some brain organs are directly reproductive, like
lust; others are indirectly reproductive, like anger.
Where does the human emotion of anger come from? An Evolution-of-
Humans Fairy who thought that anger was a worthwhile feature? The neural
circuitry of anger is a reproductive organ as surely as your liver. Anger exists
in Homo sapiens because angry ancestors had more kids. There’s no other way
it could have gotten there.
This historical fact about the origin of anger confuses all too many people.
They say, “Wait, are you saying that when I’m angry, I’m subconsciously trying
to have children? That’s not what I’m thinking after someone punches me in
the nose.”
No. No. No. NO!
Individual organisms are best thought of as adaptation-executers, not
fitness-maximizers. The cause of an adaptation, the shape of an adaptation,
and the consequence of an adaptation are all separate things. If you built a
toaster, you wouldn’t expect the toaster to reshape itself when you tried to
cram in a whole loaf of bread; yes, you intended it to make toast, but that inten-
tion is a fact about you, not a fact about the toaster. The toaster has no sense of
its own purpose.
But a toaster is not an intention-bearing object. It is not a mind at all, so
we are not tempted to attribute goals to it. If we see the toaster as purposed,
we don’t think the toaster knows it, because we don’t think the toaster knows
anything.
It’s like the old test of being asked to say the color of the letters in “blue.” It
takes longer for subjects to name this color, because of the need to untangle the
meaning of the letters and the color of the letters. You wouldn’t have similar
trouble naming the color of the letters in “wind.”
But a human brain, in addition to being an artifact historically produced
by evolution, is also a mind capable of bearing its own intentions, purposes,
desires, goals, and plans. Both a bee and a human are designs, but only a
human is a designer. The bee is “wind;” the human is “blue.”
Cognitive causes are ontologically distinct from evolutionary causes. They
are made out of a different kind of stuff. Cognitive causes are made of neurons.
Evolutionary causes are made of ancestors.
The most obvious kind of cognitive cause is deliberate, like an intention to
go to the supermarket, or a plan for toasting toast. But an emotion also exists
physically in the brain, as a train of neural impulses or a cloud of spreading
hormones. Likewise an instinct, or a flash of visualization, or a fleetingly
suppressed thought; if you could scan the brain in three dimensions and you
understood the code, you would be able to see them.
Even subconscious cognitions exist physically in the brain. “Power tends to
corrupt,” observed Lord Acton. Stalin may or may not have believed himself
an altruist, working toward the greatest good for the greatest number. But it
seems likely that, somewhere in Stalin’s brain, there were neural circuits that
reinforced pleasurably the exercise of power, and neural circuits that detected
anticipations of increases and decreases in power. If there were nothing in
Stalin’s brain that correlated to power—no little light that went on for political
command, and off for political weakness—then how could Stalin’s brain have
known to be corrupted by power?
Evolutionary selection pressures are ontologically distinct from the bio-
logical artifacts they create. The evolutionary cause of a bird’s wings is mil-
lions of ancestor-birds who reproduced more often than other ancestor-birds,
with statistical regularity owing to their possession of incrementally improved
wings compared to their competitors. We compress this gargantuan historical-
statistical macrofact by saying “evolution did it.”
Natural selection is ontologically distinct from creatures; evolution is not
a little furry thing lurking in an undiscovered forest. Evolution is a causal,
statistical regularity in the reproductive history of ancestors.
And this logic applies also to the brain. Evolution has made wings that flap,
but do not understand flappiness. It has made legs that walk, but do not under-
stand walkyness. Evolution has carved bones of calcium ions, but the bones
themselves have no explicit concept of strength, let alone inclusive genetic fit-
ness. And evolution designed brains themselves capable of designing; yet these
brains had no more concept of evolution than a bird has of aerodynamics. Un-
til the twentieth century, not a single human brain explicitly represented the
complex abstract concept of inclusive genetic fitness.
When we’re told that “The evolutionary purpose of anger is to increase
inclusive genetic fitness,” there’s a tendency to slide to “The purpose of anger
is reproduction” to “The cognitive purpose of anger is reproduction.” No! The
statistical regularity of ancestral history isn’t in the brain, even subconsciously,
any more than the designer’s intentions of toast are in a toaster!
Thinking that your built-in anger-circuitry embodies an explicit desire to
reproduce is like thinking your hand is an embodied mental desire to pick
things up.
Your hand is not wholly cut off from your mental desires. In particular
circumstances, you can control the flexing of your fingers by an act of will. If
you bend down and pick up a penny, then this may represent an act of will;
but it is not an act of will that made your hand grow in the first place.
One must distinguish a one-time event of particular anger (anger-1, anger-2,
anger-3) from the underlying neural circuitry for anger. An anger-event is a
cognitive cause, and an anger-event may have cognitive causes, but you didn’t
will the anger-circuitry to be wired into the brain.
So you have to distinguish the event of anger, from the circuitry of anger,
from the gene complex that laid down the neural template, from the ancestral
macrofact that explains the gene complex’s presence.
If there were ever a discipline that genuinely demanded X-Treme Nitpicking,
it is evolutionary psychology.
Consider, O my readers, this sordid and joyful tale: A man and a woman
meet in a bar. The man is attracted to her clear complexion and firm breasts,
which would have been fertility cues in the ancestral environment, but which
in this case result from makeup and a bra. This does not bother the man;
he just likes the way she looks. His clear-complexion-detecting neural cir-
cuitry does not know that its purpose is to detect fertility, any more than
the atoms in his hand contain tiny little XML tags reading “<purpose>pick
things up</purpose>.” The woman is attracted to his confident smile and
firm manner, cues to high status, which in the ancestral environment would
have signified the ability to provide resources for children. She plans to use
birth control, but her confident-smile-detectors don’t know this any more
than a toaster knows its designer intended it to make toast. She’s not con-
cerned philosophically with the meaning of this rebellion, because her brain is
a creationist and denies vehemently that evolution exists. He’s not concerned
philosophically with the meaning of this rebellion, because he just wants to
get laid. They go to a hotel, and undress. He puts on a condom, because he
doesn’t want kids, just the dopamine-noradrenaline rush of sex, which reliably
produced offspring 50,000 years ago when it was an invariant feature of the
ancestral environment that condoms did not exist. They have sex, and shower,
and go their separate ways. The main objective consequence is to keep the
bar and the hotel and the condom-manufacturer in business; which was not
the cognitive purpose in their minds, and has virtually nothing to do with the
key statistical regularities of reproduction 50,000 years ago which explain how
they got the genes that built their brains that executed all this behavior.
To reason correctly about evolutionary psychology you must simultane-
ously consider many complicated abstract facts that are strongly related yet
importantly distinct, without a single mixup or conflation.

*
140
An Especially Elegant Evolutionary
Psychology Experiment

In a 1989 Canadian study, adults were asked to imagine the death


of children of various ages and estimate which deaths would cre-
ate the greatest sense of loss in a parent. The results, plotted on
a graph, show grief growing until just before adolescence and
then beginning to drop. When this curve was compared with
a curve showing changes in reproductive potential over the life
cycle (a pattern calculated from Canadian demographic data),
the correlation was fairly strong. But much stronger—nearly per-
fect, in fact—was the correlation between the grief curves of these
modern Canadians and the reproductive-potential curve of a
hunter-gatherer people, the !Kung of Africa. In other words, the
pattern of changing grief was almost exactly what a Darwinian
would predict, given demographic realities in the ancestral envi-
ronment.
—Robert Wright, The Moral Animal,
summarizing Crawford et al.1
The first correlation was 0.64, the second an extremely high 0.92 (N = 221).
The most obvious inelegance of this study, as described, is that it was con-
ducted by asking human adults to imagine parental grief, rather than asking
real parents with children of particular ages. (Presumably that would have cost
more / allowed fewer subjects.) However, my understanding is that the results
here squared well with the data from closer studies of parental grief that were
looking for other correlations (i.e., a raw correlation between parental grief
and child age).
That said, consider some of this experiment’s elegant aspects:

1. A correlation of 0.92(!) This may sound suspiciously high—could evo-


lution really do such exact fine-tuning?—until you realize that this se-
lection pressure was not only great enough to fine-tune parental grief,
but, in fact, carve it out of existence from scratch in the first place.

2. People who say that evolutionary psychology hasn’t made any advance
predictions are (ironically) mere victims of “no one knows what science
doesn’t know” syndrome. You wouldn’t even think of this as an experi-
ment to be performed if not for evolutionary psychology.

3. The experiment illustrates, as beautifully and as cleanly as any I have


ever seen, the distinction between a conscious or subconscious ulterior
motive and an executing adaptation with no realtime sensitivity to the
original selection pressure that created it.

The parental grief is not even subconsciously about reproductive value—


otherwise it would update for Canadian reproductive value instead of !Kung
reproductive value. Grief is an adaptation that now simply exists, real in the
mind and continuing under its own inertia.
Parents do not care about children for the sake of their reproductive contri-
bution. Parents care about children for their own sake; and the non-cognitive,
evolutionary-historical reason why such minds exist in the universe in the first
place is that children carry their parents’ genes.
Indeed, evolution is the reason why there are any minds in the universe
at all. So you can see why I’d want to draw a sharp line through my cynicism
about ulterior motives at the evolutionary-cognitive boundary; otherwise, I
might as well stand up in a supermarket checkout line and say, “Hey! You’re
only correctly processing visual information while bagging my groceries in
order to maximize your inclusive genetic fitness!”
1. I think 0.92 is the highest correlation I’ve ever seen in any evolutionary
psychology experiment, and indeed, one of the highest correlations I’ve seen
in any psychology experiment. (Although I’ve seen e.g. a correlation of 0.98
reported for asking one group of subjects “How similar is A to B?” and another
group “What is the probability of A given B?” on questions like “How likely
are you to draw 60 red balls and 40 white balls from this barrel of 800 red balls
and 200 white balls?”—in other words, these are simply processed as the same
question.)
Since we are all Bayesians here, we may take our priors into account and
ask if at least some of this unexpectedly high correlation is due to luck. The
evolutionary fine-tuning we can probably take for granted; this is a huge
selection pressure we’re talking about. The remaining sources of suspiciously
low variance are (a) whether a large group of adults could correctly envision,
on average, relative degrees of parental grief (apparently they can), and (b)
whether the surviving !Kung are typical ancestral hunter-gatherers in this
dimension, or whether variance between hunter-gatherer tribal types should
have been too high to allow a correlation of 0.92.
But even after taking into account any skeptical priors, correlation 0.92 and
N = 221 is pretty strong evidence, and our posteriors should be less skeptical
on all these counts.
2. You might think it an inelegance of the experiment that it was performed
prospectively on imagined grief, rather than retrospectively on real grief. But
it is prospectively imagined grief that will actually operate to steer parental
behavior away from losing the child! From an evolutionary standpoint, an
actual dead child is a sunk cost; evolution “wants” the parent to learn from the
pain, not do it again, adjust back to their hedonic set point, and go on raising
other children.
3. Similarly, the graph that correlates to parental grief is for the future
reproductive potential of a child that has survived to a given age, and not the
sunk cost of raising the child which has survived to that age. (Might we get
an even higher correlation if we tried to take into account the reproductive
opportunity cost of raising a child of age X to independent maturity, while
discarding all sunk costs to raise a child to age X?)
Humans usually do notice sunk costs—this is presumably either an adapta-
tion to prevent us from switching strategies too often (compensating for an
overeager opportunity-noticer?) or an unfortunate spandrel of pain felt on
wasting resources.
Evolution, on the other hand—it’s not that evolution “doesn’t care about
sunk costs,” but that evolution doesn’t even remotely “think” that way; “evolu-
tion” is just a macrofact about the real historical reproductive consequences.
So—of course—the parental grief adaptation is fine-tuned in a way that has
nothing to do with past investment in a child, and everything to do with the
future reproductive consequences of losing that child. Natural selection isn’t
crazy about sunk costs the way we are.
But—of course—the parental grief adaptation goes on functioning as if the
parent were living in a !Kung tribe rather than Canada. Most humans would
notice the difference.
Humans and natural selection are insane in different stable complicated
ways.

1. Robert Wright, The Moral Animal: Why We Are the Way We Are: The New Science of Evolutionary
Psychology (Pantheon Books, 1994); Charles B. Crawford, Brenda E. Salter, and Kerry L. Jang,
“Human Grief: Is Its Intensity Related to the Reproductive Value of the Deceased?,” Ethology and
Sociobiology 10, no. 4 (1989): 297–307.
141
Superstimuli and the Collapse of
Western Civilization

At least three people have died playing online games for days without rest.
People have lost their spouses, jobs, and children to World of Warcraft. If
people have the right to play video games—and it’s hard to imagine a more
fundamental right—then the market is going to respond by supplying the most
engaging video games that can be sold, to the point that exceptionally engaged
consumers are removed from the gene pool.
How does a consumer product become so involving that, after 57 hours of
using the product, the consumer would rather use the product for one more
hour than eat or sleep? (I suppose one could argue that the consumer makes a
rational decision that they’d rather play Starcraft for the next hour than live
out the rest of their life, but let’s just not go there. Please.)
A candy bar is a superstimulus: it contains more concentrated sugar, salt, and
fat than anything that exists in the ancestral environment. A candy bar matches
taste buds that evolved in a hunter-gatherer environment, but it matches those
taste buds much more strongly than anything that actually existed in the hunter-
gatherer environment. The signal that once reliably correlated to healthy food
has been hijacked, blotted out with a point in tastespace that wasn’t in the train-
ing dataset—an impossibly distant outlier on the old ancestral graphs. Tastiness,
formerly representing the evolutionarily identified correlates of healthiness,
has been reverse-engineered and perfectly matched with an artificial substance.
Unfortunately there’s no equally powerful market incentive to make the re-
sulting food item as healthy as it is tasty. We can’t taste healthfulness, after
all.
The now-famous Dove Evolution video shows the painstaking construction
of another superstimulus: an ordinary woman transformed by makeup, careful
photography, and finally extensive Photoshopping, into a billboard model—a
beauty impossible, unmatchable by human women in the unretouched real
world. Actual women are killing themselves (e.g., supermodels using cocaine
to keep their weight down) to keep up with competitors that literally don’t
exist.
And likewise, a video game can be so much more engaging than mere reality,
even through a simple computer monitor, that someone will play it without
food or sleep until they literally die. I don’t know all the tricks used in video
games, but I can guess some of them—challenges poised at the critical point
between ease and impossibility, intermittent reinforcement, feedback showing
an ever-increasing score, social involvement in massively multiplayer games.
Is there a limit to the market incentive to make video games more engaging?
You might hope there’d be no incentive past the point where the players lose
their jobs; after all, they must be able to pay their subscription fee. This would
imply a “sweet spot” for the addictiveness of games, where the mode of the
bell curve is having fun, and only a few unfortunate souls on the tail become
addicted to the point of losing their jobs. As of 2007, playing World of Warcraft
for 58 hours straight until you literally die is still the exception rather than the
rule. But video game manufacturers compete against each other, and if you
can make your game 5% more addictive, you may be able to steal 50% of your
competitor’s customers. You can see how this problem could get a lot worse.
If people have the right to be tempted—and that’s what free will is all
about—the market is going to respond by supplying as much temptation as
can be sold. The incentive is to make your stimuli 5% more tempting than
those of your current leading competitors. This continues well beyond the
point where the stimuli become ancestrally anomalous superstimuli. Consider
how our standards of product-selling feminine beauty have changed since
the advertisements of the 1950s. And as candy bars demonstrate, the market
incentive also continues well beyond the point where the superstimulus begins
wreaking collateral damage on the consumer.
So why don’t we just say no? A key assumption of free-market economics
is that, in the absence of force and fraud, people can always refuse to engage in
a harmful transaction. (To the extent this is true, a free market would be, not
merely the best policy on the whole, but a policy with few or no downsides.)
An organism that regularly passes up food will die, as some video game
players found out the hard way. But, on some occasions in the ancestral
environment, a typically beneficial (and therefore tempting) act may in fact be
harmful. Humans, as organisms, have an unusually strong ability to perceive
these special cases using abstract thought. On the other hand we also tend to
imagine lots of special-case consequences that don’t exist, like ancestor spirits
commanding us not to eat perfectly good rabbits.
Evolution seems to have struck a compromise, or perhaps just aggregated
new systems on top of old. Homo sapiens are still tempted by food, but our
oversized prefrontal cortices give us a limited ability to resist temptation. Not
unlimited ability—our ancestors with too much willpower probably starved
themselves to sacrifice to the gods, or failed to commit adultery one too many
times. The video game players who died must have exercised willpower (in
some sense) to keep playing for so long without food or sleep; the evolutionary
hazard of self-control.
Resisting any temptation takes conscious expenditure of an exhaustible
supply of mental energy. It is not in fact true that we can “just say no”—not
just say no, without cost to ourselves. Even humans who won the birth lottery
for willpower or foresightfulness still pay a price to resist temptation. The price
is just more easily paid.
Our limited willpower evolved to deal with ancestral temptations; it may not
operate well against enticements beyond anything known to hunter-gatherers.
Even where we successfully resist a superstimulus, it seems plausible that the
effort required would deplete willpower much faster than resisting ancestral
temptations.
Is public display of superstimuli a negative externality, even to the people
who say no? Should we ban chocolate cookie ads, or storefronts that openly
say “Ice Cream”?
Just because a problem exists doesn’t show (without further justification
and a substantial burden of proof) that the government can fix it. The regu-
lator’s career incentive does not focus on products that combine low-grade
consumer harm with addictive superstimuli; it focuses on products with failure
modes spectacular enough to get into the newspaper. Conversely, just because
the government may not be able to fix something, doesn’t mean it isn’t going
wrong.
I leave you with a final argument from fictional evidence: Simon Funk’s
online novel After Life depicts (among other plot points) the planned exter-
mination of biological Homo sapiens—not by marching robot armies, but by
artificial children that are much cuter and sweeter and more fun to raise than
real children. Perhaps the demographic collapse of advanced societies hap-
pens because the market supplies ever-more-tempting alternatives to having
children, while the attractiveness of changing diapers remains constant over
time. Where are the advertising billboards that say “Breed”? Who will pay
professional image consultants to make arguing with sullen teenagers seem
more alluring than a vacation in Tahiti?
“In the end,” Simon Funk wrote, “the human species was simply marketed
out of existence.”

*
142
Thou Art Godshatter

Before the twentieth century, not a single human being had an explicit concept
of “inclusive genetic fitness,” the sole and absolute obsession of the blind idiot
god. We have no instinctive revulsion of condoms or oral sex. Our brains,
those supreme reproductive organs, don’t perform a check for reproductive
efficacy before granting us sexual pleasure.
Why not? Why aren’t we consciously obsessed with inclusive genetic fit-
ness? Why did the Evolution-of-Humans Fairy create brains that would invent
condoms? “It would have been so easy,” thinks the human, who can design
new complex systems in an afternoon.
The Evolution Fairy, as we all know, is obsessed with inclusive genetic fitness.
When she decides which genes to promote to universality, she doesn’t seem
to take into account anything except the number of copies a gene produces.
(How strange!)
But since the maker of intelligence is thus obsessed, why not create intel-
ligent agents—you can’t call them humans—who would likewise care purely
about inclusive genetic fitness? Such agents would have sex only as a means of
reproduction, and wouldn’t bother with sex that involved birth control. They
could eat food out of an explicitly reasoned belief that food was necessary to
reproduce, not because they liked the taste, and so they wouldn’t eat candy if
it became detrimental to survival or reproduction. Post-menopausal women
would babysit grandchildren until they became sick enough to be a net drain
on resources, and would then commit suicide.
It seems like such an obvious design improvement—from the Evolution
Fairy’s perspective.
Now it’s clear that it’s hard to build a powerful enough consequentialist.
Natural selection sort-of reasons consequentially, but only by depending on
the actual consequences. Human evolutionary theorists have to do really high-
falutin’ abstract reasoning in order to imagine the links between adaptations
and reproductive success.
But human brains clearly can imagine these links in protein. So when the
Evolution Fairy made humans, why did It bother with any motivation except
inclusive genetic fitness?
It’s been less than two centuries since a protein brain first represented
the concept of natural selection. The modern notion of “inclusive genetic
fitness” is even more subtle, a highly abstract concept. What matters is not
the number of shared genes. Chimpanzees share 95% of your genes. What
matters is shared genetic variance, within a reproducing population—your
sister is one-half related to you, because any variations in your genome, within
the human species, are 50% likely to be shared by your sister.
Only in the last century—arguably only in the last fifty years—have evolu-
tionary biologists really begun to understand the full range of causes of repro-
ductive success, things like reciprocal altruism and costly signaling. Without
all this highly detailed knowledge, an intelligent agent that set out to “maximize
inclusive fitness” would fall flat on its face.
So why not preprogram protein brains with the knowledge? Why wasn’t a
concept of “inclusive genetic fitness” programmed into us, along with a library
of explicit strategies? Then you could dispense with all the reinforcers. The
organism would be born knowing that, with high probability, fatty foods would
lead to fitness. If the organism later learned that this was no longer the case,
it would stop eating fatty foods. You could refactor the whole system. And it
wouldn’t invent condoms or cookies.
This looks like it should be quite possible in principle. I occasionally run
into people who don’t quite understand consequentialism, who say, “But if
the organism doesn’t have a separate drive to eat, it will starve, and so fail
to reproduce.” So long as the organism knows this very fact, and has a utility
function that values reproduction, it will automatically eat. In fact, this is
exactly the consequentialist reasoning that natural selection itself used to build
automatic eaters.
What about curiosity? Wouldn’t a consequentialist only be curious when
it saw some specific reason to be curious? And wouldn’t this cause it to miss
out on lots of important knowledge that came with no specific reason for
investigation attached? Again, a consequentialist will investigate given only
the knowledge of this very same fact. If you consider the curiosity drive of a
human—which is not undiscriminating, but responds to particular features of
problems—then this complex adaptation is purely the result of consequentialist
reasoning by DNA, an implicit representation of knowledge: Ancestors who
engaged in this kind of inquiry left more descendants.
So in principle, the pure reproductive consequentialist is possible. In prin-
ciple, all the ancestral history implicitly represented in cognitive adaptations
can be converted to explicitly represented knowledge, running on a core conse-
quentialist.
But the blind idiot god isn’t that smart. Evolution is not a human program-
mer who can simultaneously refactor whole code architectures. Evolution is
not a human programmer who can sit down and type out instructions at sixty
words per minute.
For millions of years before hominid consequentialism, there was rein-
forcement learning. The reward signals were events that correlated reliably to
reproduction. You can’t ask a nonhominid brain to foresee that a child eat-
ing fatty foods now will live through the winter. So the DNA builds a protein
brain that generates a reward signal for eating fatty food. Then it’s up to the
organism to learn which prey animals are tastiest.
DNA constructs protein brains with reward signals that have a long-distance
correlation to reproductive fitness, but a short-distance correlation to organism
behavior. You don’t have to figure out that eating sugary food in the fall will
lead to digesting calories that can be stored fat to help you survive the winter
so that you mate in spring to produce offspring in summer. An apple simply
tastes good, and your brain just has to plot out how to get more apples off the
tree.
And so organisms evolve rewards for eating, and building nests, and scaring
off competitors, and helping siblings, and discovering important truths, and
forming strong alliances, and arguing persuasively, and of course having sex . . .
When hominid brains capable of cross-domain consequential reasoning
began to show up, they reasoned consequentially about how to get the existing
reinforcers. It was a relatively simple hack, vastly simpler than rebuilding an
“inclusive fitness maximizer” from scratch. The protein brains plotted how
to acquire calories and sex, without any explicit cognitive representation of
“inclusive fitness.”
A human engineer would have said, “Whoa, I’ve just invented a conse-
quentialist! Now I can take all my previous hard-won knowledge about which
behaviors improve fitness, and declare it explicitly! I can convert all this compli-
cated reinforcement learning machinery into a simple declarative knowledge
statement that ‘fatty foods and sex usually improve your inclusive fitness.’ Con-
sequential reasoning will automatically take care of the rest. Plus, it won’t have
the obvious failure mode where it invents condoms!”
But then a human engineer wouldn’t have built the retina backward, either.
The blind idiot god is not a unitary purpose, but a many-splintered attention.
Foxes evolve to catch rabbits, rabbits evolve to evade foxes; there are as many
evolutions as species. But within each species, the blind idiot god is purely
obsessed with inclusive genetic fitness. No quality is valued, not even survival,
except insofar as it increases reproductive fitness. There’s no point in an
organism with steel skin if it ends up having 1% less reproductive capacity.
Yet when the blind idiot god created protein computers, its monomaniacal
focus on inclusive genetic fitness was not faithfully transmitted. Its optimiza-
tion criterion did not successfully quine. We, the handiwork of evolution, are
as alien to evolution as our Maker is alien to us. One pure utility function
splintered into a thousand shards of desire.
Why? Above all, because evolution is stupid in an absolute sense. But also
because the first protein computers weren’t anywhere near as general as the
blind idiot god, and could only utilize short-term desires.
In the final analysis, asking why evolution didn’t build humans to maxi-
mize inclusive genetic fitness is like asking why evolution didn’t hand humans
a ribosome and tell them to design their own biochemistry. Because evolution
can’t refactor code that fast, that’s why. But maybe in a billion years of con-
tinued natural selection that’s exactly what would happen, if intelligence were
foolish enough to allow the idiot god continued reign.
The Mote in God’s Eye by Niven and Pournelle depicts an intelligent species
that stayed biological a little too long, slowly becoming truly enslaved by
evolution, gradually turning into true fitness maximizers obsessed with outre-
producing each other. But thankfully that’s not what happened. Not here on
Earth. At least not yet.
So humans love the taste of sugar and fat, and we love our sons and daugh-
ters. We seek social status, and sex. We sing and dance and play. We learn for
the love of learning.
A thousand delicious tastes, matched to ancient reinforcers that once cor-
related with reproductive fitness—now sought whether or not they enhance
reproduction. Sex with birth control, chocolate, the music of long-dead Bach
on a CD.
And when we finally learn about evolution, we think to ourselves: “Obsess
all day about inclusive genetic fitness? Where’s the fun in that?”
The blind idiot god’s single monomaniacal goal splintered into a thousand
shards of desire. And this is well, I think, though I’m a human who says so.
Or else what would we do with the future? What would we do with the billion
galaxies in the night sky? Fill them with maximally efficient replicators? Should
our descendants deliberately obsess about maximizing their inclusive genetic
fitness, regarding all else only as a means to that end?
Being a thousand shards of desire isn’t always fun, but at least it’s not boring.
Somewhere along the line, we evolved tastes for novelty, complexity, elegance,
and challenge—tastes that judge the blind idiot god’s monomaniacal focus,
and find it aesthetically unsatisfying.
And yes, we got those very same tastes from the blind idiot’s godshatter.
So what?

*
Part M

Fragile Purposes
143
Belief in Intelligence

I don’t know what moves Garry Kasparov would make in a chess game. What,
then, is the empirical content of my belief that “Kasparov is a highly intelligent
chess player”? What real-world experience does my belief tell me to anticipate?
Is it a cleverly masked form of total ignorance?
To sharpen the dilemma, suppose Kasparov plays against some mere chess
grandmaster Mr. G, who’s not in the running for world champion. My own
ability is far too low to distinguish between these levels of chess skill. When I
try to guess Kasparov’s move, or Mr. G’s next move, all I can do is try to guess
“the best chess move” using my own meager knowledge of chess. Then I would
produce exactly the same prediction for Kasparov’s move or Mr. G’s move in
any particular chess position. So what is the empirical content of my belief
that “Kasparov is a better chess player than Mr. G”?
The empirical content of my belief is the testable, falsifiable prediction
that the final chess position will occupy the class of chess positions that are
wins for Kasparov, rather than drawn games or wins for Mr. G. (Counting
resignation as a legal move that leads to a chess position classified as a loss.)
The degree to which I think Kasparov is a “better player” is reflected in the
amount of probability mass I concentrate into the “Kasparov wins” class of
outcomes, versus the “drawn game” and “Mr. G wins” class of outcomes. These
classes are extremely vague in the sense that they refer to vast spaces of possible
chess positions—but “Kasparov wins” is more specific than maximum entropy,
because it can be definitely falsified by a vast set of chess positions.
The outcome of Kasparov’s game is predictable because I know, and un-
derstand, Kasparov’s goals. Within the confines of the chess board, I know
Kasparov’s motivations—I know his success criterion, his utility function, his
target as an optimization process. I know where Kasparov is ultimately trying
to steer the future and I anticipate he is powerful enough to get there, although
I don’t anticipate much about how Kasparov is going to do it.
Imagine that I’m visiting a distant city, and a local friend volunteers to
drive me to the airport. I don’t know the neighborhood. Each time my friend
approaches a street intersection, I don’t know whether my friend will turn
left, turn right, or continue straight ahead. I can’t predict my friend’s move
even as we approach each individual intersection—let alone predict the whole
sequence of moves in advance.
Yet I can predict the result of my friend’s unpredictable actions: we will
arrive at the airport. Even if my friend’s house were located elsewhere in
the city, so that my friend made a completely different sequence of turns, I
would just as confidently predict our arrival at the airport. I can predict this
long in advance, before I even get into the car. My flight departs soon, and
there’s no time to waste; I wouldn’t get into the car in the first place, if I
couldn’t confidently predict that the car would travel to the airport along an
unpredictable pathway.
Isn’t this a remarkable situation to be in, from a scientific perspective? I
can predict the outcome of a process, without being able to predict any of the
intermediate steps of the process.
How is this even possible? Ordinarily one predicts by imagining the present
and then running the visualization forward in time. If you want a precise model
of the Solar System, one that takes into account planetary perturbations, you
must start with a model of all major objects and run that model forward in
time, step by step.
Sometimes simpler problems have a closed-form solution, where calculat-
ing the future at time T takes the same amount of work regardless of T. A coin
rests on a table, and after each minute, the coin turns over. The coin starts
out showing heads. What face will it show a hundred minutes later? Obvi-
ously you did not answer this question by visualizing a hundred intervening
steps. You used a closed-form solution that worked to predict the outcome,
and would also work to predict any of the intervening steps.
But when my friend drives me to the airport, I can predict the outcome
successfully using a strange model that won’t work to predict any of the interme-
diate steps. My model doesn’t even require me to input the initial conditions—I
don’t need to know where we start out in the city!
I do need to know something about my friend. I must know that my friend
wants me to make my flight. I must credit that my friend is a good enough
planner to successfully drive me to the airport (if he wants to). These are
properties of my friend’s initial state—properties which let me predict the final
destination, though not any intermediate turns.
I must also credit that my friend knows enough about the city to drive
successfully. This may be regarded as a relation between my friend and the
city; hence, a property of both. But an extremely abstract property, which does
not require any specific knowledge about either the city, or about my friend’s
knowledge about the city.
This is one way of viewing the subject matter to which I’ve devoted my
life—these remarkable situations which place us in such odd epistemic positions.
And my work, in a sense, can be viewed as unraveling the exact form of that
strange abstract knowledge we can possess; whereby, not knowing the actions,
we can justifiably know the consequence.
“Intelligence” is too narrow a term to describe these remarkable situations
in full generality. I would say rather “optimization process.” A similar situation
accompanies the study of biological natural selection, for example; we can’t
predict the exact form of the next organism observed.
But my own specialty is the kind of optimization process called “intelli-
gence”; and even narrower, a particular kind of intelligence called “Friendly
Artificial Intelligence”—of which, I hope, I will be able to obtain especially
precise abstract knowledge.

*
144
Humans in Funny Suits

Many times the human species has travelled into space, only to find the stars
inhabited by aliens who look remarkably like humans in funny suits—or even
humans with a touch of makeup and latex—or just beige Caucasians in fee
simple.

Star Trek: The Original Series, “Arena,” © CBS Corporation


It’s remarkable how the human form is the natural baseline of the universe,
from which all other alien species are derived via a few modifications.
What could possibly explain this fascinating phenomenon? Convergent
evolution, of course! Even though these alien life-forms evolved on a thousand
alien planets, completely independently from Earthly life, they all turned out
the same.
Don’t be fooled by the fact that a kangaroo (a mammal) resembles us rather
less than does a chimp (a primate), nor by the fact that a frog (amphibians, like
us, are tetrapods) resembles us less than the kangaroo. Don’t be fooled by the
bewildering variety of the insects, who split off from us even longer ago than
the frogs; don’t be fooled that insects have six legs, and their skeletons on the
outside, and a different system of optics, and rather different sexual practices.
You might think that a truly alien species would be more different from
us than we are from insects. As I said, don’t be fooled. For an alien species
to evolve intelligence, it must have two legs with one knee each attached to an
upright torso, and must walk in a way similar to us. You see, any intelligence
needs hands, so you’ve got to repurpose a pair of legs for that—and if you don’t
start with a four-legged being, it can’t develop a running gait and walk upright,
freeing the hands.
. . . Or perhaps we should consider, as an alternative theory, that it’s the
easy way out to use humans in funny suits.
But the real problem is not shape; it is mind. “Humans in funny suits” is a
well-known term in literary science-fiction fandom, and it does not refer to
something with four limbs that walks upright. An angular creature of pure
crystal is a “human in a funny suit” if she thinks remarkably like a human—
especially a human of an English-speaking culture of the late-twentieth/early-
twenty-first century.
I don’t watch a lot of ancient movies. When I was watching the movie
Psycho (1960) a few years back, I was taken aback by the cultural gap between
the Americans on the screen and my America. The buttoned-shirted characters
of Psycho are considerably more alien than the vast majority of so-called “aliens”
I encounter on TV or the silver screen.
To write a culture that isn’t just like your own culture, you have to be able to
see your own culture as a special case—not as a norm which all other cultures
must take as their point of departure. Studying history may help—but then
it is only little black letters on little white pages, not a living experience. I
suspect that it would help more to live for a year in China or Dubai or among
the !Kung . . . this I have never done, being busy. Occasionally I wonder what
things I might not be seeing (not there, but here).
Seeing your humanity as a special case is very much harder than this.
In every known culture, humans seem to experience joy, sadness, fear,
disgust, anger, and surprise. In every known culture, these emotions are
indicated by the same facial expressions. Next time you see an “alien”—or
an “AI,” for that matter—I bet that when it gets angry (and it will get angry), it
will show the human-universal facial expression for anger.
We humans are very much alike under our skulls—that goes with being a
sexually reproducing species; you can’t have everyone using different complex
adaptations, they wouldn’t assemble. (Do the aliens reproduce sexually, like
humans and many insects? Do they share small bits of genetic material, like
bacteria? Do they form colonies, like fungi? Does the rule of psychological
unity apply among them?)
The only intelligences your ancestors had to manipulate—complexly so, and
not just tame or catch in nets—the only minds your ancestors had to model
in detail—were minds that worked more or less like their own. And so we
evolved to predict Other Minds by putting ourselves in their shoes, asking what
we would do in their situations; for that which was to be predicted, was similar
to the predictor.
“What?” you say. “I don’t assume other people are just like me! Maybe
I’m sad, and they happen to be angry! They believe other things than I do;
their personalities are different from mine!” Look at it this way: a human
brain is an extremely complicated physical system. You are not modeling it
neuron-by-neuron or atom-by-atom. If you came across a physical system as
complex as the human brain which was not like you, it would take scientific
lifetimes to unravel it. You do not understand how human brains work in
an abstract, general sense; you can’t build one, and you can’t even build a
computer model that predicts other brains as well as you predict them.
The only reason you can try at all to grasp anything as physically com-
plex and poorly understood as the brain of another human being is that you
configure your own brain to imitate it. You empathize (though perhaps not
sympathize). You impose on your own brain the shadow of the other mind’s
anger and the shadow of its beliefs. You may never think the words, “What
would I do in this situation?,” but that little shadow of the other mind that you
hold within yourself is something animated within your own brain, invoking
the same complex machinery that exists in the other person, synchronizing
gears you don’t understand. You may not be angry yourself, but you know that
if you were angry at you, and you believed that you were godless scum, you
would try to hurt you . . .
This “empathic inference” (as I shall call it) works for humans, more or less.
But minds with different emotions—minds that feel emotions you’ve never
felt yourself, or that fail to feel emotions you would feel? That’s something you
can’t grasp by putting your brain into the other brain’s shoes. I can tell you
to imagine an alien that grew up in a universe with four spatial dimensions,
instead of three spatial dimensions, but you won’t be able to reconfigure your
visual cortex to see like that alien would see. I can try to write a story about
aliens with different emotions, but you won’t be able to feel those emotions,
and neither will I.
Imagine an alien watching a video of the Marx Brothers and having abso-
lutely no idea what was going on, or why you would actively seek out such
a sensory experience, because the alien has never conceived of anything re-
motely like a sense of humor. Don’t pity them for missing out; you’ve never
antled.
You might ask: Maybe the aliens do have a sense of humor, but you’re
not telling funny enough jokes? This is roughly the equivalent of trying to
speak English very loudly, and very slowly, in a foreign country, on the theory
that those foreigners must have an inner ghost that can hear the meaning
dripping from your words, inherent in your words, if only you can speak them
loud enough to overcome whatever strange barrier stands in the way of your
perfectly sensible English.
It is important to appreciate that laughter can be a beautiful and valuable
thing, even if it is not universalizable, even if it is not possessed by all possible
minds. It would be our own special part of the gift we give to tomorrow. That
can count for something too.
It had better, because universalizability is one metaethical notion that I
can’t salvage for you. Universalizability among humans, maybe; but not among
all possible minds.
And what about minds that don’t run on emotional architectures like your
own—that don’t have things analogous to emotions? No, don’t bother explain-
ing why any intelligent mind powerful enough to build complex machines
must inevitably have states analogous to emotions. Natural selection builds
complex machines without itself having emotions. Now there’s a Real Alien
for you—an optimization process that really Does Not Work Like You Do.
Much of the progress in biology since the 1960s has consisted of trying to
enforce a moratorium on anthropomorphizing evolution. That was a major
academic slap-fight, and I’m not sure that sanity would have won the day if
not for the availability of crushing experimental evidence backed up by clear
math. Getting people to stop putting themselves in alien shoes is a long, hard,
uphill slog. I’ve been fighting that battle on AI for years.
Our anthropomorphism runs very deep in us; it cannot be excised by a
simple act of will, a determination to say, “Now I shall stop thinking like a
human!” Humanity is the air we breathe; it is our generic, the white paper
on which we begin our sketches. And we do not think of ourselves as being
human when we are being human.
It is proverbial in literary science fiction that the true test of an author is
their ability to write Real Aliens. (And not just conveniently incomprehensible
aliens who, for their own mysterious reasons, do whatever the plot happens to
require.) Jack Vance was one of the great masters of this art. Vance’s humans,
if they come from a different culture, are more alien than most “aliens.” (Never
read any Vance? I would recommend starting with City of the Chasch.) Niven
and Pournelle’s The Mote in God’s Eye also gets a standard mention here.
And conversely—well, I once read a science fiction author (I think Orson
Scott Card) say that the all-time low point of television science fiction was
the Star Trek episode where parallel evolution has proceeded to the extent
of producing aliens who not only look just like humans, who not only speak
English, but have also independently rewritten, word for word, the preamble
to the US Constitution.
This is the Great Failure of Imagination. Don’t think that it’s just about
science fiction, or even just about AI. The inability to imagine the alien is the
inability to see yourself—the inability to understand your own specialness.
Who can see a human camouflaged against a human background?

*
145
Optimization and the Intelligence
Explosion

Among the topics I haven’t delved into here is the notion of an optimization
process. Roughly, this is the idea that your power as a mind is your ability to
hit small targets in a large search space—this can be either the space of possible
futures (planning) or the space of possible designs (invention).
Suppose you have a car, and suppose we already know that your preferences
involve travel. Now suppose that you take all the parts in the car, or all the
atoms, and jumble them up at random. It’s very unlikely that you’ll end up with
a travel-artifact at all, even so much as a wheeled cart; let alone a travel-artifact
that ranks as high in your preferences as the original car. So, relative to your
preference ordering, the car is an extremely improbable artifact. The power of
an optimization process is that it can produce this kind of improbability.
You can view both intelligence and natural selection as special cases of opti-
mization: processes that hit, in a large search space, very small targets defined
by implicit preferences. Natural selection prefers more efficient replicators.
Human intelligences have more complex preferences. Neither evolution nor
humans have consistent utility functions, so viewing them as “optimization
processes” is understood to be an approximation. You’re trying to get at the sort
of work being done, not claim that humans or evolution do this work perfectly.
This is how I see the story of life and intelligence—as a story of improbably
good designs being produced by optimization processes. The “improbability”
here is improbability relative to a random selection from the design space,
not improbability in an absolute sense—if you have an optimization process
around, then “improbably” good designs become probable.
Looking over the history of optimization on Earth up until now, the first
step is to conceptually separate the meta level from the object level—separate
the structure of optimization from that which is optimized.
If you consider biology in the absence of hominids, then on the object
level we have things like dinosaurs and butterflies and cats. On the meta level
we have things like sexual recombination and natural selection of asexual
populations. The object level, you will observe, is rather more complicated
than the meta level. Natural selection is not an easy subject and it involves
math. But if you look at the anatomy of a whole cat, the cat has dynamics
immensely more complicated than “mutate, recombine, reproduce.”
This is not surprising. Natural selection is an accidental optimization pro-
cess, that basically just started happening one day in a tidal pool somewhere.
A cat is the subject of millions of years and billions of years of evolution.
Cats have brains, of course, which operate to learn over a lifetime; but at
the end of the cat’s lifetime, that information is thrown away, so it does not
accumulate. The cumulative effects of cat-brains upon the world as optimizers,
therefore, are relatively small.
Or consider a bee brain, or a beaver brain. A bee builds hives, and a beaver
builds dams; but they didn’t figure out how to build them from scratch. A
beaver can’t figure out how to build a hive, a bee can’t figure out how to build
a dam.
So animal brains—up until recently—were not major players in the plan-
etary game of optimization; they were pieces but not players. Compared to
evolution, brains lacked both generality of optimization power (they could
not produce the amazing range of artifacts produced by evolution) and cu-
mulative optimization power (their products did not accumulate complexity
over time). For more on this theme see Protein Reinforcement and DNA
Consequentialism.
Very recently, certain animal brains have begun to exhibit both generality
of optimization power (producing an amazingly wide range of artifacts, in
time scales too short for natural selection to play any significant role) and
cumulative optimization power (artifacts of increasing complexity, as a result
of skills passed on through language and writing).
Natural selection takes hundreds of generations to do anything and mil-
lions of years for de novo complex designs. Human programmers can design a
complex machine with a hundred interdependent elements in a single after-
noon. This is not surprising, since natural selection is an accidental optimiza-
tion process that basically just started happening one day, whereas humans are
optimized optimizers handcrafted by natural selection over millions of years.
The wonder of evolution is not how well it works, but that it works at all
without being optimized. This is how optimization bootstrapped itself into
the universe—starting, as one would expect, from an extremely inefficient
accidental optimization process. Which is not the accidental first replicator,
mind you, but the accidental first process of natural selection. Distinguish the
object level and the meta level!
Since the dawn of optimization in the universe, a certain structural com-
monality has held across both natural selection and human intelligence . . .
Natural selection selects on genes, but generally speaking, the genes do not
turn around and optimize natural selection. The invention of sexual recombi-
nation is an exception to this rule, and so is the invention of cells and DNA.
And you can see both the power and the rarity of such events, by the fact that
evolutionary biologists structure entire histories of life on Earth around them.
But if you step back and take a human standpoint—if you think like a
programmer—then you can see that natural selection is still not all that com-
plicated. We’ll try bundling different genes together? We’ll try separating
information storage from moving machinery? We’ll try randomly recombin-
ing groups of genes? On an absolute scale, these are the sort of bright ideas
that any smart hacker comes up with during the first ten minutes of thinking
about system architectures.
Because natural selection started out so inefficient (as a completely acci-
dental process), this tiny handful of meta-level improvements feeding back
in from the replicators—nowhere near as complicated as the structure of a
cat—structure the evolutionary epochs of life on Earth.
And after all that, natural selection is still a blind idiot of a god. Gene pools
can evolve to extinction, despite all cells and sex.
Now natural selection does feed on itself in the sense that each new adapta-
tion opens up new avenues of further adaptation; but that takes place on the
object level. The gene pool feeds on its own complexity—but only thanks to
the protected interpreter of natural selection that runs in the background, and
that is not itself rewritten or altered by the evolution of species.
Likewise, human beings invent sciences and technologies, but we have not
yet begun to rewrite the protected structure of the human brain itself. We have
a prefrontal cortex and a temporal cortex and a cerebellum, just like the first
inventors of agriculture. We haven’t started to genetically engineer ourselves.
On the object level, science feeds on science, and each new discovery paves the
way for new discoveries—but all that takes place with a protected interpreter,
the human brain, running untouched in the background.
We have meta-level inventions like science, that try to instruct humans in
how to think. But the first person to invent Bayes’s Theorem did not become a
Bayesian; they could not rewrite themselves, lacking both that knowledge and
that power. Our significant innovations in the art of thinking, like writing and
science, are so powerful that they structure the course of human history; but
they do not rival the brain itself in complexity, and their effect upon the brain
is comparatively shallow.
The present state of the art in rationality training is not sufficient to turn
an arbitrarily selected mortal into Albert Einstein, which shows the power of a
few minor genetic quirks of brain design compared to all the self-help books
ever written in the twentieth century.
Because the brain hums away invisibly in the background, people tend
to overlook its contribution and take it for granted; and talk as if the simple
instruction to “Test ideas by experiment,” or the p < 0.05 significance rule,
were the same order of contribution as an entire human brain. Try telling
chimpanzees to test their ideas by experiment and see how far you get.
Now . . . some of us want to intelligently design an intelligence that would
be capable of intelligently redesigning itself, right down to the level of machine
code.
The machine code at first, and the laws of physics later, would be a protected
level of a sort. But that “protected level” would not contain the dynamic of
optimization; the protected levels would not structure the work. The human
brain does quite a bit of optimization on its own, and screws up on its own,
no matter what you try to tell it in school. But this fully wraparound recursive
optimizer would have no protected level that was optimizing. All the structure
of optimization would be subject to optimization itself.
And that is a sea change which breaks with the entire past since the first
replicator, because it breaks the idiom of a protected meta level.
The history of Earth up until now has been a history of optimizers spinning
their wheels at a constant rate, generating a constant optimization pressure.
And creating optimized products, not at a constant rate, but at an accelerating
rate, because of how object-level innovations open up the pathway to other
object-level innovations. But that acceleration is taking place with a protected
meta level doing the actual optimizing. Like a search that leaps from island to
island in the search space, and good islands tend to be adjacent to even better
islands, but the jumper doesn’t change its legs. Occasionally, a few tiny little
changes manage to hit back to the meta level, like sex or science, and then
the history of optimization enters a new epoch and everything proceeds faster
from there.
Imagine an economy without investment, or a university without language,
a technology without tools to make tools. Once in a hundred million years, or
once in a few centuries, someone invents a hammer.
That is what optimization has been like on Earth up until now.
When I look at the history of Earth, I don’t see a history of optimization
over time. I see a history of optimization power in, and optimized products out.
Up until now, thanks to the existence of almost entirely protected meta-levels,
it’s been possible to split up the history of optimization into epochs, and, within
each epoch, graph the cumulative object-level optimization over time, because
the protected level is running in the background and is not itself changing
within an epoch.
What happens when you build a fully wraparound, recursively self-
improving AI? Then you take the graph of “optimization in, optimized out,”
and fold the graph in on itself. Metaphorically speaking.
If the AI is weak, it does nothing, because it is not powerful enough to
significantly improve itself—like telling a chimpanzee to rewrite its own brain.
If the AI is powerful enough to rewrite itself in a way that increases its
ability to make further improvements, and this reaches all the way down to
the AI’s full understanding of its own source code and its own design as an
optimizer . . . then even if the graph of “optimization power in” and “optimized
product out” looks essentially the same, the graph of optimization over time is
going to look completely different from Earth’s history so far.
People often say something like, “But what if it requires exponentially
greater amounts of self-rewriting for only a linear improvement?” To this
the obvious answer is, “Natural selection exerted roughly constant optimiza-
tion power on the hominid line in the course of coughing up humans; and
this doesn’t seem to have required exponentially more time for each linear
increment of improvement.”
All of this is still mere analogic reasoning. A full Artificial General Intelli-
gence thinking about the nature of optimization and doing its own AI research
and rewriting its own source code, is not really like a graph of Earth’s history
folded in on itself. It is a different sort of beast. These analogies are at best
good for qualitative predictions, and even then, I have a large amount of other
beliefs I haven’t yet explained, which are telling me which analogies to make,
et cetera.
But if you want to know why I might be reluctant to extend the graph of
biological and economic growth over time, into the future and over the horizon
of an AI that thinks at transistor speeds and invents self-replicating molecular
nanofactories and improves its own source code, then there is my reason: you
are drawing the wrong graph, and it should be optimization power in versus
optimized product out, not optimized product versus time.

*
146
Ghosts in the Machine

People hear about Friendly AI and say—this is one of the top three initial
reactions:
“Oh, you can try to tell the AI to be Friendly, but if the AI can modify its
own source code, it’ll just remove any constraints you try to place on it.”
And where does that decision come from?
Does it enter from outside causality, rather than being an effect of a lawful
chain of causes that started with the source code as originally written? Is the
AI the ultimate source of its own free will?
A Friendly AI is not a selfish AI constrained by a special extra conscience
module that overrides the AI’s natural impulses and tells it what to do. You just
build the conscience, and that is the AI. If you have a program that computes
which decision the AI should make, you’re done. The buck stops immediately.
At this point, I shall take a moment to quote some case studies from the
Computer Stupidities site and Programming subtopic. (I am not linking to
this, because it is a fearsome time-trap; you can Google if you dare.)

I tutored college students who were taking a computer program-


ming course. A few of them didn’t understand that computers are
not sentient. More than one person used comments in their Pas-
cal programs to put detailed explanations such as, “Now I need
you to put these letters on the screen.” I asked one of them what
the deal was with those comments. The reply: “How else is the
computer going to understand what I want it to do?” Apparently
they would assume that since they couldn’t make sense of Pascal,
neither could the computer.

While in college, I used to tutor in the school’s math lab. A stu-


dent came in because his basic program would not run. He
was taking a beginner course, and his assignment was to write a
program that would calculate the recipe for oatmeal cookies, de-
pending upon the number of people you’re baking for. I looked
at his program, and it went something like this:
10 Preheat oven to 350
20 Combine all ingredients in a large mixing
bowl
30 Mix until smooth

An introductory programming student once asked me to look at


his program and figure out why it was always churning out zeroes
as the result of a simple computation. I looked at the program,
and it was pretty obvious:

begin
read("Number of Apples", apples)
read("Number of Carrots", carrots)
read("Price for 1 Apple", a_price)
read("Price for 1 Carrot", c_price)
write("Total for Apples", a_total)
write("Total for Carrots", c_total)
write("Total", total)
total = a_total + c_total
a_total = apples * a_price
c_total = carrots * c_price
end

Me: “Well, your program can’t print correct results before they’re
computed.”
Him: “Huh? It’s logical what the right solution is, and the com-
puter should reorder the instructions the right way.”

There’s an instinctive way of imagining the scenario of “programming an AI.”


It maps onto a similar-seeming human endeavor: Telling a human being what
to do. Like the “program” is giving instructions to a little ghost that sits inside
the machine, which will look over your instructions and decide whether it
likes them or not.
There is no ghost who looks over the instructions and decides how to follow
them. The program is the AI.
That doesn’t mean the ghost does anything you wish for, like a genie. It
doesn’t mean the ghost does everything you want the way you want it, like a
slave of exceeding docility. It means your instruction is the only ghost that’s
there, at least at boot time.
AI is much harder than people instinctively imagined, exactly because you
can’t just tell the ghost what to do. You have to build the ghost from scratch,
and everything that seems obvious to you, the ghost will not see unless you
know how to make the ghost see it. You can’t just tell the ghost to see it. You
have to create that-which-sees from scratch.
If you don’t know how to build something that seems to have some strange
ineffable elements like, say, “decision-making,” then you can’t just shrug your
shoulders and let the ghost’s free will do the job. You’re left forlorn and
ghostless.
There’s more to building a chess-playing program than building a really fast
processor—so the AI will be really smart—and then typing at the command
prompt “Make whatever chess moves you think are best.” You might think that,
since the programmers themselves are not very good chess players, any advice
they tried to give the electronic superbrain would just slow the ghost down.
But there is no ghost. You see the problem.
And there isn’t a simple spell you can perform to—poof!—summon a
complete ghost into the machine. You can’t say, “I summoned the ghost, and
it appeared; that’s cause and effect for you.” (It doesn’t work if you use the
notion of “emergence” or “complexity” as a substitute for “summon,” either.)
You can’t give an instruction to the CPU, “Be a good chess player!” You have
to see inside the mystery of chess-playing thoughts, and structure the whole
ghost from scratch.
No matter how common-sensical, no matter how logical, no matter how
“obvious” or “right” or “self-evident” or “intelligent” something seems to you,
it will not happen inside the ghost. Unless it happens at the end of a chain of
cause and effect that began with the instructions that you had to decide on,
plus any causal dependencies on sensory data that you built into the starting
instructions.
This doesn’t mean you program in every decision explicitly. Deep Blue
was a chess player far superior to its programmers. Deep Blue made better
chess moves than anything its makers could have explicitly programmed—but
not because the programmers shrugged and left it up to the ghost. Deep Blue
moved better than its programmers . . . at the end of a chain of cause and
effect that began in the programmers’ code and proceeded lawfully from there.
Nothing happened just because it was so obviously a good move that Deep
Blue’s ghostly free will took over, without the code and its lawful consequences
being involved.
If you try to wash your hands of constraining the AI, you aren’t left with a
free ghost like an emancipated slave. You are left with a heap of sand that no
one has purified into silicon, shaped into a CPU and programmed to think.
Go ahead, try telling a computer chip “Do whatever you want!” See what
happens? Nothing. Because you haven’t constrained it to understand freedom.
All it takes is one single step that is so obvious, so logical, so self-evident that
your mind just skips right over it, and you’ve left the path of the AI programmer.
It takes an effort like the one I illustrate in Grasping Slippery Things to prevent
your mind from doing this.

*
147
Artificial Addition

Suppose that human beings had absolutely no idea how they performed arith-
metic. Imagine that human beings had evolved, rather than having learned,
the ability to count sheep and add sheep. People using this built-in ability have
no idea how it worked, the way Aristotle had no idea how his visual cortex sup-
ported his ability to see things. Peano Arithmetic as we know it has not been
invented. There are philosophers working to formalize numerical intuitions,
but they employ notations such as

Plus-Of(Seven, Six) = Thirteen

to formalize the intuitively obvious fact that when you add “seven” plus “six,”
of course you get “thirteen.”
In this world, pocket calculators work by storing a giant lookup table of
arithmetical facts, entered manually by a team of expert Artificial Arithmeti-
cians, for starting values that range between zero and one hundred. While
these calculators may be helpful in a pragmatic sense, many philosophers argue
that they’re only simulating addition, rather than really adding. No machine
can really count—that’s why humans have to count thirteen sheep before typ-
ing “thirteen” into the calculator. Calculators can recite back stored facts, but
they can never know what the statements mean—if you type in “two hundred
plus two hundred” the calculator says “Error: Outrange,” when it’s intuitively
obvious, if you know what the words mean, that the answer is “four hundred.”
Some philosophers, of course, are not so naive as to be taken in by these in-
tuitions. Numbers are really a purely formal system—the label “thirty-seven” is
meaningful, not because of any inherent property of the words themselves, but
because the label refers to thirty-seven sheep in the external world. A number
is given this referential property by its semantic network of relations to other
numbers. That’s why, in computer programs, the lisp token for “thirty-seven”
doesn’t need any internal structure—it’s only meaningful because of reference
and relation, not some computational property of “thirty-seven” itself.
No one has ever developed an Artificial General Arithmetician, though of
course there are plenty of domain-specific, narrow Artificial Arithmeticians
that work on numbers between “twenty” and “thirty,” and so on. And if you
look at how slow progress has been on numbers in the range of “two hundred,”
then it becomes clear that we’re not going to get Artificial General Arithmetic
any time soon. The best experts in the field estimate it will be at least a hundred
years before calculators can add as well as a human twelve-year-old.
But not everyone agrees with this estimate, or with merely conventional
beliefs about Artificial Arithmetic. It’s common to hear statements such as the
following:

• “It’s a framing problem—what ‘twenty-one plus’ equals depends on


whether it’s ‘plus three’ or ‘plus four.’ If we can just get enough arith-
metical facts stored to cover the common-sense truths that everyone
knows, we’ll start to see real addition in the network.”

• “But you’ll never be able to program in that many arithmetical facts by


hiring experts to enter them manually. What we need is an Artificial
Arithmetician that can learn the vast network of relations between num-
bers that humans acquire during their childhood by observing sets of
apples.”

• “No, what we really need is an Artificial Arithmetician that can under-


stand natural language, so that instead of having to be explicitly told that
twenty-one plus sixteen equals thirty-seven, it can get the knowledge by
exploring the Web.”

• “Frankly, it seems to me that you’re just trying to convince yourselves


that you can solve the problem. None of you really know what arithmetic
is, so you’re floundering around with these generic sorts of arguments.
‘We need an AA that can learn X,’ ‘We need an AA that can extract X
from the Internet.’ I mean, it sounds good, it sounds like you’re making
progress, and it’s even good for public relations, because everyone thinks
they understand the proposed solution—but it doesn’t really get you
any closer to general addition, as opposed to domain-specific addition.
Probably we will never know the fundamental nature of arithmetic. The
problem is just too hard for humans to solve.”

• “That’s why we need to develop a general arithmetician the same way


Nature did—evolution.”

• “Top-down approaches have clearly failed to produce arithmetic. We


need a bottom-up approach, some way to make arithmetic emerge. We
have to acknowledge the basic unpredictability of complex systems.”

• “You’re all wrong. Past efforts to create machine arithmetic were futile
from the start, because they just didn’t have enough computing power.
If you look at how many trillions of synapses there are in the human
brain, it’s clear that calculators don’t have lookup tables anywhere near
that large. We need calculators as powerful as a human brain. According
to Moore’s Law, this will occur in the year 2031 on April 27 between
4:00 and 4:30 in the morning.”

• “I believe that machine arithmetic will be developed when researchers


scan each neuron of a complete human brain into a computer, so that we
can simulate the biological circuitry that performs addition in humans.”

• “I don’t think we have to wait to scan a whole brain. Neural networks


are just like the human brain, and you can train them to do things
without knowing how they do them. We’ll create programs that will do
arithmetic without we, our creators, ever understanding how they do
arithmetic.”

• “But Gödel’s Theorem shows that no formal system can ever capture the
basic properties of arithmetic. Classical physics is formalizable, so to
add two and two, the brain must take advantage of quantum physics.”

• “Hey, if human arithmetic were simple enough that we could reproduce


it in a computer, we wouldn’t be able to count high enough to build
computers.”

• “Haven’t you heard of John Searle’s Chinese Calculator Experiment?


Even if you did have a huge set of rules that would let you add ‘twenty-
one’ and ‘sixteen,’ just imagine translating all the words into Chinese,
and you can see that there’s no genuine addition going on. There are
no real numbers anywhere in the system, just labels that humans use for
numbers . . .”

There is more than one moral to this parable, and I have told it with different
morals in different contexts. It illustrates the idea of levels of organization,
for example—a CPU can add two large numbers because the numbers aren’t
black-box opaque objects, they’re ordered structures of 32 bits.
But for purposes of overcoming bias, let us draw two morals:

• First, the danger of believing assertions you can’t regenerate from your
own knowledge.

• Second, the danger of trying to dance around basic confusions.

Lest anyone accuse me of generalizing from fictional evidence, both lessons


may be drawn from the real history of Artificial Intelligence as well.
The first danger is the object-level problem that the AA devices ran into:
they functioned as tape recorders playing back “knowledge” generated from
outside the system, using a process they couldn’t capture internally. A human
could tell the AA device that “twenty-one plus sixteen equals thirty-seven,”
and the AA devices could record this sentence and play it back, or even pattern-
match “twenty-one plus sixteen” to output “thirty-seven!”—but the AA devices
couldn’t generate such knowledge for themselves.
Which is strongly reminiscent of believing a physicist who tells you “Light is
waves,” recording the fascinating words and playing them back when someone
asks “What is light made of?,” without being able to generate the knowledge
for yourself.
The second moral is the meta-level danger that consumed the Artificial
Arithmetic researchers and opinionated bystanders—the danger of dancing
around confusing gaps in your knowledge. The tendency to do just about
anything except grit your teeth and buckle down and fill in the damn gap.
Whether you say, “It is emergent!,” or whether you say, “It is unknowable!,”
in neither case are you acknowledging that there is a basic insight required
which is possessable, but unpossessed by you.
How can you know when you’ll have a new basic insight? And there’s no way
to get one except by banging your head against the problem, learning everything
you can about it, studying it from as many angles as possible, perhaps for years.
It’s not a pursuit that academia is set up to permit, when you need to publish at
least one paper per month. It’s certainly not something that venture capitalists
will fund. You want to either go ahead and build the system now, or give up
and do something else instead.
Look at the comments above: none are aimed at setting out on a quest for
the missing insight which would make numbers no longer mysterious, make
“twenty-seven” more than a black box. None of the commenters realized that
their difficulties arose from ignorance or confusion in their own minds, rather
than an inherent property of arithmetic. They were not trying to achieve a
state where the confusing thing ceased to be confusing.
If you read Judea Pearl’s Probabilistic Reasoning in Intelligent Systems:
Networks of Plausible Inference,1 then you will see that the basic insight be-
hind graphical models is indispensable to problems that require it. (It’s not
something that fits on a T-shirt, I’m afraid, so you’ll have to go and read the
book yourself. I haven’t seen any online popularizations of Bayesian networks
that adequately convey the reasons behind the principles, or the importance
of the math being exactly the way it is, but Pearl’s book is wonderful.) There
were once dozens of “non-monotonic logics” awkwardly trying to capture in-
tuitions such as “If my burglar alarm goes off, there was probably a burglar,
but if I then learn that there was a small earthquake near my home, there was
probably not a burglar.” With the graphical-model insight in hand, you can
give a mathematical explanation of exactly why first-order logic has the wrong
properties for the job, and express the correct solution in a compact way that
captures all the common-sense details in one elegant swoop. Until you have
that insight, you’ll go on patching the logic here, patching it there, adding more
and more hacks to force it into correspondence with everything that seems
“obviously true.”
You won’t know the Artificial Arithmetic problem is unsolvable without its
key. If you don’t know the rules, you don’t know the rule that says you need to
know the rules to do anything. And so there will be all sorts of clever ideas
that seem like they might work, like building an Artificial Arithmetician that
can read natural language and download millions of arithmetical assertions
from the Internet.
And yet somehow the clever ideas never work. Somehow it always turns out
that you “couldn’t see any reason it wouldn’t work” because you were ignorant
of the obstacles, not because no obstacles existed. Like shooting blindfolded
at a distant target—you can fire blind shot after blind shot, crying, “You can’t
prove to me that I won’t hit the center!” But until you take off the blindfold,
you’re not even in the aiming game. When “no one can prove to you” that
your precious idea isn’t right, it means you don’t have enough information
to strike a small target in a vast answer space. Until you know your idea will
work, it won’t.
From the history of previous key insights in Artificial Intelligence, and the
grand messes that were proposed prior to those insights, I derive an important
real-life lesson: When the basic problem is your ignorance, clever strategies for
bypassing your ignorance lead to shooting yourself in the foot.

1. Pearl, Probabilistic Reasoning in Intelligent Systems.


148
Terminal Values and Instrumental
Values

On a purely instinctive level, any human planner behaves as if they distinguish


between means and ends. Want chocolate? There’s chocolate at the Publix
supermarket. You can get to the supermarket if you drive one mile south on
Washington Ave. You can drive if you get into the car. You can get into the car
if you open the door. You can open the door if you have your car keys. So you
put your car keys into your pocket, and get ready to leave the house . . .
. . . when suddenly the word comes on the radio that an earthquake has
destroyed all the chocolate at the local Publix. Well, there’s no point in driving
to the Publix if there’s no chocolate there, and no point in getting into the car
if you’re not driving anywhere, and no point in having car keys in your pocket
if you’re not driving. So you take the car keys out of your pocket, and call the
local pizza service and have them deliver a chocolate pizza. Mm, delicious.
I rarely notice people losing track of plans they devised themselves. People
usually don’t drive to the supermarket if they know the chocolate is gone.
But I’ve also noticed that when people begin explicitly talking about goal
systems instead of just wanting things, mentioning “goals” instead of using
them, they oft become confused. Humans are experts at planning, not experts
on planning, or there’d be a lot more AI developers in the world.
In particular, I’ve noticed people get confused when—in abstract philo-
sophical discussions rather than everyday life—they consider the distinction
between means and ends; more formally, between “instrumental values” and
“terminal values.”
Part of the problem, it seems to me, is that the human mind uses a rather
ad-hoc system to keep track of its goals—it works, but not cleanly. English
doesn’t embody a sharp distinction between means and ends: “I want to save
my sister’s life” and “I want to administer penicillin to my sister” use the same
word “want.”
Can we describe, in mere English, the distinction that is getting lost?
As a first stab:
“Instrumental values” are desirable strictly conditional on their anticipated
consequences. “I want to administer penicillin to my sister,” not because a
penicillin-filled sister is an intrinsic good, but in anticipation of penicillin
curing her flesh-eating pneumonia. If instead you anticipated that injecting
penicillin would melt your sister into a puddle like the Wicked Witch of the
West, you’d fight just as hard to keep her penicillin-free.
“Terminal values” are desirable without conditioning on other conse-
quences: “I want to save my sister’s life” has nothing to do with your an-
ticipating whether she’ll get injected with penicillin after that.
This first attempt suffers from obvious flaws. If saving my sister’s life would
cause the Earth to be swallowed up by a black hole, then I would go off and
cry for a while, but I wouldn’t administer penicillin. Does this mean that
saving my sister’s life was not a “terminal” or “intrinsic” value, because it’s
theoretically conditional on its consequences? Am I only trying to save her
life because of my belief that a black hole won’t consume the Earth afterward?
Common sense should say that’s not what’s happening.
So forget English. We can set up a mathematical description of a decision
system in which terminal values and instrumental values are separate and in-
compatible types—like integers and floating-point numbers, in a programming
language with no automatic conversion between them.
An ideal Bayesian decision system can be set up using only four elements:

• Outcomes : type Outcome[]

– list of possible outcomes


– {sister lives, sister dies}

• Actions : type Action[]

– list of possible actions


– {administer penicillin, don’t administer penicillin}

• Utility_function : type Outcome -> Utility

– utility function that maps each outcome onto a utility


– (a utility being representable as a real number between negative
and positive infinity)
( )
sister lives 7→ 1

sister dies 7→ 0
• Conditional_probability_function :
type Action -> (Outcome -> Probability)

– conditional probability function that maps each action onto a


probability distribution over outcomes
– (a probability being representable as a real number between 0
and 1)
 ! 
sister lives →
7 0.9
7→
 

 administer penicillin 

sister dies → 7 0.1

 


 


 ! 
sister lives →
7 0.3

 

 don’t administer penicillin 7→

 

 
sister dies → 7 0.7

If you can’t read the type system directly, don’t worry, I’ll always translate into
English. For programmers, seeing it described in distinct statements helps to
set up distinct mental objects.
And the decision system itself?
• Expected_Utility : Action A ->
(Sum O in Outcomes: Utility(O) * Probability(O|A))

– The “expected utility” of an action equals the sum, over all out-
comes, of the utility of that outcome times the conditional proba-
bility of that outcome given that action.
( )
EU(administer penicillin) = 0.9

EU(don’t administer penicillin) = 0.3
• Choose :
-> (Argmax A in Actions: Expected_Utility(A))

– Pick an action whose “expected utility” is maximal.


– {return: administer penicillin}

For every action, calculate the conditional probability of all the consequences
that might follow, then add up the utilities of those consequences times their
conditional probability. Then pick the best action.
This is a mathematically simple sketch of a decision system. It is not an
efficient way to compute decisions in the real world.
What if, for example, you need a sequence of acts to carry out a plan? The
formalism can easily represent this by letting each Action stand for a whole
sequence. But this creates an exponentially large space, like the space of all
sentences you can type in 100 letters. As a simple example, if one of the possible
acts on the first turn is “Shoot my own foot off,” a human planner will decide
this is a bad idea generally—eliminate all sequences beginning with this action.
But we’ve flattened this structure out of our representation. We don’t have
sequences of acts, just flat “actions.”
So, yes, there are a few minor complications. Obviously so, or we’d just run
out and build a real AI this way. In that sense, it’s much the same as Bayesian
probability theory itself.
But this is one of those times when it’s a surprisingly good idea to consider
the absurdly simple version before adding in any high-falutin’ complications.
Consider the philosopher who asserts, “All of us are ultimately selfish; we
care only about our own states of mind. The mother who claims to care about
her son’s welfare, really wants to believe that her son is doing well—this belief is
what makes the mother happy. She helps him for the sake of her own happiness,
not his.” You say, “Well, suppose the mother sacrifices her life to push her son
out of the path of an oncoming truck. That’s not going to make her happy, just
dead.” The philosopher stammers for a few moments, then replies, “But she
still did it because she valued that choice above others—because of the feeling
of importance she attached to that decision.”
So you say,

TYPE ERROR: No constructor found for


Expected_Utility -> Utility.

Allow me to explain that reply.


Even our simple formalism illustrates a sharp distinction between expected
utility, which is something that actions have; and utility, which is something
that outcomes have. Sure, you can map both utilities and expected utilities
onto real numbers. But that’s like observing that you can map wind speed and
temperature onto real numbers. It doesn’t make them the same thing.
The philosopher begins by arguing that all your Utilities must be over
Outcomes consisting of your state of mind. If this were true, your intelligence
would operate as an engine to steer the future into regions where you were
happy. Future states would be distinguished only by your state of mind; you
would be indifferent between any two futures in which you had the same state
of mind.
And you would, indeed, be rather unlikely to sacrifice your own life to save
another.
When we object that people sometimes do sacrifice their lives, the philoso-
pher’s reply shifts to discussing Expected Utilities over Actions: “The feel-
ing of importance she attached to that decision.” This is a drastic jump that
should make us leap out of our chairs in indignation. Trying to convert an
Expected_Utility into a Utility would cause an outright error in our
programming language. But in English it all sounds the same.
The choices of our simple decision system are those with highest
Expected_Utility, but this doesn’t say anything whatsoever about where it
steers the future. It doesn’t say anything about the utilities the decider assigns,
or which real-world outcomes are likely to happen as a result. It doesn’t say
anything about the mind’s function as an engine.
The physical cause of a physical action is a cognitive state, in our ideal
decider an Expected_Utility, and this expected utility is calculated by eval-
uating a utility function over imagined consequences. To save your son’s life,
you must imagine the event of your son’s life being saved, and this imagina-
tion is not the event itself. It’s a quotation, like the difference between “snow”
and snow. But that doesn’t mean that what’s inside the quote marks must itself
be a cognitive state. If you choose the action that leads to the future that you
represent with “my son is still alive,” then you have functioned as an engine to
steer the future into a region where your son is still alive. Not an engine that
steers the future into a region where you represent the sentence “my son is still
alive.” To steer the future there, your utility function would have to return a
high utility when fed “ “my son is still alive” ”, the quotation of the quotation,
your imagination of yourself imagining. Recipes make poor cake when you
grind them up and toss them in the batter.
And that’s why it’s helpful to consider the simple decision systems first.
Mix enough complications into the system, and formerly clear distinctions
become harder to see.
So now let’s look at some complications. Clearly the Utility function
(mapping Outcomes onto Utilities) is meant to formalize what I earlier re-
ferred to as “terminal values,” values not contingent upon their consequences.
What about the case where saving your sister’s life leads to Earth’s destruc-
tion by a black hole? In our formalism, we’ve flattened out this possibility.
Outcomes don’t lead to Outcomes, only Actions lead to Outcomes. Your
sister recovering from pneumonia followed by the Earth being devoured by a
black hole would be flattened into a single “possible outcome.”
And where are the “instrumental values” in this simple formalism? Actually,
they’ve vanished entirely! You see, in this formalism, actions lead directly to
outcomes with no intervening events. There’s no notion of throwing a rock
that flies through the air and knocks an apple off a branch so that it falls to the
ground. Throwing the rock is the Action, and it leads straight to the Outcome
of the apple lying on the ground—according to the conditional probability
function that turns an Action directly into a Probability distribution over
Outcomes.
In order to actually compute the conditional probability function, and in
order to separately consider the utility of a sister’s pneumonia and a black
hole swallowing Earth, we would have to represent the network structure of
causality—the way that events lead to other events.
And then the instrumental values would start coming back. If the causal
network was sufficiently regular, you could find a state B that tended to lead
to C regardless of how you achieved B. Then if you wanted to achieve C for
some reason, you could plan efficiently by first working out a B that led to C,
and then an A that led to B. This would be the phenomenon of “instrumental
value”—B would have “instrumental value” because it led to C. The state
C itself might be terminally valued—a term in the utility function over the
total outcome. Or C might just be an instrumental value, a node that was not
directly valued by the utility function.
Instrumental value, in this formalism, is purely an aid to the efficient compu-
tation of plans. It can and should be discarded wherever this kind of regularity
does not exist.
Suppose, for example, that there’s some particular value of B that doesn’t
lead to C. Would you choose an A which led to that B? Or never mind the
abstract philosophy: If you wanted to go to the supermarket to get chocolate,
and you wanted to drive to the supermarket, and you needed to get into your
car, would you gain entry by ripping off the car door with a steam shovel?
(No.) Instrumental value is a “leaky abstraction,” as we programmers say; you
sometimes have to toss away the cached value and compute out the actual
expected utility. Part of being efficient without being suicidal is noticing when
convenient shortcuts break down. Though this formalism does give rise to
instrumental values, it does so only where the requisite regularity exists, and
strictly as a convenient shortcut in computation.
But if you complicate the formalism before you understand the simple
version, then you may start thinking that instrumental values have some strange
life of their own, even in a normative sense. That, once you say B is usually
good because it leads to C, you’ve committed yourself to always try for B even
in the absence of C. People make this kind of mistake in abstract philosophy,
even though they would never, in real life, rip open their car door with a steam
shovel. You may start thinking that there’s no way to develop a consequentialist
that maximizes only inclusive genetic fitness, because it will starve unless you
include an explicit terminal value for “eating food.” People make this mistake
even though they would never stand around opening car doors all day long,
for fear of being stuck outside their cars if they didn’t have a terminal value for
opening car doors.
Instrumental values live in (the network structure of) the conditional proba-
bility function. This makes instrumental value strictly dependent on beliefs-of-
fact given a fixed utility function. If I believe that penicillin causes pneumonia,
and that the absence of penicillin cures pneumonia, then my perceived in-
strumental value of penicillin will go from high to low. Change the beliefs of
fact—change the conditional probability function that associates actions to
believed consequences—and the instrumental values will change in unison.
In moral arguments, some disputes are about instrumental consequences,
and some disputes are about terminal values. If your debating opponent says
that banning guns will lead to lower crime, and you say that banning guns
will lead to higher crime, then you agree about a superior instrumental value
(crime is bad), but you disagree about which intermediate events lead to which
consequences. But I do not think an argument about female circumcision is
really a factual argument about how to best achieve a shared value of treating
women fairly or making them happy.
This important distinction often gets flushed down the toilet in angry argu-
ments. People with factual disagreements and shared values each decide that
their debating opponents must be sociopaths. As if your hated enemy, gun con-
trol/rights advocates, really wanted to kill people, which should be implausible
as realistic psychology.
I fear the human brain does not strongly type the distinction between
terminal moral beliefs and instrumental moral beliefs. “We should ban guns”
and “We should save lives” don’t feel different, as moral beliefs, the way that
sight feels different from sound. Despite all the other ways that the human
goal system complicates everything in sight, this one distinction it manages to
collapse into a mishmash of things-with-conditional-value.
To extract out the terminal values we have to inspect this mishmash of
valuable things, trying to figure out which ones are getting their value from
somewhere else. It’s a difficult project! If you say that you want to ban guns
in order to reduce crime, it may take a moment to realize that “reducing
crime” isn’t a terminal value, it’s a superior instrumental value with links to
terminal values for human lives and human happinesses. And then the one
who advocates gun rights may have links to the superior instrumental value
of “reducing crime” plus a link to a value for “freedom,” which might be a
terminal value unto them, or another instrumental value . . .
We can’t print out our complete network of values derived from other
values. We probably don’t even store the whole history of how values got there.
By considering the right moral dilemmas, “Would you do X if Y, ” we can
often figure out where our values came from. But even this project itself is
full of pitfalls; misleading dilemmas and gappy philosophical arguments. We
don’t know what our own values are, or where they came from, and can’t find
out except by undertaking error-prone projects of cognitive archaeology. Just
forming a conscious distinction between “terminal value” and “instrumental
value,” and keeping track of what it means, and using it correctly, is hard work.
Only by inspecting the simple formalism can we see how easy it ought to be,
in principle.
And that’s to say nothing of all the other complications of the human reward
system—the whole use of reinforcement architecture, and the way that eating
chocolate is pleasurable, and anticipating eating chocolate is pleasurable, but
they’re different kinds of pleasures . . .
But I don’t complain too much about the mess.
Being ignorant of your own values may not always be fun, but at least it’s
not boring.

*
149
Leaky Generalizations

Are apples good to eat? Usually, but some apples are rotten.
Do humans have ten fingers? Most of us do, but plenty of people have lost
a finger and nonetheless qualify as “human.”
Unless you descend to a level of description far below any macroscopic
object—below societies, below people, below fingers, below tendon and bone,
below cells, all the way down to particles and fields where the laws are truly
universal—practically every generalization you use in the real world will be
leaky.
(Though there may, of course, be some exceptions to the above rule . . .)
Mostly, the way you deal with leaky generalizations is that, well, you just
have to deal. If the cookie market almost always closes at 10 p.m., except on
Thanksgiving it closes at 6 p.m., and today happens to be National Native
American Genocide Day, you’d better show up before 6 p.m. or you won’t get
a cookie.
Our ability to manipulate leaky generalizations is opposed by need for
closure, the degree to which we want to say once and for all that humans have
ten fingers, and get frustrated when we have to tolerate continued ambiguity.
Raising the value of the stakes can increase need for closure—which shuts
down complexity tolerance when complexity tolerance is most needed.
Life would be complicated even if the things we wanted were simple (they
aren’t). The leakyness of leaky generalizations about what-to-do-next would
leak in from the leaky structure of the real world. Or to put it another way:
Instrumental values often have no specification that is both compact and
local.
Suppose there’s a box containing a million dollars. The box is locked,
not with an ordinary combination lock, but with a dozen keys controlling a
machine that can open the box. If you know how the machine works, you can
deduce which sequences of key-presses will open the box. There’s more than
one key sequence that can trigger the machine to open the box. But if you
press a sufficiently wrong sequence, the machine incinerates the money. And
if you don’t know about the machine, there’s no simple rules like “Pressing
any key three times opens the box” or “Pressing five different keys with no
repetitions incinerates the money.”
There’s a compact nonlocal specification of which keys you want to press:
You want to press keys such that they open the box. You can write a compact
computer program that computes which key sequences are good, bad or neutral,
but the computer program will need to describe the machine, not just the keys
themselves.
There’s likewise a local noncompact specification of which keys to press:
a giant lookup table of the results for each possible key sequence. It’s a very
large computer program, but it makes no mention of anything except the keys.
But there’s no way to describe which key sequences are good, bad, or neutral,
which is both simple and phrased only in terms of the keys themselves.
It may be even worse if there are tempting local generalizations which turn
out to be leaky. Pressing most keys three times in a row will open the box,
but there’s a particular key that incinerates the money if you press it just once.
You might think you had found a perfect generalization—a locally describable
class of sequences that always opened the box—when you had merely failed to
visualize all the possible paths of the machine, or failed to value all the side
effects.
The machine represents the complexity of the real world. The openness
of the box (which is good) and the incinerator (which is bad) represent the
thousand shards of desire that make up our terminal values. The keys represent
the actions and policies and strategies available to us.
When you consider how many different ways we value outcomes, and how
complicated are the paths we take to get there, it’s a wonder that there exists
any such thing as helpful ethical advice. (Of which the strangest of all advices,
and yet still helpful, is that “the end does not justify the means.”)
But conversely, the complicatedness of action need not say anything about
the complexity of goals. You often find people who smile wisely, and say, “Well,
morality is complicated, you know, female circumcision is right in one culture
and wrong in another, it’s not always a bad thing to torture people. How naive
you are, how full of need for closure, that you think there are any simple rules.”
You can say, unconditionally and flatly, that killing anyone is a huge dose of
negative terminal utility. Yes, even Hitler. That doesn’t mean you shouldn’t
shoot Hitler. It means that the net instrumental utility of shooting Hitler carries
a giant dose of negative utility from Hitler’s death, and a hugely larger dose of
positive utility from all the other lives that would be saved as a consequence.
Many commit the type error that I warned against in Terminal Values
and Instrumental Values, and think that if the net consequential expected
utility of Hitler’s death is conceded to be positive, then the immediate local
terminal utility must also be positive, meaning that the moral principle “Death
is always a bad thing” is itself a leaky generalization. But this is double counting,
with utilities instead of probabilities; you’re setting up a resonance between
the expected utility and the utility, instead of a one-way flow from utility to
expected utility.
Or maybe it’s just the urge toward a one-sided policy debate: the best policy
must have no drawbacks.
In my moral philosophy, the local negative utility of Hitler’s death is stable,
no matter what happens to the external consequences and hence to the expected
utility.
Of course, you can set up a moral argument that it’s an inherently good
thing to punish evil people, even with capital punishment for sufficiently evil
people. But you can’t carry this moral argument by pointing out that the
consequence of shooting a man holding a leveled gun may be to save other lives.
This is appealing to the value of life, not appealing to the value of death. If
expected utilities are leaky and complicated, it doesn’t mean that utilities must
be leaky and complicated as well. They might be! But it would be a separate
argument.

*
150
The Hidden Complexity of Wishes

I wish to live in the locations of my choice, in a physically healthy,


uninjured, and apparently normal version of my current body
containing my current mental state, a body which will heal from
all injuries at a rate three sigmas faster than the average given the
medical technology available to me, and which will be protected
from any diseases, injuries or illnesses causing disability, pain, or
degraded functionality or any sense, organ, or bodily function for
more than ten days consecutively or fifteen days in any year . . .
—The Open-Source Wish Project, Wish For Immortality 1.1

There are three kinds of genies: Genies to whom you can safely say, “I wish for
you to do what I should wish for”; genies for which no wish is safe; and genies
that aren’t very powerful or intelligent.
Suppose your aged mother is trapped in a burning building, and it so
happens that you’re in a wheelchair; you can’t rush in yourself. You could cry,
“Get my mother out of that building!” but there would be no one to hear.
Luckily you have, in your pocket, an Outcome Pump. This handy device
squeezes the flow of time, pouring probability into some outcomes, draining it
from others.
The Outcome Pump is not sentient. It contains a tiny time machine, which
resets time unless a specified outcome occurs. For example, if you hooked up
the Outcome Pump’s sensors to a coin, and specified that the time machine
should keep resetting until it sees the coin come up heads, and then you actually
flipped the coin, you would see the coin come up heads. (The physicists say that
any future in which a “reset” occurs is inconsistent, and therefore never happens
in the first place—so you aren’t actually killing any versions of yourself.)
Whatever proposition you can manage to input into the Outcome Pump
somehow happens, though not in a way that violates the laws of physics. If you
try to input a proposition that’s too unlikely, the time machine will suffer a
spontaneous mechanical failure before that outcome ever occurs.
You can also redirect probability flow in more quantitative ways, using the
“future function” to scale the temporal reset probability for different outcomes.
If the temporal reset probability is 99% when the coin comes up heads, and
1% when the coin comes up tails, the odds will go from 1:1 to 99:1 in favor of
tails. If you had a mysterious machine that spit out money, and you wanted to
maximize the amount of money spit out, you would use reset probabilities that
diminished as the amount of money increased. For example, spitting out $10
might have a 99.999999% reset probability, and spitting out $100 might have a
99.99999% reset probability. This way you can get an outcome that tends to be
as high as possible in the future function, even when you don’t know the best
attainable maximum.
So you desperately yank the Outcome Pump from your pocket—your
mother is still trapped in the burning building, remember?—and try to describe
your goal: get your mother out of the building!
The user interface doesn’t take English inputs. The Outcome Pump isn’t
sentient, remember? But it does have 3D scanners for the near vicinity, and
built-in utilities for pattern matching. So you hold up a photo of your mother’s
head and shoulders; match on the photo; use object contiguity to select your
mother’s whole body (not just her head and shoulders); and define the future
function using your mother’s distance from the building’s center. The further
she gets from the building’s center, the less the time machine’s reset probability.
You cry “Get my mother out of the building!,” for luck, and press Enter.
For a moment it seems like nothing happens. You look around, waiting
for the fire truck to pull up, and rescuers to arrive—or even just a strong, fast
runner to haul your mother out of the building—
Boom! With a thundering roar, the gas main under the building explodes.
As the structure comes apart, in what seems like slow motion, you glimpse
your mother’s shattered body being hurled high into the air, traveling fast,
rapidly increasing its distance from the former center of the building.
On the side of the Outcome Pump is an Emergency Regret Button. All
future functions are automatically defined with a huge negative value for the
Regret Button being pressed—a temporal reset probability of nearly 1—so that
the Outcome Pump is extremely unlikely to do anything which upsets the
user enough to make them press the Regret Button. You can’t ever remember
pressing it. But you’ve barely started to reach for the Regret Button (and what
good will it do now?) when a flaming wooden beam drops out of the sky and
smashes you flat.
Which wasn’t really what you wanted, but scores very high in the defined
future function . . .
The Outcome Pump is a genie of the second class. No wish is safe.
If someone asked you to get their poor aged mother out of a burning
building, you might help, or you might pretend not to hear. But it wouldn’t
even occur to you to explode the building. “Get my mother out of the building”
sounds like a much safer wish than it really is, because you don’t even consider
the plans that you assign extreme negative values.
Consider again the Tragedy of Group Selectionism: Some early biologists
asserted that group selection for low subpopulation sizes would produce in-
dividual restraint in breeding; and yet actually enforcing group selection in
the laboratory produced cannibalism, especially of immature females. It’s
obvious in hindsight that, given strong selection for small subpopulation sizes,
cannibals will outreproduce individuals who voluntarily forego reproductive
opportunities. But eating little girls is such an un-aesthetic solution that Wynne-
Edwards, Allee, Brereton, and the other group-selectionists simply didn’t think
of it. They only saw the solutions they would have used themselves.
Suppose you try to patch the future function by specifying that the Out-
come Pump should not explode the building: outcomes in which the building
materials are distributed over too much volume will have ∼1 temporal reset
probabilities.
So your mother falls out of a second-story window and breaks her neck.
The Outcome Pump took a different path through time that still ended up with
your mother outside the building, and it still wasn’t what you wanted, and it
still wasn’t a solution that would occur to a human rescuer.
If only the Open-Source Wish Project had developed a Wish To Get Your
Mother Out Of A Burning Building:

I wish to move my mother (defined as the woman who shares


half my genes and gave birth to me) to outside the boundaries of
the building currently closest to me which is on fire; but not by
exploding the building; nor by causing the walls to crumble so
that the building no longer has boundaries; nor by waiting until
after the building finishes burning down for a rescue worker to
take out the body . . .

All these special cases, the seemingly unlimited number of required patches,
should remind you of the parable of Artificial Addition—programming an
Arithmetic Expert Systems by explicitly adding ever more assertions like
“fifteen plus fifteen equals thirty, but fifteen plus sixteen equals thirty-one
instead.”
How do you exclude the outcome where the building explodes and flings
your mother into the sky? You look ahead, and you foresee that your mother
would end up dead, and you don’t want that consequence, so you try to forbid
the event leading up to it.
Your brain isn’t hardwired with a specific, prerecorded statement that
“Blowing up a burning building containing my mother is a bad idea.” And
yet you’re trying to prerecord that exact specific statement in the Outcome
Pump’s future function. So the wish is exploding, turning into a giant lookup
table that records your judgment of every possible path through time.
You failed to ask for what you really wanted. You wanted your mother to
go on living, but you wished for her to become more distant from the center of
the building.
Except that’s not all you wanted. If your mother was rescued from the
building but was horribly burned, that outcome would rank lower in your
preference ordering than an outcome where she was rescued safe and sound.
So you not only value your mother’s life, but also her health.
And you value not just her bodily health, but her state of mind. Being
rescued in a fashion that traumatizes her—for example, a giant purple monster
roaring up out of nowhere and seizing her—is inferior to a fireman showing
up and escorting her out through a non-burning route. (Yes, we’re supposed
to stick with physics, but maybe a powerful enough Outcome Pump has aliens
coincidentally showing up in the neighborhood at exactly that moment.) You
would certainly prefer her being rescued by the monster to her being roasted
alive, however.
How about a wormhole spontaneously opening and swallowing her to a
desert island? Better than her being dead; but worse than her being alive,
well, healthy, untraumatized, and in continual contact with you and the other
members of her social network.
Would it be okay to save your mother’s life at the cost of the family dog’s
life, if it ran to alert a fireman but then got run over by a car? Clearly yes, but
it would be better ceteris paribus to avoid killing the dog. You wouldn’t want
to swap a human life for hers, but what about the life of a convicted murderer?
Does it matter if the murderer dies trying to save her, from the goodness of
his heart? How about two murderers? If the cost of your mother’s life was
the destruction of every extant copy, including the memories, of Bach’s Little
Fugue in G Minor, would that be worth it? How about if she had a terminal
illness and would die anyway in eighteen months?
If your mother’s foot is crushed by a burning beam, is it worthwhile to
extract the rest of her? What if her head is crushed, leaving her body? What if
her body is crushed, leaving only her head? What if there’s a cryonics team
waiting outside, ready to suspend the head? Is a frozen head a person? Is Terry
Schiavo a person? How much is a chimpanzee worth?
Your brain is not infinitely complicated; there is only a finite Kolmogorov
complexity / message length which suffices to describe all the judgments you
would make. But just because this complexity is finite does not make it small.
We value many things, and no they are not reducible to valuing happiness or
valuing reproductive fitness.
There is no safe wish smaller than an entire human morality. There are
too many possible paths through Time. You can’t visualize all the roads that
lead to the destination you give the genie. “Maximizing the distance between
your mother and the center of the building” can be done even more effectively
by detonating a nuclear weapon. Or, at higher levels of genie power, flinging
her body out of the Solar System. Or, at higher levels of genie intelligence,
doing something that neither you nor I would think of, just like a chimpanzee
wouldn’t think of detonating a nuclear weapon. You can’t visualize all the
paths through time, any more than you can program a chess-playing machine
by hardcoding a move for every possible board position.
And real life is far more complicated than chess. You cannot predict, in
advance, which of your values will be needed to judge the path through time
that the genie takes. Especially if you wish for something longer-term or
wider-range than rescuing your mother from a burning building.
I fear the Open-Source Wish Project is futile, except as an illustration of
how not to think about genie problems. The only safe genie is a genie that
shares all your judgment criteria, and at that point, you can just say “I wish
for you to do what I should wish for.” Which simply runs the genie’s should
function.
Indeed, it shouldn’t be necessary to say anything. To be a safe fulfiller of
a wish, a genie must share the same values that led you to make the wish.
Otherwise the genie may not choose a path through time that leads to the
destination you had in mind, or it may fail to exclude horrible side effects that
would lead you to not even consider a plan in the first place. Wishes are leaky
generalizations, derived from the huge but finite structure that is your entire
morality; only by including this entire structure can you plug all the leaks.
With a safe genie, wishing is superfluous. Just run the genie.

*
151
Anthropomorphic Optimism

The core fallacy of anthropomorphism is expecting something to be predicted


by the black box of your brain, when its causal structure is so different from
that of a human brain as to give you no license to expect any such thing.
The early (pre-1966) biologists in The Tragedy of Group Selectionism be-
lieved that predators would voluntarily restrain their breeding to avoid over-
populating their habitat and exhausting the prey population. Later on, when
Michael J. Wade actually went out and created in the laboratory the nigh-
impossible conditions for group selection, the adults adapted to cannibalize
eggs and larvae, especially female larvae.1
Now, why might the group selectionists have not thought of that possibility?
Suppose you were a member of a tribe, and you knew that, in the near future,
your tribe would be subjected to a resource squeeze. You might propose, as a
solution, that no couple have more than one child—after the first child, the
couple goes on birth control. Saying, “Let’s all individually have as many
children as we can, but then hunt down and cannibalize each other’s children,
especially the girls,” would not even occur to you as a possibility.
Think of a preference ordering over solutions, relative to your goals. You
want a solution as high in this preference ordering as possible. How do you find
one? With a brain, of course! Think of your brain as a high-ranking-solution-
generator—a search process that produces solutions that rank high in your
innate preference ordering.
The solution space on all real-world problems is generally fairly large, which
is why you need an efficient brain that doesn’t even bother to formulate the vast
majority of low-ranking solutions.
If your tribe is faced with a resource squeeze, you could try hopping every-
where on one leg, or chewing off your own toes. These “solutions” obviously
wouldn’t work and would incur large costs, as you can see upon examination—
but in fact your brain is too efficient to waste time considering such poor
solutions; it doesn’t generate them in the first place. Your brain, in its search
for high-ranking solutions, flies directly to parts of the solution space like “Ev-
eryone in the tribe gets together, and agrees to have no more than one child
per couple until the resource squeeze is past.”
Such a low-ranking solution as “Everyone have as many kids as possible,
then cannibalize the girls” would not be generated in your search process.
But the ranking of an option as “low” or “high” is not an inherent property
of the option. It is a property of the optimization process that does the prefer-
ring. And different optimization processes will search in different orders.
So far as evolution is concerned, individuals reproducing to the fullest and
then cannibalizing others’ daughters is a no-brainer; whereas individuals vol-
untarily restraining their own breeding for the good of the group is absolutely
ludicrous. Or to say it less anthropomorphically, the first set of alleles would
rapidly replace the second in a population. (And natural selection has no obvi-
ous search order here—these two alternatives seem around equally simple as
mutations.)
Suppose that one of the biologists had said, “If a predator population has
only finite resources, evolution will craft them to voluntarily restrain their
breeding—that’s how I’d do it if I were in charge of building predators.” This
would be anthropomorphism outright, the lines of reasoning naked and ex-
posed: I would do it this way, therefore I infer that evolution will do it this
way.
One does occasionally encounter the fallacy outright, in my line of work.
But suppose you say to the one, “An AI will not necessarily work like you do.”
Suppose you say to this hypothetical biologist, “Evolution doesn’t work like
you do.” What will the one say in response? I can tell you a reply you will
not hear: “Oh my! I didn’t realize that! One of the steps of my inference was
invalid; I will throw away the conclusion and start over from scratch.”
No: what you’ll hear instead is a reason why any AI has to reason the
same way as the speaker. Or a reason why natural selection, following entirely
different criteria of optimization and using entirely different methods of opti-
mization, ought to do the same thing that would occur to a human as a good
idea.
Hence the elaborate idea that group selection would favor predator groups
where the individuals voluntarily forsook reproductive opportunities.
The group selectionists went just as far astray, in their predictions, as some-
one committing the fallacy outright. Their final conclusions were the same as if
they were assuming outright that evolution necessarily thought like themselves.
But they erased what had been written above the bottom line of their argument,
without erasing the actual bottom line, and wrote in new rationalizations. Now
the fallacious reasoning is disguised; the obviously flawed step in the inference
has been hidden—even though the conclusion remains exactly the same; and
hence, in the real world, exactly as wrong.
But why would any scientist do this? In the end, the data came out against
the group selectionists and they were embarrassed.
As I remarked in Fake Optimization Criteria, we humans seem to have
evolved an instinct for arguing that our preferred policy arises from practically
any criterion of optimization. Politics was a feature of the ancestral environ-
ment; we are descended from those who argued most persuasively that the
tribe’s interest—not just their own interest—required that their hated rival
Uglak be executed. We certainly aren’t descended from Uglak, who failed
to argue that his tribe’s moral code—not just his own obvious self-interest—
required his survival.
And because we can more persuasively argue for what we honestly believe,
we have evolved an instinct to honestly believe that other people’s goals, and
our tribe’s moral code, truly do imply that they should do things our way for
their benefit.
So the group selectionists, imagining this beautiful picture of predators re-
straining their breeding, instinctively rationalized why natural selection ought
to do things their way, even according to natural selection’s own purposes.
The foxes will be fitter if they restrain their breeding! No, really! They’ll even
outbreed other foxes who don’t restrain their breeding! Honestly!
The problem with trying to argue natural selection into doing things your
way is that evolution does not contain that which could be moved by your
arguments. Evolution does not work like you do—not even to the extent
of having any element that could listen to or care about your painstaking
explanation of why evolution ought to do things your way. Human arguments
are not even commensurate with the internal structure of natural selection as
an optimization process—human arguments aren’t used in promoting alleles,
as human arguments would play a causal role in human politics.
So instead of successfully persuading natural selection to do things their
way, the group selectionists were simply embarrassed when reality came out
differently.
There’s a fairly heavy subtext here about Unfriendly AI.
But the point generalizes: this is the problem with optimistic reasoning
in general. What is optimism? It is ranking the possibilities by your own
preference ordering, and selecting an outcome high in that preference ordering,
and somehow that outcome ends up as your prediction. What kind of elaborate
rationalizations were generated along the way is probably not so relevant as
one might fondly believe; look at the cognitive history and it’s optimism in,
optimism out. But Nature, or whatever other process is under discussion, is
not actually, causally choosing between outcomes by ranking them in your
preference ordering and picking a high one. So the brain fails to synchronize
with the environment, and the prediction fails to match reality.

1. Wade, “Group selections among laboratory populations of Tribolium.”


152
Lost Purposes

It was in either kindergarten or first grade that I was first asked to pray, given
a transliteration of a Hebrew prayer. I asked what the words meant. I was
told that so long as I prayed in Hebrew, I didn’t need to know what the words
meant, it would work anyway.
That was the beginning of my break with Judaism.
As you read this, some young man or woman is sitting at a desk in a
university, earnestly studying material they have no intention of ever using,
and no interest in knowing for its own sake. They want a high-paying job, and
the high-paying job requires a piece of paper, and the piece of paper requires a
previous master’s degree, and the master’s degree requires a bachelor’s degree,
and the university that grants the bachelor’s degree requires you to take a
class in twelfth-century knitting patterns to graduate. So they diligently study,
intending to forget it all the moment the final exam is administered, but still
seriously working away, because they want that piece of paper.
Maybe you realized it was all madness, but I bet you did it anyway. You
didn’t have a choice, right? A recent study here in the Bay Area showed that
80% of teachers in K-5 reported spending less than one hour per week on
science, and 16% said they spend no time on science. Why? I’m given to
understand the proximate cause is the No Child Left Behind Act and similar
legislation. Virtually all classroom time is now spent on preparing for tests
mandated at the state or federal level. I seem to recall (though I can’t find the
source) that just taking mandatory tests was 40% of classroom time in one
school.
The old Soviet bureaucracy was famous for being more interested in ap-
pearances than reality. One shoe factory overfulfilled its quota by producing
lots of tiny shoes. Another shoe factory reported cut but unassembled leather
as a “shoe.” The superior bureaucrats weren’t interested in looking too hard,
because they also wanted to report quota overfulfillments. All this was a great
help to the comrades freezing their feet off.
It is now being suggested in several sources that an actual majority of pub-
lished findings in medicine, though “statistically significant with p < 0.05,”
are untrue. But so long as p < 0.05 remains the threshold for publication, why
should anyone hold themselves to higher standards, when that requires bigger
research grants for larger experimental groups, and decreases the likelihood
of getting a publication? Everyone knows that the whole point of science is to
publish lots of papers, just as the whole point of a university is to print certain
pieces of parchment, and the whole point of a school is to pass the mandatory
tests that guarantee the annual budget. You don’t get to set the rules of the
game, and if you try to play by different rules, you’ll just lose.
(Though for some reason, physics journals require a threshold of
p < 0.0001. It’s as if they conceive of some other purpose to their existence
than publishing physics papers.)
There’s chocolate at the supermarket, and you can get to the supermarket by
driving, and driving requires that you be in the car, which means opening your
car door, which needs keys. If you find there’s no chocolate at the supermarket,
you won’t stand around opening and slamming your car door because the
car door still needs opening. I rarely notice people losing track of plans they
devised themselves.
It’s another matter when incentives must flow through large organizations—
or worse, many different organizations and interest groups, some of them gov-
ernmental. Then you see behaviors that would mark literal insanity, if they
were born from a single mind. Someone gets paid every time they open a car
door, because that’s what’s measurable; and this person doesn’t care whether
the driver ever gets paid for arriving at the supermarket, let alone whether the
buyer purchases the chocolate, or whether the eater is happy or starving.
From a Bayesian perspective, subgoals are epiphenomena of conditional
probability functions. There is no expected utility without utility. How silly
would it be to think that instrumental value could take on a mathematical life of
its own, leaving terminal value in the dust? It’s not sane by decision-theoretical
criteria of sanity.
But consider the No Child Left Behind Act. The politicians want to look
like they’re doing something about educational difficulties; the politicians have
to look busy to voters this year, not fifteen years later when the kids are looking
for jobs. The politicians are not the consumers of education. The bureaucrats
have to show progress, which means that they’re only interested in progress
that can be measured this year. They aren’t the ones who’ll end up ignorant of
science. The publishers who commission textbooks, and the committees that
purchase textbooks, don’t sit in the classrooms bored out of their skulls.
The actual consumers of knowledge are the children—who can’t pay, can’t
vote, can’t sit on the committees. Their parents care for them, but don’t sit in
the classes themselves; they can only hold politicians responsible according
to surface images of “tough on education.” Politicians are too busy being
re-elected to study all the data themselves; they have to rely on surface images
of bureaucrats being busy and commissioning studies—it may not work to
help any children, but it works to let politicians appear caring. Bureaucrats
don’t expect to use textbooks themselves, so they don’t care if the textbooks
are hideous to read, so long as the process by which they are purchased looks
good on the surface. The textbook publishers have no motive to produce
bad textbooks, but they know that the textbook purchasing committee will be
comparing textbooks based on how many different subjects they cover, and that
the fourth-grade purchasing committee isn’t coordinated with the third-grade
purchasing committee, so they cram as many subjects into one textbook as
possible. Teachers won’t get through a fourth of the textbook before the end
of the year, and then the next year’s teacher will start over. Teachers might
complain, but they aren’t the decision-makers, and ultimately, it’s not their
future on the line, which puts sharp bounds on how much effort they’ll spend
on unpaid altruism . . .
It’s amazing, when you look at it that way—consider all the lost informa-
tion and lost incentives—that anything at all remains of the original purpose,
gaining knowledge. Though many educational systems seem to be currently
in the process of collapsing into a state not much better than nothing.
Want to see the problem really solved? Make the politicians go to school.
A single human mind can track a probabilistic expectation of utility as
it flows through the conditional chances of a dozen intermediate events—
including nonlocal dependencies, places where the expected utility of opening
the car door depends on whether there’s chocolate in the supermarket. But
organizations can only reward today what is measurable today, what can be
written into legal contract today, and this means measuring intermediate events
rather than their distant consequences. These intermediate measures, in turn,
are leaky generalizations—often very leaky. Bureaucrats are untrustworthy
genies, for they do not share the values of the wisher.
Miyamoto Musashi said:1

The primary thing when you take a sword in your hands is your
intention to cut the enemy, whatever the means. Whenever you
parry, hit, spring, strike or touch the enemy’s cutting sword, you
must cut the enemy in the same movement. It is essential to
attain this. If you think only of hitting, springing, striking or
touching the enemy, you will not be able actually to cut him. More
than anything, you must be thinking of carrying your movement
through to cutting him. You must thoroughly research this.

(I wish I lived in an era where I could just tell my readers they have to thoroughly
research something, without giving insult.)
Why would any individual lose track of their purposes in a swordfight? If
someone else had taught them to fight, if they had not generated the entire art
from within themselves, they might not understand the reason for parrying at
one moment, or springing at another moment; they might not realize when
the rules had exceptions, fail to see the times when the usual method won’t
cut through.
The essential thing in the art of epistemic rationality is to understand how
every rule is cutting through to the truth in the same movement. The corre-
sponding essential of pragmatic rationality—decision theory, versus probability
theory—is to always see how every expected utility cuts through to utility. You
must thoroughly research this.
C. J. Cherryh said:2

Your sword has no blade. It has only your intention. When that
goes astray you have no weapon.

I have seen many people go astray when they wish to the genie of an imagined
AI, dreaming up wish after wish that seems good to them, sometimes with
many patches and sometimes without even that pretense of caution. And they
don’t jump to the meta-level. They don’t instinctively look-to-purpose, the
instinct that started me down the track to atheism at the age of five. They do
not ask, as I reflexively ask, “Why do I think this wish is a good idea? Will the
genie judge likewise?” They don’t see the source of their judgment, hovering
behind the judgment as its generator. They lose track of the ball; they know the
ball bounced, but they don’t instinctively look back to see where it bounced
from—the criterion that generated their judgments.
Likewise with people not automatically noticing when supposedly selfish
people give altruistic arguments in favor of selfishness, or when supposedly
altruistic people give selfish arguments in favor of altruism.
People can handle goal-tracking for driving to the supermarket just fine,
when it’s all inside their own heads, and no genies or bureaucracies or philoso-
phies are involved. The trouble is that real civilization is immensely more
complicated than this. Dozens of organizations, and dozens of years, inter-
vene between the child suffering in the classroom, and the new-minted college
graduate not being very good at their job. (But will the interviewer or man-
ager notice, if the college graduate is good at looking busy?) With every new
link that intervenes between the action and its consequence, intention has one
more chance to go astray. With every intervening link, information is lost, in-
centive is lost. And this bothers most people a lot less than it bothers me, or
why were all my classmates willing to say prayers without knowing what they
meant? They didn’t feel the same instinct to look-to-the-generator.
Can people learn to keep their eye on the ball? To keep their intention from
going astray? To never spring or strike or touch, without knowing the higher
goal they will complete in the same movement? People do often want to do
their jobs, all else being equal. Can there be such a thing as a sane corporation?
A sane civilization, even? That’s only a distant dream, but it’s what I’ve been
getting at with all of these essays on the flow of intentions (a.k.a. expected
utility, a.k.a. instrumental value) without losing purpose (a.k.a. utility, a.k.a.
terminal value). Can people learn to feel the flow of parent goals and child
goals? To know consciously, as well as implicitly, the distinction between
expected utility and utility?
Do you care about threats to your civilization? The worst metathreat to
complex civilization is its own complexity, for that complication leads to the
loss of many purposes.
I look back, and I see that more than anything, my life has been driven by an
exceptionally strong abhorrence to lost purposes. I hope it can be transformed
to a learnable skill.

1. Miyamoto Musashi, Book of Five Rings (New Line Publishing, 2003).

2. Carolyn J. Cherryh, The Paladin (Baen, 2002).


Part N

A Human’s Guide to Words


153
The Parable of the Dagger

(Adapted from Raymond Smullyan.1 )


Once upon a time, there was a court jester who dabbled in logic.
The jester presented the king with two boxes. Upon the first box was
inscribed:

Either this box contains an angry frog, or the box with a false
inscription contains an angry frog, but not both.

On the second box was inscribed:

Either this box contains gold and the box with a false inscription
contains an angry frog, or this box contains an angry frog and
the box with a true inscription contains gold.

And the jester said to the king: “One box contains an angry frog, the other box
gold; and one, and only one, of the inscriptions is true.”
The king opened the wrong box, and was savaged by an angry frog.
“You see,” the jester said, “let us hypothesize that the first inscription is
the true one. Then suppose the first box contains gold. Then the other box
would have an angry frog, while the box with a true inscription would contain
gold, which would make the second statement true as well. Now hypothesize
that the first inscription is false, and that the first box contains gold. Then the
second inscription would be—”
The king ordered the jester thrown in the dungeons.
A day later, the jester was brought before the king in chains and shown two
boxes.
“One box contains a key,” said the king, “to unlock your chains; and if you
find the key you are free. But the other box contains a dagger for your heart if
you fail.”
And the first box was inscribed:

Either both inscriptions are true, or both inscriptions are false.

And the second box was inscribed:

This box contains the key.

The jester reasoned thusly: “Suppose the first inscription is true. Then the
second inscription must also be true. Now suppose the first inscription is
false. Then again the second inscription must be true. So the second box must
contain the key, if the first inscription is true, and also if the first inscription is
false. Therefore, the second box must logically contain the key.”
The jester opened the second box, and found a dagger.
“How?!” cried the jester in horror, as he was dragged away. “It’s logically
impossible!”
“It is entirely possible,” replied the king. “I merely wrote those inscriptions
on two boxes, and then I put the dagger in the second one.”

1. Raymond M. Smullyan, What Is the Name of This Book?: The Riddle of Dracula and Other Logical
Puzzles (Penguin Books, 1990).
154
The Parable of Hemlock

All men are mortal. Socrates is a man. Therefore Socrates is


mortal.
—Standard Medieval syllogism

Socrates raised the glass of hemlock to his lips . . .


“Do you suppose,” asked one of the onlookers, “that even hemlock will not
be enough to kill so wise and good a man?”
“No,” replied another bystander, a student of philosophy; “all men are
mortal, and Socrates is a man; and if a mortal drinks hemlock, surely he dies.”
“Well,” said the onlooker, “what if it happens that Socrates isn’t mortal?”
“Nonsense,” replied the student, a little sharply; “all men are mortal by
definition; it is part of what we mean by the word ‘man.’ All men are mortal,
Socrates is a man, therefore Socrates is mortal. It is not merely a guess, but a
logical certainty.”
“I suppose that’s right . . .” said the onlooker. “Oh, look, Socrates already
drank the hemlock while we were talking.”
“Yes, he should be keeling over any minute now,” said the student.
And they waited, and they waited, and they waited . . .
“Socrates appears not to be mortal,” said the onlooker.
“Then Socrates must not be a man,” replied the student. “All men are
mortal, Socrates is not mortal, therefore Socrates is not a man. And that is not
merely a guess, but a logical certainty.”
The fundamental problem with arguing that things are true “by definition” is
that you can’t make reality go a different way by choosing a different definition.
You could reason, perhaps, as follows: “All things I have observed which
wear clothing, speak language, and use tools, have also shared certain other
properties as well, such as breathing air and pumping red blood. The last thirty
‘humans’ belonging to this cluster whom I observed to drink hemlock soon fell
over and stopped moving. Socrates wears a toga, speaks fluent ancient Greek,
and drank hemlock from a cup. So I predict that Socrates will keel over in the
next five minutes.”
But that would be mere guessing. It wouldn’t be, y’know, absolutely
and eternally certain. The Greek philosophers—like most prescientific
philosophers—were rather fond of certainty.
Luckily the Greek philosophers have a crushing rejoinder to your question-
ing. You have misunderstood the meaning of “All humans are mortal,” they
say. It is not a mere observation. It is part of the definition of the word “hu-
man.” Mortality is one of several properties that are individually necessary,
and together sufficient, to determine membership in the class “human.” The
statement “All humans are mortal” is a logically valid truth, absolutely unques-
tionable. And if Socrates is human, he must be mortal: it is a logical deduction,
as certain as certain can be.
But then we can never know for certain that Socrates is a “human” until
after Socrates has been observed to be mortal. It does no good to observe
that Socrates speaks fluent Greek, or that Socrates has red blood, or even that
Socrates has human DNA. None of these characteristics are logically equivalent
to mortality. You have to see him die before you can conclude that he was
human.
(And even then it’s not infinitely certain. What if Socrates rises from the
grave a night after you see him die? Or more realistically, what if Socrates is
signed up for cryonics? If mortality is defined to mean finite lifespan, then
you can never really know if someone was human, until you’ve observed to the
end of eternity—just to make sure they don’t come back. Or you could think
you saw Socrates keel over, but it could be an illusion projected onto your eyes
with a retinal scanner. Or maybe you just hallucinated the whole thing . . . )
The problem with syllogisms is that they’re always valid. “All humans are
mortal; Socrates is human; therefore Socrates is mortal” is—if you treat it as a
logical syllogism—logically valid within our own universe. It’s also logically
valid within neighboring Everett branches in which, due to a slightly different
evolved biochemistry, hemlock is a delicious treat rather than a poison. And
it’s logically valid even in universes where Socrates never existed, or for that
matter, where humans never existed.
The Bayesian definition of evidence favoring a hypothesis is evidence which
we are more likely to see if the hypothesis is true than if it is false. Observing
that a syllogism is logically valid can never be evidence favoring any empir-
ical proposition, because the syllogism will be logically valid whether that
proposition is true or false.
Syllogisms are valid in all possible worlds, and therefore, observing their
validity never tells us anything about which possible world we actually live in.
This doesn’t mean that logic is useless—just that logic can only tell us that
which, in some sense, we already know. But we do not always believe what we
know. Is the number 29,384,209 prime? By virtue of how I define my decimal
system and my axioms of arithmetic, I have already determined my answer to
this question—but I do not know what my answer is yet, and I must do some
logic to find out.
Similarly, if I form the uncertain empirical generalization “Humans are
vulnerable to hemlock,” and the uncertain empirical guess “Socrates is human,”
logic can tell me that my previous guesses are predicting that Socrates will be
vulnerable to hemlock.
It’s been suggested that we can view logical reasoning as resolving our
uncertainty about impossible possible worlds—eliminating probability mass in
logically impossible worlds which we did not know to be logically impossible.
In this sense, logical argument can be treated as observation.
But when you talk about an empirical prediction like “Socrates is going to
keel over and stop breathing” or “Socrates is going to do fifty jumping jacks and
then compete in the Olympics next year,” that is a matter of possible worlds,
not impossible possible worlds.
Logic can tell us which hypotheses match up to which observations, and it
can tell us what these hypotheses predict for the future—it can bring old obser-
vations and previous guesses to bear on a new problem. But logic never flatly
says, “Socrates will stop breathing now.” Logic never dictates any empirical
question; it never settles any real-world query which could, by any stretch of
the imagination, go either way.
Just remember the Litany Against Logic:

Logic stays true, wherever you may go,


So logic never tells you where you live.

*
155
Words as Hidden Inferences

Suppose I find a barrel, sealed at the top, but with a hole large enough for a
hand. I reach in and feel a small, curved object. I pull the object out, and it’s
blue—a bluish egg. Next I reach in and feel something hard and flat, with
edges—which, when I extract it, proves to be a red cube. I pull out 11 eggs and
8 cubes, and every egg is blue, and every cube is red.
Now I reach in and I feel another egg-shaped object. Before I pull it out
and look, I have to guess: What will it look like?
The evidence doesn’t prove that every egg in the barrel is blue and every
cube is red. The evidence doesn’t even argue this all that strongly: 19 is not a
large sample size. Nonetheless, I’ll guess that this egg-shaped object is blue—or
as a runner-up guess, red. If I guess anything else, there’s as many possibilities
as distinguishable colors—and for that matter, who says the egg has to be a
single shade? Maybe it has a picture of a horse painted on.
So I say “blue,” with a dutiful patina of humility. For I am a sophis-
ticated rationalist-type person, and I keep track of my assumptions and
dependencies—I guess, but I’m aware that I’m guessing . . . right?
But when a large yellow striped feline-shaped object leaps out at me from
the shadows, I think, “Yikes! A tiger!” Not, “Hm . . . objects with the properties
of largeness, yellowness, stripedness, and feline shape, have previously often
possessed the properties ‘hungry’ and ‘dangerous,’ and thus, although it is not
logically necessary, it may be an empirically good guess that aaauuughhhh
crunch crunch gulp.”
The human brain, for some odd reason, seems to have been adapted to
make this inference quickly, automatically, and without keeping explicit track
of its assumptions.
And if I name the egg-shaped objects “bleggs” (for blue eggs) and the red
cubes “rubes,” then, when I reach in and feel another egg-shaped object, I may
think, Oh, it’s a blegg, rather than considering all that problem-of-induction
stuff.
It is a common misconception that you can define a word any way you like.
This would be true if the brain treated words as purely logical constructs,
Aristotelian classes, and you never took out any more information than you
put in.
Yet the brain goes on about its work of categorization, whether or not we
consciously approve. “All humans are mortal; Socrates is a human; there-
fore Socrates is mortal”—thus spake the ancient Greek philosophers. Well, if
mortality is part of your logical definition of “human,” you can’t logically clas-
sify Socrates as human until you observe him to be mortal. But—this is the
problem—Aristotle knew perfectly well that Socrates was a human. Aristo-
tle’s brain placed Socrates in the “human” category as efficiently as your own
brain categorizes tigers, apples, and everything else in its environment: Swiftly,
silently, and without conscious approval.
Aristotle laid down rules under which no one could conclude Socrates was
“human” until after he died. Nonetheless, Aristotle and his students went on
concluding that living people were humans and therefore mortal; they saw
distinguishing properties such as human faces and human bodies, and their
brains made the leap to inferred properties such as mortality.
Misunderstanding the working of your own mind does not, thankfully,
prevent the mind from doing its work. Otherwise Aristotelians would have
starved, unable to conclude that an object was edible merely because it looked
and felt like a banana.
So the Aristotelians went on classifying environmental objects on the basis
of partial information, the way people had always done. Students of Aris-
totelian logic went on thinking exactly the same way, but they had acquired an
erroneous picture of what they were doing.
If you asked an Aristotelian philosopher whether Carol the grocer was
mortal, they would say “Yes.” If you asked them how they knew, they would
say “All humans are mortal; Carol is human; therefore Carol is mortal.” Ask
them whether it was a guess or a certainty, and they would say it was a certainty
(if you asked before the sixteenth century, at least). Ask them how they knew
that humans were mortal, and they would say it was established by definition.
The Aristotelians were still the same people, they retained their original
natures, but they had acquired incorrect beliefs about their own functioning.
They looked into the mirror of self-awareness, and saw something unlike their
true selves: they reflected incorrectly.
Your brain doesn’t treat words as logical definitions with no empirical
consequences, and so neither should you. The mere act of creating a word
can cause your mind to allocate a category, and thereby trigger unconscious
inferences of similarity. Or block inferences of similarity; if I create two labels
I can get your mind to allocate two categories. Notice how I said “you” and
“your brain” as if they were different things?
Making errors about the inside of your head doesn’t change what’s there;
otherwise Aristotle would have died when he concluded that the brain was
an organ for cooling the blood. Philosophical mistakes usually don’t interfere
with blink-of-an-eye perceptual inferences.
But philosophical mistakes can severely mess up the deliberate thinking
processes that we use to try to correct our first impressions. If you believe that
you can “define a word any way you like,” without realizing that your brain
goes on categorizing without your conscious oversight, then you won’t make
the effort to choose your definitions wisely.

*
156
Extensions and Intensions

“What is red?”
“Red is a color.”
“What’s a color?”
“A color is a property of a thing.”

But what is a thing? And what’s a property? Soon the two are lost in a maze of
words defined in other words, the problem that Steven Harnad once described
as trying to learn Chinese from a Chinese/Chinese dictionary.
Alternatively, if you asked me “What is red?” I could point to a stop sign,
then to someone wearing a red shirt, and a traffic light that happens to be red,
and blood from where I accidentally cut myself, and a red business card, and
then I could call up a color wheel on my computer and move the cursor to the
red area. This would probably be sufficient, though if you know what the word
“No” means, the truly strict would insist that I point to the sky and say “No.”
I think I stole this example from S. I. Hayakawa—though I’m really not
sure, because I heard this way back in the indistinct blur of my childhood.
(When I was twelve, my father accidentally deleted all my computer files. I
have no memory of anything before that.)
But that’s how I remember first learning about the difference between
intensional and extensional definition. To give an “intensional definition” is
to define a word or phrase in terms of other words, as a dictionary does. To
give an “extensional definition” is to point to examples, as adults do when
teaching children. The preceding sentence gives an intensional definition of
“extensional definition,” which makes it an extensional example of “intensional
definition.”
In Hollywood Rationality and popular culture generally, “rationalists” are
depicted as word-obsessed, floating in endless verbal space disconnected from
reality.
But the actual Traditional Rationalists have long insisted on maintaining a
tight connection to experience:

If you look into a textbook of chemistry for a definition of lithium,


you may be told that it is that element whose atomic weight is 7
very nearly. But if the author has a more logical mind he will tell
you that if you search among minerals that are vitreous, translu-
cent, grey or white, very hard, brittle, and insoluble, for one which
imparts a crimson tinge to an unluminous flame, this mineral be-
ing triturated with lime or witherite rats-bane, and then fused,
can be partly dissolved in muriatic acid; and if this solution be
evaporated, and the residue be extracted with sulphuric acid, and
duly purified, it can be converted by ordinary methods into a
chloride, which being obtained in the solid state, fused, and elec-
trolyzed with half a dozen powerful cells, will yield a globule of a
pinkish silvery metal that will float on gasolene; and the material
of that is a specimen of lithium.
—Charles Sanders Peirce1

That’s an example of “logical mind” as described by a genuine Traditional


Rationalist, rather than a Hollywood scriptwriter.
But note: Peirce isn’t actually showing you a piece of lithium. He didn’t
have pieces of lithium stapled to his book. Rather he’s giving you a treasure
map—an intensionally defined procedure which, when executed, will lead you
to an extensional example of lithium. This is not the same as just tossing you
a hunk of lithium, but it’s not the same as saying “atomic weight 7” either.
(Though if you had sufficiently sharp eyes, saying “3 protons” might let you
pick out lithium at a glance . . .)
So that is intensional and extensional definition, which is a way of telling
someone else what you mean by a concept. When I talked about “definitions”
above, I talked about a way of communicating concepts—telling someone else
what you mean by “red,” “tiger,” “human,” or “lithium.” Now let’s talk about
the actual concepts themselves.
The actual intension of my “tiger” concept would be the neural pattern (in
my temporal cortex) that inspects an incoming signal from the visual cortex
to determine whether or not it is a tiger.
The actual extension of my “tiger” concept is everything I call a tiger.
Intensional definitions don’t capture entire intensions; extensional defi-
nitions don’t capture entire extensions. If I point to just one tiger and say
the word “tiger,” the communication may fail if they think I mean “danger-
ous animal” or “male tiger” or “yellow thing.” Similarly, if I say “dangerous
yellow-black striped animal,” without pointing to anything, the listener may
visualize giant hornets.
You can’t capture in words all the details of the cognitive concept—as it
exists in your mind—that lets you recognize things as tigers or nontigers. It’s
too large. And you can’t point to all the tigers you’ve ever seen, let alone
everything you would call a tiger.
The strongest definitions use a crossfire of intensional and extensional
communication to nail down a concept. Even so, you only communicate maps
to concepts, or instructions for building concepts—you don’t communicate
the actual categories as they exist in your mind or in the world.
(Yes, with enough creativity you can construct exceptions to this rule, like
“Sentences Eliezer Yudkowsky has published containing the term ‘huragaloni’
as of Feb 4, 2008.” I’ve just shown you this concept’s entire extension. But
except in mathematics, definitions are usually treasure maps, not treasure.)
So that’s another reason you can’t “define a word any way you like”: You
can’t directly program concepts into someone else’s brain.
Even within the Aristotelian paradigm, where we pretend that the defini-
tions are the actual concepts, you don’t have simultaneous freedom of intension
and extension. Suppose I define Mars as “A huge red rocky sphere, around a
tenth of Earth’s mass and 50% further away from the Sun.” It’s then a sepa-
rate matter to show that this intensional definition matches some particular
extensional thing in my experience, or indeed, that it matches any real thing
whatsoever. If instead I say “That’s Mars” and point to a red light in the night
sky, it becomes a separate matter to show that this extensional light matches
any particular intensional definition I may propose—or any intensional beliefs
I may have—such as “Mars is the God of War.”
But most of the brain’s work of applying intensions happens sub-deliberately.
We aren’t consciously aware that our identification of a red light as “Mars” is
a separate matter from our verbal definition “Mars is the God of War.” No
matter what kind of intensional definition I make up to describe Mars, my
mind believes that “Mars” refers to this thingy, and that it is the fourth planet
in the Solar System.
When you take into account the way the human mind actually, pragmati-
cally works, the notion “I can define a word any way I like” soon becomes “I
can believe anything I want about a fixed set of objects” or “I can move any
object I want in or out of a fixed membership test.” Just as you can’t usually
convey a concept’s whole intension in words because it’s a big complicated neu-
ral membership test, you can’t control the concept’s entire intension because
it’s applied sub-deliberately. This is why arguing that XYZ is true “by defini-
tion” is so popular. If definition changes behaved like the empirical null-ops
they’re supposed to be, no one would bother arguing them. But abuse defini-
tions just a little, and they turn into magic wands—in arguments, of course;
not in reality.

1. Charles Sanders Peirce, Collected Papers (Harvard University Press, 1931).


157
Similarity Clusters

Once upon a time, the philosophers of Plato’s Academy claimed that the best
definition of human was a “featherless biped.” Diogenes of Sinope, also called
Diogenes the Cynic, is said to have promptly exhibited a plucked chicken
and declared “Here is Plato’s man.” The Platonists promptly changed their
definition to “a featherless biped with broad nails.”
No dictionary, no encyclopedia, has ever listed all the things that humans
have in common. We have red blood, five fingers on each of two hands, bony
skulls, 23 pairs of chromosomes—but the same might be said of other animal
species. We make complex tools to make complex tools, we use syntactical
combinatorial language, we harness critical fission reactions as a source of
energy: these things may serve out to single out only humans, but not all
humans—many of us have never built a fission reactor. With the right set of
necessary-and-sufficient gene sequences you could single out all humans, and
only humans—at least for now—but it would still be far from all that humans
have in common.
But so long as you don’t happen to be near a plucked chicken, saying “Look
for featherless bipeds” may serve to pick out a few dozen of the particular
things that are humans, as opposed to houses, vases, sandwiches, cats, colors,
or mathematical theorems.
Once the definition “featherless biped” has been bound to some particular
featherless bipeds, you can look over the group, and begin harvesting some
of the other characteristics—beyond mere featherfree twolegginess—that the
“featherless bipeds” seem to share in common. The particular featherless
bipeds that you see seem to also use language, build complex tools, speak
combinatorial language with syntax, bleed red blood if poked, die when they
drink hemlock.
Thus the category “human” grows richer, and adds more and more charac-
teristics; and when Diogenes finally presents his plucked chicken, we are not
fooled: This plucked chicken is obviously not similar to the other “featherless
bipeds.”
(If Aristotelian logic were a good model of human psychology, the Platonists
would have looked at the plucked chicken and said, “Yes, that’s a human; what’s
your point?”)
If the first featherless biped you see is a plucked chicken, then you may end
up thinking that the verbal label “human” denotes a plucked chicken; so I can
modify my treasure map to point to “featherless bipeds with broad nails,” and
if I am wise, go on to say, “See Diogenes over there? That’s a human, and I’m
a human, and you’re a human; and that chimpanzee is not a human, though
fairly close.”
The initial clue only has to lead the user to the similarity cluster—the group
of things that have many characteristics in common. After that, the initial clue
has served its purpose, and I can go on to convey the new information “humans
are currently mortal,” or whatever else I want to say about us featherless bipeds.
A dictionary is best thought of, not as a book of Aristotelian class definitions,
but a book of hints for matching verbal labels to similarity clusters, or matching
labels to properties that are useful in distinguishing similarity clusters.

*
158
Typicality and Asymmetrical
Similarity

Birds fly. Well, except ostriches don’t. But which is a more typical bird—a
robin, or an ostrich?
Which is a more typical chair: a desk chair, a rocking chair, or a beanbag
chair?
Most people would say that a robin is a more typical bird, and a desk chair
is a more typical chair. The cognitive psychologists who study this sort of thing
experimentally, do so under the heading of “typicality effects” or “prototype
effects.”1 For example, if you ask subjects to press a button to indicate “true”
or “false” in response to statements like “A robin is a bird” or “A penguin is a
bird,” reaction times are faster for more central examples.2 Typicality measures
correlate well using different investigative methods—reaction times are one
example; you can also ask people to directly rate, on a scale of 1 to 10, how
well an example (like a specific robin) fits a category (like “bird”).
So we have a mental measure of typicality—which might, perhaps, function
as a heuristic—but is there a corresponding bias we can use to pin it down?
Well, which of these statements strikes you as more natural: “98 is approxi-
mately 100,” or “100 is approximately 98”? If you’re like most people, the first
statement seems to make more sense.3 For similar reasons, people asked to rate
how similar Mexico is to the United States, gave consistently higher ratings
than people asked to rate how similar the United States is to Mexico.4
And if that still seems harmless, a study by Rips showed that people were
more likely to expect a disease would spread from robins to ducks on an island,
than from ducks to robins.5 Now this is not a logical impossibility, but in a
pragmatic sense, whatever difference separates a duck from a robin and would
make a disease less likely to spread from a duck to a robin, must also be a
difference between a robin and a duck, and would make a disease less likely to
spread from a robin to a duck.
Yes, you can come up with rationalizations, like “Well, there could be more
neighboring species of the robins, which would make the disease more likely
to spread initially, etc.,” but be careful not to try too hard to rationalize the
probability ratings of subjects who didn’t even realize there was a comparison
going on. And don’t forget that Mexico is more similar to the United States
than the United States is to Mexico, and that 98 is closer to 100 than 100 is
to 98. A simpler interpretation is that people are using the (demonstrated)
similarity heuristic as a proxy for the probability that a disease spreads, and
this heuristic is (demonstrably) asymmetrical.
Kansas is unusually close to the center of the United States, and Alaska is
unusually far from the center of the United States; so Kansas is probably closer
to most places in the US and Alaska is probably farther. It does not follow,
however, that Kansas is closer to Alaska than is Alaska to Kansas. But people
seem to reason (metaphorically speaking) as if closeness is an inherent property
of Kansas and distance is an inherent property of Alaska; so that Kansas is still
close, even to Alaska; and Alaska is still distant, even from Kansas.
So once again we see that Aristotle’s notion of categories—logical classes
with membership determined by a collection of properties that are individually
strictly necessary, and together strictly sufficient—is not a good model of
human cognitive psychology. (Science’s view has changed somewhat over
the last 2,350 years? Who would’ve thought?) We don’t even reason as if set
membership is a true-or-false property: statements of set membership can
be more or less true. (Note: This is not the same thing as being more or less
probable.)
One more reason not to pretend that you, or anyone else, is really going to
treat words as Aristotelian logical classes.

1. Eleanor Rosch, “Principles of Categorization,” in Cognition and Categorization, ed. Eleanor Rosch
and Barbara B. Lloyd (Hillsdale, NJ: Lawrence Erlbaum, 1978).

2. George Lakoff, Women, Fire, and Dangerous Things: What Categories Reveal about the Mind
(Chicago: Chicago University Press, 1987).

3. Jerrold Sadock, “Truth and Approximations,” Papers from the Third Annual Meeting of the Berkeley
Linguistics Society (1977): 430–439.

4. Amos Tversky and Itamar Gati, “Studies of Similarity,” in Cognition and Categorization, ed. Eleanor
Rosch and Barbara Lloyd (Hillsdale, NJ: Lawrence Erlbaum Associates, Inc., 1978), 79–98.

5. Lance J. Rips, “Inductive Judgments about Natural Categories,” Journal of Verbal Learning and
Verbal Behavior 14 (1975): 665–681.
159
The Cluster Structure of Thingspace

The notion of a “configuration space” is a way of translating object descriptions


into object positions. It may seem like blue is “closer” to blue-green than to red,
but how much closer? It’s hard to answer that question by just staring at the
colors. But it helps to know that the (proportional) color coordinates in RGB
are 0:0:5, 0:3:2, and 5:0:0. It would be even clearer if plotted on a 3D graph.
In the same way, you can see a robin as a robin—brown tail, red breast,
standard robin shape, maximum flying speed when unladen, its species-typical
DNA and individual alleles. Or you could see a robin as a single point in a
configuration space whose dimensions described everything we knew, or could
know, about the robin.
A robin is bigger than a virus, and smaller than an aircraft carrier—that
might be the “volume” dimension. Likewise a robin weighs more than a
hydrogen atom, and less than a galaxy; that might be the “mass” dimension.
Different robins will have strong correlations between “volume” and “mass,”
so the robin-points will be lined up in a fairly linear string, in those two
dimensions—but the correlation won’t be exact, so we do need two separate
dimensions.
This is the benefit of viewing robins as points in space: You couldn’t see
the linear lineup as easily if you were just imagining the robins as cute little
wing-flapping creatures.
A robin’s DNA is a highly multidimensional variable, but you can still think
of it as part of a robin’s location in thingspace—millions of quaternary coor-
dinates, one coordinate for each DNA base—or maybe a more sophisticated
view than that. The shape of the robin, and its color (surface reflectance), you
can likewise think of as part of the robin’s position in thingspace, even though
they aren’t single dimensions.
Just like the coordinate point 0:0:5 contains the same information as the
actual html color blue, we shouldn’t actually lose information when we see
robins as points in space. We believe the same statement about the robin’s mass
whether we visualize a robin balancing the scales opposite a 0.07-kilogram
weight, or a robin-point with a mass-coordinate of +70.
We can even imagine a configuration space with one or more dimensions
for every distinct characteristic of an object, so that the position of an object’s
point in this space corresponds to all the information in the real object itself.
Rather redundantly represented, too—dimensions would include the mass,
the volume, and the density.
If you think that’s extravagant, quantum physicists use an infinite-
dimensional configuration space, and a single point in that space describes the
location of every particle in the universe. So we’re actually being compara-
tively conservative in our visualization of thingspace—a point in thingspace
describes just one object, not the entire universe.
If we’re not sure of the robin’s exact mass and volume, then we can think
of a little cloud in thingspace, a volume of uncertainty, within which the robin
might be. The density of the cloud is the density of our belief that the robin has
that particular mass and volume. If you’re more sure of the robin’s density than
of its mass and volume, your probability-cloud will be highly concentrated in
the density dimension, and concentrated around a slanting line in the subspace
of mass/volume. (Indeed, the cloud here is actually a surface, because of the
relation V D = M.)
“Radial categories” are how cognitive psychologists describe the non-
Aristotelian boundaries of words. The central “mother” conceives her child,
gives birth to it, and supports it. Is an egg donor who never sees her child a
mother? She is the “genetic mother.” What about a woman who is implanted
with a foreign embryo and bears it to term? She is a “surrogate mother.” And
the woman who raises a child that isn’t hers genetically? Why, she’s an “adop-
tive mother.” The Aristotelian syllogism would run, “Humans have ten fingers,
Fred has nine fingers, therefore Fred is not a human,” but the way we actu-
ally think is “Humans have ten fingers, Fred is a human, therefore Fred is a
‘nine-fingered human.’ ”
We can think about the radial-ness of categories in intensional terms, as
described above—properties that are usually present, but optionally absent.
If we thought about the intension of the word “mother,” it might be like a
distributed glow in thingspace, a glow whose intensity matches the degree to
which that volume of thingspace matches the category “mother.” The glow is
concentrated in the center of genetics and birth and child-raising; the volume
of egg donors would also glow, but less brightly.
Or we can think about the radial-ness of categories extensionally. Suppose
we mapped all the birds in the world into thingspace, using a distance metric
that corresponds as well as possible to perceived similarity in humans: A robin
is more similar to another robin, than either is similar to a pigeon, but robins
and pigeons are all more similar to each other than either is to a penguin,
et cetera.
Then the center of all birdness would be densely populated by many neigh-
boring tight clusters, robins and sparrows and canaries and pigeons and many
other species. Eagles and falcons and other large predatory birds would occupy
a nearby cluster. Penguins would be in a more distant cluster, and likewise
chickens and ostriches.
The result might look, indeed, something like an astronomical cluster:
many galaxies orbiting the center, and a few outliers.
Or we could think simultaneously about both the intension of the cognitive
category “bird,” and its extension in real-world birds: The central clusters of
robins and sparrows glowing brightly with highly typical birdness; satellite
clusters of ostriches and penguins glowing more dimly with atypical birdness,
and Abraham Lincoln a few megaparsecs away and glowing not at all.
I prefer that last visualization—the glowing points—because as I see it,
the structure of the cognitive intension followed from the extensional cluster
structure. First came the structure-in-the-world, the empirical distribution
of birds over thingspace; then, by observing it, we formed a category whose
intensional glow roughly overlays this structure.
This gives us yet another view of why words are not Aristotelian classes:
the empirical clustered structure of the real universe is not so crystalline. A
natural cluster, a group of things highly similar to each other, may have no set
of necessary and sufficient properties—no set of characteristics that all group
members have, and no non-members have.
But even if a category is irrecoverably blurry and bumpy, there’s no need
to panic. I would not object if someone said that birds are “feathered flying
things.” But penguins don’t fly!—well, fine. The usual rule has an exception;
it’s not the end of the world. Definitions can’t be expected to exactly match
the empirical structure of thingspace in any event, because the map is smaller
and much less complicated than the territory. The point of the definition
“feathered flying things” is to lead the listener to the bird cluster, not to give a
total description of every existing bird down to the molecular level.
When you draw a boundary around a group of extensional points empir-
ically clustered in thingspace, you may find at least one exception to every
simple intensional rule you can invent.
But if a definition works well enough in practice to point out the intended
empirical cluster, objecting to it may justly be called “nitpicking.”

*
160
Disguised Queries

Imagine that you have a peculiar job in a peculiar factory: Your task is to
take objects from a mysterious conveyor belt, and sort the objects into two
bins. When you first arrive, Susan the Senior Sorter explains to you that blue
egg-shaped objects are called “bleggs” and go in the “blegg bin,” while red
cubes are called “rubes” and go in the “rube bin.”
Once you start working, you notice that bleggs and rubes differ in ways
besides color and shape. Bleggs have fur on their surface, while rubes are
smooth. Bleggs flex slightly to the touch; rubes are hard. Bleggs are opaque,
the rube’s surface slightly translucent.
Soon after you begin working, you encounter a blegg shaded an unusually
dark blue—in fact, on closer examination, the color proves to be purple, halfway
between red and blue.
Yet wait! Why are you calling this object a “blegg”? A “blegg” was originally
defined as blue and egg-shaped—the qualification of blueness appears in the
very name “blegg,” in fact. This object is not blue. One of the necessary
qualifications is missing; you should call this a “purple egg-shaped object,” not
a “blegg.”
But it so happens that, in addition to being purple and egg-shaped, the
object is also furred, flexible, and opaque. So when you saw the object, you
thought, “Oh, a strangely colored blegg.” It certainly isn’t a rube . . . right?
Still, you aren’t quite sure what to do next. So you call over Susan the Senior
Sorter.

“Oh, yes, it’s a blegg,” Susan says, “you can put it in the blegg
bin.”
You start to toss the purple blegg into the blegg bin, but pause
for a moment. “Susan,” you say, “how do you know this is a
blegg?”
Susan looks at you oddly. “Isn’t it obvious? This object may
be purple, but it’s still egg-shaped, furred, flexible, and opaque,
like all the other bleggs. You’ve got to expect a few color defects.
Or is this one of those philosophical conundrums, like ‘How do
you know the world wasn’t created five minutes ago complete
with false memories?’ In a philosophical sense I’m not absolutely
certain that this is a blegg, but it seems like a good guess.”
“No, I mean . . .” You pause, searching for words. “Why is
there a blegg bin and a rube bin? What’s the difference between
bleggs and rubes?”
“Bleggs are blue and egg-shaped, rubes are red and cube-
shaped,” Susan says patiently. “You got the standard orientation
lecture, right?”
“Why do bleggs and rubes need to be sorted?”
“Er . . . because otherwise they’d be all mixed up?” says Susan.
“Because nobody will pay us to sit around all day and not sort
bleggs and rubes?”
“Who originally determined that the first blue egg-shaped
object was a ‘blegg,’ and how did they determine that?”
Susan shrugs. “I suppose you could just as easily call the
red cube-shaped objects ‘bleggs’ and the blue egg-shaped objects
‘rubes,’ but it seems easier to remember this way.”
You think for a moment. “Suppose a completely mixed-up
object came off the conveyor. Like, an orange sphere-shaped
furred translucent object with writhing green tentacles. How
could I tell whether it was a blegg or a rube?”
“Wow, no one’s ever found an object that mixed up,” says
Susan, “but I guess we’d take it to the sorting scanner.”
“How does the sorting scanner work?” you inquire. “X-rays?
Magnetic resonance imaging? Fast neutron transmission spec-
troscopy?”
“I’m told it works by Bayes’s Rule, but I don’t quite understand
how,” says Susan. “I like to say it, though. Bayes Bayes Bayes Bayes
Bayes.”
“What does the sorting scanner tell you?”
“It tells you whether to put the object into the blegg bin or the
rube bin. That’s why it’s called a sorting scanner.”
At this point you fall silent.
“Incidentally,” Susan says casually, “it may interest you to
know that bleggs contain small nuggets of vanadium ore, and
rubes contain shreds of palladium, both of which are useful in-
dustrially.”
“Susan, you are pure evil.”
“Thank you.”

So now it seems we’ve discovered the heart and essence of bleggness: a blegg
is an object that contains a nugget of vanadium ore. Surface characteristics,
like blue color and furredness, do not determine whether an object is a blegg;
surface characteristics only matter because they help you infer whether an
object is a blegg, that is, whether the object contains vanadium.
Containing vanadium is a necessary and sufficient definition: all bleggs
contain vanadium and everything that contains vanadium is a blegg: “blegg”
is just a shorthand way of saying “vanadium-containing object.” Right?
Not so fast, says Susan: Around 98% of bleggs contain vanadium, but 2%
contain palladium instead. To be precise (Susan continues) around 98% of
blue egg-shaped furred flexible opaque objects contain vanadium. For unusual
bleggs, it may be a different percentage: 95% of purple bleggs contain vanadium,
92% of hard bleggs contain vanadium, etc.
Now suppose you find a blue egg-shaped furred flexible opaque object, an
ordinary blegg in every visible way, and just for kicks you take it to the sorting
scanner, and the scanner says “palladium”—this is one of the rare 2%. Is it a
blegg?
At first you might answer that, since you intend to throw this object in the
rube bin, you might as well call it a “rube.” However, it turns out that almost
all bleggs, if you switch off the lights, glow faintly in the dark, while almost all
rubes do not glow in the dark. And the percentage of bleggs that glow in the
dark is not significantly different for blue egg-shaped furred flexible opaque
objects that contain palladium, instead of vanadium. Thus, if you want to guess
whether the object glows like a blegg, or remains dark like a rube, you should
guess that it glows like a blegg.
So is the object really a blegg or a rube?
On one hand, you’ll throw the object in the rube bin no matter what else
you learn. On the other hand, if there are any unknown characteristics of the
object you need to infer, you’ll infer them as if the object were a blegg, not a
rube—group it into the similarity cluster of blue egg-shaped furred flexible
opaque things, and not the similarity cluster of red cube-shaped smooth hard
translucent things.
The question “Is this object a blegg?” may stand in for different queries on
different occasions.
If it weren’t standing in for some query, you’d have no reason to care.
Is atheism a “religion”? Is transhumanism a “cult”? People who argue that
atheism is a religion “because it states beliefs about God” are really trying to
argue (I think) that the reasoning methods used in atheism are on a par with
the reasoning methods used in religion, or that atheism is no safer than religion
in terms of the probability of causally engendering violence, etc. . . . What’s
really at stake is an atheist’s claim of substantial difference and superiority
relative to religion, which the religious person is trying to reject by denying
the difference rather than the superiority(!).
But that’s not the a priori irrational part: The a priori irrational part is where,
in the course of the argument, someone pulls out a dictionary and looks up
the definition of “atheism” or “religion.” (And yes, it’s just as silly whether an
atheist or religionist does it.) How could a dictionary possibly decide whether
an empirical cluster of atheists is really substantially different from an empirical
cluster of theologians? How can reality vary with the meaning of a word? The
points in thingspace don’t move around when we redraw a boundary.
But people often don’t realize that their argument about where to draw a
definitional boundary, is really a dispute over whether to infer a characteristic
shared by most things inside an empirical cluster . . .
Hence the phrase, “disguised query.”

*
161
Neural Categories

In Disguised Queries, I talked about a classification task of “bleggs” and


“rubes.” The typical blegg is blue, egg-shaped, furred, flexible, opaque, glows
in the dark, and contains vanadium. The typical rube is red, cube-shaped,
smooth, hard, translucent, unglowing, and contains palladium. For the sake of
simplicity, let us forget the characteristics of flexibility/hardness and opaque-
ness/translucency. This leaves five dimensions in thingspace: color, shape,
texture, luminance, and interior.
Suppose I want to create an Artificial Neural Network (ANN) to predict
unobserved blegg characteristics from observed blegg characteristics. And
suppose I’m fairly naive about ANNs: I’ve read excited popular science books
about how neural networks are distributed, emergent, and parallel just like the
human brain!! but I can’t derive the differential equations for gradient descent
in a non-recurrent multilayer network with sigmoid units (which is actually a
lot easier than it sounds).
Then I might design a neural network that looks something like Fig-
ure 161.1.
Network 1 is for classifying bleggs and rubes. But since “blegg” is an
unfamiliar and synthetic concept, I’ve also included a similar Network 1b in
Color:
+blue / -red

Shape: Luminance:
+egg / -cube +glow / -dark

Texture: Interior:
+furred / +vanadium /
-smooth -palladium

Figure 161.1: Network 1

Figure 161.2 for distinguishing humans from Space Monsters, with input from
Aristotle (“All men are mortal”) and Plato’s Academy (“A featherless biped
with broad nails”).
A neural network needs a learning rule. The obvious idea is that when two
nodes are often active at the same time, we should strengthen the connection
between them—this is one of the first rules ever proposed for training a neural
network, known as Hebb’s Rule.
Thus, if you often saw things that were both blue and furred—thus simul-
taneously activating the “color” node in the + state and the “texture” node
in the + state—the connection would strengthen between color and texture,
so that + colors activated + textures, and vice versa. If you saw things that
were blue and egg-shaped and vanadium-containing, that would strengthen
positive mutual connections between color and shape and interior.
Lifespan:
+mortal / -immortal

Nails: Feathers:
+broad / -talons +no / -yes

Legs: Blood:
+2 / -17 +red /
-glows green

Figure 161.2: Network 1b

Let’s say you’ve already seen plenty of bleggs and rubes come off the conveyor
belt. But now you see something that’s furred, egg-shaped, and—gasp!—
reddish purple (which we’ll model as a “color” activation level of −2/3). You
haven’t yet tested the luminance, or the interior. What to predict, what to
predict?
What happens then is that the activation levels in Network 1 bounce around
a bit. Positive activation flows to luminance from shape, negative activation
flows to interior from color, negative activation flows from interior to lumi-
nance . . . Of course all these messages are passed in parallel!! and asyn-
chronously!! just like the human brain . . .
Finally Network 1 settles into a stable state, which has high positive acti-
vation for “luminance” and “interior.” The network may be said to “expect”
(though it has not yet seen) that the object will glow in the dark, and that it
contains vanadium.
And lo, Network 1 exhibits this behavior even though there’s no explicit
node that says whether the object is a blegg or not. The judgment is implicit
in the whole network!! Bleggness is an attractor!! which arises as the result of
emergent behavior!! from the distributed!! learning rule.
Now in real life, this kind of network design—however faddish it may
sound—runs into all sorts of problems. Recurrent networks don’t always settle
right away: They can oscillate, or exhibit chaotic behavior, or just take a very
long time to settle down. This is a Bad Thing when you see something big
and yellow and striped, and you have to wait five minutes for your distributed
neural network to settle into the “tiger” attractor. Asynchronous and parallel
it may be, but it’s not real-time.
And there are other problems, like double-counting the evidence when
messages bounce back and forth: If you suspect that an object glows in the
dark, your suspicion will activate belief that the object contains vanadium,
which in turn will activate belief that the object glows in the dark.
Plus if you try to scale up the Network 1 design, it requires O(N 2 ) connec-
tions, where N is the total number of observables.
So what might be a more realistic neural network design?
In Network 2 of Figure 161.3, a wave of activation converges on the central
node from any clamped (observed) nodes, and then surges back out again
to any unclamped (unobserved) nodes. Which means we can compute the
answer in one step, rather than waiting for the network to settle—an important
requirement in biology when the neurons only run at 20Hz. And the network
architecture scales as O(N ), rather than O(N 2 ).
Admittedly, there are some things you can notice more easily with the first
network architecture than the second. Network 1 has a direct connection
between every two nodes. So if red objects never glow in the dark, but red
furred objects usually have the other blegg characteristics like egg-shape and
vanadium, Network 1 can easily represent this: it just takes a very strong direct
negative connection from color to luminance, but more powerful positive
connections from texture to all other nodes except luminance.
Color:
+blue / -red

Shape: Luminance:
+egg / -cube +glow / -dark

Category:
+BLEGG /
-RUBE

Texture: Interior:
+furred / +vanadium /
-smooth -palladium

Figure 161.3: Network 2

Nor is this a “special exception” to the general rule that bleggs glow—remember,
in Network 1, there is no unit that represents blegg-ness; blegg-ness emerges
as an attractor in the distributed network.
So yes, those O(N 2 ) connections were buying us something. But not very
much. Network 1 is not more useful on most real-world problems, where you
rarely find an animal stuck halfway between being a cat and a dog.
(There are also facts that you can’t easily represent in Network 1 or Network
2. Let’s say sea-blue color and spheroid shape, when found together, always
indicate the presence of palladium; but when found individually, without
the other, they are each very strong evidence for vanadium. This is hard to
represent, in either architecture, without extra nodes. Both Network 1 and
Network 2 embody implicit assumptions about what kind of environmental
structure is likely to exist; the ability to read this off is what separates the adults
from the babes, in machine learning.)
Make no mistake: Neither Network 1 nor Network 2 is biologically realistic.
But it still seems like a fair guess that however the brain really works, it is in
some sense closer to Network 2 than Network 1. Fast, cheap, scalable, works
well to distinguish dogs and cats: natural selection goes for that sort of thing
like water running down a fitness landscape.
It seems like an ordinary enough task to classify objects as either bleggs or
rubes, tossing them into the appropriate bin. But would you notice if sea-blue
objects never glowed in the dark?
Maybe, if someone presented you with twenty objects that were alike only in
being sea-blue, and then switched off the light, and none of the objects glowed.
If you got hit over the head with it, in other words. Perhaps by presenting you
with all these sea-blue objects in a group, your brain forms a new subcategory,
and can detect the “doesn’t glow” characteristic within that subcategory. But
you probably wouldn’t notice if the sea-blue objects were scattered among a
hundred other bleggs and rubes. It wouldn’t be easy or intuitive to notice, the
way that distinguishing cats and dogs is easy and intuitive.
Or: “Socrates is human, all humans are mortal, therefore Socrates is mortal.”
How did Aristotle know that Socrates was human? Well, Socrates had no
feathers, and broad nails, and walked upright, and spoke Greek, and, well,
was generally shaped like a human and acted like one. So the brain decides,
once and for all, that Socrates is human; and from there, infers that Socrates is
mortal like all other humans thus yet observed. It doesn’t seem easy or intuitive
to ask how much wearing clothes, as opposed to using language, is associated
with mortality. Just, “things that wear clothes and use language are human”
and “humans are mortal.”
Are there biases associated with trying to classify things into categories
once and for all? Of course there are. See e.g. Cultish Countercultishness.

*
162
How An Algorithm Feels From
Inside

“If a tree falls in the forest, and no one hears it, does it make a sound?” I
remember seeing an actual argument get started on this subject—a fully naive
argument that went nowhere near Berkeleian subjectivism. Just:

“It makes a sound, just like any other falling tree!”


“But how can there be a sound that no one hears?”

The standard rationalist view would be that the first person is speaking as if
“sound” means acoustic vibrations in the air; the second person is speaking
as if “sound” means an auditory experience in a brain. If you ask “Are there
acoustic vibrations?” or “Are there auditory experiences?,” the answer is at
once obvious. And so the argument is really about the definition of the word
“sound.”
I think the standard analysis is essentially correct. So let’s accept that as
a premise, and ask: Why do people get into such arguments? What’s the
underlying psychology?
A key idea of the heuristics and biases program is that mistakes are often
more revealing of cognition than correct answers. Getting into a heated dispute
about whether, if a tree falls in a deserted forest, it makes a sound, is traditionally
considered a mistake.
So what kind of mind design corresponds to that error?
In Disguised Queries I introduced the blegg/rube classification task, in
which Susan the Senior Sorter explains that your job is to sort objects coming
off a conveyor belt, putting the blue eggs or “bleggs” into one bin, and the red
cubes or “rubes” into the rube bin. This, it turns out, is because bleggs contain
small nuggets of vanadium ore, and rubes contain small shreds of palladium,
both of which are useful industrially.
Except that around 2% of blue egg-shaped objects contain palladium instead.
So if you find a blue egg-shaped thing that contains palladium, should you call
it a “rube” instead? You’re going to put it in the rube bin—why not call it a
“rube”?
But when you switch off the light, nearly all bleggs glow faintly in the dark.
And blue egg-shaped objects that contain palladium are just as likely to glow
in the dark as any other blue egg-shaped object.
So if you find a blue egg-shaped object that contains palladium and you ask
“Is it a blegg?,” the answer depends on what you have to do with the answer. If
you ask “Which bin does the object go in?,” then you choose as if the object
is a rube. But if you ask “If I turn off the light, will it glow?,” you predict as if
the object is a blegg. In one case, the question “Is it a blegg?” stands in for the
disguised query, “Which bin does it go in?” In the other case, the question “Is
it a blegg?” stands in for the disguised query, “Will it glow in the dark?”
Now suppose that you have an object that is blue and egg-shaped and
contains palladium; and you have already observed that it is furred, flexible,
opaque, and glows in the dark.
This answers every query, observes every observable introduced. There’s
nothing left for a disguised query to stand for.
So why might someone feel an impulse to go on arguing whether the object
is really a blegg?
These diagrams from Neural Categories show two different neural networks
that might be used to answer questions about bleggs and rubes. Network 1
(Figure 162.1) has a number of disadvantages—such as potentially oscillat-
Color:
+blue / -red

Shape: Luminance:
+egg / -cube +glow / -dark

Texture: Interior:
+furred / +vanadium /
-smooth -palladium

Figure 162.1: Network 1

ing/chaotic behavior, or requiring O(N 2 ) connections—but Network 1’s struc-


ture does have one major advantage over Network 2: every unit in the network
corresponds to a testable query. If you observe every observable, clamping
every value, there are no units in the network left over.
Network 2 (Figure 162.2), however, is a far better candidate for being some-
thing vaguely like how the human brain works: It’s fast, cheap, scalable—and
has an extra dangling unit in the center, whose activation can still vary, even
after we’ve observed every single one of the surrounding nodes.
Which is to say that even after you know whether an object is blue or
red, egg or cube, furred or smooth, bright or dark, and whether it contains
vanadium or palladium, it feels like there’s a leftover, unanswered question:
But is it really a blegg?
Color:
+blue / -red

Shape: Luminance:
+egg / -cube +glow / -dark

Category:
+BLEGG /
-RUBE

Texture: Interior:
+furred / +vanadium /
-smooth -palladium

Figure 162.2: Network 2

Usually, in our daily experience, acoustic vibrations and auditory experience


go together. But a tree falling in a deserted forest unbundles this common
association. And even after you know that the falling tree creates acoustic
vibrations but not auditory experience, it feels like there’s a leftover question:
Did it make a sound?
We know where Pluto is, and where it’s going; we know Pluto’s shape, and
Pluto’s mass—but is it a planet?
Now remember: When you look at Network 2, as I’ve laid it out here,
you’re seeing the algorithm from the outside. People don’t think to themselves,
“Should the central unit fire, or not?” any more than you think “Should neuron
#12,234,320,242 in my visual cortex fire, or not?”
It takes a deliberate effort to visualize your brain from the outside—and
then you still don’t see your actual brain; you imagine what you think is there.
Hopefully based on science, but regardless, you don’t have any direct access to
neural network structures from introspection. That’s why the ancient Greeks
didn’t invent computational neuroscience.
When you look at Network 2, you are seeing from the outside; but the way
that neural network structure feels from the inside, if you yourself are a brain
running that algorithm, is that even after you know every characteristic of the
object, you still find yourself wondering: “But is it a blegg, or not?”
This is a great gap to cross, and I’ve seen it stop people in their tracks.
Because we don’t instinctively see our intuitions as “intuitions,” we just see
them as the world. When you look at a green cup, you don’t think of yourself
as seeing a picture reconstructed in your visual cortex—although that is what
you are seeing—you just see a green cup. You think, “Why, look, this cup is
green,” not, “The picture in my visual cortex of this cup is green.”
And in the same way, when people argue over whether the falling tree makes
a sound, or whether Pluto is a planet, they don’t see themselves as arguing over
whether a categorization should be active in their neural networks. It seems
like either the tree makes a sound, or not.
We know where Pluto is, and where it’s going; we know Pluto’s shape, and
Pluto’s mass—but is it a planet? And yes, there were people who said this was a
fight over definitions—but even that is a Network 2 sort of perspective, because
you’re arguing about how the central unit ought to be wired up. If you were a
mind constructed along the lines of Network 1, you wouldn’t say “It depends
on how you define ‘planet,’ ” you would just say, “Given that we know Pluto’s
orbit and shape and mass, there is no question left to ask.” Or, rather, that’s
how it would feel—it would feel like there was no question left—if you were a
mind constructed along the lines of Network 1.
Before you can question your intuitions, you have to realize that what your
mind’s eye is looking at is an intuition—some cognitive algorithm, as seen
from the inside—rather than a direct perception of the Way Things Really Are.
People cling to their intuitions, I think, not so much because they believe
their cognitive algorithms are perfectly reliable, but because they can’t see their
intuitions as the way their cognitive algorithms happen to look from the inside.
And so everything you try to say about how the native cognitive algorithm
goes astray, ends up being contrasted to their direct perception of the Way
Things Really Are—and discarded as obviously wrong.

*
163
Disputing Definitions

I have watched more than one conversation—even conversations supposedly


about cognitive science—go the route of disputing over definitions. Taking
the classic example to be “If a tree falls in a forest, and no one hears it, does it
make a sound?,” the dispute often follows a course like this:

If a tree falls in the forest, and no one hears it, does it make a sound?
Albert: “Of course it does. What kind of silly question is
that? Every time I’ve listened to a tree fall, it made a sound, so
I’ll guess that other trees falling also make sounds. I don’t believe
the world changes around when I’m not looking.”
Barry: “Wait a minute. If no one hears it, how can it be a
sound?”

In this example, Barry is arguing with Albert because of a genuinely differ-


ent intuition about what constitutes a sound. But there’s more than one way
the Standard Dispute can start. Barry could have a motive for rejecting Al-
bert’s conclusion. Or Barry could be a skeptic who, upon hearing Albert’s
argument, reflexively scrutinized it for possible logical flaws; and then, on find-
ing a counterargument, automatically accepted it without applying a second
layer of search for a counter-counterargument; thereby arguing himself into
the opposite position. This doesn’t require that Barry’s prior intuition—the in-
tuition Barry would have had, if we’d asked him before Albert spoke—differs
from Albert’s.
Well, if Barry didn’t have a differing intuition before, he sure has one now.

Albert: “What do you mean, there’s no sound? The tree’s


roots snap, the trunk comes crashing down and hits the ground.
This generates vibrations that travel through the ground and the
air. That’s where the energy of the fall goes, into heat and sound.
Are you saying that if people leave the forest, the tree violates
Conservation of Energy?”
Barry: “But no one hears anything. If there are no humans
in the forest, or, for the sake of argument, anything else with a
complex nervous system capable of ‘hearing,’ then no one hears
a sound.”

Albert and Barry recruit arguments that feel like support for their respec-
tive positions, describing in more detail the thoughts that caused their
“sound”-detectors to fire or stay silent. But so far the conversation has still fo-
cused on the forest, rather than definitions. And note that they don’t actually
disagree on anything that happens in the forest.

Albert: “This is the dumbest argument I’ve ever been in.


You’re a niddlewicking fallumphing pickleplumber.”
Barry: “Yeah? Well, you look like your face caught on fire
and someone put it out with a shovel.”

Insult has been proffered and accepted; now neither party can back down
without losing face. Technically, this isn’t part of the argument, as rationalists
account such things; but it’s such an important part of the Standard Dispute
that I’m including it anyway.

Albert: “The tree produces acoustic vibrations. By definition,


that is a sound.”
Barry: “No one hears anything. By definition, that is not a
sound.”
The argument starts shifting to focus on definitions. Whenever you feel
tempted to say the words “by definition” in an argument that is not literally
about pure mathematics, remember that anything which is true “by definition”
is true in all possible worlds, and so observing its truth can never constrain
which world you live in.

Albert: “My computer’s microphone can record a sound


without anyone being around to hear it, store it as a file, and it’s
called a ‘sound file.’ And what’s stored in the file is the pattern
of vibrations in air, not the pattern of neural firings in anyone’s
brain. ‘Sound’ means a pattern of vibrations.”

Albert deploys an argument that feels like support for the word “sound” having
a particular meaning. This is a different kind of question from whether acoustic
vibrations take place in a forest—but the shift usually passes unnoticed.

Barry: “Oh, yeah? Let’s just see if the dictionary agrees with
you.”

There’s a lot of things I could be curious about in the falling-tree scenario.


I could go into the forest and look at trees, or learn how to derive the wave
equation for changes of air pressure, or examine the anatomy of an ear, or
study the neuroanatomy of the auditory cortex. Instead of doing any of these
things, I am to consult a dictionary, apparently. Why? Are the editors of the
dictionary expert botanists, expert physicists, expert neuroscientists? Looking
in an encyclopedia might make sense, but why a dictionary?

Albert: “Hah! Definition 2c in Merriam-Webster: ‘Sound:


Mechanical radiant energy that is transmitted by longitudinal
pressure waves in a material medium (as air).’ ”
Barry: “Hah! Definition 2b in Merriam-Webster: ‘Sound:
The sensation perceived by the sense of hearing.’ ”
Albert and Barry, chorus: “Consarned dictionary! This
doesn’t help at all!”

Dictionary editors are historians of usage, not legislators of language. Dictio-


nary editors find words in current usage, then write down the words next to (a
small part of) what people seem to mean by them. If there’s more than one
usage, the editors write down more than one definition.

Albert: “Look, suppose that I left a microphone in the forest


and recorded the pattern of the acoustic vibrations of the tree
falling. If I played that back to someone, they’d call it a ‘sound’!
That’s the common usage! Don’t go around making up your own
wacky definitions!”
Barry: “One, I can define a word any way I like so long as I
use it consistently. Two, the meaning I gave was in the dictionary.
Three, who gave you the right to decide what is or isn’t common
usage?”

There’s quite a lot of rationality errors in the Standard Dispute. Some of them
I’ve already covered, and some of them I’ve yet to cover; likewise the remedies.
But for now, I would just like to point out—in a mournful sort of way—that
Albert and Barry seem to agree on virtually every question of what is actually
going on inside the forest, and yet it doesn’t seem to generate any feeling of
agreement.
Arguing about definitions is a garden path; people wouldn’t go down the
path if they saw at the outset where it led. If you asked Albert (Barry) why he’s
still arguing, he’d probably say something like: “Barry (Albert) is trying to
sneak in his own definition of ‘sound,’ the scurvey scoundrel, to support his
ridiculous point; and I’m here to defend the standard definition.”
But suppose I went back in time to before the start of the argument:

(Eliezer appears from nowhere in a peculiar conveyance that looks


just like the time machine from the original The Time Machine
movie.)
Barry: “Gosh! A time traveler!”
Eliezer: “I am a traveler from the future! Hear my words! I
have traveled far into the past—around fifteen minutes—”
Albert: “Fifteen minutes?”
Eliezer: “—to bring you this message!”
(There is a pause of mixed confusion and expectancy.)
Eliezer: “Do you think that ‘sound’ should be defined to
require both acoustic vibrations (pressure waves in air) and also
auditory experiences (someone to listen to the sound), or should
‘sound’ be defined as meaning only acoustic vibrations, or only
auditory experience?”
Barry: “You went back in time to ask us that?”
Eliezer: “My purposes are my own! Answer!”
Albert: “Well . . . I don’t see why it would matter. You can
pick any definition so long as you use it consistently.”
Barry: “Flip a coin. Er, flip a coin twice.”
Eliezer: “Personally I’d say that if the issue arises, both sides
should switch to describing the event in unambiguous lower-level
constituents, like acoustic vibrations or auditory experiences. Or
each side could designate a new word, like ‘alberzle’ and ‘bargu-
lum,’ to use for what they respectively used to call ‘sound’; and
then both sides could use the new words consistently. That way
neither side has to back down or lose face, but they can still com-
municate. And of course you should try to keep track, at all times,
of some testable proposition that the argument is actually about.
Does that sound right to you?”
Albert: “I guess . . .”
Barry: “Why are we talking about this?”
Eliezer: “To preserve your friendship against a contingency
you will, now, never know. For the future has already changed!”
(Eliezer and the machine vanish in a puff of smoke.)
Barry: “Where were we again?”
Albert: “Oh, yeah: If a tree falls in the forest, and no one
hears it, does it make a sound?”
Barry: “It makes an alberzle but not a bargulum. What’s the
next question?”

This remedy doesn’t destroy every dispute over categorizations. But it destroys
a substantial fraction.

*
164
Feel the Meaning

When I hear someone say, “Oh, look, a butterfly,” the spoken phonemes “but-
terfly” enter my ear and vibrate on my ear drum, being transmitted to the
cochlea, tickling auditory nerves that transmit activation spikes to the auditory
cortex, where phoneme processing begins, along with recognition of words,
and reconstruction of syntax (a by no means serial process), and all manner of
other complications.
But at the end of the day, or rather, at the end of the second, I am primed to
look where my friend is pointing and see a visual pattern that I will recognize
as a butterfly; and I would be quite surprised to see a wolf instead.
My friend looks at a butterfly, his throat vibrates and lips move, the pressure
waves travel invisibly through the air, my ear hears and my nerves transduce
and my brain reconstructs, and lo and behold, I know what my friend is looking
at. Isn’t that marvelous? If we didn’t know about the pressure waves in the
air, it would be a tremendous discovery in all the newspapers: Humans are
telepathic! Human brains can transfer thoughts to each other!
Well, we are telepathic, in fact; but magic isn’t exciting when it’s merely
real, and all your friends can do it too.
Think telepathy is simple? Try building a computer that will be telepathic
with you. Telepathy, or “language,” or whatever you want to call our partial
thought transfer ability, is more complicated than it looks.
But it would be quite inconvenient to go around thinking, “Now I shall
partially transduce some features of my thoughts into a linear sequence of
phonemes which will invoke similar thoughts in my conversational partner . . .”
So the brain hides the complexity—or rather, never represents it in the first
place—which leads people to think some peculiar thoughts about words.
As I remarked earlier, when a large yellow striped object leaps at me, I
think “Yikes! A tiger!” not “Hm . . . objects with the properties of largeness,
yellowness, and stripedness have previously often possessed the properties
‘hungry’ and ‘dangerous,’ and therefore, although it is not logically necessary,
auughhhh crunch crunch gulp.”
Similarly, when someone shouts “Yikes! A tiger!,” natural selection would
not favor an organism that thought, “Hm . . . I have just heard the syllables
‘Tie’ and ‘Grr’ which my fellow tribe members associate with their internal
analogues of my own tiger concept, and which they are more likely to utter if
they see an object they categorize as aiiieeee crunch crunch help it’s got my
arm crunch gulp.”
Considering this as a design constraint on the human cognitive architecture,
you wouldn’t want any extra steps between when your auditory cortex recog-
nizes the syllables “tiger,” and when the tiger concept gets activated.
Going back to the parable of bleggs and rubes, and the centralized network
that categorizes quickly and cheaply, you might visualize a direct connection
running from the unit that recognizes the syllable “blegg” to the unit at the
center of the blegg network. The central unit, the blegg concept, gets activated
almost as soon as you hear Susan the Senior Sorter say, “Blegg!”
Or, for purposes of talking—which also shouldn’t take eons—as soon as
you see a blue egg-shaped thing and the central blegg unit fires, you holler
“Blegg!” to Susan.
And what that algorithm feels like from inside is that the label, and the
concept, are very nearly identified; the meaning feels like an intrinsic property
of the word itself.
Color:
+blue / -red
Blegg!

Shape: Luminance:
+egg / -cube +glow / -dark

Category:
+BLEGG /
-RUBE

Texture: Interior:
+furred / +vanadium /
-smooth -palladium

Figure 164.1: Network 3

The cognoscenti will recognize this as a case of E. T. Jaynes’s “Mind Projection


Fallacy.” It feels like a word has a meaning, as a property of the word itself; just
like how redness is a property of a red apple, or mysteriousness is a property
of a mysterious phenomenon.
Indeed, on most occasions, the brain will not distinguish at all between
the word and the meaning—only bothering to separate the two while learning
a new language, perhaps. And even then, you’ll see Susan pointing to a blue
egg-shaped thing and saying “Blegg!,” and you’ll think, I wonder what “blegg”
means, and not, I wonder what mental category Susan associates to the auditory
label “blegg.”
Consider, in this light, the part of the Standard Dispute of Definitions where
the two parties argue about what the word “sound” really means—the same
way they might argue whether a particular apple is really red or green:
Albert: “My computer’s microphone can record a sound
without anyone being around to hear it, store it as a file, and it’s
called a ‘sound file.’ And what’s stored in the file is the pattern
of vibrations in air, not the pattern of neural firings in anyone’s
brain. ‘Sound’ means a pattern of vibrations.”
Barry: “Oh, yeah? Let’s just see if the dictionary agrees with
you.”

Albert feels intuitively that the word “sound” has a meaning and that the
meaning is acoustic vibrations. Just as Albert feels that a tree falling in the
forest makes a sound (rather than causing an event that matches the sound
category).
Barry likewise feels that:

sound.meaning == auditory experiences


forest.sound == false .

Rather than:

myBrain.FindConcept("sound") ==
concept_AuditoryExperience
concept_AuditoryExperience.match(forest) ==
false .

Which is closer to what’s really going on; but humans have not evolved to know
this, anymore than humans instinctively know the brain is made of neurons.
Albert and Barry’s conflicting intuitions provide the fuel for continuing the
argument in the phase of arguing over what the word “sound” means—which
feels like arguing over a fact like any other fact, like arguing over whether the
sky is blue or green.
You may not even notice that anything has gone astray, until you try to
perform the rationalist ritual of stating a testable experiment whose result
depends on the facts you’re so heatedly disputing . . .

*
165
The Argument from Common Usage

Part of the Standard Definitional Dispute runs as follows:

Albert: “Look, suppose that I left a microphone in the forest


and recorded the pattern of the acoustic vibrations of the tree
falling. If I played that back to someone, they’d call it a ‘sound’!
That’s the common usage! Don’t go around making up your own
wacky definitions!”
Barry: “One, I can define a word any way I like so long as I
use it consistently. Two, the meaning I gave was in the dictionary.
Three, who gave you the right to decide what is or isn’t common
usage?”

Not all definitional disputes progress as far as recognizing the notion of com-
mon usage. More often, I think, someone picks up a dictionary because they
believe that words have meanings, and the dictionary faithfully records what
this meaning is. Some people even seem to believe that the dictionary deter-
mines the meaning—that the dictionary editors are the Legislators of Language.
Maybe because back in elementary school, their authority-teacher said that
they had to obey the dictionary, that it was a mandatory rule rather than an
optional one?
Dictionary editors read what other people write, and record what the words
seem to mean; they are historians. The Oxford English Dictionary may be
comprehensive, but never authoritative.
But surely there is a social imperative to use words in a commonly under-
stood way? Does not our human telepathy, our valuable power of language,
rely on mutual coordination to work? Perhaps we should voluntarily treat dic-
tionary editors as supreme arbiters—even if they prefer to think of themselves
as historians—in order to maintain the quiet cooperation on which all speech
depends.
The phrase “authoritative dictionary” is almost never used correctly, an
example of proper usage being The Authoritative Dictionary of ieee Standards
Terms. The ieee is a body of voting members who have a professional need for
exact agreement on terms and definitions, and so The Authoritative Dictionary
of ieee Standards Terms is actual, negotiated legislation, which exerts whatever
authority one regards as residing in the ieee.
In everyday life, shared language usually does not arise from a deliberate
agreement, as of the ieee. It’s more a matter of infection, as words are invented
and diffuse through the culture. (A “meme,” one might say, following Richard
Dawkins forty years ago—but you already know what I mean, and if not, you
can look it up on Google, and then you too will have been infected.)
Yet as the example of the ieee shows, agreement on language can also
be a cooperatively established public good. If you and I wish to undergo an
exchange of thoughts via language, the human telepathy, then it is in our mutual
interest that we use the same word for similar concepts—preferably, concepts
similar to the limit of resolution in our brain’s representation thereof—even
though we have no obvious mutual interest in using any particular word for a
concept.
We have no obvious mutual interest in using the word “oto” to mean sound,
or “sound” to mean oto; but we have a mutual interest in using the same word,
whichever word it happens to be. (Preferably, words we use frequently should
be short, but let’s not get into information theory just yet.)
But, while we have a mutual interest, it is not strictly necessary that you
and I use the similar labels internally; it is only convenient. If I know that, to
you, “oto” means sound—that is, you associate “oto” to a concept very similar
to the one I associate to “sound”—then I can say “Paper crumpling makes a
crackling oto.” It requires extra thought, but I can do it if I want.
Similarly, if you say “What is the walking-stick of a bowling ball dropping
on the floor?” and I know which concept you associate with the syllables
“walking-stick,” then I can figure out what you mean. It may require some
thought, and give me pause, because I ordinarily associate “walking-stick” with
a different concept. But I can do it just fine.
When humans really want to communicate with each other, we’re hard to
stop! If we’re stuck on a deserted island with no common language, we’ll take
up sticks and draw pictures in sand.
Albert’s appeal to the Argument from Common Usage assumes that agree-
ment on language is a cooperatively established public good. Yet Albert as-
sumes this for the sole purpose of rhetorically accusing Barry of breaking
the agreement, and endangering the public good. Now the falling-tree argu-
ment has gone all the way from botany to semantics to politics; and so Barry
responds by challenging Albert for the authority to define the word.
A rationalist, with the discipline of hugging the query active, would notice
that the conversation had gone rather far astray.
Oh, dear reader, is it all really necessary? Albert knows what Barry means by
“sound.” Barry knows what Albert means by “sound.” Both Albert and Barry
have access to words, such as “acoustic vibrations” or “auditory experience,”
which they already associate to the same concepts, and which can describe
events in the forest without ambiguity. If they were stuck on a deserted island,
trying to communicate with each other, their work would be done.
When both sides know what the other side wants to say, and both sides
accuse the other side of defecting from “common usage,” then whatever it is
they are about, it is clearly not working out a way to communicate with each
other. But this is the whole benefit that common usage provides in the first
place.
Why would you argue about the meaning of a word, two sides trying to
wrest it back and forth? If it’s just a namespace conflict that has gotten blown
out of proportion, and nothing more is at stake, then the two sides need merely
generate two new words and use them consistently.
Yet often categorizations function as hidden inferences and disguised
queries. Is atheism a “religion”? If someone is arguing that the reasoning
methods used in atheism are on a par with the reasoning methods used in Ju-
daism, or that atheism is on a par with Islam in terms of causally engendering
violence, then they have a clear argumentative stake in lumping it all together
into an indistinct gray blur of “faith.”
Or consider the fight to blend together blacks and whites as “people.” This
would not be a time to generate two words—what’s at stake is exactly the idea
that you shouldn’t draw a moral distinction.
But once any empirical proposition is at stake, or any moral proposition,
you can no longer appeal to common usage.
If the question is how to cluster together similar things for purposes of
inference, empirical predictions will depend on the answer; which means that
definitions can be wrong. A conflict of predictions cannot be settled by an
opinion poll.
If you want to know whether atheism should be clustered with supernatural-
ist religions for purposes of some particular empirical inference, the dictionary
can’t answer you.
If you want to know whether blacks are people, the dictionary can’t answer
you.
If everyone believes that the red light in the sky is Mars the God of War, the
dictionary will define “Mars” as the God of War. If everyone believes that fire
is the release of phlogiston, the dictionary will define “fire” as the release of
phlogiston.
There is an art to using words; even when definitions are not literally true
or false, they are often wiser or more foolish. Dictionaries are mere histories
of past usage; if you treat them as supreme arbiters of meaning, it binds you to
the wisdom of the past, forbidding you to do better.
Though do take care to ensure (if you must depart from the wisdom of the
past) that people can figure out what you’re trying to swim.

*
166
Empty Labels

Consider (yet again) the Aristotelian idea of categories. Let’s say that there’s
some object with properties A, B, C, D, and E, or at least it looks E-ish.

Fred: “You mean that thing over there is blue, round, fuzzy,
and—”
Me: “In Aristotelian logic, it’s not supposed to make a differ-
ence what the properties are, or what I call them. That’s why I’m
just using the letters.”

Next, I invent the Aristotelian category “zawa,” which describes those objects,
all those objects, and only those objects, that have properties A, C, and D.

Me: “Object 1 is zawa, B, and E.”


Fred: “And it’s blue—I mean, A—too, right?”
Me: “That’s implied when I say it’s zawa.”
Fred: “Still, I’d like you to say it explicitly.”
Me: “Okay. Object 1 is A, B, zawa, and E.”

Then I add another word, “yokie,” which describes all and only objects that are
B and E; and the word “xippo,” which describes all and only objects which
are E but not D.
Me: “Object 1 is zawa and yokie, but not xippo.”
Fred: “Wait, is it luminescent? I mean, is it E?”
Me: “Yes. That is the only possibility on the information
given.”
Fred: “I’d rather you spelled it out.”
Me: “Fine: Object 1 is A, zawa, B, yokie, C, D, E, and not
xippo.”
Fred: “Amazing! You can tell all that just by looking?”

Impressive, isn’t it? Let’s invent even more new words: “Bolo” is A, C, and
yokie; “mun” is A, C, and xippo; and “merlacdonian” is bolo and mun.
Pointlessly confusing? I think so too. Let’s replace the labels with the
definitions:

“Zawa, B, and E” becomes [A, C, D], B, E


“Bolo and A” becomes [A, C, [B, E]], A
“Merlacdonian” becomes [A, C, [B, E]], [A, C, [E, ¬D]].

And the thing to remember about the Aristotelian idea of categories is that
[A, C, D] is the entire information of “zawa.” It’s not just that I can vary the
label, but that I can get along just fine without any label at all—the rules for
Aristotelian classes work purely on structures like [A, C, D]. To call one of
these structures “zawa,” or attach any other label to it, is a human convenience
(or inconvenience) which makes not the slightest difference to the Aristotelian
rules.
Let’s say that “human” is to be defined as a mortal featherless biped. Then
the classic syllogism would have the form:

All [mortal, ¬feathers, bipedal] are mortal.


Socrates is a [mortal, ¬feathers, bipedal].
Therefore, Socrates is mortal.

The feat of reasoning looks a lot less impressive now, doesn’t it?
Here the illusion of inference comes from the labels, which conceal the
premises, and pretend to novelty in the conclusion. Replacing labels with
definitions reveals the illusion, making visible the tautology’s empirical un-
helpfulness. You can never say that Socrates is a [mortal, ¬feathers, biped]
until you have observed him to be mortal.
There’s an idea, which you may have noticed I hate, that “you can de-
fine a word any way you like.” This idea came from the Aristotelian notion
of categories; since, if you follow the Aristotelian rules exactly and without
flaw—which humans never do; Aristotle knew perfectly well that Socrates was
human, even though that wasn’t justified under his rules—but, if some imagi-
nary nonhuman entity were to follow the rules exactly, they would never arrive
at a contradiction. They wouldn’t arrive at much of anything: they couldn’t
say that Socrates is a [mortal, ¬feathers, biped] until they observed him to be
mortal.
But it’s not so much that labels are arbitrary in the Aristotelian system, as
that the Aristotelian system works fine without any labels at all—it cranks out
exactly the same stream of tautologies, they just look a lot less impressive. The
labels are only there to create the illusion of inference.
So if you’re going to have an Aristotelian proverb at all, the proverb should
be, not “I can define a word any way I like,” nor even, “Defining a word never
has any consequences,” but rather, “Definitions don’t need words.”

*
167
Taboo Your Words

In the game Taboo (by Hasbro), the objective is for a player to have their partner
guess a word written on a card, without using that word or five additional words
listed on the card. For example, you might have to get your partner to say
“baseball” without using the words “sport,” “bat,” “hit,” “pitch,” “base” or of
course “baseball.”
As soon as I see a problem like that, I at once think, “An artificial group
conflict in which you use a long wooden cylinder to whack a thrown spheroid,
and then run between four safe positions.” It might not be the most efficient
strategy to convey the word “baseball” under the stated rules—that might be,
“It’s what the Yankees play”—but the general skill of blanking a word out of my
mind was one I’d practiced for years, albeit with a different purpose.
In the previous essay we saw how replacing terms with definitions could
reveal the empirical unproductivity of the classical Aristotelian syllogism. All
humans are mortal (and also, apparently, featherless bipeds); Socrates is hu-
man; therefore Socrates is mortal. When we replace the word “human” by its
apparent definition, the following underlying reasoning is revealed:

All [mortal, ¬feathers, biped] are mortal;


Socrates is a [mortal, ¬feathers, biped];
Therefore Socrates is mortal.

But the principle of replacing words by definitions applies much more broadly:

Albert: “A tree falling in a deserted forest makes a sound.”


Barry: “A tree falling in a deserted forest does not make a
sound.”

Clearly, since one says “sound” and one says “not sound,” we must have a
contradiction, right? But suppose that they both dereference their pointers
before speaking:

Albert: “A tree falling in a deserted forest matches [mem-


bership test: this event generates acoustic vibrations].”
Barry: “A tree falling in a deserted forest does not match
[membership test: this event generates auditory experiences].”

Now there is no longer an apparent collision—all they had to do was prohibit


themselves from using the word sound. If “acoustic vibrations” came into
dispute, we would just play Taboo again and say “pressure waves in a material
medium”; if necessary we would play Taboo again on the word “wave” and
replace it with the wave equation. (Play Taboo on “auditory experience” and
you get “That form of sensory processing, within the human brain, that takes
as input a linear time series of frequency mixes . . .”)
But suppose, on the other hand, that Albert and Barry were to have the
argument:

Albert: “Socrates matches the concept [membership test:


this person will die after drinking hemlock].”
Barry: “Socrates matches the concept [membership test: this
person will not die after drinking hemlock].”

Now Albert and Barry have a substantive clash of expectations; a difference in


what they anticipate seeing after Socrates drinks hemlock. But they might not
notice this, if they happened to use the same word “human” for their different
concepts.
You get a very different picture of what people agree or disagree about, de-
pending on whether you take a label’s-eye-view (Albert says “sound” and Barry
says “not sound,” so they must disagree) or taking the test’s-eye-view (Albert’s
membership test is acoustic vibrations, Barry’s is auditory experience).
Get together a pack of soi-disant futurists and ask them if they believe we’ll
have Artificial Intelligence in thirty years, and I would guess that at least half
of them will say yes. If you leave it at that, they’ll shake hands and congratulate
themselves on their consensus. But make the term “Artificial Intelligence”
taboo, and ask them to describe what they expect to see, without ever using
words like “computers” or “think,” and you might find quite a conflict of
expectations hiding under that featureless standard word. See also Shane
Legg’s compilation of 71 definitions of “intelligence.”
The illusion of unity across religions can be dispelled by making the term
“God” taboo, and asking them to say what it is they believe in; or making the
word “faith” taboo, and asking them why they believe it. Though mostly they
won’t be able to answer at all, because it is mostly profession in the first place,
and you cannot cognitively zoom in on an audio recording.
When you find yourself in philosophical difficulties, the first line of defense
is not to define your problematic terms, but to see whether you can think without
using those terms at all. Or any of their short synonyms. And be careful not to
let yourself invent a new word to use instead. Describe outward observables
and interior mechanisms; don’t use a single handle, whatever that handle may
be.
Albert says that people have “free will.” Barry says that people don’t have
“free will.” Well, that will certainly generate an apparent conflict. Most philoso-
phers would advise Albert and Barry to try to define exactly what they mean by
“free will,” on which topic they will certainly be able to discourse at great length.
I would advise Albert and Barry to describe what it is that they think people
do, or do not have, without using the phrase “free will” at all. (If you want to
try this at home, you should also avoid the words “choose,” “act,” “decide,”
“determined,” “responsible,” or any of their synonyms.)
This is one of the nonstandard tools in my toolbox, and in my humble
opinion, it works way way better than the standard one. It also requires more
effort to use; you get what you pay for.

*
168
Replace the Symbol with the
Substance

What does it take to—as in the previous essay’s example—see a “baseball game”
as “An artificial group conflict in which you use a long wooden cylinder to
whack a thrown spheroid, and then run between four safe positions”? What
does it take to play the rationalist version of Taboo, in which the goal is not to
find a synonym that isn’t on the card, but to find a way of describing without
the standard concept-handle?
You have to visualize. You have to make your mind’s eye see the details, as
though looking for the first time. You have to perform an Original Seeing.
Is that a “bat”? No, it’s a long, round, tapering, wooden rod, narrowing at
one end so that a human can grasp and swing it.
Is that a “ball”? No, it’s a leather-covered spheroid with a symmetrical
stitching pattern, hard but not metal-hard, which someone can grasp and
throw, or strike with the wooden rod, or catch.
Are those “bases”? No, they’re fixed positions on a game field, that players
try to run to as quickly as possible because of their safety within the game’s
artificial rules.
The chief obstacle to performing an original seeing is that your mind already
has a nice neat summary, a nice little easy-to-use concept handle. Like the
word “baseball,” or “bat,” or “base.” It takes an effort to stop your mind from
sliding down the familiar path, the easy path, the path of least resistance, where
the small featureless word rushes in and obliterates the details you’re trying
to see. A word itself can have the destructive force of cliché; a word itself can
carry the poison of a cached thought.
Playing the game of Taboo—being able to describe without using the stan-
dard pointer/label/handle—is one of the fundamental rationalist capacities. It
occupies the same primordial level as the habit of constantly asking “Why?” or
“What does this belief make me anticipate?”
The art is closely related to:

• Pragmatism, because seeing in this way often gives you a much closer
connection to anticipated experience, rather than propositional belief;

• Reductionism, because seeing in this way often forces you to drop down
to a lower level of organization, look at the parts instead of your eye
skipping over the whole;

• Hugging the query, because words often distract you from the question
you really want to ask;

• Avoiding cached thoughts, which will rush in using standard words, so


you can block them by tabooing standard words;

• The writer’s rule of “Show, don’t tell!,” which has power among ratio-
nalists;

• And not losing sight of your original purpose.

How could tabooing a word help you keep your purpose?


From Lost Purposes:

As you read this, some young man or woman is sitting at a desk in


a university, earnestly studying material they have no intention of
ever using, and no interest in knowing for its own sake. They want
a high-paying job, and the high-paying job requires a piece of
paper, and the piece of paper requires a previous master’s degree,
and the master’s degree requires a bachelor’s degree, and the
university that grants the bachelor’s degree requires you to take
a class in twelfth-century knitting patterns to graduate. So they
diligently study, intending to forget it all the moment the final
exam is administered, but still seriously working away, because
they want that piece of paper.

Why are you going to “school”? To get an “education” ending in a “degree.”


Blank out the forbidden words and all their obvious synonyms, visualize the
actual details, and you’re much more likely to notice that “school” currently
seems to consist of sitting next to bored teenagers listening to material you
already know, that a “degree” is a piece of paper with some writing on it, and
that “education” is forgetting the material as soon as you’re tested on it.
Leaky generalizations often manifest through categorizations: People who
actually learn in classrooms are categorized as “getting an education,” so “get-
ting an education” must be good; but then anyone who actually shows up at a
college will also match against the concept “getting an education,” whether or
not they learn.
Students who understand math will do well on tests, but if you require
schools to produce good test scores, they’ll spend all their time teaching to the
test. A mental category, that imperfectly matches your goal, can produce the
same kind of incentive failure internally. You want to learn, so you need an
“education”; and then as long as you’re getting anything that matches against
the category “education,” you may not notice whether you’re learning or not.
Or you’ll notice, but you won’t realize you’ve lost sight of your original purpose,
because you’re “getting an education” and that’s how you mentally described
your goal.
To categorize is to throw away information. If you’re told that a falling
tree makes a “sound,” you don’t know what the actual sound is; you haven’t
actually heard the tree falling. If a coin lands “heads,” you don’t know its
radial orientation. A blue egg-shaped thing may be a “blegg,” but what if the
exact egg shape varies, or the exact shade of blue? You want to use categories
to throw away irrelevant information, to sift gold from dust, but often the
standard categorization ends up throwing out relevant information too. And
when you end up in that sort of mental trouble, the first and most obvious
solution is to play Taboo.
For example: “Play Taboo” is itself a leaky generalization. Hasbro’s version
is not the rationalist version; they only list five additional banned words on
the card, and that’s not nearly enough coverage to exclude thinking in familiar
old words. What rationalists do would count as playing Taboo—it would
match against the “play Taboo” concept—but not everything that counts as
playing Taboo works to force original seeing. If you just think “play Taboo to
force original seeing,” you’ll start thinking that anything that counts as playing
Taboo must count as original seeing.
The rationalist version isn’t a game, which means that you can’t win by
trying to be clever and stretching the rules. You have to play Taboo with a
voluntary handicap: Stop yourself from using synonyms that aren’t on the card.
You also have to stop yourself from inventing a new simple word or phrase
that functions as an equivalent mental handle to the old one. You are trying
to zoom in on your map, not rename the cities; dereference the pointer, not
allocate a new pointer; see the events as they happen, not rewrite the cliché in
a different wording.
By visualizing the problem in more detail, you can see the lost purpose:
Exactly what do you do when you “play Taboo”? What purpose does each and
every part serve?
If you see your activities and situation originally, you will be able to origi-
nally see your goals as well. If you can look with fresh eyes, as though for the
first time, you will see yourself doing things that you would never dream of
doing if they were not habits.
Purpose is lost whenever the substance (learning, knowledge, health) is
displaced by the symbol (a degree, a test score, medical care). To heal a lost
purpose, or a lossy categorization, you must do the reverse:
Replace the symbol with the substance; replace the signifier with the sig-
nified; replace the property with the membership test; replace the word with
the meaning; replace the label with the concept; replace the summary with
the details; replace the proxy question with the real question; dereference the
pointer; drop into a lower level of organization; mentally simulate the process
instead of naming it; zoom in on your map.
The Simple Truth was generated by an exercise of this discipline to describe
“truth” on a lower level of organization, without invoking terms like “accurate,”
“correct,” “represent,” “reflect,” “semantic,” “believe,” “knowledge,” “map,”
or “real.” (And remember that the goal is not really to play Taboo—the word
“true” appears in the text, but not to define truth. It would get a buzzer in
Hasbro’s game, but we’re not actually playing that game. Ask yourself whether
the document fulfilled its purpose, not whether it followed the rules.)
Bayes’s Rule itself describes “evidence” in pure math, without using words
like “implies,” “means,” “supports,” “proves,” or “justifies.” Set out to define
such philosophical terms, and you’ll just go in circles.
And then there’s the most important word of all to Taboo. I’ve often warned
that you should be careful not to overuse it, or even avoid the concept in certain
cases. Now you know the real reason why. It’s not a bad subject to think about.
But your true understanding is measured by your ability to describe what
you’re doing and why, without using that word or any of its synonyms.

*
169
Fallacies of Compression

“The map is not the territory,” as the saying goes. The only life-size, atomically
detailed, 100% accurate map of California is California. But California has im-
portant regularities, such as the shape of its highways, that can be described us-
ing vastly less information—not to mention vastly less physical material—than
it would take to describe every atom within the state borders. Hence the other
saying: “The map is not the territory, but you can’t fold up the territory and
put it in your glove compartment.”
A paper map of California, at a scale of 10 kilometers to 1 centimeter (a
million to one), doesn’t have room to show the distinct position of two fallen
leaves lying a centimeter apart on the sidewalk. Even if the map tried to show
the leaves, the leaves would appear as the same point on the map; or rather the
map would need a feature size of 10 nanometers, which is a finer resolution
than most book printers handle, not to mention human eyes.
Reality is very large—just the part we can see is billions of lightyears across.
But your map of reality is written on a few pounds of neurons, folded up
to fit inside your skull. I don’t mean to be insulting, but your skull is tiny.
Comparatively speaking.
Inevitably, then, certain things that are distinct in reality, will be compressed
into the same point on your map.
But what this feels like from inside is not that you say, “Oh, look, I’m
compressing two things into one point on my map.” What it feels like from
inside is that there is just one thing, and you are seeing it.
A sufficiently young child, or a sufficiently ancient Greek philosopher,
would not know that there were such things as “acoustic vibrations” or “audi-
tory experiences.” There would just be a single thing that happened when a
tree fell; a single event called “sound.”
To realize that there are two distinct events, underlying one point on your
map, is an essentially scientific challenge—a big, difficult scientific challenge.
Sometimes fallacies of compression result from confusing two known things
under the same label—you know about acoustic vibrations, and you know
about auditory processing in brains, but you call them both “sound” and so
confuse yourself. But the more dangerous fallacy of compression arises from
having no idea whatsoever that two distinct entities even exist. There is just one
mental folder in the filing system, labeled “sound,” and everything thought
about “sound” drops into that one folder. It’s not that there are two folders with
the same label; there’s just a single folder. By default, the map is compressed;
why would the brain create two mental buckets where one would serve?
Or think of a mystery novel in which the detective’s critical insight is that
one of the suspects has an identical twin. In the course of the detective’s
ordinary work, their job is just to observe that Carol is wearing red, that
she has black hair, that her sandals are leather—but all these are facts about
Carol. It’s easy enough to question an individual fact, like WearsRed(Carol)
or BlackHair(Carol). Maybe BlackHair(Carol) is false. Maybe Carol dyes her
hair. Maybe BrownHair(Carol). But it takes a subtler detective to wonder if the
Carol in WearsRed(Carol) and BlackHair(Carol)—the Carol file into which
their observations drop—should be split into two files. Maybe there are two
Carols, so that the Carol who wore red is not the same woman as the Carol
who had black hair.
Here it is the very act of creating two different buckets that is the stroke of
genius insight. ’Tis easier to question one’s facts than one’s ontology.
The map of reality contained in a human brain, unlike a paper map of
California, can expand dynamically when we write down more detailed de-
scriptions. But what this feels like from inside is not so much zooming in on a
map, as fissioning an indivisible atom—taking one thing (it felt like one thing)
and splitting it into two or more things.
Often this manifests in the creation of new words, like “acoustic vibrations”
and “auditory experiences” instead of just “sound.” Something about creating
the new name seems to allocate the new bucket. The detective is liable to start
calling one of their suspects “Carol-2” or “the Other Carol” almost as soon as
they realize that there are two Carols.
But expanding the map isn’t always as simple as generating new city names.
It is a stroke of scientific insight to realize that such things as acoustic vibrations,
or auditory experiences, even exist.
The obvious modern-day illustration would be words like “intelligence” or
“consciousness.” Every now and then one sees a press release claiming that a
research study has “explained consciousness” because a team of neurologists
investigated a 40Hz electrical rhythm that might have something to do with
cross-modality binding of sensory information, or because they investigated
the reticular activating system that keeps humans awake. That’s an extreme
example, and the usual failures are more subtle, but they are of the same kind.
The part of “consciousness” that people find most interesting is reflectivity,
self-awareness, realizing that the person I see in the mirror is “me”; that and the
hard problem of subjective experience as distinguished by David Chalmers.
We also label “conscious” the state of being awake, rather than asleep, in our
daily cycle. But they are all different concepts going under the same name, and
the underlying phenomena are different scientific puzzles. You can explain
being awake without explaining reflectivity or subjectivity.
Fallacies of compression also underlie the bait-and-switch technique in
philosophy—you argue about “consciousness” under one definition (like the
ability to think about thinking) and then apply the conclusions to “conscious-
ness” under a different definition (like subjectivity). Of course it may be that
the two are the same thing, but if so, genuinely understanding this fact would
require first a conceptual split and then a genius stroke of reunification.
Expanding your map is (I say again) a scientific challenge: part of the art of
science, the skill of inquiring into the world. (And of course you cannot solve
a scientific challenge by appealing to dictionaries, nor master a complex skill
of inquiry by saying “I can define a word any way I like.”) Where you see a
single confusing thing, with protean and self-contradictory attributes, it is a
good guess that your map is cramming too much into one point—you need to
pry it apart and allocate some new buckets. This is not like defining the single
thing you see, but it does often follow from figuring out how to talk about the
thing without using a single mental handle.
So the skill of prying apart the map is linked to the rationalist version of
Taboo, and to the wise use of words; because words often represent the points
on our map, the labels under which we file our propositions and the buckets
into which we drop our information. Avoiding a single word, or allocating
new ones, is often part of the skill of expanding the map.

*
170
Categorizing Has Consequences

Among the many genetic variations and mutations you carry in your genome,
there are a very few alleles you probably know—including those determining
your blood type: the presence or absence of the A, B, and + antigens. If you
receive a blood transfusion containing an antigen you don’t have, it will trigger
an allergic reaction. It was Karl Landsteiner’s discovery of this fact, and how
to test for compatible blood types, that made it possible to transfuse blood
without killing the patient. (1930 Nobel Prize in Medicine.) Also, if a mother
with blood type A (for example) bears a child with blood type A+, the mother
may acquire an allergic reaction to the + antigen; if she has another child with
blood type A+, the child will be in danger, unless the mother takes an allergic
suppressant during pregnancy. Thus people learn their blood types before they
marry.
Oh, and also: people with blood type A are earnest and creative, while peo-
ple with blood type B are wild and cheerful. People with type O are agreeable
and sociable, while people with type AB are cool and controlled. (You would
think that O would be the absence of A and B, while AB would just be A plus B,
but no . . .) All this, according to the Japanese blood type theory of personality.
It would seem that blood type plays the role in Japan that astrological signs
play in the West, right down to blood type horoscopes in the daily newspaper.
This fad is especially odd because blood types have never been mysterious,
not in Japan and not anywhere. We only know blood types even exist thanks
to Karl Landsteiner. No mystic witch doctor, no venerable sorcerer, ever said a
word about blood types; there are no ancient, dusty scrolls to shroud the error
in the aura of antiquity. If the medical profession claimed tomorrow that it
had all been a colossal hoax, we layfolk would not have one scrap of evidence
from our unaided senses to contradict them.
There’s never been a war between blood types. There’s never even been a
political conflict between blood types. The stereotypes must have arisen strictly
from the mere existence of the labels.
Now, someone is bound to point out that this is a story of categorizing
humans. Does the same thing happen if you categorize plants, or rocks, or
office furniture? I can’t recall reading about such an experiment, but of course,
that doesn’t mean one hasn’t been done. (I’d expect the chief difficulty of
doing such an experiment would be finding a protocol that didn’t mislead the
subjects into thinking that, since the label was given you, it must be significant
somehow.) So while I don’t mean to update on imaginary evidence, I would
predict a positive result for the experiment: I would expect them to find that
mere labeling had power over all things, at least in the human imagination.
You can see this in terms of similarity clusters: once you draw a bound-
ary around a group, the mind starts trying to harvest similarities from the
group. And unfortunately the human pattern-detectors seem to operate in
such overdrive that we see patterns whether they’re there or not; a weakly nega-
tive correlation can be mistaken for a strong positive one with a bit of selective
memory.
You can see this in terms of neural algorithms: creating a name for a set of
things is like allocating a subnetwork to find patterns in them.
You can see this in terms of a compression fallacy: things given the same
name end up dumped into the same mental bucket, blurring them together
into the same point on the map.
Or you can see this in terms of the boundless human ability to make stuff
up out of thin air and believe it because no one can prove it’s wrong. As soon
as you name the category, you can start making up stuff about it. The named
thing doesn’t have to be perceptible; it doesn’t have to exist; it doesn’t even
have to be coherent.
And no, it’s not just Japan: Here in the West, a blood-type-based diet book
called Eat Right 4 Your Type was a bestseller.
Any way you look at it, drawing a boundary in thingspace is not a neutral
act. Maybe a more cleanly designed, more purely Bayesian AI could ponder
an arbitrary class and not be influenced by it. But you, a human, do not have
that option. Categories are not static things in the context of a human brain; as
soon as you actually think of them, they exert force on your mind. One more
reason not to believe you can define a word any way you like.

*
171
Sneaking in Connotations

In the previous essay, we saw that in Japan, blood types have taken the place of
astrology—if your blood type is AB, for example, you’re supposed to be “cool
and controlled.”
So suppose we decided to invent a new word, “wiggin,” and defined this
word to mean people with green eyes and black hair—

A green-eyed man with black hair walked into a restaurant.


“Ha,” said Danny, watching from a nearby table, “did you
see that? A wiggin just walked into the room. Bloody wiggins.
Commit all sorts of crimes, they do.”
His sister Erda sighed. “You haven’t seen him commit any
crimes, have you, Danny?”
“Don’t need to,” Danny said, producing a dictionary. “See,
it says right here in the Oxford English Dictionary. ‘Wiggin. (1)
A person with green eyes and black hair.’ He’s got green eyes
and black hair, he’s a wiggin. You’re not going to argue with the
Oxford English Dictionary, are you? By definition, a green-eyed
black-haired person is a wiggin.”
“But you called him a wiggin,” said Erda. “That’s a nasty
thing to say about someone you don’t even know. You’ve got no
evidence that he puts too much ketchup on his burgers, or that as
a kid he used his slingshot to launch baby squirrels.”
“But he is a wiggin,” Danny said patiently. “He’s got green
eyes and black hair, right? Just you watch, as soon as his burger
arrives, he’s reaching for the ketchup.”

The human mind passes from observed characteristics to inferred characteris-


tics via the medium of words. In “All humans are mortal, Socrates is a human,
therefore Socrates is mortal,” the observed characteristics are Socrates’s clothes,
speech, tool use, and generally human shape; the categorization is “human”;
the inferred characteristic is poisonability by hemlock.
Of course there’s no hard distinction between “observed characteristics”
and “inferred characteristics.” If you hear someone speak, they’re probably
shaped like a human, all else being equal. If you see a human figure in the
shadows, then ceteris paribus it can probably speak.
And yet some properties do tend to be more inferred than observed. You’re
more likely to decide that someone is human, and will therefore burn if exposed
to open flame, than carry through the inference the other way around.
If you look in a dictionary for the definition of “human,” you’re more
likely to find characteristics like “intelligence” and “featherless biped”—
characteristics that are useful for quickly eyeballing what is and isn’t a human—
rather than the ten thousand connotations, from vulnerability to hemlock, to
overconfidence, that we can infer from someone’s being human. Why? Per-
haps dictionaries are intended to let you match up labels to similarity groups,
and so are designed to quickly isolate clusters in thingspace. Or perhaps the
big, distinguishing characteristics are the most salient, and therefore first to
pop into a dictionary editor’s mind. (I’m not sure how aware dictionary editors
are of what they really do.)
But the upshot is that when Danny pulls out his OED to look up “wiggin,”
he sees listed only the first-glance characteristics that distinguish a wiggin:
Green eyes and black hair. The OED doesn’t list the many minor connotations
that have come to attach to this term, such as criminal proclivities, culinary
peculiarities, and some unfortunate childhood activities.
How did those connotations get there in the first place? Maybe there was
once a famous wiggin with those properties. Or maybe someone made stuff
up at random, and wrote a series of bestselling books about it (The Wiggin,
Talking to Wiggins, Raising Your Little Wiggin, Wiggins in the Bedroom). Maybe
even the wiggins believe it now, and act accordingly. As soon as you call some
people “wiggins,” the word will begin acquiring connotations.
But remember the Parable of Hemlock: If we go by the logical class defini-
tions, we can never class Socrates as a “human” until after we observe him to
be mortal. Whenever someone pulls a dictionary, they’re generally trying to
sneak in a connotation, not the actual definition written down in the dictionary.
After all, if the only meaning of the word “wiggin” is “green-eyed black-
haired person,” then why not just call those people “green-eyed black-haired
people”? And if you’re wondering whether someone is a ketchup-reacher, why
not ask directly, “Is he a ketchup-reacher?” rather than “Is he a wiggin?” (Note
substitution of substance for symbol.)
Oh, but arguing the real question would require work. You’d have to actually
watch the wiggin to see if he reached for the ketchup. Or maybe see if you
can find statistics on how many green-eyed black-haired people actually like
ketchup. At any rate, you wouldn’t be able to do it sitting in your living room
with your eyes closed. And people are lazy. They’d rather argue “by definition,”
especially since they think “you can define a word any way you like.”
But of course the real reason they care whether someone is a “wiggin” is
a connotation—a feeling that comes along with the word—that isn’t in the
definition they claim to use.
Imagine Danny saying, “Look, he’s got green eyes and black hair. He’s a
wiggin! It says so right there in the dictionary!—therefore, he’s got black hair.
Argue with that, if you can!”
Doesn’t have much of a triumphant ring to it, does it? If the real point
of the argument actually was contained in the dictionary definition—if the
argument genuinely was logically valid—then the argument would feel empty;
it would either say nothing new, or beg the question.
It’s only the attempt to smuggle in connotations not explicitly listed in the
definition, that makes anyone feel they can score a point that way.

*
172
Arguing “By Definition”

“This plucked chicken has two legs and no feathers—therefore, by definition, it


is a human!”
When people argue definitions, they usually start with some visible, known,
or at least widely believed set of characteristics; then pull out a dictionary,
and point out that these characteristics fit the dictionary definition; and so
conclude, “Therefore, by definition, atheism is a religion!”
But visible, known, widely believed characteristics are rarely the real point
of a dispute. Just the fact that someone thinks Socrates’s two legs are evident
enough to make a good premise for the argument, “Therefore, by definition,
Socrates is human!” indicates that bipedalism probably isn’t really what’s at
stake—or the listener would reply, “Whaddaya mean Socrates is bipedal? That’s
what we’re arguing about in the first place!”
Now there is an important sense in which we can legitimately move from
evident characteristics to not-so-evident ones. You can, legitimately, see that
Socrates is human-shaped, and predict his vulnerability to hemlock. But this
probabilistic inference does not rely on dictionary definitions or common usage;
it relies on the universe containing empirical clusters of similar things.
This cluster structure is not going to change depending on how you define
your words. Even if you look up the dictionary definition of “human” and
it says “all featherless bipeds except Socrates,” that isn’t going to change the
actual degree to which Socrates is similar to the rest of us featherless bipeds.
When you are arguing correctly from cluster structure, you’ll say something
like, “Socrates has two arms, two feet, a nose and tongue, speaks fluent Greek,
uses tools, and in every aspect I’ve been able to observe him, seems to have
every major and minor property that characterizes Homo sapiens; so I’m going
to guess that he has human DNA, human biochemistry, and is vulnerable to
hemlock just like all other Homo sapiens in whom hemlock has been clinically
tested for lethality.”
And suppose I reply, “But I saw Socrates out in the fields with some herbol-
ogists; I think they were trying to prepare an antidote. Therefore I don’t expect
Socrates to keel over after he drinks the hemlock—he will be an exception to
the general behavior of objects in his cluster: they did not take an antidote,
and he did.”
Now there’s not much point in arguing over whether Socrates is “human”
or not. The conversation has to move to a more detailed level, poke around
inside the details that make up the “human” category—talk about human
biochemistry, and specifically, the neurotoxic effects of coniine.
If you go on insisting, “But Socrates is a human and humans, by definition,
are mortal!” then what you’re really trying to do is blur out everything you know
about Socrates except the fact of his humanity—insist that the only correct
prediction is the one you would make if you knew nothing about Socrates
except that he was human.
Which is like insisting that a coin is 50% likely to be showing heads or
tails, because it is a “fair coin,” after you’ve actually looked at the coin and
it’s showing heads. It’s like insisting that Frodo has ten fingers, because most
hobbits have ten fingers, after you’ve already looked at his hands and seen nine
fingers. Naturally this is illegal under Bayesian probability theory: You can’t
just refuse to condition on new evidence.
And you can’t just keep one categorization and make estimates based on
that, while deliberately throwing out everything else you know.
Not every piece of new evidence makes a significant difference, of course.
If I see that Socrates has nine fingers, this isn’t going to noticeably change
my estimate of his vulnerability to hemlock, because I’ll expect that the way
Socrates lost his finger didn’t change the rest of his biochemistry. And this is
true, whether or not the dictionary’s definition says that human beings have ten
fingers. The legal inference is based on the cluster structure of the environment,
and the causal structure of biology; not what the dictionary editor writes down,
nor even “common usage.”
Now ordinarily, when you’re doing this right—in a legitimate way—you
just say, “The coniine alkaloid found in hemlock produces muscular paralysis
in humans, resulting in death by asphyxiation.” Or more simply, “Humans are
vulnerable to hemlock.” That’s how it’s usually said in a legitimate argument.
When would someone feel the need to strengthen the argument with the
emphatic phrase “by definition”? (E.g. “Humans are vulnerable to hemlock
by definition!”) Why, when the inferred characteristic has been called into
doubt—Socrates has been seen consulting herbologists—and so the speaker
feels the need to tighten the vise of logic.
So when you see “by definition” used like this, it usually means: “For-
get what you’ve heard about Socrates consulting herbologists—humans, by
definition, are mortal!”
People feel the need to squeeze the argument onto a single course by saying
“Any P, by definition, has property Q!,” on exactly those occasions when they
see, and prefer to dismiss out of hand, additional arguments that call into doubt
the default inference based on clustering.
So too with the argument “X, by definition, is a Y !” E.g., “Atheists believe
that God doesn’t exist; therefore atheists have beliefs about God, because a
negative belief is still a belief; therefore atheism asserts answers to theological
questions; therefore atheism is, by definition, a religion.”
You wouldn’t feel the need to say, “Hinduism, by definition, is a religion!”
because, well, of course Hinduism is a religion. It’s not just a religion “by
definition,” it’s, like, an actual religion.
Atheism does not resemble the central members of the “religion” cluster,
so if it wasn’t for the fact that atheism is a religion by definition, you might go
around thinking that atheism wasn’t a religion. That’s why you’ve got to crush
all opposition by pointing out that “Atheism is a religion” is true by definition,
because it isn’t true any other way.
Which is to say: People insist that “X, by definition, is a Y !” on those
occasions when they’re trying to sneak in a connotation of Y that isn’t directly
in the definition, and X doesn’t look all that much like other members of the
Y cluster.
Over the last thirteen years I’ve been keeping track of how often this phrase
is used correctly versus incorrectly—though not with literal statistics, I fear.
But eyeballing suggests that using the phrase by definition, anywhere outside of
math, is among the most alarming signals of flawed argument I’ve ever found.
It’s right up there with “Hitler,” “God,” “absolutely certain,” and “can’t prove
that.”
This heuristic of failure is not perfect—the first time I ever spotted a cor-
rect usage outside of math, it was by Richard Feynman; and since then I’ve
spotted more. But you’re probably better off just deleting the phrase “by defi-
nition” from your vocabulary—and always on any occasion where you might
be tempted to say it in italics or followed with an exclamation mark. That’s a
bad idea by definition!

*
173
Where to Draw the Boundary?

The one comes to you and says:

Long have I pondered the meaning of the word “Art,” and at last
I’ve found what seems to me a satisfactory definition: “Art is that
which is designed for the purpose of creating a reaction in an
audience.”

Just because there’s a word “art” doesn’t mean that it has a meaning, floating
out there in the void, which you can discover by finding the right definition.
It feels that way, but it is not so.
Wondering how to define a word means you’re looking at the problem
the wrong way—searching for the mysterious essence of what is, in fact, a
communication signal.
Now, there is a real challenge which a rationalist may legitimately attack,
but the challenge is not to find a satisfactory definition of a word. The real
challenge can be played as a single-player game, without speaking aloud. The
challenge is figuring out which things are similar to each other—which things
are clustered together—and sometimes, which things have a common cause.
If you define “eluctromugnetism” to include lightning, include compasses,
exclude light, and include Mesmer’s “animal magnetism” (what we now
call hypnosis), then you will have some trouble asking “How does eluctro-
mugnetism work?” You have lumped together things which do not belong
together, and excluded others that would be needed to complete a set. (This
example is historically plausible; Mesmer came before Faraday.)
We could say that eluctromugnetism is a wrong word, a boundary in
thingspace that loops around and swerves through the clusters, a cut that
fails to carve reality along its natural joints.
Figuring where to cut reality in order to carve along the joints—this is the
problem worthy of a rationalist. It is what people should be trying to do, when
they set out in search of the floating essence of a word.
And make no mistake: it is a scientific challenge to realize that you need
a single word to describe breathing and fire. So do not think to consult the
dictionary editors, for that is not their job.
What is “art”? But there is no essence of the word, floating in the void.
Perhaps you come to me with a long list of the things that you call “art” and
“not art”:

The Little Fugue in G Minor: Art.


A punch in the nose: Not art.
Escher’s Relativity: Art.
A flower: Not art.
The Python programming language: Art.
A cross floating in urine: Not art.
Jack Vance’s Tschai novels: Art.
Modern Art: Not art.

And you say to me: “It feels intuitive to me to draw this boundary, but I don’t
know why—can you find me an intension that matches this extension? Can
you give me a simple description of this boundary?”
So I reply: “I think it has to do with admiration of craftsmanship: work
going in and wonder coming out. What the included items have in common
is the similar aesthetic emotions that they inspire, and the deliberate human
effort that went into them with the intent of producing such an emotion.”
Is this helpful, or is it just cheating at Taboo? I would argue that the list of
which human emotions are or are not aesthetic is far more compact than the
list of everything that is or isn’t art. You might be able to see those emotions
lighting up an f MRI scan—I say this by way of emphasizing that emotions are
not ethereal.
But of course my definition of art is not the real point. The real point is that
you could well dispute either the intension or the extension of my definition.
You could say, “Aesthetic emotion is not what these things have in common;
what they have in common is an intent to inspire any complex emotion for
the sake of inspiring it.” That would be disputing my intension, my attempt
to draw a curve through the data points. You would say, “Your equation may
roughly fit those points, but it is not the true generating distribution.”
Or you could dispute my extension by saying, “Some of these things do
belong together—I can see what you’re getting at—but the Python language
shouldn’t be on the list, and Modern Art should be.” (This would mark you as a
philistine, but you could argue it.) Here, the presumption is that there is indeed
an underlying curve that generates this apparent list of similar and dissimilar
things—that there is a rhyme and reason, even though you haven’t said yet
where it comes from—but I have unwittingly lost the rhythm and included some
data points from a different generator.
Long before you know what it is that electricity and magnetism have in
common, you might still suspect—based on surface appearances—that “animal
magnetism” does not belong on the list.
Once upon a time it was thought that the word “fish” included dolphins.
Now you could play the oh-so-clever arguer, and say, “The list: {Salmon, gup-
pies, sharks, dolphins, trout} is just a list—you can’t say that a list is wrong. I
can prove in set theory that this list exists. So my definition of fish, which is
simply this extensional list, cannot possibly be ‘wrong’ as you claim.”
Or you could stop playing games and admit that dolphins don’t belong on
the fish list.
You come up with a list of things that feel similar, and take a guess at why
this is so. But when you finally discover what they really have in common, it
may turn out that your guess was wrong. It may even turn out that your list
was wrong.
You cannot hide behind a comforting shield of correct-by-definition. Both
extensional definitions and intensional definitions can be wrong, can fail to
carve reality at the joints.
Categorizing is a guessing endeavor, in which you can make mistakes; so
it’s wise to be able to admit, from a theoretical standpoint, that your definition-
guesses can be “mistaken.”

*
174
Entropy, and Short Codes

(If you aren’t familiar with Bayesian inference, this may be a good time to read
An Intuitive Explanation of Bayes’s Theorem.)
Suppose you have a system X that’s equally likely to be in any of 8 possible
states:
{X1 , X2 , X3 , X4 , X5 , X6 , X7 , X8 } .

There’s an extraordinarily ubiquitous quantity—in physics, mathematics, and


even biology—called entropy; and the entropy of X is 3 bits. This means that,
on average, we’ll have to ask 3 yes-or-no questions to find out X’s value. For
example, someone could tell us X’s value using this code:

X1 : 001 X2 : 010 X3 : 011 X4 : 100


X5 : 101 X6 : 110 X7 : 111 X8 : 000 .

So if I asked “Is the first symbol 1?” and heard “yes,” then asked “Is the second
symbol 1?” and heard “no,” then asked “Is the third symbol 1?” and heard “no,”
I would know that X was in state 4.
Now suppose that the system Y has four possible states with the following
probabilities:
Y1 : 1/2 (50%) Y2 : 1/4 (25%)
Y3 : 1/8 (12.5%) Y4 : 1/8 (12.5%) .
Then the entropy of Y would be 1.75 bits, meaning that we can find out its
value by asking 1.75 yes-or-no questions.
What does it mean to talk about asking one and three-fourths of a question?
Imagine that we designate the states of Y using the following code:

Y1 : 1 Y2 : 01 Y3 : 001 Y4 : 000 .

First you ask, “Is the first symbol 1?” If the answer is “yes,” you’re done: Y is
in state 1. This happens half the time, so 50% of the time, it takes 1 yes-or-no
question to find out Y ’s state.
Suppose that instead the answer is “No.” Then you ask, “Is the second
symbol 1?” If the answer is “yes,” you’re done: Y is in state 2. The system Y is
in state 2 with probability 1/4, and each time Y is in state 2 we discover this
fact using two yes-or-no questions, so 25% of the time it takes 2 questions to
discover Y ’s state.
If the answer is “No” twice in a row, you ask “Is the third symbol 1?” If
“yes,” you’re done and Y is in state 3; if “no,” you’re done and Y is in state 4.
The 1/8 of the time that Y is in state 3, it takes three questions; and the 1/8 of
the time that Y is in state 4, it takes three questions.

(1/2 × 1) + (1/4 × 2) + (1/8 × 3) + (1/8 × 3)


= 0.5 + 0.5 + 0.375 + 0.375
= 1.75 .

The general formula for the entropy H(S) of a system S is the sum, over all
Si , of −P (Si ) log2 (P (Si )).
For example, the log (base 2) of 1/8 is −3. So −(1/8 × −3) = 0.375 is
the contribution of state S4 to the total entropy: 1/8 of the time, we have to
ask 3 questions.
You can’t always devise a perfect code for a system, but if you have to tell
someone the state of arbitrarily many copies of S in a single message, you can
get arbitrarily close to a perfect code. (Google “arithmetic coding” for a simple
method.)
Now, you might ask: “Why not use the code 10 for Y4 , instead of 000?
Wouldn’t that let us transmit messages more quickly?”
But if you use the code 10 for Y4 , then when someone answers “Yes” to
the question “Is the first symbol 1?,” you won’t know yet whether the system
state is Y1 (1) or Y4 (10). In fact, if you change the code this way, the whole
system falls apart—because if you hear “1001,” you don’t know if it means “Y4 ,
followed by Y2 ” or “Y1 , followed by Y3 .”
The moral is that short words are a conserved resource.
The key to creating a good code—a code that transmits messages as com-
pactly as possible—is to reserve short words for things that you’ll need to say
frequently, and use longer words for things that you won’t need to say as often.
When you take this art to its limit, the length of the message you need
to describe something corresponds exactly or almost exactly to its probabil-
ity. This is the Minimum Description Length or Minimum Message Length
formalization of Occam’s Razor.
And so even the labels that we use for words are not quite arbitrary. The
sounds that we attach to our concepts can be better or worse, wiser or more
foolish. Even apart from considerations of common usage!
I say all this, because the idea that “You can X any way you like” is a huge
obstacle to learning how to X wisely. “It’s a free country; I have a right to my
own opinion” obstructs the art of finding truth. “I can define a word any way
I like” obstructs the art of carving reality at its joints. And even the sensible-
sounding “The labels we attach to words are arbitrary” obstructs awareness
of compactness. Prosody too, for that matter—Tolkien once observed what a
beautiful sound the phrase “cellar door” makes; that is the kind of awareness it
takes to use language like Tolkien.
The length of words also plays a nontrivial role in the cognitive science of
language:
Consider the phrases “recliner,” “chair,” and “furniture.” Recliner is a more
specific category than chair; furniture is a more general category than chair.
But the vast majority of chairs have a common use—you use the same sort of
motor actions to sit down in them, and you sit down in them for the same sort
of purpose (to take your weight off your feet while you eat, or read, or type,
or rest). Recliners do not depart from this theme. “Furniture,” on the other
hand, includes things like beds and tables which have different uses, and call
up different motor functions, from chairs.
In the terminology of cognitive psychology, “chair” is a basic-level category.
People have a tendency to talk, and presumably think, at the basic level of
categorization—to draw the boundary around “chairs,” rather than around
the more specific category “recliner,” or the more general category “furniture.”
People are more likely to say “You can sit in that chair” than “You can sit in
that recliner” or “You can sit in that furniture.”
And it is no coincidence that the word for “chair” contains fewer syllables
than either “recliner” or “furniture.” Basic-level categories, in general, tend
to have short names; and nouns with short names tend to refer to basic-level
categories. Not a perfect rule, of course, but a definite tendency. Frequent use
goes along with short words; short words go along with frequent use.
Or as Douglas Hofstadter put it, there’s a reason why the English language
uses “the” to mean “the” and “antidisestablishmentarianism” to mean “antidis-
establishmentarianism” instead of antidisestablishmentarianism other way
around.

*
175
Mutual Information, and Density in
Thingspace

Suppose you have a system X that can be in any of 8 states, which are all
equally probable (relative to your current state of knowledge), and a system Y
that can be in any of 4 states, all equally probable.
The entropy of X, as defined in the previous essay, is 3 bits; we’ll need to
ask 3 yes-or-no questions to find out X’s exact state. The entropy of Y is 2
bits; we have to ask 2 yes-or-no questions to find out Y ’s exact state. This
may seem obvious since 23 = 8 and 22 = 4, so 3 questions can distinguish
8 possibilities and 2 questions can distinguish 4 possibilities; but remember
that if the possibilities were not all equally likely, we could use a more clever
code to discover Y ’s state using e.g. 1.75 questions on average. In this case,
though, X’s probability mass is evenly distributed over all its possible states,
and likewise Y, so we can’t use any clever codes.
What is the entropy of the combined system (X, Y )?
You might be tempted to answer, “It takes 3 questions to find out X, and
then 2 questions to find out Y, so it takes 5 questions total to find out the state
of X and Y. ”
But what if the two variables are entangled, so that learning the state of Y
tells us something about the state of X?
In particular, let’s suppose that X and Y are either both odd or both even.
Now if we receive a 3-bit message (ask 3 questions) and learn that X is in
state X5 , we know that Y is in state Y1 or state Y3 , but not state Y2 or state
Y4 . So the single additional question “Is Y in state Y3 ?,” answered “No,” tells
us the entire state of (X, Y ): X = X5 , Y = Y1 . And we learned this with a
total of 4 questions.
Conversely, if we learn that Y is in state Y4 using two questions, it will take
us only an additional two questions to learn whether X is in state X2 , X4 , X6 ,
or X8 . Again, four questions to learn the state of the joint system.
The mutual information of two variables is defined as the difference between
the entropy of the joint system and the entropy of the independent systems:
I(X; Y ) = H(X) + H(Y ) − H(X, Y ).
Here there is one bit of mutual information between the two systems: Learn-
ing X tells us one bit of information about Y (cuts down the space of pos-
sibilities from 4 possibilities to 2, a factor-of-2 decrease in the volume) and
learning Y tells us one bit of information about X (cuts down the possibility
space from 8 possibilities to 4).
What about when probability mass is not evenly distributed? Last essay, for
example, we discussed the case in which Y had the probabilities 1/2, 1/4, 1/8,
1/8 for its four states. Let us take this to be our probability distribution over
Y, considered independently—if we saw Y, without seeing anything else, this
is what we’d expect to see. And suppose the variable Z has two states, Z1 and
Z2 , with probabilities 3/8 and 5/8 respectively.
Then if and only if the joint distribution of Y and Z is as follows, there is
zero mutual information between Y and Z:
Z1 Y1 : 3/16 Z1 Y2 : 3/32 Z1 Y3 : 3/64 Z1 Y4 : 3/64
Z2 Y1 : 5/16 Z2 Y2 : 5/32 Z2 Y3 : 5/64 Z2 Y4 : 5/64 .
This distribution obeys the law

P (Y, Z) = P (Y )P (Z) .

For example, P (Z1 Y2 ) = P (Z1 )P (Y2 ) = 3/8 × 1/4 = 3/32.


And observe that we can recover the marginal (independent) probabilities
of Y and Z just by looking at the joint distribution:

P (Y1 ) = total probability of all the different ways Y1 can happen


= P (Z1 Y1 ) + P (Z2 Y1 )
= 3/16 + 5/16
= 1/2 .

So, just by inspecting the joint distribution, we can determine whether the
marginal variables Y and Z are independent; that is, whether the joint distri-
bution factors into the product of the marginal distributions; whether, for all
Y and Z, we have P (Y, Z) = P (Y )P (Z).
This last is significant because, by Bayes’s Rule,

P (Zj Yi ) = P (Yi )P (Zj )


P (Zj Yi )/P (Zj ) = P (Yi )
P (Yi |Zj ) = P (Yi ) .

In English: “After you learn Zj , your belief about Yi is just what it was before.”
So when the distribution factorizes—when P (Y, Z) = P (Y )P (Z)—this
is equivalent to “Learning about Y never tells us anything about Z or vice
versa.”
From which you might suspect, correctly, that there is no mutual informa-
tion between Y and Z. Where there is no mutual information, there is no
Bayesian evidence, and vice versa.
Suppose that in the distribution (Y, Z) above, we treated each possible
combination of Y and Z as a separate event—so that the distribution (Y, Z)
would have a total of 8 possibilities, with the probabilities shown—and then
we calculated the entropy of the distribution (Y, Z) the same way we would
calculate the entropy of any distribution:

P (Z1 Y1 ) log2 (P (Z1 Y1 )) + P (Z1 Y2 ) log2 (P (Z1 Y2 ))+


P (Z1 Y3 ) log2 (P (Z1 Y3 )) + . . . + P (Z2 Y4 ) log2 (P (Z2 Y4 ))
=(3/16) log2 (3/16) + (3/32) log2 (3/32)+
(3/64) log2 (3/64) + . . . + (5/64) log2 (5/64) .

You would end up with the same total you would get if you separately calculated
the entropy of Y plus the entropy of Z. There is no mutual information
between the two variables, so our uncertainty about the joint system is not any
less than our uncertainty about the two systems considered separately. (I am
not showing the calculations, but you are welcome to do them; and I am not
showing the proof that this is true in general, but you are welcome to Google
on “Shannon entropy” and “mutual information.”)
What if the joint distribution doesn’t factorize? For example:

Z1 Y1 : 12/64 Z1 Y2 : 8/64 Z1 Y3 : 1/64 Z1 Y4 : 3/64


Z2 Y1 : 20/64 Z2 Y2 : 8/64 Z2 Y3 : 7/64 Z2 Y4 : 5/64 .

If you add up the joint probabilities to get marginal probabilities, you should
find that P (Y1 ) = 1/2, P (Z1 ) = 3/8, and so on—the marginal probabilities
are the same as before.
But the joint probabilities do not always equal the product of the marginal
probabilities. For example, the probability P (Z1 Y2 ) equals 8/64, where
P (Z1 )P (Y2 ) would equal 3/8 × 1/4 = 6/64. That is, the probability of
running into Z1 Y2 together is greater than you’d expect based on the proba-
bilities of running into Z1 or Y2 separately.
Which in turn implies:

P (Z1 Y2 ) > P (Z1 )P (Y2 )


P (Z1 Y2 )/P (Y2 ) > P (Z1 )
P (Z1 |Y2 ) > P (Z1 ) .

Since there’s an “unusually high” probability for P (Z1 Y2 )—defined as a prob-


ability higher than the marginal probabilities would indicate by default—it
follows that observing Y2 is evidence that increases the probability of Z1 . And
by a symmetrical argument, observing Z1 must favor Y2 .
As there are at least some values of Y that tell us about Z (and vice versa)
there must be mutual information between the two variables; and so you will
find—I am confident, though I haven’t actually checked—that calculating the
entropy of (Y, Z) yields less total uncertainty than the sum of the independent
entropies of Y and Z. That is, H(Y, Z) = H(Y ) + H(Z) − I(Y ; Z), with
all quantities necessarily positive.
(I digress here to remark that the symmetry of the expression for the mu-
tual information shows that Y must tell us as much about Z, on average, as Z
tells us about Y. I leave it as an exercise to the reader to reconcile this with any-
thing they were taught in logic class about how, if all ravens are black, being
allowed to reason Raven(x) ⇒ Black(x) doesn’t mean you’re allowed to rea-
son Black(x) ⇒ Raven(x). How different seem the symmetrical probability
flows of the Bayesian, from the sharp lurches of logic—even though the latter
is just a degenerate case of the former.)
“But,” you ask, “what has all this to do with the proper use of words?”
In Empty Labels and then Replace the Symbol with the Substance, we saw
the technique of replacing a word with its definition—the example being given:

All [mortal, ¬feathers, bipedal] are mortal.


Socrates is a [mortal, ¬feathers, bipedal].
Therefore, Socrates is mortal.

Why, then, would you even want to have a word for “human”? Why not just
say “Socrates is a mortal featherless biped”?
Because it’s helpful to have shorter words for things that you encounter
often. If your code for describing single properties is already efficient, then
there will not be an advantage to having a special word for a conjunction—like
“human” for “mortal featherless biped”—unless things that are mortal and
featherless and bipedal, are found more often than the marginal probabilities
would lead you to expect.
In efficient codes, word length corresponds to probability—so the code
for Z1 Y2 will be just as long as the code for Z1 plus the code for Y2 , unless
P (Z1 Y2 ) > P (Z1 )P (Y2 ), in which case the code for the word can be shorter
than the codes for its parts.
And this in turn corresponds exactly to the case where we can infer some
of the properties of the thing from seeing its other properties. It must be more
likely than the default that featherless bipedal things will also be mortal.
Of course the word “human” really describes many, many more properties—
when you see a human-shaped entity that talks and wears clothes, you can infer
whole hosts of biochemical and anatomical and cognitive facts about it. To
replace the word “human” with a description of everything we know about hu-
mans would require us to spend an inordinate amount of time talking. But this
is true only because a featherless talking biped is far more likely than default to
be poisonable by hemlock, or have broad nails, or be overconfident.
Having a word for a thing, rather than just listing its properties, is a more
compact code precisely in those cases where we can infer some of those prop-
erties from the other properties. (With the exception perhaps of very primitive
words, like “red,” that we would use to send an entirely uncompressed de-
scription of our sensory experiences. But by the time you encounter a bug, or
even a rock, you’re dealing with nonsimple property collections, far above the
primitive level.)
So having a word “wiggin” for green-eyed black-haired people is more
useful than just saying “green-eyed black-haired person” precisely when:

1. Green-eyed people are more likely than average to be black-haired (and


vice versa), meaning that we can probabilistically infer green eyes from
black hair or vice versa; or

2. Wiggins share other properties that can be inferred at greater-than-


default probability. In this case we have to separately observe the green
eyes and black hair; but then, after observing both these properties in-
dependently, we can probabilistically infer other properties (like a taste
for ketchup).

One may even consider the act of defining a word as a promise to this effect.
Telling someone, “I define the word ‘wiggin’ to mean a person with green eyes
and black hair,” by Gricean implication, asserts that the word “wiggin” will
somehow help you make inferences / shorten your messages.
If green-eyes and black hair have no greater than default probability to
be found together, nor does any other property occur at greater than default
probability along with them, then the word “wiggin” is a lie: The word claims
that certain people are worth distinguishing as a group, but they’re not.
In this case the word “wiggin” does not help describe reality more
compactly—it is not defined by someone sending the shortest message—it has
no role in the simplest explanation. Equivalently, the word “wiggin” will be
of no help to you in doing any Bayesian inference. Even if you do not call the
word a lie, it is surely an error.
And the way to carve reality at its joints is to draw your boundaries around
concentrations of unusually high probability density in Thingspace.

*
176
Superexponential Conceptspace, and
Simple Words

Thingspace, you might think, is a rather huge space. Much larger than reality,
for where reality only contains things that actually exist, Thingspace contains
everything that could exist.
Actually, the way I “defined” Thingspace to have dimensions for every
possible attribute—including correlated attributes like density and volume and
mass—Thingspace may be too poorly defined to have anything you could call
a size. But it’s important to be able to visualize Thingspace anyway. Surely,
no one can really understand a flock of sparrows if all they see is a cloud of
flapping cawing things, rather than a cluster of points in Thingspace.
But as vast as Thingspace may be, it doesn’t hold a candle to the size of
Conceptspace.
“Concept,” in machine learning, means a rule that includes or excludes
examples. If you see the data {2:+, 3:-, 14:+, 23:-, 8:+, 9:-} then
you might guess that the concept was “even numbers.” There is a rather large
literature (as one might expect) on how to learn concepts from data . . . given
random examples, given chosen examples . . . given possible errors in classifi-
cation . . . and most importantly, given different spaces of possible rules.
Suppose, for example, that we want to learn the concept “good days on
which to play tennis.” The possible attributes of Days are

Sky: {Sunny, Cloudy, Rainy}


AirTemp: {Warm, Cold}
Humidity: {Normal, High}
Wind: {Strong, Weak}.
We’re then presented with the following data, where + indicates a positive
example of the concept, and - indicates a negative classification:

+ Sky: Sunny; AirTemp: Warm;


Humidity: High; Wind: Strong.
- Sky: Rainy; AirTemp: Cold;
Humidity: High; Wind: Strong.
+ Sky: Sunny; AirTemp: Warm;
Humidity: High; Wind: Weak.
What should an algorithm infer from this?
A machine learner might represent one concept that fits this data as follows:

{Sky: ?; AirTemp: Warm; Humidity: High; Wind: ?}.

In this format, to determine whether this concept accepts or rejects an example,


we compare element-by-element: ? accepts anything, but a specific value
accepts only that specific value.
So the concept above will accept only Days with AirTemp = Warm and
Humidity = High, but the Sky and the Wind can take on any value. This fits
both the negative and the positive classifications in the data so far—though it
isn’t the only concept that does so.
We can also simplify the above concept representation to

{?, Warm, High, ?} .

Without going into details, the classic algorithm would be:

• Maintain the set of the most general hypotheses that fit the data—those
that positively classify as many examples as possible, while still fitting
the facts.
• Maintain another set of the most specific hypotheses that fit the data—
those that negatively classify as many examples as possible, while still
fitting the facts.

• Each time we see a new negative example, we strengthen all the most
general hypotheses as little as possible, so that the new set is again as
general as possible while fitting the facts.

• Each time we see a new positive example, we relax all the most specific
hypotheses as little as possible, so that the new set is again as specific as
possible while fitting the facts.

• We continue until we have only a single hypothesis left. This will be the
answer if the target concept was in our hypothesis space at all.

In the case above, the set of most general hypotheses would be

{?, Warm, ?, ?}, {Sunny, ?, ?, ?} ,




while the set of most specific hypotheses contains the single member
{Sunny, Warm, High, ?}.
Any other concept you can find that fits the data will be strictly more specific
than one of the most general hypotheses, and strictly more general than the
most specific hypothesis.
(For more on this, I recommend Tom Mitchell’s Machine Learning, from
which this example was adapted.1 )
Now you may notice that the format above cannot represent all possible
concepts. E.g., “Play tennis when the sky is sunny or the air is warm.” That fits
the data, but in the concept representation defined above, there’s no quadruplet
of values that describes the rule.
Clearly our machine learner is not very general. Why not allow it to rep-
resent all possible concepts, so that it can learn with the greatest possible
flexibility?
Days are composed of these four variables, one variable with 3 values and
three variables with 2 values. So there are 3 × 2 × 2 × 2 = 24 possible Days
that we could encounter.
The format given for representing Concepts allows us to require any of these
values for a variable, or leave the variable open. So there are 4×3×3×3 = 108
concepts in that representation. For the most-general/most-specific algorithm
to work, we need to start with the most specific hypothesis “no example is ever
positively classified.” If we add that, it makes a total of 109 concepts.
Is it suspicious that there are more possible concepts than possible Days?
Surely not: After all, a concept can be viewed as a collection of Days. A concept
can be viewed as the set of days that it classifies positively, or isomorphically,
the set of days that it classifies negatively.
So the space of all possible concepts that classify Days is the set of all possible
sets of Days, whose size is 224 = 16,777,216.
This complete space includes all the concepts we have discussed so far. But
it also includes concepts like “Positively classify only the examples {Sunny,
Warm, High, Strong} and {Sunny, Warm, High, Weak} and reject ev-
erything else” or “Negatively classify only the example {Rainy, Cold, High,
Strong} and accept everything else.” It includes concepts with no compact
representation, just a flat list of what is and isn’t allowed.
That’s the problem with trying to build a “fully general” inductive learner:
They can’t learn concepts until they’ve seen every possible example in the
instance space.
If we add on more attributes to Days—like the Water temperature, or
the Forecast for tomorrow—then the number of possible days will grow
exponentially in the number of attributes. But this isn’t a problem with our
restricted concept space, because you can narrow down a large space using a
logarithmic number of examples.
Let’s say we add the Water: {Warm, Cold} attribute to days, which will
make for 48 possible Days and 325 possible concepts. Let’s say that each Day
we see is, usually, classified positive by around half of the currently-plausible
concepts, and classified negative by the other half. Then when we learn the
actual classification of the example, it will cut the space of compatible concepts
in half. So it might only take 9 examples (29 = 512) to narrow 325 possible
concepts down to one.
Even if Days had forty binary attributes, it should still only take a manage-
able amount of data to narrow down the possible concepts to one. Sixty-four
examples, if each example is classified positive by half the remaining concepts.
Assuming, of course, that the actual rule is one we can represent at all!
If you want to think of all the possibilities, well, good luck with that. The
space of all possible concepts grows superexponentially in the number of at-
tributes.
By the time you’re talking about data with forty binary attributes, the num-
ber of possible examples is past a trillion—but the number of possible concepts
is past two-to-the-trillionth-power. To narrow down that superexponential
concept space, you’d have to see over a trillion examples before you could say
what was In, and what was Out. You’d have to see every possible example, in
fact.
That’s with forty binary attributes, mind you. Forty bits, or 5 bytes, to be
classified simply “Yes” or “No.” Forty bits implies 240 possible examples, and
40
22 possible concepts that classify those examples as positive or negative.
So, here in the real world, where objects take more than 5 bytes to describe
and a trillion examples are not available and there is noise in the training data,
we only even think about highly regular concepts. A human mind—or the
whole observable universe—is not nearly large enough to consider all the other
hypotheses.
From this perspective, learning doesn’t just rely on inductive bias, it is
nearly all inductive bias—when you compare the number of concepts ruled
out a priori, to those ruled out by mere evidence.
But what has this (you inquire) to do with the proper use of words?
It’s the whole reason that words have intensions as well as extensions.
In the last essay, I concluded:

The way to carve reality at its joints is to draw boundaries around


concentrations of unusually high probability density.

I deliberately left out a key qualification in that (slightly edited) statement,


because I couldn’t explain it until now. A better statement would be:
The way to carve reality at its joints, is to draw simple boundaries
around concentrations of unusually high probability density in
Thingspace.

Otherwise you would just gerrymander Thingspace. You would create really
odd noncontiguous boundaries that collected the observed examples, examples
that couldn’t be described in any shorter message than your observations
themselves, and say: “This is what I’ve seen before, and what I expect to see
more of in the future.”
In the real world, nothing above the level of molecules repeats itself exactly.
Socrates is shaped a lot like all those other humans who were vulnerable to
hemlock, but he isn’t shaped exactly like them. So your guess that Socrates is
a “human” relies on drawing simple boundaries around the human cluster in
Thingspace. Rather than, “Things shaped exactly like [5-megabyte shape speci-
fication 1] and with [lots of other characteristics], or exactly like [5-megabyte
shape specification 2] and [lots of other characteristics], . . . , are human.”
If you don’t draw simple boundaries around your experiences, you can’t do
inference with them. So you try to describe “art” with intensional definitions
like “that which is intended to inspire any complex emotion for the sake of
inspiring it,” rather than just pointing at a long list of things that are, or aren’t
art.
In fact, the above statement about “how to carve reality at its joints” is a bit
chicken-and-eggish: You can’t assess the density of actual observations until
you’ve already done at least a little carving. And the probability distribution
comes from drawing the boundaries, not the other way around—if you already
had the probability distribution, you’d have everything necessary for inference,
so why would you bother drawing boundaries?
And this suggests another—yes, yet another—reason to be suspicious of
the claim that “you can define a word any way you like.” When you consider
the superexponential size of Conceptspace, it becomes clear that singling out
one particular concept for consideration is an act of no small audacity—not
just for us, but for any mind of bounded computing power.
Presenting us with the word “wiggin,” defined as “a black-haired green-eyed
person,” without some reason for raising this particular concept to the level
of our deliberate attention, is rather like a detective saying: “Well, I haven’t
the slightest shred of support one way or the other for who could’ve murdered
those orphans . . . not even an intuition, mind you . . . but have we considered
John Q. Wiffleheim of 1234 Norkle Rd as a suspect?”

1. Tom M. Mitchell, Machine Learning (McGraw-Hill Science/Engineering/Math, 1997).


177
Conditional Independence, and
Naive Bayes

Previously I spoke of mutual information between X and Y, written I(X; Y ),


which is the difference between the entropy of the joint probability distribution,
H(X, Y ), and the entropies of the marginal distributions, H(X) + H(Y ).
I gave the example of a variable X, having eight states, X1 through X8 ,
which are all equally probable if we have not yet encountered any evidence;
and a variable Y, with states Y1 through Y4 , which are all equally probable if
we have not yet encountered any evidence. Then if we calculate the marginal
entropies H(X) and H(Y ), we will find that X has 3 bits of entropy, and Y
has 2 bits.
However, we also know that X and Y are both even or both odd; and this
is all we know about the relation between them. So for the joint distribution
(X, Y ) there are only 16 possible states, all equally probable, for a joint entropy
of 4 bits. This is a 1-bit entropy defect, compared to 5 bits of entropy if X
and Y were independent. This entropy defect is the mutual information—the
information that X tells us about Y, or vice versa, so that we are not as uncertain
about one after having learned the other.
Suppose, however, that there exists a third variable Z. The variable Z has
two states, “even” and “odd,” perfectly correlated to the evenness or oddness
of (X, Y ). In fact, we’ll suppose that Z is just the question “Are X and Y
even or odd?”
If we have no evidence about X and Y, then Z itself necessarily has 1 bit
of entropy on the information given. There is 1 bit of mutual information
between Z and X, and 1 bit of mutual information between Z and Y. And, as
previously noted, 1 bit of mutual information between X and Y. So how much
entropy for the whole system (X, Y, Z)? You might naively expect that

H(X, Y, Z) = H(X) + H(Y ) + H(Z) − I(X; Z) − I(Z; Y ) − I(X; Y ) ,

but this turns out not to be the case.


The joint system (X, Y, Z) only has 16 possible states—since Z is just the
question “Are X and Y even or odd?”—so H(X, Y, Z) = 4 bits.
But if you calculate the formula just given, you get

(3 + 2 + 1 − 1 − 1 − 1) bits = 3 bits = Wrong!

Why? Because if you have the mutual information between X and Z, and
the mutual information between Z and Y, that may include some of the same
mutual information that we’ll calculate exists between X and Y. In this case,
for example, knowing that X is even tells us that Z is even, and knowing that
Z is even tells us that Y is even, but this is the same information that X would
tell us about Y. We double-counted some of our knowledge, and so came up
with too little entropy.
The correct formula is (I believe):

H(X, Y, Z) = H(X)+H(Y )+H(Z)−I(X; Z)−I(Z; Y )−I(X; Y |Z) .

Here the last term, I(X; Y |Z), means, “the information that X tells us about
Y, given that we already know Z.” In this case, X doesn’t tell us anything
about Y, given that we already know Z, so the term comes out as zero—and
the equation gives the correct answer. There, isn’t that nice?
“No,” you correctly reply, “for you have not told me how to calculate
I(X; Y |Z), only given me a verbal argument that it ought to be zero.”
We calculate I(X; Y |Z) just the way you would expect. We know
I(X; Y ) = H(X) + H(Y ) − H(X, Y ), so

I(X; Y |Z) = H(X|Z) + H(Y |Z) − H(X, Y |Z) .

And now, I suppose, you want to know how to calculate the conditional en-
tropy? Well, the original formula for the entropy is
X
H(S) = −P (Si ) × log2 (P (Si )) .
i

If we then learned a new fact Z0 , our remaining uncertainty about S would be


X
H(S|Z0 ) = −P (Si |Z0 ) log2 (P (Si |Z0 )) .
i

So if we’re going to learn a new fact Z, but we don’t know which Z yet, then,
on average, we expect to be around this uncertain of S afterward:
!
X X
H(S|Z) = P (Zj ) −P (Si |Zj ) log2 (P (Si |Zj )) .
j i

And that’s how one calculates conditional entropies; from which, in turn, we
can get the conditional mutual information.
There are all sorts of ancillary theorems here, like

H(X|Y ) = H(X, Y ) − H(Y )

and
if I(X; Z) = 0 and I(Y ; X|Z) = 0 then I(X; Y ) = 0 ,

but I’m not going to go into those.


“But,” you ask, “what does this have to do with the nature of words and
their hidden Bayesian structure?”
I am just so unspeakably glad that you asked that question, because I was
planning to tell you whether you liked it or not. But first there are a couple
more preliminaries.
You will remember—yes, you will remember—that there is a duality be-
tween mutual information and Bayesian evidence. Mutual information is
positive if and only if the probability of at least some joint events P (x, y) does
not equal the product of the probabilities of the separate events P (x)P (y).
This, in turn, is exactly equivalent to the condition that Bayesian evidence
exists between x and y:

I(X; Y ) > 0 ⇒
P (x, y) 6= P (x)P (y)
P (x, y)
6= P (x)
P (y)
P (x|y) 6= P (x) .

If you’re conditioning on Z, you just adjust the whole derivation accordingly:

I(X; Y |Z) > 0 ⇒


P (x, y|z) 6= P (x|z)P (y|z)
P (x, y|z)
6 P (x|z)
=
P (y|z)
(P (x, y, z)/P (z))
6 P (x|z)
=
(P (y, z)/P (z))
P (x, y, z)
6 P (x|z)
=
P (y, z)
P (x|y, z) 6= P (x|z) .

Which last line reads “Even knowing Z, learning Y still changes our beliefs
about X.”
Conversely, as in our original case of Z being “even” or “odd,” Z screens
off X from Y—that is, if we know that Z is “even,” learning that Y is in state
Y4 tells us nothing more about whether X is X2 , X4 , X6 , or X8 . Or if we know
that Z is “odd,” then learning that X is X5 tells us nothing more about whether
Y is Y1 or Y3 . Learning Z has rendered X and Y conditionally independent.
Conditional independence is a hugely important concept in probability
theory—to cite just one example, without conditional independence, the uni-
verse would have no structure.
Here, though, I only intend to talk about one particular kind of conditional
independence—the case of a central variable that screens off other variables
surrounding it, like a central body with tentacles.
Let there be five variables U, V, W, X, and Y; and moreover, suppose that
for every pair of these variables, one variable is evidence about the other. If you
select U and W, for example, then learning U = U1 will tell you something
you didn’t know before about the probability that W = W1 .
An unmanageable inferential mess? Evidence gone wild? Not necessarily.
Maybe U is “Speaks a language,” V is “Two arms and ten digits,” W is
“Wears clothes,” X is “Poisonable by hemlock,” and Y is “Red blood.” Now if
you encounter a thing-in-the-world, that might be an apple and might be a
rock, and you learn that this thing speaks Chinese, you are liable to assess a
much higher probability that it wears clothes; and if you learn that the thing is
not poisonable by hemlock, you will assess a somewhat lower probability that
it has red blood.
Now some of these rules are stronger than others. There is the case of Fred,
who is missing a finger due to a volcano accident, and the case of Barney the
Baby who doesn’t speak yet, and the case of Irving the IRCBot who emits
sentences but has no blood. So if we learn that a certain thing is not wearing
clothes, that doesn’t screen off everything that its speech capability can tell us
about its blood color. If the thing doesn’t wear clothes but does talk, maybe it’s
Nude Nellie.
This makes the case more interesting than, say, five integer variables that are
all odd or all even, but otherwise uncorrelated. In that case, knowing any one
of the variables would screen off everything that knowing a second variable
could tell us about a third variable.
But here, we have dependencies that don’t go away as soon as we learn
just one variable, as the case of Nude Nellie shows. So is it an unmanageable
inferential inconvenience?
Fear not! For there may be some sixth variable Z, which, if we knew it,
really would screen off every pair of variables from each other. There may
be some variable Z—even if we have to construct Z rather than observing it
directly—such that:

P (U |V, W, X, Y, Z) = P (U |Z)
P (V |U, W, X, Y, Z) = P (V |Z)
P (W |U, V, X, Y, Z) = P (W |Z)
..
.

Perhaps, given that a thing is “human,” then the probabilities of it speaking,


wearing clothes, and having the standard number of fingers, are all independent.
Fred may be missing a finger—but he is no more likely to be a nudist than the
next person; Nude Nellie never wears clothes, but knowing this doesn’t make
it any less likely that she speaks; and Baby Barney doesn’t talk yet, but is not
missing any limbs.
This is called the “Naive Bayes” method, because it usually isn’t quite true,
but pretending that it’s true can simplify the living daylights out of your calcula-
tions. We don’t keep separate track of the influence of clothed-ness on speech
capability given finger number. We just use all the information we’ve observed
to keep track of the probability that this thingy is a human (or alternatively,
something else, like a chimpanzee or robot) and then use our beliefs about
the central class to predict anything we haven’t seen yet, like vulnerability to
hemlock.
Any observations of U, V, W, X, and Y just act as evidence for the central
class variable Z, and then we use the posterior distribution on Z to make any
predictions that need making about unobserved variables in U, V, W, X, and
Y.
Sound familiar? It should; see Figure 177.1.
As a matter of fact, if you use the right kind of neural network units, this
“neural network” ends up exactly, mathematically equivalent to Naive Bayes.
The central unit just needs a logistic threshold—an S-curve response—and the
weights of the inputs just need to match the logarithms of the likelihood ratios,
et cetera. In fact, it’s a good guess that this is one of the reasons why logistic
response often works so well in neural networks—it lets the algorithm sneak
in a little Bayesian reasoning while the designers aren’t looking.
Color:
+blue / -red

Shape: Luminance:
+egg / -cube +glow / -dark

Category:
+BLEGG /
-RUBE

Texture: Interior:
+furred / +vanadium /
-smooth -palladium

Figure 177.1: Network 2

Just because someone is presenting you with an algorithm that they call a
“neural network” with buzzwords like “scruffy” and “emergent” plastered all
over it, disclaiming proudly that they have no idea how the learned network
works—well, don’t assume that their little AI algorithm really is Beyond the
Realms of Logic. For this paradigm of adhockery, if it works, will turn out to
have Bayesian structure; it may even be exactly equivalent to an algorithm of
the sort called “Bayesian.”
Even if it doesn’t look Bayesian, on the surface.
And then you just know that the Bayesians are going to start explaining
exactly how the algorithm works, what underlying assumptions it reflects,
which environmental regularities it exploits, where it works and where it fails,
and even attaching understandable meanings to the learned network weights.
Disappointing, isn’t it?

*
178
Words as Mental Paintbrush Handles

Suppose I tell you: “It’s the strangest thing: The lamps in this hotel have
triangular lightbulbs.”
You may or may not have visualized it—if you haven’t done it yet, do so
now—what, in your mind’s eye, does a “triangular lightbulb” look like?
In your mind’s eye, did the glass have sharp edges, or smooth?
When the phrase “triangular lightbulb” first crossed my mind—no, the
hotel doesn’t have them—then as best as my introspection could determine, I
first saw a pyramidal lightbulb with sharp edges, then (almost immediately)
the edges were smoothed, and then my mind generated a loop of flourescent
bulb in the shape of a smooth triangle as an alternative.
As far as I can tell, no deliberative/verbal thoughts were involved—just
wordless reflex flinch away from the imaginary mental vision of sharp glass,
which design problem was solved before I could even think in words.
Believe it or not, for some decades, there was a serious debate about whether
people really had mental images in their mind—an actual picture of a chair
somewhere—or if people just naively thought they had mental images (having
been misled by “introspection,” a very bad forbidden activity), while actually
just having a little “chair” label, like a lisp token, active in their brain.
I am trying hard not to say anything like “How spectacularly silly,” because
there is always the hindsight effect to consider, but: how spectacularly silly.
This academic paradigm, I think, was mostly a deranged legacy of behavior-
ism, which denied the existence of thoughts in humans, and sought to explain
all human phenomena as “reflex,” including speech. Behaviorism probably de-
serves its own write at some point, as it was a perversion of rationalism; but
this is not that write.
“You call it ‘silly,’ ” you inquire, “but how do you know that your brain
represents visual images? Is it merely that you can close your eyes and see
them?”
This question used to be harder to answer, back in the day of the controversy.
If you wanted to prove the existence of mental imagery “scientifically,” rather
than just by introspection, you had to infer the existence of mental imagery
from experiments like this: Show subjects two objects and ask them if one can
be rotated into correspondence with the other. The response time is linearly
proportional to the angle of rotation required. This is easy to explain if you are
actually visualizing the image and continuously rotating it at a constant speed,
but hard to explain if you are just checking propositional features of the image.
Today we can actually neuroimage the little pictures in the visual cortex. So,
yes, your brain really does represent a detailed image of what it sees or imagines.
See Stephen Kosslyn’s Image and Brain: The Resolution of the Imagery Debate.1
Part of the reason people get in trouble with words, is that they do not
realize how much complexity lurks behind words.
Can you visualize a “green dog”? Can you visualize a “cheese apple”?
“Apple” isn’t just a sequence of two syllables or five letters. That’s a shadow.
That’s the tip of the tiger’s tail.
Words, or rather the concepts behind them, are paintbrushes—you can
use them to draw images in your own mind. Literally draw, if you employ
concepts to make a picture in your visual cortex. And by the use of shared
labels, you can reach into someone else’s mind, and grasp their paintbrushes
to draw pictures in their minds—sketch a little green dog in their visual cortex.
But don’t think that, because you send syllables through the air, or letters
through the Internet, it is the syllables or the letters that draw pictures in the
visual cortex. That takes some complex instructions that wouldn’t fit in the
sequence of letters. “Apple” is 5 bytes, and drawing a picture of an apple from
scratch would take more data than that.
“Apple” is merely the tag attached to the true and wordless apple concept,
which can paint a picture in your visual cortex, or collide with “cheese,” or
recognize an apple when you see one, or taste its archetype in apple pie, maybe
even send out the motor behavior for eating an apple . . .
And it’s not as simple as just calling up a picture from memory. Or how
would you be able to visualize combinations like a “triangular lightbulb”—
imposing triangleness on lightbulbs, keeping the essence of both, even if you’ve
never seen such a thing in your life?
Don’t make the mistake the behaviorists made. There’s far more to speech
than sound in air. The labels are just pointers—“look in memory area 1387540.”
Sooner or later, when you’re handed a pointer, it comes time to dereference it,
and actually look in memory area 1387540.
What does a word point to?

1. Stephen M. Kosslyn, Image and Brain: The Resolution of the Imagery Debate (Cambridge, MA: MIT
Press, 1994).
179
Variable Question Fallacies

Albert: “Every time I’ve listened to a tree fall, it made a


sound, so I’ll guess that other trees falling also make sounds. I
don’t believe the world changes around when I’m not looking.”
Barry: “Wait a minute. If no one hears it, how can it be a
sound?”

While writing the dialogue of Albert and Barry in their dispute over whether
a falling tree in a deserted forest makes a sound, I sometimes found myself
losing empathy with my characters. I would start to lose the gut feel of why
anyone would ever argue like that, even though I’d seen it happen many times.
On these occasions, I would repeat to myself, “Either the falling tree makes
a sound, or it does not!” to restore my borrowed sense of indignation.
(P or ¬P ) is not always a reliable heuristic, if you substitute arbitrary
English sentences for P. “This sentence is false” cannot be consistently viewed
as true or false. And then there’s the old classic, “Have you stopped beating
your wife?”
Now if you are a mathematician, and one who believes in classical (rather
than intuitionistic) logic, there are ways to continue insisting that (P or ¬P )
is a theorem: for example, saying that “This sentence is false” is not a sentence.
But such resolutions are subtle, which suffices to demonstrate a need for
subtlety. You cannot just bull ahead on every occasion with “Either it does or
it doesn’t!”
So does the falling tree make a sound, or not, or . . . ?
Surely, 2 + 2 = X or it does not? Well, maybe, if it’s really the same X,
the same 2, and the same + and = . If X evaluates to 5 on some occasions
and 4 on another, your indignation may be misplaced.
To even begin claiming that (P or ¬P ) ought to be a necessary truth,
the symbol P must stand for exactly the same thing in both halves of the
dilemma. “Either the fall makes a sound, or not!”—but if Albert::sound is not
the same as Barry::sound, there is nothing paradoxical about the tree making
an Albert::sound but not a Barry::sound.
(The :: idiom is something I picked up in my C++ days for avoiding names-
pace collisions. If you’ve got two different packages that define a class Sound,
you can write Package1::Sound to specify which Sound you mean. The idiom
is not widely known, I think; which is a pity, because I often wish I could use it
in writing.)
The variability may be subtle: Albert and Barry may carefully verify that it
is the same tree, in the same forest, and the same occasion of falling, just to
ensure that they really do have a substantive disagreement about exactly the
same event. And then forget to check that they are matching this event against
exactly the same concept.
Think about the grocery store that you visit most often: Is it on the left
side of the street, or the right? But of course there is no “the left side” of the
street, only your left side, as you travel along it from some particular direction.
Many of the words we use are really functions of implicit variables supplied by
context.
It’s actually one heck of a pain, requiring one heck of a lot of work, to handle
this kind of problem in an Artificial Intelligence program intended to parse
language—the phenomenon going by the name of “speaker deixis.”
“Martin told Bob the building was on his left.” But “left” is a function-word
that evaluates with a speaker-dependent variable invisibly grabbed from the
surrounding context. Whose “left” is meant, Bob’s or Martin’s?
The variables in a variable question fallacy often aren’t neatly labeled—it’s
not as simple as “Say, do you think Z + 2 equals 6?”
If a namespace collision introduces two different concepts that look like
“the same concept” because they have the same name—or a map compression
introduces two different events that look like the same event because they
don’t have separate mental files—or the same function evaluates in different
contexts—then reality itself becomes protean, changeable. At least that’s what
the algorithm feels like from inside. Your mind’s eye sees the map, not the
territory directly.
If you have a question with a hidden variable, that evaluates to different
expressions in different contexts, it feels like reality itself is unstable—what
your mind’s eye sees, shifts around depending on where it looks.
This often confuses undergraduates (and postmodernist professors) who
discover a sentence with more than one interpretation; they think they have
discovered an unstable portion of reality.
“Oh my gosh! ‘The Sun goes around the Earth’ is true for Hunga Hunter-
gatherer, but for Amara Astronomer, ‘The Sun goes around the Earth’ is false!
There is no fixed truth!” The deconstruction of this sophomoric nitwittery is
left as an exercise to the reader.
And yet, even I initially found myself writing “If X is 5 on some occasions
and 4 on another, the sentence ‘2 + 2 = X’ may have no fixed truth-value.”
There is not one sentence with a variable truth-value. “2 + 2 = X” has
no truth-value. It is not a proposition, not yet, not as mathematicians define
proposition-ness, any more than “2 + 2 =” is a proposition, or “Fred jumped
over the” is a grammatical sentence.
But this fallacy tends to sneak in, even when you allegedly know better,
because, well, that’s how the algorithm feels from inside.

*
180
37 Ways That Words Can Be Wrong

Some reader is bound to declare that a better title for this essay would be “37
Ways That You Can Use Words Unwisely,” or “37 Ways That Suboptimal Use
Of Categories Can Have Negative Side Effects On Your Cognition.”
But one of the primary lessons of this gigantic list is that saying “There’s
no way my choice of X can be ‘wrong’ ” is nearly always an error in practice,
whatever the theory. You can always be wrong. Even when it’s theoretically
impossible to be wrong, you can still be wrong. There is never a Get Out of Jail
Free card for anything you do. That’s life.
Besides, I can define the word “wrong” to mean anything I like—it’s not
like a word can be wrong.
Personally, I think it quite justified to use the word “wrong” when:

1. A word fails to connect to reality in the first place. Is Socrates a framster?


Yes or no? (The Parable of the Dagger)

2. Your argument, if it worked, could coerce reality to go a different way by


choosing a different word definition. Socrates is a human, and humans,
by definition, are mortal. So if you defined humans to not be mortal,
would Socrates live forever? (The Parable of Hemlock)
3. You try to establish any sort of empirical proposition as being true “by
definition.” Socrates is a human, and humans, by definition, are mortal.
So is it a logical truth if we empirically predict that Socrates should keel
over if he drinks hemlock? It seems like there are logically possible,
non-self-contradictory worlds where Socrates doesn’t keel over—where
he’s immune to hemlock by a quirk of biochemistry, say. Logical truths
are true in all possible worlds, and so never tell you which possible world
you live in—and anything you can establish “by definition” is a logical
truth. (The Parable of Hemlock)

4. You unconsciously slap the conventional label on something, without ac-


tually using the verbal definition you just gave. You know perfectly well
that Bob is “human,” even though, by your definition, you can never call
Bob “human” without first observing him to be mortal. (The Parable of
Hemlock)

5. The act of labeling something with a word disguises a challengable inductive


inference you are making. If the last 11 egg-shaped objects drawn have
been blue, and the last 8 cubes drawn have been red, it is a matter of
induction to say this rule will hold in the future. But if you call the
blue eggs “bleggs” and the red cubes “rubes,” you may reach into the
barrel, feel an egg shape, and think “Oh, a blegg.” (Words as Hidden
Inferences)

6. You try to define a word using words, in turn defined with ever-more-
abstract words, without being able to point to an example. “What is red?”
“Red is a color.” “What’s a color?” “It’s a property of a thing.” “What’s a
thing? What’s a property?” It never occurs to you to point to a stop sign
and an apple. (Extensions and Intensions)

7. The extension doesn’t match the intension. We aren’t consciously aware of


our identification of a red light in the sky as “Mars,” which will probably
happen regardless of your attempt to define “Mars” as “The God of War.”
(Extensions and Intensions)
8. Your verbal definition doesn’t capture more than a tiny fraction of the
category’s shared characteristics, but you try to reason as if it does. When
the philosophers of Plato’s Academy claimed that the best definition of
a human was a “featherless biped,” Diogenes the Cynic is said to have
exhibited a plucked chicken and declared “Here is Plato’s Man.” The
Platonists promptly changed their definition to “a featherless biped with
broad nails.” (Similarity Clusters)

9. You try to treat category membership as all-or-nothing, ignoring the exis-


tence of more and less typical subclusters. Ducks and penguins are less
typical birds than robins and pigeons. Interestingly, a between-groups
experiment showed that subjects thought a disease was more likely to
spread from robins to ducks on an island, than from ducks to robins.
(Typicality and Asymmetrical Similarity)

10. A verbal definition works well enough in practice to point out the intended
cluster of similar things, but you nitpick exceptions. Not every human
has ten fingers, or wears clothes, or uses language; but if you look for
an empirical cluster of things which share these characteristics, you’ll
get enough information that the occasional nine-fingered human won’t
fool you. (The Cluster Structure of Thingspace)

11. You ask whether something “is” or “is not” a category member but can’t
name the question you really want answered. What is a “man”? Is Barney
the Baby Boy a “man”? The “correct” answer may depend considerably
on whether the query you really want answered is “Would hemlock be a
good thing to feed Barney?” or “Will Barney make a good husband?”
(Disguised Queries)

12. You treat intuitively perceived hierarchical categories like the only correct
way to parse the world, without realizing that other forms of statistical
inference are possible even though your brain doesn’t use them. It’s much
easier for a human to notice whether an object is a “blegg” or “rube”;
than for a human to notice that red objects never glow in the dark, but
red furred objects have all the other characteristics of bleggs. Other
statistical algorithms work differently. (Neural Categories)
13. You talk about categories as if they are manna fallen from the Platonic
Realm, rather than inferences implemented in a real brain. The ancient
philosophers said “Socrates is a man,” not, “My brain perceptually
classifies Socrates as a match against the ‘human’ concept.” (How An
Algorithm Feels From Inside)

14. You argue about a category membership even after screening off all ques-
tions that could possibly depend on a category-based inference. After you
observe that an object is blue, egg-shaped, furred, flexible, opaque, lu-
minescent, and palladium-containing, what’s left to ask by arguing, “Is
it a blegg?” But if your brain’s categorizing neural network contains a
(metaphorical) central unit corresponding to the inference of blegg-ness,
it may still feel like there’s a leftover question. (How An Algorithm Feels
From Inside)

15. You allow an argument to slide into being about definitions, even though
it isn’t what you originally wanted to argue about. If, before a dispute
started about whether a tree falling in a deserted forest makes a “sound,”
you asked the two soon-to-be arguers whether they thought a “sound”
should be defined as “acoustic vibrations” or “auditory experiences,”
they’d probably tell you to flip a coin. Only after the argument starts
does the definition of a word become politically charged. (Disputing
Definitions)

16. You think a word has a meaning, as a property of the word itself; rather
than there being a label that your brain associates to a particular concept.
When someone shouts “Yikes! A tiger!,” evolution would not favor
an organism that thinks, “Hm . . . I have just heard the syllables ‘Tie’
and ‘Grr’ which my fellow tribemembers associate with their internal
analogues of my own tiger concept and which aiiieeee crunch crunch
gulp.” So the brain takes a shortcut, and it seems that the meaning of
tigerness is a property of the label itself. People argue about the correct
meaning of a label like “sound.” (Feel the Meaning)

17. You argue over the meanings of a word, even after all sides understand
perfectly well what the other sides are trying to say. The human ability
to associate labels to concepts is a tool for communication. When peo-
ple want to communicate, we’re hard to stop; if we have no common
language, we’ll draw pictures in sand. When you each understand what
is in the other’s mind, you are done. (The Argument From Common
Usage)

18. You pull out a dictionary in the middle of an empirical or moral argument.
Dictionary editors are historians of usage, not legislators of language. If
the common definition contains a problem—if “Mars” is defined as the
God of War, or a “dolphin” is defined as a kind of fish, or “Negroes” are
defined as a separate category from humans, the dictionary will reflect
the standard mistake. (The Argument From Common Usage)

19. You pull out a dictionary in the middle of any argument ever. Seriously,
what the heck makes you think that dictionary editors are an authority
on whether “atheism” is a “religion” or whatever? If you have any
substantive issue whatsoever at stake, do you really think dictionary
editors have access to ultimate wisdom that settles the argument? (The
Argument From Common Usage)

20. You defy common usage without a reason, making it gratuitously hard for
others to understand you. Fast stand up plutonium, with bagels without
handle. (The Argument From Common Usage)

21. You use complex renamings to create the illusion of inference. Is a “hu-
man” defined as a “mortal featherless biped”? Then write: “All [mortal
featherless bipeds] are mortal; Socrates is a [mortal featherless biped];
therefore, Socrates is mortal.” Looks less impressive that way, doesn’t it?
(Empty Labels)

22. You get into arguments that you could avoid if you just didn’t use the
word. If Albert and Barry aren’t allowed to use the word “sound,” then
Albert will have to say “A tree falling in a deserted forest generates
acoustic vibrations,” and Barry will say “A tree falling in a deserted forest
generates no auditory experiences.” When a word poses a problem, the
simplest solution is to eliminate the word and its synonyms. (Taboo
Your Words)

23. The existence of a neat little word prevents you from seeing the details of
the thing you’re trying to think about. What actually goes on in schools
once you stop calling it “education”? What’s a degree, once you stop
calling it a “degree”? If a coin lands “heads,” what’s its radial orientation?
What is “truth,” if you can’t say “accurate” or “correct” or “represent” or
“reflect” or “semantic” or “believe” or “knowledge” or “map” or “real”
or any other simple term? (Replace the Symbol with the Substance)

24. You have only one word, but there are two or more different things-in-
reality, so that all the facts about them get dumped into a single undifferen-
tiated mental bucket. It’s part of a detective’s ordinary work to observe
that Carol wore red last night, or that she has black hair; and it’s part
of a detective’s ordinary work to wonder if maybe Carol dyes her hair.
But it takes a subtler detective to wonder if there are two Carols, so that
the Carol who wore red is not the same as the Carol who had black hair.
(Fallacies of Compression)

25. You see patterns where none exist, harvesting other characteristics from
your definitions even when there is no similarity along that dimension. In
Japan, it is thought that people of blood type A are earnest and creative,
blood type Bs are wild and cheerful, blood type Os are agreeable and
sociable, and blood type ABs are cool and controlled. (Categorizing
Has Consequences)

26. You try to sneak in the connotations of a word, by arguing from a defini-
tion that doesn’t include the connotations. A “wiggin” is defined in the
dictionary as a person with green eyes and black hair. The word “wig-
gin” also carries the connotation of someone who commits crimes and
launches cute baby squirrels, but that part isn’t in the dictionary. So you
point to someone and say: “Green eyes? Black hair? See, told you he’s
a wiggin! Watch, next he’s going to steal the silverware.” (Sneaking in
Connotations)
27. You claim “X, by definition, is a Y !” On such occasions you’re almost
certainly trying to sneak in a connotation of Y that wasn’t in your given
definition. You define “human” as a “featherless biped,” and point to
Socrates and say, “No feathers—two legs—he must be human!” But what
you really care about is something else, like mortality. If what was in
dispute was Socrates’s number of legs, the other fellow would just reply,
“Whaddaya mean, Socrates’s got two legs? That’s what we’re arguing
about in the first place!” (Arguing “By Definition”)

28. You claim “Ps, by definition, are Qs!” If you see Socrates out in the field
with some biologists, gathering herbs that might confer resistance to
hemlock, there’s no point in arguing “Men, by definition, are mortal!”
The main time you feel the need to tighten the vise by insisting that
something is true “by definition” is when there’s other information that
calls the default inference into doubt. (Arguing “By Definition”)

29. You try to establish membership in an empirical cluster “by definition.”


You wouldn’t feel the need to say, “Hinduism, by definition, is a religion!”
because, well, of course Hinduism is a religion. It’s not just a religion “by
definition,” it’s, like, an actual religion. Atheism does not resemble the
central members of the “religion” cluster, so if it wasn’t for the fact that
atheism is a religion by definition, you might go around thinking that
atheism wasn’t a religion. That’s why you’ve got to crush all opposition
by pointing out that “Atheism is a religion” is true by definition, because
it isn’t true any other way. (Arguing “By Definition”)

30. Your definition draws a boundary around things that don’t really belong
together. You can claim, if you like, that you are defining the word “fish”
to refer to salmon, guppies, sharks, dolphins, and trout, but not jellyfish
or algae. You can claim, if you like, that this is merely a list, and there is
no way a list can be “wrong.” Or you can stop playing games and admit
that you made a mistake and that dolphins don’t belong on the fish list.
(Where to Draw the Boundary?)

31. You use a short word for something that you won’t need to describe often,
or a long word for something you’ll need to describe often. This can result
in inefficient thinking, or even misapplications of Occam’s Razor, if your
mind thinks that short sentences sound “simpler.” Which sounds more
plausible, “God did a miracle” or “A supernatural universe-creating
entity temporarily suspended the laws of physics”? (Entropy, and Short
Codes)

32. You draw your boundary around a volume of space where there is no
greater-than-usual density, meaning that the associated word does not
correspond to any performable Bayesian inferences. Since green-eyed
people are not more likely to have black hair, or vice versa, and they
don’t share any other characteristics in common, why have a word for
“wiggin”? (Mutual Information, and Density in Thingspace)

33. You draw an unsimple boundary without any reason to do so. The act
of defining a word to refer to all humans, except black people, seems
kind of suspicious. If you don’t present reasons to draw that particular
boundary, trying to create an “arbitrary” word in that location is like
a detective saying: “Well, I haven’t the slightest shred of support one
way or the other for who could’ve murdered those orphans . . . but have
we considered John Q. Wiffleheim as a suspect?” (Superexponential
Conceptspace, and Simple Words)

34. You use categorization to make inferences about properties that don’t have
the appropriate empirical structure, namely, conditional independence
given knowledge of the class, to be well-approximated by Naive Bayes. No
way am I trying to summarize this one. Just read the essay. (Conditional
Independence, and Naive Bayes)

35. You think that words are like tiny little lisp symbols in your mind, rather
than words being labels that act as handles to direct complex mental
paintbrushes that can paint detailed pictures in your sensory workspace.
Visualize a “triangular lightbulb.” What did you see? (Words as Mental
Paintbrush Handles)

36. You use a word that has different meanings in different places as though
it meant the same thing on each occasion, possibly creating the illusion
of something protean and shifting. “Martin told Bob the building was
on his left.” But “left” is a function-word that evaluates with a speaker-
dependent variable grabbed from the surrounding context. Whose “left”
is meant, Bob’s or Martin’s? (Variable Question Fallacies)

37. You think that definitions can’t be “wrong,” or that “I can define a word any
way I like!” This kind of attitude teaches you to indignantly defend your
past actions, instead of paying attention to their consequences, or fessing
up to your mistakes. (37 Ways That Suboptimal Use Of Categories Can
Have Negative Side Effects On Your Cognition)

Everything you do in the mind has an effect, and your brain races ahead
unconsciously without your supervision.
Saying “Words are arbitrary; I can define a word any way I like” makes
around as much sense as driving a car over thin ice with the accelerator floored
and saying, “Looking at this steering wheel, I can’t see why one radial angle is
special—so I can turn the steering wheel any way I like.”
If you’re trying to go anywhere, or even just trying to survive, you had
better start paying attention to the three or six dozen optimality criteria that
control how you use words, definitions, categories, classes, boundaries, labels,
and concepts.

*
Interlude
An Intuitive Explanation of Bayes’s
Theorem

[Editor’s Note: This is an abridgement of the original version of this essay,


which contained many interactive elements.]

Your friends and colleagues are talking about something called “Bayes’s Theo-
rem” or “Bayes’s Rule,” or something called Bayesian reasoning. They sound
really enthusiastic about it, too, so you google and find a web page about Bayes’s
Theorem and . . .
It’s this equation. That’s all. Just one equation. The page you found gives
a definition of it, but it doesn’t say what it is, or why it’s useful, or why your
friends would be interested in it. It looks like this random statistics thing.
Why does a mathematical concept generate this strange enthusiasm in its
students? What is the so-called Bayesian Revolution now sweeping through
the sciences, which claims to subsume even the experimental method itself as
a special case? What is the secret that the adherents of Bayes know? What is
the light that they have seen?
Soon you will know. Soon you will be one of us.
While there are a few existing online explanations of Bayes’s Theorem,
my experience with trying to introduce people to Bayesian reasoning is that
the existing online explanations are too abstract. Bayesian reasoning is very
counterintuitive. People do not employ Bayesian reasoning intuitively, find
it very difficult to learn Bayesian reasoning when tutored, and rapidly forget
Bayesian methods once the tutoring is over. This holds equally true for novice
students and highly trained professionals in a field. Bayesian reasoning is
apparently one of those things which, like quantum mechanics or the Wason
Selection Test, is inherently difficult for humans to grasp with our built-in
mental faculties.
Or so they claim. Here you will find an attempt to offer an intuitive ex-
planation of Bayesian reasoning—an excruciatingly gentle introduction that
invokes all the human ways of grasping numbers, from natural frequencies to
spatial visualization. The intent is to convey, not abstract rules for manipulat-
ing numbers, but what the numbers mean, and why the rules are what they are
(and cannot possibly be anything else). When you are finished reading this,
you will see Bayesian problems in your dreams.
And let’s begin.

Here’s a story problem about a situation that doctors often encounter:

1% of women at age forty who participate in routine screening


have breast cancer. 80% of women with breast cancer will get
positive mammographies. 9.6% of women without breast cancer
will also get positive mammographies. A woman in this age group
had a positive mammography in a routine screening. What is the
probability that she actually has breast cancer?

What do you think the answer is? If you haven’t encountered this kind of
problem before, please take a moment to come up with your own answer
before continuing.

Next, suppose I told you that most doctors get the same wrong answer on this
problem—usually, only around 15% of doctors get it right. (“Really? 15%?
Is that a real number, or an urban legend based on an Internet poll?” It’s a
real number. See Casscells, Schoenberger, and Graboys 1978;1 Eddy 1982;2
Gigerenzer and Hoffrage 1995;3 and many other studies. It’s a surprising result
which is easy to replicate, so it’s been extensively replicated.)
On the story problem above, most doctors estimate the probability to be
between 70% and 80%, which is wildly incorrect.
Here’s an alternate version of the problem on which doctors fare somewhat
better:

10 out of 1,000 women at age forty who participate in routine


screening have breast cancer. 800 out of 1,000 women with breast
cancer will get positive mammographies. 96 out of 1,000 women
without breast cancer will also get positive mammographies. If
1,000 women in this age group undergo a routine screening, about
what fraction of women with positive mammographies will actu-
ally have breast cancer?

And finally, here’s the problem on which doctors fare best of all, with 46%—
nearly half—arriving at the correct answer:

100 out of 10,000 women at age forty who participate in routine


screening have breast cancer. 80 of every 100 women with breast
cancer will get a positive mammography. 950 out of 9,900 women
without breast cancer will also get a positive mammography. If
10,000 women in this age group undergo a routine screening,
about what fraction of women with positive mammographies will
actually have breast cancer?

The correct answer is 7.8%, obtained as follows: Out of 10,000 women, 100 have
breast cancer; 80 of those 100 have positive mammographies. From the same
10,000 women, 9,900 will not have breast cancer and of those 9,900 women, 950
will also get positive mammographies. This makes the total number of women
with positive mammographies 950 + 80 or 1,030. Of those 1,030 women with
positive mammographies, 80 will have cancer. Expressed as a proportion, this
is 80/1,030 or 0.07767 or 7.8%.
To put it another way, before the mammography screening, the 10,000
women can be divided into two groups:

• Group 1: 100 women with breast cancer.

• Group 2: 9,900 women without breast cancer.

Summing these two groups gives a total of 10,000 patients, confirming that
none have been lost in the math. After the mammography, the women can be
divided into four groups:

• Group A: 80 women with breast cancer and a positive mammography.

• Group B: 20 women with breast cancer and a negative mammography.

• Group C: 950 women without breast cancer and a positive mammo-


graphy.

• Group D: 8,950 women without breast cancer and a negative mammo-


graphy.

The sum of groups A and B, the groups with breast cancer, corresponds to
group 1; and the sum of groups C and D, the groups without breast cancer,
corresponds to group 2. If you administer a mammography to 10,000 pa-
tients, then out of the 1,030 with positive mammographies, eighty of those
positive-mammography patients will have cancer. This is the correct answer,
the answer a doctor should give a positive-mammography patient if she asks
about the chance she has breast cancer; if thirteen patients ask this question,
roughly one out of those thirteen will have cancer.

The most common mistake is to ignore the original fraction of women with
breast cancer, and the fraction of women without breast cancer who receive
false positives, and focus only on the fraction of women with breast cancer
who get positive results. For example, the vast majority of doctors in these
studies seem to have thought that if around 80% of women with breast cancer
have positive mammographies, then the probability of a women with a positive
mammography having breast cancer must be around 80%.
Figuring out the final answer always requires all three pieces of
information—the percentage of women with breast cancer, the percentage
of women without breast cancer who receive false positives, and the percentage
of women with breast cancer who receive (correct) positives.
The original proportion of patients with breast cancer is known as the prior
probability. The chance that a patient with breast cancer gets a positive mam-
mography, and the chance that a patient without breast cancer gets a positive
mammography, are known as the two conditional probabilities. Collectively,
this initial information is known as the priors. The final answer—the estimated
probability that a patient has breast cancer, given that we know she has a posi-
tive result on her mammography—is known as the revised probability or the
posterior probability. What we’ve just seen is that the posterior probability
depends in part on the prior probability.
To see that the final answer always depends on the original fraction of
women with breast cancer, consider an alternate universe in which only one
woman out of a million has breast cancer. Even if mammography in this world
detects breast cancer in 8 out of 10 cases, while returning a false positive on
a woman without breast cancer in only 1 out of 10 cases, there will still be a
hundred thousand false positives for every real case of cancer detected. The
original probability that a woman has cancer is so extremely low that, although
a positive result on the mammography does increase the estimated probability,
the probability isn’t increased to certainty or even “a noticeable chance”; the
probability goes from 1:1,000,000 to 1:100,000.
What this demonstrates is that the mammography result doesn’t replace
your old information about the patient’s chance of having cancer; the mam-
mography slides the estimated probability in the direction of the result. A
positive result slides the original probability upward; a negative result slides
the probability downward. For example, in the original problem where 1%
of the women have cancer, 80% of women with cancer get positive mammo-
graphies, and 9.6% of women without cancer get positive mammographies, a
positive result on the mammography slides the 1% chance upward to 7.8%.
Most people encountering problems of this type for the first time carry
out the mental operation of replacing the original 1% probability with the
80% probability that a woman with cancer gets a positive mammography. It
may seem like a good idea, but it just doesn’t work. “The probability that
a woman with a positive mammography has breast cancer” is not at all the
same thing as “the probability that a woman with breast cancer has a positive
mammography”; they are as unlike as apples and cheese.

Q. Why did the Bayesian reasoner cross the road?


A. You need more information to answer this question.

Suppose that a barrel contains many small plastic eggs. Some eggs are painted
red and some are painted blue. 40% of the eggs in the bin contain pearls, and
60% contain nothing. 30% of eggs containing pearls are painted blue, and 10%
of eggs containing nothing are painted blue. What is the probability that a blue
egg contains a pearl? For this example the arithmetic is simple enough that
you may be able to do it in your head, and I would suggest trying to do so.
A more compact way of specifying the problem:

P (pearl) = 40%
P (blue|pearl) = 30%
P (blue|¬pearl) = 10%
P (pearl|blue) = ?

The symbol “¬” is shorthand for “not,” so ¬pearl reads “not pearl.”
The notation P (blue|pearl) is shorthand for “the probability of blue given
pearl” or “the probability that an egg is painted blue, given that the egg con-
tains a pearl.” The item on the right side is what you already know or the
premise, and the item on the left side is the implication or conclusion. If we have
P (blue|pearl) = 30%, and we already know that some egg contains a pearl,
then we can conclude there is a 30% chance that the egg is painted blue. Thus,
the final fact we’re looking for—“the chance that a blue egg contains a pearl”
or “the probability that an egg contains a pearl, if we know the egg is painted
blue”—reads P (pearl|blue).
40% of the eggs contain pearls, and 60% of the eggs contain nothing. 30%
of the eggs containing pearls are painted blue, so 12% of the eggs altogether
contain pearls and are painted blue. 10% of the eggs containing nothing are
painted blue, so altogether 6% of the eggs contain nothing and are painted
blue. A total of 18% of the eggs are painted blue, and a total of 12% of the eggs
are painted blue and contain pearls, so the chance a blue egg contains a pearl
is 12/18 or 2/3 or around 67%.
As before, we can see the necessity of all three pieces of information by
considering extreme cases. In a (large) barrel in which only one egg out of
a thousand contains a pearl, knowing that an egg is painted blue slides the
probability from 0.1% to 0.3% (instead of sliding the probability from 40% to
67%). Similarly, if 999 out of 1,000 eggs contain pearls, knowing that an egg is
blue slides the probability from 99.9% to 99.966%; the probability that the egg
does not contain a pearl goes from 1/1,000 to around 1/3,000.
On the pearl-egg problem, most respondents unfamiliar with Bayesian
reasoning would probably respond that the probability a blue egg contains a
pearl is 30%, or perhaps 20% (the 30% chance of a true positive minus the 10%
chance of a false positive). Even if this mental operation seems like a good
idea at the time, it makes no sense in terms of the question asked. It’s like the
experiment in which you ask a second-grader: “If eighteen people get on a bus,
and then seven more people get on the bus, how old is the bus driver?” Many
second-graders will respond: “Twenty-five.” They understand when they’re
being prompted to carry out a particular mental procedure, but they haven’t
quite connected the procedure to reality. Similarly, to find the probability that
a woman with a positive mammography has breast cancer, it makes no sense
whatsoever to replace the original probability that the woman has cancer with
the probability that a woman with breast cancer gets a positive mammography.
Neither can you subtract the probability of a false positive from the probability
of the true positive. These operations are as wildly irrelevant as adding the
number of people on the bus to find the age of the bus driver.
A study by Gigerenzer and Hoffrage in 1995 showed that some ways of phrasing
story problems are much more evocative of correct Bayesian reasoning.4 The
least evocative phrasing used probabilities. A slightly more evocative phrasing
used frequencies instead of probabilities; the problem remained the same, but
instead of saying that 1% of women had breast cancer, one would say that 1 out
of 100 women had breast cancer, that 80 out of 100 women with breast cancer
would get a positive mammography, and so on. Why did a higher proportion
of subjects display Bayesian reasoning on this problem? Probably because
saying “1 out of 100 women” encourages you to concretely visualize X women
with cancer, leading you to visualize X women with cancer and a positive
mammography, etc.
The most effective presentation found so far is what’s known as natural
frequencies—saying that 40 out of 100 eggs contain pearls, 12 out of 40 eggs
containing pearls are painted blue, and 6 out of 60 eggs containing nothing
are painted blue. A natural frequencies presentation is one in which the infor-
mation about the prior probability is included in presenting the conditional
probabilities. If you were just learning about the eggs’ conditional probabilities
through natural experimentation, you would—in the course of cracking open
a hundred eggs—crack open around 40 eggs containing pearls, of which 12
eggs would be painted blue, while cracking open 60 eggs containing nothing,
of which about 6 would be painted blue. In the course of learning the condi-
tional probabilities, you’d see examples of blue eggs containing pearls about
twice as often as you saw examples of blue eggs containing nothing.
Unfortunately, while natural frequencies are a step in the right direction, it
probably won’t be enough. When problems are presented in natural frequen-
cies, the proportion of people using Bayesian reasoning rises to around half. A
big improvement, but not big enough when you’re talking about real doctors
and real patients.

Q. How can I find the priors for a problem?


A. Many commonly used priors are listed in the Handbook of
Chemistry and Physics.
Q. Where do priors originally come from?
A. Never ask that question.
Q. Uh huh. Then where do scientists get their priors?
A. Priors for scientific problems are established by annual vote of
the aaas. In recent years the vote has become fractious and con-
troversial, with widespread acrimony, factional polarization, and
several outright assassinations. This may be a front for infighting
within the Bayes Council, or it may be that the disputants have
too much spare time. No one is really sure.
Q. I see. And where does everyone else get their priors?
A. They download their priors from Kazaa.
Q. What if the priors I want aren’t available on Kazaa?
A. There’s a small, cluttered antique shop in a back alley of San
Francisco’s Chinatown. Don’t ask about the bronze rat.

Actually, priors are true or false just like the final answer—they reflect reality
and can be judged by comparing them against reality. For example, if you think
that 920 out of 10,000 women in a sample have breast cancer, and the actual
number is 100 out of 10,000, then your priors are wrong. For our particular
problem, the priors might have been established by three studies—a study on
the case histories of women with breast cancer to see how many of them tested
positive on a mammography, a study on women without breast cancer to see
how many of them test positive on a mammography, and an epidemiological
study on the prevalence of breast cancer in some specific demographic.

The probability P (A, B) is the same as P (B, A), but P (A|B) is not the same
thing as P (B|A), and P (A, B) is completely different from P (A|B). It’s a
common confusion to mix up some or all of these quantities.
To get acquainted with all the relationships between them, we’ll play “fol-
low the degrees of freedom.” For example, the two quantities P (cancer) and
P (¬cancer) have one degree of freedom between them, because of the gen-
eral law P (A) + P (¬A) = 1. If you know that P (¬cancer) = 0.99, you can
obtain P (cancer) = 1 − P (¬cancer) = 0.01.
The quantities P (positive|cancer) and P (¬positive|cancer) also have only
one degree of freedom between them; either a woman with breast cancer gets a
positive mammography or she doesn’t. On the other hand, P (positive|cancer)
and P (positive|¬cancer) have two degrees of freedom. You can have a mam-
mography test that returns positive for 80% of cancer patients and 9.6% of
healthy patients, or that returns positive for 70% of cancer patients and 2% of
healthy patients, or even a health test that returns “positive” for 30% of can-
cer patients and 92% of healthy patients. The two quantities, the output of
the mammography test for cancer patients and the output of the mammog-
raphy test for healthy patients, are in mathematical terms independent; one
cannot be obtained from the other in any way, and so they have two degrees of
freedom between them.
What about P (positive, cancer), P (positive|cancer), and P (cancer)?
Here we have three quantities; how many degrees of freedom are there? In this
case the equation that must hold is

P (positive, cancer) = P (positive|cancer) × P (cancer) .

This equality reduces the degrees of freedom by one. If we know the fraction
of patients with cancer, and the chance that a cancer patient has a positive
mammography, we can deduce the fraction of patients who have breast cancer
and a positive mammography by multiplying.
Similarly, if we know the number of patients with breast cancer and positive
mammographies, and also the number of patients with breast cancer, we can
estimate the chance that a woman with breast cancer gets a positive mammog-
raphy by dividing: P (positive|cancer) = P (positive, cancer)/P (cancer). In
fact, this is exactly how such medical diagnostic tests are calibrated; you do a
study on 8,520 women with breast cancer and see that there are 6,816 (or there-
abouts) women with breast cancer and positive mammographies, then divide
6,816 by 8,520 to find that 80% of women with breast cancer had positive mam-
mographies. (Incidentally, if you accidentally divide 8,520 by 6,816 instead of
the other way around, your calculations will start doing strange things, such
as insisting that 125% of women with breast cancer and positive mammogra-
phies have breast cancer. This is a common mistake in carrying out Bayesian
arithmetic, in my experience.) And finally, if you know P (positive, cancer)
and P (positive|cancer), you can deduce how many cancer patients there must
have been originally. There are two degrees of freedom shared out among the
three quantities; if we know any two, we can deduce the third.
How about P (positive), P (positive, cancer), and P (positive, ¬cancer)?
Again there are only two degrees of freedom among these three variables. The
equation occupying the extra degree of freedom is

P (positive) = P (positive, cancer) + P (positive, ¬cancer) .

This is how P (positive) is computed to begin with; we figure out the num-
ber of women with breast cancer who have positive mammographies, and
the number of women without breast cancer who have positive mammo-
graphies, then add them together to get the total number of women with
positive mammographies. It would be very strange to go out and conduct a
study to determine the number of women with positive mammographies—
just that one number and nothing else—but in theory you could do so.
And if you then conducted another study and found the number of those
women who had positive mammographies and breast cancer, you would also
know the number of women with positive mammographies and no breast
cancer—either a woman with a positive mammography has breast cancer or
she doesn’t. In general, P (A, B) + P (A, ¬B) = P (A). Symmetrically,
P (A, B) + P (¬A, B) = P (B).
What about P (positive, cancer), P (positive, ¬cancer), P (¬positive,
cancer), and P (¬positive, ¬cancer)? You might at first be tempted to think
that there are only two degrees of freedom for these four quantities—that
you can, for example, get P (positive, ¬cancer) by multiplying P (positive) ×
P (¬cancer), and thus that all four quantities can be found given only
the two quantities P (positive) and P (cancer). This is not the case!
P (positive, ¬cancer) = P (positive) × P (¬cancer) only if the two proba-
bilities are statistically independent—if the chance that a woman has breast
cancer has no bearing on whether she has a positive mammography. This
amounts to requiring that the two conditional probabilities be equal to each
other—a requirement which would eliminate one degree of freedom. If you
remember that these four quantities are the groups A, B, C, and D, you can
look over those four groups and realize that, in theory, you can put any num-
ber of people into the four groups. If you start with a group of 80 women with
breast cancer and positive mammographies, there’s no reason why you can’t
add another group of 500 women with breast cancer and negative mammo-
graphies, followed by a group of 3 women without breast cancer and negative
mammographies, and so on. So now it seems like the four quantities have
four degrees of freedom. And they would, except that in expressing them as
probabilities, we need to normalize them to fractions of the complete group,
which adds the constraint that P (positive, cancer) + P (positive, ¬cancer) +
P (¬positive, cancer) + P (¬positive, ¬cancer) = 1. This equation takes up
one degree of freedom, leaving three degrees of freedom among the four quan-
tities. If you specify the fractions of women in groups A, B, and D, you can
deduce the fraction of women in group C.
Given the four groups A, B, C, and D, it is very straightforward to compute
everything else:
A+B
P (cancer) =
A+B+C +D
B
P (¬positive|cancer) = ,
A+B
and so on. Since {A, B, C, D} contains three degrees of freedom, it follows
that the entire set of probabilities relating cancer rates to test results contains
only three degrees of freedom. Remember that in our problems we always
needed three pieces of information—the prior probability and the two con-
ditional probabilities—which, indeed, have three degrees of freedom among
them. Actually, for Bayesian problems, any three quantities with three degrees
of freedom between them should logically specify the entire problem.

The probability that a test gives a true positive divided by the probability that
a test gives a false positive is known as the likelihood ratio of that test. The
likelihood ratio for a positive result summarizes how much a positive result
will slide the prior probability. Does the likelihood ratio of a medical test then
sum up everything there is to know about the usefulness of the test?
No, it does not! The likelihood ratio sums up everything there is to know
about the meaning of a positive result on the medical test, but the meaning of a
negative result on the test is not specified, nor is the frequency with which the
test is useful. For example, a mammography with a hit rate of 80% for patients
with breast cancer and a false positive rate of 9.6% for healthy patients has the
same likelihood ratio as a test with an 8% hit rate and a false positive rate of
0.96%. Although these two tests have the same likelihood ratio, the first test is
more useful in every way—it detects disease more often, and a negative result
is stronger evidence of health.

Suppose that you apply two tests for breast cancer in succession—say, a standard
mammography and also some other test which is independent of mammogra-
phy. Since I don’t know of any such test that is independent of mammography,
I’ll invent one for the purpose of this problem, and call it the Tams-Braylor
Division Test, which checks to see if any cells are dividing more rapidly than
other cells. We’ll suppose that the Tams-Braylor gives a true positive for 90%
of patients with breast cancer, and gives a false positive for 5% of patients with-
out cancer. Let’s say the prior prevalence of breast cancer is 1%. If a patient
gets a positive result on her mammography and her Tams-Braylor, what is the
revised probability she has breast cancer?
One way to solve this problem would be to take the revised probability for
a positive mammography, which we already calculated as 7.8%, and plug that
into the Tams-Braylor test as the new prior probability. If we do this, we find
that the result comes out to 60%.
Suppose that the prior prevalence of breast cancer in a demographic is 1%.
Suppose that we, as doctors, have a repertoire of three independent tests for
breast cancer. Our first test, test A, a mammography, has a likelihood ratio
of 80%/9.6% = 8.33. The second test, test B, has a likelihood ratio of 18.0
(for example, from 90% versus 5%); and the third test, test C, has a likelihood
ratio of 3.5 (which could be from 70% versus 20%, or from 35% versus 10%; it
makes no difference). Suppose a patient gets a positive result on all three tests.
What is the probability the patient has breast cancer?
Here’s a fun trick for simplifying the bookkeeping. If the prior prevalence
of breast cancer in a demographic is 1%, then 1 out of 100 women have breast
cancer, and 99 out of 100 women do not have breast cancer. So if we rewrite
the probability of 1% as an odds ratio, the odds are 1:99.
And the likelihood ratios of the three tests A, B, and C are:

8.33 : 1 = 25 : 3
18.0 : 1 = 18 : 1
3.5 : 1 = 7 : 2 .

The odds for women with breast cancer who score positive on all three tests,
versus women without breast cancer who score positive on all three tests, will
equal:
1 × 25 × 18 × 7 : 99 × 3 × 1 × 2 = 3150 : 594 .
To recover the probability from the odds, we just write:

3150/(3150 + 594) = 84% .

This always works regardless of how the odds ratios are written; i.e., 8.33:1
is just the same as 25:3 or 75:9. It doesn’t matter in what order the tests are
administered, or in what order the results are computed. The proof is left as an
exercise for the reader.

E. T. Jaynes, in Probability Theory With Applications in Science and Engineering,


suggests that credibility and evidence should be measured in decibels.5
Decibels?
Decibels are used for measuring exponential differences of intensity. For
example, if the sound from an automobile horn carries 10,000 times as much
energy (per square meter per second) as the sound from an alarm clock, the
automobile horn would be 40 decibels louder. The sound of a bird singing
might carry 1,000 times less energy than an alarm clock, and hence would be
30 decibels softer. To get the number of decibels, you take the logarithm base
10 and multiply by 10:

decibels = 10 log10 (intensity)


or

intensity = 10decibels/10 .

Suppose we start with a prior probability of 1% that a woman has breast cancer,
corresponding to an odds ratio of 1:99. And then we administer three tests of
likelihood ratios 25:3, 18:1, and 7:2. You could multiply those numbers . . . or
you could just add their logarithms:

10 log10 (1/99) ≈ −20


10 log10 (25/3) ≈ 9
10 log10 (18/1) ≈ 13
10 log10 (7/2) ≈ 5 .

It starts out as fairly unlikely that a woman has breast cancer—our credibility
level is at −20 decibels. Then three test results come in, corresponding to 9,
13, and 5 decibels of evidence. This raises the credibility level by a total of 27
decibels, meaning that the prior credibility of −20 decibels goes to a posterior
credibility of 7 decibels. So the odds go from 1:99 to 5:1, and the probability
goes from 1% to around 83%.

You are a mechanic for gizmos. When a gizmo stops working, it is


due to a blocked hose 30% of the time. If a gizmo’s hose is blocked,
there is a 45% probability that prodding the gizmo will produce
sparks. If a gizmo’s hose is unblocked, there is only a 5% chance
that prodding the gizmo will produce sparks. A customer brings
you a malfunctioning gizmo. You prod the gizmo and find that it
produces sparks. What is the probability that a spark-producing
gizmo has a blocked hose?

What is the sequence of arithmetical operations that you performed to solve


this problem?

(45% × 30%)/(45% × 30% + 5% × 70%)


Similarly, to find the chance that a woman with positive mammography has
breast cancer, we computed:

P (positive|cancer) × P (cancer)
!
P (positive|cancer) × P (cancer)
+ P (positive|¬cancer) × P (¬cancer)

which is
P (positive, cancer)
P (positive, cancer) + P (positive, ¬cancer)
which is
P (positive, cancer)
P (positive)
which is
P (cancer|positive) .

The fully general form of this calculation is known as Bayes’s Theorem or Bayes’s
Rule.

P (X|A) × P (A)
P (A|X) = .
P (X|A) × P (A) + P (X|¬A) × P (¬A)

When there is some phenomenon A that we want to investigate, and an obser-


vation X that is evidence about A—for example, in the previous example, A
is breast cancer and X is a positive mammography—Bayes’s Theorem tells us
how we should update our probability of A, given the new evidence X.
By this point, Bayes’s Theorem may seem blatantly obvious or even tau-
tological, rather than exciting and new. If so, this introduction has entirely
succeeded in its purpose.

Bayes’s Theorem describes what makes something “evidence” and how much
evidence it is. Statistical models are judged by comparison to the Bayesian
method because, in statistics, the Bayesian method is as good as it gets—the
Bayesian method defines the maximum amount of mileage you can get out
of a given piece of evidence, in the same way that thermodynamics defines
the maximum amount of work you can get out of a temperature differential.
This is why you hear cognitive scientists talking about Bayesian reasoners. In
cognitive science, Bayesian reasoner is the technically precise code word that
we use to mean rational mind.
There are also a number of general heuristics about human reasoning that
you can learn from looking at Bayes’s Theorem.
For example, in many discussions of Bayes’s Theorem, you may hear cogni-
tive psychologists saying that people do not take prior frequencies sufficiently
into account, meaning that when people approach a problem where there’s
some evidence X indicating that condition A might hold true, they tend to
judge A’s likelihood solely by how well the evidence X seems to match A,
without taking into account the prior frequency of A. If you think, for exam-
ple, that under the mammography example, the woman’s chance of having
breast cancer is in the range of 70%–80%, then this kind of reasoning is insen-
sitive to the prior frequency given in the problem; it doesn’t notice whether
1% of women or 10% of women start out having breast cancer. “Pay more at-
tention to the prior frequency!” is one of the many things that humans need to
bear in mind to partially compensate for our built-in inadequacies.
A related error is to pay too much attention to P (X|A) and not enough to
P (X|¬A) when determining how much evidence X is for A. The degree to
which a result X is evidence for A depends not only on the strength of the state-
ment we’d expect to see result X if A were true, but also on the strength of the
statement we wouldn’t expect to see result X if A weren’t true. For example, if it
is raining, this very strongly implies the grass is wet—P (wetgrass|rain) ≈ 1—
but seeing that the grass is wet doesn’t necessarily mean that it has just rained;
perhaps the sprinkler was turned on, or you’re looking at the early morning dew.
Since P (wetgrass|¬rain) is substantially greater than zero, P (rain|wetgrass)
is substantially less than one. On the other hand, if the grass was never wet
when it wasn’t raining, then knowing that the grass was wet would always show
that it was raining, P (rain|wetgrass) ≈ 1, even if P (wetgrass|rain) = 50%;
that is, even if the grass only got wet 50% of the times it rained. Evidence is
always the result of the differential between the two conditional probabilities.
Strong evidence is not the product of a very high probability that A leads to X,
but the product of a very low probability that not-A could have led to X.
The Bayesian revolution in the sciences is fueled, not only by more and more
cognitive scientists suddenly noticing that mental phenomena have Bayesian
structure in them; not only by scientists in every field learning to judge their
statistical methods by comparison with the Bayesian method; but also by the
idea that science itself is a special case of Bayes’s Theorem; experimental evidence
is Bayesian evidence. The Bayesian revolutionaries hold that when you perform
an experiment and get evidence that “confirms” or “disconfirms” your theory,
this confirmation and disconfirmation is governed by the Bayesian rules. For
example, you have to take into account not only whether your theory predicts
the phenomenon, but whether other possible explanations also predict the
phenomenon.
Previously, the most popular philosophy of science was probably Karl Pop-
per’s falsificationism—this is the old philosophy that the Bayesian revolution is
currently dethroning. Karl Popper’s idea that theories can be definitely falsi-
fied, but never definitely confirmed, is yet another special case of the Bayesian
rules; if P (X|A) ≈ 1—if the theory makes a definite prediction—then ob-
serving ¬X very strongly falsifies A. On the other hand, if P (X|A) ≈ 1, and
we observe X, this doesn’t definitely confirm the theory; there might be some
other condition B such that P (X|B) ≈ 1, in which case observing X doesn’t
favor A over B. For observing X to definitely confirm A, we would have to
know, not that P (X|A) ≈ 1, but that P (X|¬A) ≈ 0, which is something
that we can’t know because we can’t range over all possible alternative expla-
nations. For example, when Einstein’s theory of General Relativity toppled
Newton’s incredibly well-confirmed theory of gravity, it turned out that all of
Newton’s predictions were just a special case of Einstein’s predictions.
You can even formalize Popper’s philosophy mathematically. The likeli-
hood ratio for X, the quantity P (X|A)/P (X|¬A), determines how much
observing X slides the probability for A; the likelihood ratio is what says how
strong X is as evidence. Well, in your theory A, you can predict X with prob-
ability 1, if you like; but you can’t control the denominator of the likelihood
ratio, P (X|¬A)—there will always be some alternative theories that also pre-
dict X, and while we go with the simplest theory that fits the current evidence,
you may someday encounter some evidence that an alternative theory predicts
but your theory does not. That’s the hidden gotcha that toppled Newton’s
theory of gravity. So there’s a limit on how much mileage you can get from
successful predictions; there’s a limit on how high the likelihood ratio goes for
confirmatory evidence.
On the other hand, if you encounter some piece of evidence Y that is
definitely not predicted by your theory, this is enormously strong evidence
against your theory. If P (Y |A) is infinitesimal, then the likelihood ratio will
also be infinitesimal. For example, if P (Y |A) is 0.0001%, and P (Y |¬A) is
1%, then the likelihood ratio P (Y |A)/P (Y |¬A) will be 1:10,000. That’s −40
decibels of evidence! Or, flipping the likelihood ratio, if P (Y |A) is very small,
then P (Y |¬A)/P (Y |A) will be very large, meaning that observing Y greatly
favors ¬A over A. Falsification is much stronger than confirmation. This is a
consequence of the earlier point that very strong evidence is not the product
of a very high probability that A leads to X, but the product of a very low
probability that not-A could have led to X. This is the precise Bayesian rule
that underlies the heuristic value of Popper’s falsificationism.
Similarly, Popper’s dictum that an idea must be falsifiable can be inter-
preted as a manifestation of the Bayesian conservation-of-probability rule; if a
result X is positive evidence for the theory, then the result ¬X would have
disconfirmed the theory to some extent. If you try to interpret both X and
¬X as “confirming” the theory, the Bayesian rules say this is impossible! To
increase the probability of a theory you must expose it to tests that can po-
tentially decrease its probability; this is not just a rule for detecting would-be
cheaters in the social process of science, but a consequence of Bayesian proba-
bility theory. On the other hand, Popper’s idea that there is only falsification
and no such thing as confirmation turns out to be incorrect. Bayes’s Theorem
shows that falsification is very strong evidence compared to confirmation, but
falsification is still probabilistic in nature; it is not governed by fundamentally
different rules from confirmation, as Popper argued.
So we find that many phenomena in the cognitive sciences, plus the statisti-
cal methods used by scientists, plus the scientific method itself, are all turning
out to be special cases of Bayes’s Theorem. Hence the Bayesian revolution.

Having introduced Bayes’s Theorem explicitly, we can explicitly discuss its


components.

P (X|A) × P (A)
P (A|X) =
P (X|A) × P (A) + P (X|¬A) × P (¬A)

We’ll start with P (A|X). If you ever find yourself getting confused about
what’s A and what’s X in Bayes’s Theorem, start with P (A|X) on the left
side of the equation; that’s the simplest part to interpret. In P (A|X), A is the
thing we want to know about. X is how we’re observing it; X is the evidence
we’re using to make inferences about A. Remember that for every expression
P (Q|P ), we want to know about the probability for Q given P, the degree
to which P implies Q—a more sensible notation, which it is now too late to
adopt, would be P (Q ← P ).
P (Q|P ) is closely related to P (Q, P ), but they are not identical. Expressed
as a probability or a fraction, P (Q, P ) is the proportion of things that have
property Q and property P among all things; e.g., the proportion of “women
with breast cancer and a positive mammography” within the group of all
women. If the total number of women is 10,000, and 80 women have breast
cancer and a positive mammography, then P (Q, P ) is 80/10,000 = 0.8%. You
might say that the absolute quantity, 80, is being normalized to a probability
relative to the group of all women. Or to make it clearer, suppose that there’s a
group of 641 women with breast cancer and a positive mammography within
a total sample group of 89,031 women. Six hundred and forty-one is the
absolute quantity. If you pick out a random woman from the entire sample,
then the probability you’ll pick a woman with breast cancer and a positive
mammography is P (Q, P ), or 0.72% (in this example).
On the other hand, P (Q|P ) is the proportion of things that have property
Q and property P among all things that have P ; e.g., the proportion of women
with breast cancer and a positive mammography within the group of all women
with positive mammographies. If there are 641 women with breast cancer and
positive mammographies, 7,915 women with positive mammographies, and
89,031 women, then P (Q, P ) is the probability of getting one of those 641
women if you’re picking at random from the entire group of 89,031, while
P (Q|P ) is the probability of getting one of those 641 women if you’re picking
at random from the smaller group of 7,915.
In a sense, P (Q|P ) really means P (Q, P |P ), but specifying the extra P
all the time would be redundant. You already know it has property P, so the
property you’re investigating is Q—even though you’re looking at the size
of group (Q, P ) within group P, not the size of group Q within group P
(which would be nonsense). This is what it means to take the property on
the right-hand side as given; it means you know you’re working only within
the group of things that have property P. When you constrict your focus of
attention to see only this smaller group, many other probabilities change. If
you’re taking P as given, then P (Q, P ) equals just P (Q)—at least, relative
to the group P . The old P (Q), the frequency of “things that have property
Q within the entire sample,” is revised to the new frequency of “things that
have property Q within the subsample of things that have property P. ” If P is
given, if P is our entire world, then looking for (Q, P ) is the same as looking
for just Q.
If you constrict your focus of attention to only the population of eggs that
are painted blue, then suddenly “the probability that an egg contains a pearl”
becomes a different number; this proportion is different for the population of
blue eggs than the population of all eggs. The given, the property that constricts
our focus of attention, is always on the right side of P (Q|P ); the P becomes
our world, the entire thing we see, and on the other side of the “given” P
always has probability 1—that is what it means to take P as given. So P (Q|P )
means “If P has probability 1, what is the probability of Q?” or “If we constrict
our attention to only things or events where P is true, what is the probability
of Q?” The statement Q, on the other side of the given, is not certain—its
probability may be 10% or 90% or any other number. So when you use Bayes’s
Theorem, and you write the part on the left side as P (A|X)—how to update
the probability of A after seeing X, the new probability of A given that we
know X, the degree to which X implies A—you can tell that X is always the
observation or the evidence, and A is the property being investigated, the thing
you want to know about.

The right side of Bayes’s Theorem is derived from the left side through these
steps:

P (A|X) = P (A|X)
P (X, A)
P (A|X) =
P (X)
P (X, A)
P (A|X) =
P (X, A) + P (X, ¬A)
P (X|A) × P (A)
P (A|X) = .
P (X|A) × P (A) + P (X|¬A) × P (¬A)

Once the derivation is finished, all the implications on the right side of the equa-
tion are of the form P (X|A) or P (X|¬A), while the implication on the left
side is P (A|X). The symmetry arises because the elementary causal relations
are generally implications from facts to observations, e.g., from breast cancer to
positive mammography. The elementary steps in reasoning are generally impli-
cations from observations to facts, e.g., from a positive mammography to breast
cancer. The left side of Bayes’s Theorem is an elementary inferential step from
the observation of positive mammography to the conclusion of an increased
probability of breast cancer. Implication is written right-to-left, so we write
P (cancer|positive) on the left side of the equation. The right side of Bayes’s
Theorem describes the elementary causal steps—for example, from breast can-
cer to a positive mammography—and so the implications on the right side of
Bayes’s Theorem take the form P (positive|cancer) or P (positive|¬cancer).
And that’s Bayes’s Theorem. Rational inference on the left end, physical
causality on the right end; an equation with mind on one side and reality on
the other. Remember how the scientific method turned out to be a special
case of Bayes’s Theorem? If you wanted to put it poetically, you could say that
Bayes’s Theorem binds reasoning into the physical universe.
Okay, we’re done.

Reverend Bayes says:

You are now an initiate of the Bayesian Conspiracy.


1. Ward Casscells, Arno Schoenberger, and Thomas Graboys, “Interpretation by Physicians of Clinical
Laboratory Results,” New England Journal of Medicine 299 (1978): 999–1001.

2. David M. Eddy, “Probabilistic Reasoning in Clinical Medicine: Problems and Opportunities,”


in Judgement Under Uncertainty: Heuristics and Biases, ed. Daniel Kahneman, Paul Slovic, and
Amos Tversky (Cambridge University Press, 1982).

3. Gerd Gigerenzer and Ulrich Hoffrage, “How to Improve Bayesian Reasoning without Instruction:
Frequency Formats,” Psychological Review 102 (1995): 684–704.

4. Ibid.

5. Edwin T. Jaynes, “Probability Theory, with Applications in Science and Engineering,” Unpublished
manuscript (1974).

You might also like