Book 3
Book 3
Contents
The Machine in the Ghost
Rebuilding Intelligence
Yudkowsky is a decision theorist and mathematician who works on founda-
tional issues in Artificial General Intelligence (AGI), the theoretical study of
domain-general problem-solving systems. Yudkowsky’s work in AI has been a
major driving force behind his exploration of the psychology of human ratio-
nality, as he noted in his very first blog post on Overcoming Bias, The Martial
Art of Rationality:
2. Irving John Good, “Speculations Concerning the First Ultraintelligent Machine,” in Advances in
Computers, ed. Franz L. Alt and Morris Rubinoff, vol. 6 (New York: Academic Press, 1965), 31–88,
doi:10.1016/S0065-2458(08)60418-0.
3. Nick Bostrom, Superintelligence: Paths, Dangers, Strategies (Oxford University Press, 2014).
4. Stuart J. Russell and Peter Norvig, Artificial Intelligence: A Modern Approach, 3rd ed. (Upper Saddle
River, NJ: Prentice-Hall, 2010).
6. An example of a possible existential risk is the “grey goo” scenario, in which molecular robots
designed to efficiently self-replicate do their job too well, rapidly outcompeting living organisms
as they consume the Earth’s available matter.
7. Nick Bostrom and Eliezer Yudkowsky, “The Ethics of Artificial Intelligence,” in The Cambridge
Handbook of Artificial Intelligence, ed. Keith Frankish and William Ramsey (New York: Cambridge
University Press, 2014).
Interlude
The Power of Intelligence
In our skulls we carry around three pounds of slimy, wet, grayish tissue, corru-
gated like crumpled toilet paper.
You wouldn’t think, to look at the unappetizing lump, that it was some of
the most powerful stuff in the known universe. If you’d never seen an anatomy
textbook, and you saw a brain lying in the street, you’d say “Yuck!” and try not
to get any of it on your shoes. Aristotle thought the brain was an organ that
cooled the blood. It doesn’t look dangerous.
Five million years ago, the ancestors of lions ruled the day, the ancestors
of wolves roamed the night. The ruling predators were armed with teeth and
claws—sharp, hard cutting edges, backed up by powerful muscles. Their prey,
in self-defense, evolved armored shells, sharp horns, toxic venoms, camouflage.
The war had gone on through hundreds of eons and countless arms races.
Many a loser had been removed from the game, but there was no sign of a
winner. Where one species had shells, another species would evolve to crack
them; where one species became poisonous, another would evolve to tolerate
the poison. Each species had its private niche—for who could live in the seas
and the skies and the land at once? There was no ultimate weapon and no
ultimate defense and no reason to believe any such thing was possible.
Then came the Day of the Squishy Things.
They had no armor. They had no claws. They had no venoms.
If you saw a movie of a nuclear explosion going off, and you were told an
Earthly life form had done it, you would never in your wildest dreams imagine
that the Squishy Things could be responsible. After all, Squishy Things aren’t
radioactive.
In the beginning, the Squishy Things had no fighter jets, no machine guns,
no rifles, no swords. No bronze, no iron. No hammers, no anvils, no tongs,
no smithies, no mines. All the Squishy Things had were squishy fingers—too
weak to break a tree, let alone a mountain. Clearly not dangerous. To cut stone
you would need steel, and the Squishy Things couldn’t excrete steel. In the
environment there were no steel blades for Squishy fingers to pick up. Their
bodies could not generate temperatures anywhere near hot enough to melt
metal. The whole scenario was obviously absurd.
And as for the Squishy Things manipulating DNA—that would have been
beyond ridiculous. Squishy fingers are not that small. There is no access to
DNA from the Squishy level; it would be like trying to pick up a hydrogen
atom. Oh, technically it’s all one universe, technically the Squishy Things and
DNA are part of the same world, the same unified laws of physics, the same
great web of causality. But let’s be realistic: you can’t get there from here.
Even if Squishy Things could someday evolve to do any of those feats, it
would take thousands of millennia. We have watched the ebb and flow of Life
through the eons, and let us tell you, a year is not even a single clock tick of
evolutionary time. Oh, sure, technically a year is six hundred trillion trillion
trillion trillion Planck intervals. But nothing ever happens in less than six
hundred million trillion trillion trillion trillion Planck intervals, so it’s a moot
point. The Squishy Things, as they run across the savanna now, will not fly
across continents for at least another ten million years; no one could have that
much sex.
Now explain to me again why an Artificial Intelligence can’t do anything
interesting over the Internet unless a human programmer builds it a robot
body.
I have observed that someone’s flinch-reaction to “intelligence”—the
thought that crosses their mind in the first half-second after they hear the
word “intelligence”—often determines their flinch-reaction to the notion of
an intelligence explosion. Often they look up the keyword “intelligence” and
retrieve the concept booksmarts—a mental image of the Grand Master chess
player who can’t get a date, or a college professor who can’t survive outside
academia.
“It takes more than intelligence to succeed professionally,” people say, as
if charisma resided in the kidneys, rather than the brain. “Intelligence is no
match for a gun,” they say, as if guns had grown on trees. “Where will an
Artificial Intelligence get money?” they ask, as if the first Homo sapiens had
found dollar bills fluttering down from the sky, and used them at convenience
stores already in the forest. The human species was not born into a market
economy. Bees won’t sell you honey if you offer them an electronic funds
transfer. The human species imagined money into existence, and it exists—for
us, not mice or wasps—because we go on believing in it.
I keep trying to explain to people that the archetype of intelligence is not
Dustin Hoffman in Rain Man. It is a human being, period. It is squishy things
that explode in a vacuum, leaving footprints on their moon. Within that gray
wet lump is the power to search paths through the great web of causality, and
find a road to the seemingly impossible—the power sometimes called creativity.
People—venture capitalists in particular—sometimes ask how, if the Ma-
chine Intelligence Research Institute successfully builds a true AI, the results
will be commercialized. This is what we call a framing problem.
Or maybe it’s something deeper than a simple clash of assumptions. With
a bit of creative thinking, people can imagine how they would go about trav-
elling to the Moon, or curing smallpox, or manufacturing computers. To
imagine a trick that could accomplish all these things at once seems downright
impossible—even though such a power resides only a few centimeters behind
their own eyes. The gray wet thing still seems mysterious to the gray wet thing.
And so, because people can’t quite see how it would all work, the power of
intelligence seems less real; harder to imagine than a tower of fire sending a
ship to Mars. The prospect of visiting Mars captures the imagination. But if
one should promise a Mars visit, and also a grand unified theory of physics,
and a proof of the Riemann Hypothesis, and a cure for obesity, and a cure
for cancer, and a cure for aging, and a cure for stupidity—well, it just sounds
wrong, that’s all.
And well it should. It’s a serious failure of imagination to think that intelli-
gence is good for so little. Who could have imagined, ever so long ago, what
minds would someday do? We may not even know what our real problems are.
But meanwhile, because it’s hard to see how one process could have such
diverse powers, it’s hard to imagine that one fell swoop could solve even such
prosaic problems as obesity and cancer and aging.
Well, one trick cured smallpox and built airplanes and cultivated wheat
and tamed fire. Our current science may not agree yet on how exactly the
trick works, but it works anyway. If you are temporarily ignorant about a
phenomenon, that is a fact about your current state of mind, not a fact about
the phenomenon. A blank map does not correspond to a blank territory. If
one does not quite understand that power which put footprints on the Moon,
nonetheless, the footprints are still there—real footprints, on a real Moon, put
there by a real power. If one were to understand deeply enough, one could
create and shape that power. Intelligence is as real as electricity. It’s merely
far more powerful, far more dangerous, has far deeper implications for the
unfolding story of life in the universe—and it’s a tiny little bit harder to figure
out how to build a generator.
*
Part L
“A curious aspect of the theory of evolution,” said Jacques Monod, “is that
everybody thinks he understands it.”
A human being, looking at the natural world, sees a thousand times pur-
pose. A rabbit’s legs, built and articulated for running; a fox’s jaws, built and
articulated for tearing. But what you see is not exactly what is there . . .
In the days before Darwin, the cause of all this apparent purposefulness
was a very great puzzle unto science. The Goddists said “God did it,” because
you get 50 bonus points each time you use the word “God” in a sentence. Yet
perhaps I’m being unfair. In the days before Darwin, it seemed like a much
more reasonable hypothesis. Find a watch in the desert, said William Paley,
and you can infer the existence of a watchmaker.
But when you look at all the apparent purposefulness in Nature, rather than
picking and choosing your examples, you start to notice things that don’t fit
the Judeo-Christian concept of one benevolent God. Foxes seem well-designed
to catch rabbits. Rabbits seem well-designed to evade foxes. Was the Creator
having trouble making up Its mind?
When I design a toaster oven, I don’t design one part that tries to get
electricity to the coils and a second part that tries to prevent electricity from
getting to the coils. It would be a waste of effort. Who designed the ecosystem,
with its predators and prey, viruses and bacteria? Even the cactus plant, which
you might think well-designed to provide water and fruit to desert animals, is
covered with inconvenient spines.
The ecosystem would make much more sense if it wasn’t designed by a
unitary Who, but, rather, created by a horde of deities—say from the Hindu or
Shinto religions. This handily explains both the ubiquitous purposefulnesses,
and the ubiquitous conflicts: More than one deity acted, often at cross-purposes.
The fox and rabbit were both designed, but by distinct competing deities. I
wonder if anyone ever remarked on the seemingly excellent evidence thus
provided for Hinduism over Christianity. Probably not.
Similarly, the Judeo-Christian God is alleged to be benevolent—well, sort
of. And yet much of nature’s purposefulness seems downright cruel. Darwin
suspected a non-standard Creator for studying Ichneumon wasps, whose para-
lyzing stings preserve its prey to be eaten alive by its larvae: “I cannot persuade
myself,” wrote Darwin, “that a beneficent and omnipotent God would have de-
signedly created the Ichneumonidae with the express intention of their feeding
within the living bodies of Caterpillars, or that a cat should play with mice.”1 I
wonder if any earlier thinker remarked on the excellent evidence thus provided
for Manichaean religions over monotheistic ones.
By now we all know the punchline: you just say “evolution.”
I worry that’s how some people are absorbing the “scientific” explanation,
as a magical purposefulness factory in Nature. I’ve previously discussed the
case of Storm from the movie X-Men, who in one mutation gets the ability
to throw lightning bolts. Why? Well, there’s this thing called “evolution”
that somehow pumps a lot of purposefulness into Nature, and the changes
happen through “mutations.” So if Storm gets a really large mutation, she can
be redesigned to throw lightning bolts. Radioactivity is a popular super origin:
radiation causes mutations, so more powerful radiation causes more powerful
mutations. That’s logic.
But evolution doesn’t allow just any kind of purposefulness to leak into
Nature. That’s what makes evolution a success as an empirical hypothesis. If
evolutionary biology could explain a toaster oven, not just a tree, it would be
worthless. There’s a lot more to evolutionary theory than pointing at Nature
and saying, “Now purpose is allowed,” or “Evolution did it!” The strength of a
theory is not what it allows, but what it prohibits; if you can invent an equally
persuasive explanation for any outcome, you have zero knowledge.
“Many non-biologists,” observed George Williams, “think that it is for their
benefit that rattles grow on rattlesnake tails.”2 Bzzzt! This kind of purposeful-
ness is not allowed. Evolution doesn’t work by letting flashes of purposefulness
creep in at random—reshaping one species for the benefit of a random recipi-
ent.
Evolution is powered by a systematic correlation between the different
ways that different genes construct organisms, and how many copies of those
genes make it into the next generation. For rattles to grow on rattlesnake tails,
rattle-growing genes must become more and more frequent in each successive
generation. (Actually genes for incrementally more complex rattles, but if I
start describing all the fillips and caveats to evolutionary biology, we really will
be here all day.)
There isn’t an Evolution Fairy that looks over the current state of Nature,
decides what would be a “good idea,” and chooses to increase the frequency of
rattle-constructing genes.
I suspect this is where a lot of people get stuck, in evolutionary biology.
They understand that “helpful” genes become more common, but “helpful”
lets any sort of purpose leak in. They don’t think there’s an Evolution Fairy,
yet they ask which genes will be “helpful” as if a rattlesnake gene could “help”
non-rattlesnakes.
The key realization is that there is no Evolution Fairy. There’s no outside
force deciding which genes ought to be promoted. Whatever happens, happens
because of the genes themselves.
Genes for constructing (incrementally better) rattles must have somehow
ended up more frequent in the rattlesnake gene pool, because of the rattle.
In this case it’s probably because rattlesnakes with better rattles survive more
often—rather than mating more successfully, or having brothers that reproduce
more successfully, etc.
Maybe predators are wary of rattles and don’t step on the snake. Or maybe
the rattle diverts attention from the snake’s head. (As George Williams suggests,
“The outcome of a fight between a dog and a viper would depend very much
on whether the dog initially seized the reptile by the head or by the tail.”)
But that’s just a snake’s rattle. There are much more complicated ways that a
gene can cause copies of itself to become more frequent in the next generation.
Your brother or sister shares half your genes. A gene that sacrifices one unit of
resources to bestow three units of resource on a brother, may promote some
copies of itself by sacrificing one of its constructed organisms. (If you really
want to know all the details and caveats, buy a book on evolutionary biology;
there is no royal road.)
The main point is that the gene’s effect must cause copies of that gene to
become more frequent in the next generation. There’s no Evolution Fairy that
reaches in from outside. There’s nothing which decides that some genes are
“helpful” and should, therefore, increase in frequency. It’s just cause and effect,
starting from the genes themselves.
This explains the strange conflicting purposefulness of Nature, and its
frequent cruelty. It explains even better than a horde of Shinto deities.
Why is so much of Nature at war with other parts of Nature? Because there
isn’t one Evolution directing the whole process. There’s as many different
“evolutions” as reproducing populations. Rabbit genes are becoming more
or less frequent in rabbit populations. Fox genes are becoming more or less
frequent in fox populations. Fox genes which construct foxes that catch rabbits,
insert more copies of themselves in the next generation. Rabbit genes which
construct rabbits that evade foxes are naturally more common in the next
generation of rabbits. Hence the phrase “natural selection.”
Why is Nature cruel? You, a human, can look at an Ichneumon wasp,
and decide that it’s cruel to eat your prey alive. You can decide that if you’re
going to eat your prey alive, you can at least have the decency to stop it from
hurting. It would scarcely cost the wasp anything to anesthetize its prey as well
as paralyze it. Or what about old elephants, who die of starvation when their
last set of teeth fall out? These elephants aren’t going to reproduce anyway.
What would it cost evolution—the evolution of elephants, rather—to ensure
that the elephant dies right away, instead of slowly and in agony? What would
it cost evolution to anesthetize the elephant, or give it pleasant dreams before
it dies? Nothing; that elephant won’t reproduce more or less either way.
If you were talking to a fellow human, trying to resolve a conflict of interest,
you would be in a good negotiating position—would have an easy job of
persuasion. It would cost so little to anesthetize the prey, to let the elephant
die without agony! Oh please, won’t you do it, kindly . . . um . . .
There’s no one to argue with.
Human beings fake their justifications, figure out what they want using one
method, and then justify it using another method. There’s no Evolution of
Elephants Fairy that’s trying to (a) figure out what’s best for elephants, and then
(b) figure out how to justify it to the Evolutionary Overseer, who (c) doesn’t
want to see reproductive fitness decreased, but is (d) willing to go along with
the painless-death idea, so long as it doesn’t actually harm any genes.
There’s no advocate for the elephants anywhere in the system.
Humans, who are often deeply concerned for the well-being of animals,
can be very persuasive in arguing how various kindnesses wouldn’t harm
reproductive fitness at all. Sadly, the evolution of elephants doesn’t use a
similar algorithm; it doesn’t select nice genes that can plausibly be argued to
help reproductive fitness. Simply: genes that replicate more often become
more frequent in the next generation. Like water flowing downhill, and equally
benevolent.
A human, looking over Nature, starts thinking of all the ways we would de-
sign organisms. And then we tend to start rationalizing reasons why our design
improvements would increase reproductive fitness—a political instinct, trying
to sell your own preferred option as matching the boss’s favored justification.
And so, amateur evolutionary biologists end up making all sorts of won-
derful and completely mistaken predictions. Because the amateur biologists are
drawing their bottom line—and more importantly, locating their prediction
in hypothesis-space—using a different algorithm than evolutions use to draw
their bottom lines.
A human engineer would have designed human taste buds to measure how
much of each nutrient we had, and how much we needed. When fat was scarce,
almonds or cheeseburgers would taste delicious. But if you started to become
obese, or if vitamins were lacking, lettuce would taste delicious. But there is no
Evolution of Humans Fairy, which intelligently planned ahead and designed a
general system for every contingency. It was a reliable invariant of humans’
ancestral environment that calories were scarce. So genes whose organisms
loved calories, became more frequent. Like water flowing downhill.
We are simply the embodied history of which organisms did in fact survive
and reproduce, not which organisms ought prudentially to have survived and
reproduced.
The human retina is constructed backward: The light-sensitive cells are at
the back, and the nerves emerge from the front and go back through the retina
into the brain. Hence the blind spot. To a human engineer, this looks simply
stupid—and other organisms have independently evolved retinas the right way
around. Why not redesign the retina?
The problem is that no single mutation will reroute the whole retina simul-
taneously. A human engineer can redesign multiple parts simultaneously, or
plan ahead for future changes. But if a single mutation breaks some vital part
of the organism, it doesn’t matter what wonderful things a Fairy could build
on top of it—the organism dies and the gene decreases in frequency.
If you turn around the retina’s cells without also reprogramming the nerves
and optic cable, the system as a whole won’t work. It doesn’t matter that,
to a Fairy or a human engineer, this is one step forward in redesigning the
retina. The organism is blind. Evolution has no foresight, it is simply the frozen
history of which organisms did in fact reproduce. Evolution is as blind as a
halfway-redesigned retina.
Find a watch in a desert, said William Paley, and you can infer the watch-
maker. There were once those who denied this, who thought that life “just
happened” without need of an optimization process, mice being spontaneously
generated from straw and dirty shirts.
If we ask who was more correct—the theologians who argued for a Creator-
God, or the intellectually unfulfilled atheists who argued that mice sponta-
neously generated—then the theologians must be declared the victors: evo-
lution is not God, but it is closer to God than it is to pure random entropy.
Mutation is random, but selection is non-random. This doesn’t mean an intel-
ligent Fairy is reaching in and selecting. It means there’s a non-zero statistical
correlation between the gene and how often the organism reproduces. Over a
few million years, that non-zero statistical correlation adds up to something
very powerful. It’s not a god, but it’s more closely akin to a god than it is to
snow on a television screen.
In a lot of ways, evolution is like unto theology. “Gods are ontologically
distinct from creatures,” said Damien Broderick, “or they’re not worth the
paper they’re written on.” And indeed, the Shaper of Life is not itself a creature.
Evolution is bodiless, like the Judeo-Christian deity. Omnipresent in Nature,
immanent in the fall of every leaf. Vast as a planet’s surface. Billions of years
old. Itself unmade, arising naturally from the structure of physics. Doesn’t
that all sound like something that might have been said about God?
And yet the Maker has no mind, as well as no body. In some ways, its
handiwork is incredibly poor design by human standards. It is internally
divided. Most of all, it isn’t nice.
In a way, Darwin discovered God—a God that failed to match the precon-
ceptions of theology, and so passed unheralded. If Darwin had discovered that
life was created by an intelligent agent—a bodiless mind that loves us, and will
smite us with lightning if we dare say otherwise—people would have said “My
gosh! That’s God!”
But instead Darwin discovered a strange alien God—not comfortably “in-
effable,” but really genuinely different from us. Evolution is not a God, but if it
were, it wouldn’t be Jehovah. It would be H. P. Lovecraft’s Azathoth, the blind
idiot God burbling chaotically at the center of everything, surrounded by the
thin monotonous piping of flutes.
Which you might have predicted, if you had really looked at Nature.
So much for the claim some religionists make, that they believe in a vague
deity with a correspondingly high probability. Anyone who really believed
in a vague deity, would have recognized their strange inhuman creator when
Darwin said “Aha!”
So much for the claim some religionists make, that they are waiting
innocently curious for Science to discover God. Science has already discov-
ered the sort-of-godlike maker of humans—but it wasn’t what the religionists
wanted to hear. They were waiting for the discovery of their God, the highly spe-
cific God they want to be there. They shall wait forever, for the great discovery
has already taken place, and the winner is Azathoth.
Well, more power to us humans. I like having a Creator I can outwit. Beats
being a pet. I’m glad it was Azathoth and not Odin.
1. Francis Darwin, ed., The Life and Letters of Charles Darwin, vol. 2 (John Murray, 1887).
2. George C. Williams, Adaptation and Natural Selection: A Critique of Some Current Evolutionary
Thought, Princeton Science Library (Princeton, NJ: Princeton University Press, 1966).
132
The Wonder of Evolution
Let us understand, once and for all, that the ethical progress of
society depends, not on imitating the cosmic process, still less in
running away from it, but in combating it.
1. Thomas Henry Huxley, Evolution and Ethics and Other Essays (Macmillan, 1894).
133
Evolutions Are Stupid (But Work
Anyway)
where N is the population size and (1 + s) is the fitness. (If each bearer of the
gene has 1.03 times as many children as a non-bearer, s = 0.03.)
Thus, if the population size were 1,000,000—the estimated population in
hunter-gatherer times—then it would require 2,763 generations for a gene
conveying a 1% advantage to spread through the gene pool.1
This should not be surprising; genes have to do all their own work of spread-
ing. There’s no Evolution Fairy who can watch the gene pool and say, “Hm,
that gene seems to be spreading rapidly—I should distribute it to everyone.”
In a human market economy, someone who is legitimately getting 20% re-
turns on investment—especially if there’s an obvious, clear mechanism behind
it—can rapidly acquire more capital from other investors; and others will start
duplicate enterprises. Genes have to spread without stock markets or banks or
imitators—as if Henry Ford had to make one car, sell it, buy the parts for 1.01
more cars (on average), sell those cars, and keep doing this until he was up to
a million cars.
All this assumes that the gene spreads in the first place. Here the equation
is simpler and ends up not depending at all on population size:
Probability of fixation = 2s .
A mutation conveying a 3% advantage (which is pretty darned large, as muta-
tions go) has a 6% chance of spreading, at least on that occasion.2 Mutations
can happen more than once, but in a population of a million with a copying
fidelity of 10−8 errors per base per generation, you may have to wait a hun-
dred generations for another chance, and then it still has only a 6% chance of
fixating.
Still, in the long run, an evolution has a good shot at getting there eventually.
(This is going to be a running theme.)
Complex adaptations take a very long time to evolve. First comes allele A,
which is advantageous of itself, and requires a thousand generations to fixate
in the gene pool. Only then can another allele B, which depends on A, begin
rising to fixation. A fur coat is not a strong advantage unless the environment
has a statistically reliable tendency to throw cold weather at you. Well, genes
form part of the environment of other genes, and if B depends on A, then
B will not have a strong advantage unless A is reliably present in the genetic
environment.
Let’s say that B confers a 5% advantage in the presence of A, no advantage
otherwise. Then while A is still at 1% frequency in the population, B only
confers its advantage 1 out of 100 times, so the average fitness advantage of B is
0.05%, and B’s probability of fixation is 0.1%. With a complex adaptation, first
A has to evolve over a thousand generations, then B has to evolve over another
thousand generations, then A∗ evolves over another thousand generations . . .
and several million years later, you’ve got a new complex adaptation.
Then other evolutions don’t imitate it. If snake evolution develops an
amazing new venom, it doesn’t help fox evolution or lion evolution.
Contrast all this to a human programmer, who can design a new complex
mechanism with a hundred interdependent parts over the course of a single
afternoon. How is this even possible? I don’t know all the answer, and my
guess is that neither does science; human brains are much more complicated
than evolutions. I could wave my hands and say something like “goal-directed
backward chaining using combinatorial modular representations,” but you
would not thereby be enabled to design your own human. Still: Humans can
foresightfully design new parts in anticipation of later designing other new
parts; produce coordinated simultaneous changes in interdependent machin-
ery; learn by observing single test cases; zero in on problem spots and think
abstractly about how to solve them; and prioritize which tweaks are worth try-
ing, rather than waiting for a cosmic ray strike to produce a good one. By the
standards of natural selection, this is simply magic.
Humans can do things that evolutions probably can’t do period over the
expected lifetime of the universe. As the eminent biologist Cynthia Kenyon
once put it at a dinner I had the honor of attending, “One grad student can do
things in an hour that evolution could not do in a billion years.” According to
biologists’ best current knowledge, evolutions have invented a fully rotating
wheel on a grand total of three occasions.
And don’t forget the part where the programmer posts the code snippet to
the Internet.
Yes, some evolutionary handiwork is impressive even by comparison to the
best technology of Homo sapiens. But our Cambrian explosion only started,
we only really began accumulating knowledge, around . . . what, four hundred
years ago? In some ways, biology still excels over the best human technology:
we can’t build a self-replicating system the size of a butterfly. In other ways,
human technology leaves biology in the dust. We got wheels, we got steel, we
got guns, we got knives, we got pointy sticks; we got rockets, we got transistors,
we got nuclear power plants. With every passing decade, that balance tips
further.
So, once again: for a human to look to natural selection as inspiration on the
art of design is like a sophisticated modern bacterium trying to imitate the first
awkward replicator’s biochemistry. The first replicator would be eaten instantly
if it popped up in today’s competitive ecology. The same fate would accrue
to any human planner who tried making random point mutations to their
strategies and waiting 768 iterations of testing to adopt a 3% improvement.
Don’t praise evolutions one millimeter more than they deserve.
Coming up next: More exciting mathematical bounds on evolution!
1. Dan Graur and Wen-Hsiung Li, Fundamentals of Molecular Evolution, 2nd ed. (Sunderland, MA:
Sinauer Associates, 2000).
2. John B. S. Haldane, “A Mathematical Theory of Natural and Artificial Selection,” Math-
ematical Proceedings of the Cambridge Philosophical Society 23 (5 1927): 607–615,
doi:10.1017/S0305004100011750.
134
No Evolutions for Corporations or
Nanodevices
The laws of physics and the rules of math don’t cease to apply.
That leads me to believe that evolution doesn’t stop. That further
leads me to believe that nature—bloody in tooth and claw, as
some have termed it—will simply be taken to the next level . . .
[Getting rid of Darwinian evolution is] like trying to get rid
of gravitation. So long as there are limited resources and multiple
competing actors capable of passing on characteristics, you have
selection pressure.
—Perry Metzger, predicting that the reign of natural selection
would continue into the indefinite future
This is a very powerful and general formula. For example, a particular gene
for height can be the Z, the characteristic that changes, in which case Price’s
Equation says that the change in the probability of possessing this gene equals
the covariance of the gene with reproductive fitness. Or you can consider
height in general as the characteristic Z, apart from any particular genes, and
Price’s Equation says that the change in height in the next generation will equal
the covariance of height with relative reproductive fitness.
(At least, this is true so long as height is straightforwardly heritable. If
nutrition improves, so that a fixed genotype becomes taller, you have to add a
correction term to Price’s Equation. If there are complex nonlinear interactions
between many genes, you have to either add a correction term, or calculate the
equation in such a complicated way that it ceases to enlighten.)
Many enlightenments may be attained by studying the different forms
and derivations of Price’s Equation. For example, the final equation says that
the average characteristic changes according to its covariance with relative
fitness, rather than its absolute fitness. This means that if a Frodo gene saves
its whole species from extinction, the average Frodo characteristic does not
increase, since Frodo’s act benefited all genotypes equally and did not covary
with relative fitness.
It is said that Price became so disturbed with the implications of his equation
for altruism that he committed suicide, though he may have had other issues.
(Overcoming Bias does not advocate committing suicide after studying Price’s
Equation.)
One of the enlightenments which may be gained by meditating upon Price’s
Equation is that “limited resources” and “multiple competing actors capable of
passing on characteristics” are not sufficient to give rise to an evolution. “Things
that replicate themselves” is not a sufficient condition. Even “competition
between replicating things” is not sufficient.
Do corporations evolve? They certainly compete. They occasionally spin
off children. Their resources are limited. They sometimes die.
But how much does the child of a corporation resemble its parents? Much
of the personality of a corporation derives from key officers, and CEOs cannot
divide themselves by fission. Price’s Equation only operates to the extent that
characteristics are heritable across generations. If great-great-grandchildren
don’t much resemble their great-great-grandparents, you won’t get more than
four generations’ worth of cumulative selection pressure—anything that hap-
pened more than four generations ago will blur itself out. Yes, the personality
of a corporation can influence its spinoff—but that’s nothing like the heritabil-
ity of DNA, which is digital rather than analog, and can transmit itself with
10−8 errors per base per generation.
With DNA you have heritability lasting for millions of generations. That’s
how complex adaptations can arise by pure evolution—the digital DNA lasts
long enough for a gene conveying 3% advantage to spread itself over 768 gener-
ations, and then another gene dependent on it can arise. Even if corporations
replicated with digital fidelity, they would currently be at most ten generations
into the RNA World.
Now, corporations are certainly selected, in the sense that incompetent
corporations go bust. This should logically make you more likely to observe
corporations with features contributing to competence. And in the same sense,
any star that goes nova shortly after it forms, is less likely to be visible when
you look up at the night sky. But if an accident of stellar dynamics makes
one star burn longer than another star, that doesn’t make it more likely that
future stars will also burn longer—the feature will not be copied onto other
stars. We should not expect future astrophysicists to discover complex internal
features of stars which seem designed to help them burn longer. That kind of
mechanical adaptation requires much larger cumulative selection pressures
than a once-off winnowing.
Think of the principle introduced in Einstein’s Arrogance—that the vast
majority of the evidence required to think of General Relativity had to go into
raising that one particular equation to the level of Einstein’s personal attention;
the amount of evidence required to raise it from a deliberately considered possi-
bility to 99.9% certainty was trivial by comparison. In the same sense, complex
features of corporations that require hundreds of bits to specify are produced
primarily by human intelligence, not a handful of generations of low-fidelity
evolution. In biology, the mutations are purely random and evolution supplies
thousands of bits of cumulative selection pressure. In corporations, humans
offer up thousand-bit intelligently designed complex “mutations,” and then
the further selection pressure of “Did it go bankrupt or not?” accounts for a
handful of additional bits in explaining what you see.
Advanced molecular nanotechnology—the artificial sort, not biology—
should be able to copy itself with digital fidelity through thousands of genera-
tions. Would Price’s Equation thereby gain a foothold?
Correlation is covariance divided by variance, so if A is highly predictive
of B, there can be a strong “correlation” between them even if A is ranging
from 0 to 9 and B is only ranging from 50.0001 and 50.0009. Price’s Equation
runs on covariance of characteristics with reproduction—not correlation! If
you can compress variance in characteristics into a tiny band, the covariance
goes way down, and so does the cumulative change in the characteristic.
The Foresight Institute suggests, among other sensible proposals, that the
replication instructions for any nanodevice should be encrypted. Moreover,
encrypted such that flipping a single bit of the encoded instructions will en-
tirely scramble the decrypted output. If all nanodevices produced are precise
molecular copies, and moreover, any mistakes on the assembly line are not
heritable because the offspring got a digital copy of the original encrypted in-
structions for use in making grandchildren, then your nanodevices ain’t gonna
be doin’ much evolving.
You’d still have to worry about prions—self-replicating assembly errors
apart from the encrypted instructions, where a robot arm fails to grab a carbon
atom that is used in assembling a homologue of itself, and this causes the
offspring’s robot arm to likewise fail to grab a carbon atom, etc., even with
all the encrypted instructions remaining constant. But how much correlation
is there likely to be, between this sort of transmissible error, and a higher
reproductive rate? Let’s say that one nanodevice produces a copy of itself every
1,000 seconds, and the new nanodevice is magically more efficient (it not only
has a prion, it has a beneficial prion) and copies itself every 999.99999 seconds.
It needs one less carbon atom attached, you see. That’s not a whole lot of
variance in reproduction, so it’s not a whole lot of covariance either.
And how often will these nanodevices need to replicate? Unless they’ve
got more atoms available than exist in the solar system, or for that matter,
the visible Universe, only a small number of generations will pass before they
hit the resource wall. “Limited resources” are not a sufficient condition for
evolution; you need the frequently iterated death of a substantial fraction of
the population to free up resources. Indeed, “generations” is not so much an
integer as an integral over the fraction of the population that consists of newly
created individuals.
This is, to me, the most frightening thing about gray goo or nanotechnolog-
ical weapons—that they could eat the whole Earth and then that would be it,
nothing interesting would happen afterward. Diamond is stabler than proteins
held together by van der Waals forces, so the goo would only need to reassem-
ble some pieces of itself when an asteroid hit. Even if prions were a powerful
enough idiom to support evolution at all—evolution is slow enough with digi-
tal DNA!—fewer than 1.0 generations might pass between when the goo ate
the Earth and when the Sun died.
To sum up, if you have all of the following properties:
Then you will have significant cumulative selection pressures, enough to pro-
duce complex adaptations by the force of evolution.
*
135
Evolving to Extinction
It is a very common misconception that an evolution works for the good of its
species. Can you remember hearing someone talk about two rabbits breeding
eight rabbits and thereby “contributing to the survival of their species”? A
modern evolutionary biologist would never say such a thing; they’d sooner
breed with a rabbit.
It’s yet another case where you’ve got to simultaneously consider multiple
abstract concepts and keep them distinct. Evolution doesn’t operate on partic-
ular individuals; individuals keep whatever genes they’re born with. Evolution
operates on a reproducing population, a species, over time. There’s a natural
tendency to think that if an Evolution Fairy is operating on the species, she
must be optimizing for the species. But what really changes are the gene fre-
quencies, and frequencies don’t increase or decrease according to how much
the gene helps the species as a whole. As we shall later see, it’s quite possible
for a species to evolve to extinction.
Why are boys and girls born in roughly equal numbers? (Leaving aside
crazy countries that use artificial gender selection technologies.) To see why
this is surprising, consider that 1 male can impregnate 2, 10, or 100 females; it
wouldn’t seem that you need the same number of males as females to ensure
the survival of the species. This is even more surprising in the vast majority of
animal species where the male contributes very little to raising the children—
humans are extraordinary, even among primates, for their level of paternal
investment. Balanced gender ratios are found even in species where the male
impregnates the female and vanishes into the mist.
Consider two groups on different sides of a mountain; in group A, each
mother gives birth to 2 males and 2 females; in group B, each mother gives
birth to 3 females and 1 male. Group A and group B will have the same number
of children, but group B will have 50% more grandchildren and 125% more
great-grandchildren. You might think this would be a significant evolutionary
advantage.
But consider: The rarer males become, the more reproductively valuable
they become—not to the group, but to the individual parent. Every child has
one male and one female parent. Then in every generation, the total genetic
contribution from all males equals the total genetic contribution from all
females. The fewer males, the greater the individual genetic contribution per
male. If all the females around you are doing what’s good for the group, what’s
good for the species, and birthing 1 male per 10 females, you can make a
genetic killing by birthing all males, each of whom will have (on average) ten
times as many grandchildren as their female cousins.
So while group selection ought to favor more girls, individual selection fa-
vors equal investment in male and female offspring. Looking at the statistics
of a maternity ward, you can see at a glance that the quantitative balance be-
tween group selection forces and individual selection forces is overwhelmingly
tilted in favor of individual selection in Homo sapiens.
(Technically, this isn’t quite a glance. Individual selection favors equal
parental investments in male and female offspring. If males cost half as much
to birth and/or raise, twice as many males as females will be born at the evolu-
tionarily stable equilibrium. If the same number of males and females were
born in the population at large, but males were twice as cheap to birth, then
you could again make a genetic killing by birthing more males. So the ma-
ternity ward should reflect the balance of parental opportunity costs, in a
hunter-gatherer society, between raising boys and raising girls; and you’d have
to assess that somehow. But ya know, it doesn’t seem all that much more
reproductive-opportunity-costly for a hunter-gatherer family to raise a girl, so
it’s kinda suspicious that around the same number of boys are born as girls.)
Natural selection isn’t about groups, or species, or even individuals. In a sex-
ual species, an individual organism doesn’t evolve; it keeps whatever genes it’s
born with. An individual is a once-off collection of genes that will never reap-
pear; how can you select on that? When you consider that nearly all of your
ancestors are dead, it’s clear that “survival of the fittest” is a tremendous mis-
nomer. “Replication of the fitter” would be more accurate, although technically
fitness is defined only in terms of replication.
Natural selection is really about gene frequencies. To get a complex adap-
tation, a machine with multiple dependent parts, each new gene as it evolves
depends on the other genes being reliably present in its genetic environment.
They must have high frequencies. The more complex the machine, the higher
the frequencies must be. The signature of natural selection occurring is a gene
rising from 0.00001% of the gene pool to 99% of the gene pool. This is the in-
formation, in an information-theoretic sense; and this is what must happen
for large complex adaptations to evolve.
The real struggle in natural selection is not the competition of organisms
for resources; this is an ephemeral thing when all the participants will vanish in
another generation. The real struggle is the competition of alleles for frequency
in the gene pool. This is the lasting consequence that creates lasting information.
The two rams bellowing and locking horns are only passing shadows.
It’s perfectly possible for an allele to spread to fixation by outcompeting
an alternative allele which was “better for the species.” If the Flying Spaghetti
Monster magically created a species whose gender mix was perfectly opti-
mized to ensure the survival of the species—the optimal gender mix to bounce
back reliably from near-extinction events, adapt to new niches, et cetera—
then the evolution would rapidly degrade this species optimum back into
the individual-selection optimum of equal parental investment in males and
females.
Imagine a “Frodo gene” that sacrifices its vehicle to save its entire species
from an extinction event. What happens to the allele frequency as a result? It
goes down. Kthxbye.
If species-level extinction threats occur regularly (call this a “Buffy envi-
ronment”) then the Frodo gene will systematically decrease in frequency and
vanish, and soon thereafter, so will the species.
A hypothetical example? Maybe. If the human species was going to stay
biological for another century, it would be a good idea to start cloning Gandhi.
In viruses, there’s the tension between individual viruses replicating as fast
as possible, versus the benefit of leaving the host alive long enough to transmit
the illness. This is a good real-world example of group selection, and if the
virus evolves to a point on the fitness landscape where the group selection
pressures fail to overcome individual pressures, the virus could vanish shortly
thereafter. I don’t know if a disease has ever been caught in the act of evolving
to extinction, but it’s probably happened any number of times.
Segregation-distorters subvert the mechanisms that usually guarantee fair-
ness of sexual reproduction. For example, there is a segregation-distorter on
the male sex chromosome of some mice which causes only male children to
be born, all carrying the segregation-distorter. Then these males impregnate
females, who give birth to only male children, and so on. You might cry “This
is cheating!” but that’s a human perspective; the reproductive fitness of this
allele is extremely high, since it produces twice as many copies of itself in the
succeeding generation as its nonmutant alternative. Even as females become
rarer and rarer, males carrying this gene are no less likely to mate than any
other male, and so the segregation-distorter remains twice as fit as its alterna-
tive allele. It’s speculated that real-world group selection may have played a
role in keeping the frequency of this gene as low as it seems to be. In which
case, if mice were to evolve the ability to fly and migrate for the winter, they
would probably form a single reproductive population, and would evolve to
extinction as the segregation-distorter evolved to fixation.
Around 50% of the total genome of maize consists of transposons, DNA
elements whose primary function is to copy themselves into other locations of
DNA. A class of transposons called “P elements” seem to have first appeared
in Drosophila only in the middle of the twentieth century, and spread to every
population of the species within 50 years. The “Alu sequence” in humans,
a 300-base transposon, is repeated between 300,000 and a million times in
the human genome. This may not extinguish a species, but it doesn’t help
it; transposons cause more mutations which are as always mostly harmful,
decrease the effective copying fidelity of DNA. Yet such cheaters are extremely
fit.
Suppose that in some sexually reproducing species, a perfect DNA-copying
mechanism is invented. Since most mutations are detrimental, this gene com-
plex is an advantage to its holders. Now you might wonder about beneficial
mutations—they do happen occasionally, so wouldn’t the unmutable be at
a disadvantage? But in a sexual species, a beneficial mutation that began in
a mutable can spread to the descendants of unmutables as well. The muta-
bles suffer from degenerate mutations in each generation; and the unmutables
can sexually acquire, and thereby benefit from, any beneficial mutations that
occur in the mutables. Thus the mutables have a pure disadvantage. The per-
fect DNA-copying mechanism rises in frequency to fixation. Ten thousand
years later there’s an ice age and the species goes out of business. It evolved to
extinction.
The “bystander effect” is that, when someone is in trouble, solitary individ-
uals are more likely to intervene than groups. A college student apparently
having an epileptic seizure was helped 85% of the time by a single bystander,
and 31% of the time by five bystanders. I speculate that even if the kinship rela-
tion in a hunter-gatherer tribe was strong enough to create a selection pressure
for helping individuals not directly related, when several potential helpers were
present, a genetic arms race might occur to be the last one to step forward.
Everyone delays, hoping that someone else will do it. Humanity is facing mul-
tiple species-level extinction threats right now, and I gotta tell ya, there ain’t
a lot of people steppin’ forward. If we lose this fight because virtually no one
showed up on the battlefield, then—like a probably-large number of species
which we don’t see around today—we will have evolved to extinction.
Cancerous cells do pretty well in the body, prospering and amassing more
resources, far outcompeting their more obedient counterparts. For a while.
Multicellular organisms can only exist because they’ve evolved powerful
internal mechanisms to outlaw evolution. If the cells start evolving, they rapidly
evolve to extinction: the organism dies.
So praise not evolution for the solicitous concern it shows for the individual;
nearly all of your ancestors are dead. Praise not evolution for the solicitous
concern it shows for a species; no one has ever found a complex adaptation
which can only be interpreted as operating to preserve a species, and the
mathematics would seem to indicate that this is virtually impossible. Indeed,
it’s perfectly possible for a species to evolve to extinction. Humanity may be
finishing up the process right now. You can’t even praise evolution for the
solicitous concern it shows for genes; the battle between two alternative alleles
at the same location is a zero-sum game for frequency.
Fitness is not always your friend.
*
136
The Tragedy of Group Selectionism
Before 1966, it was not unusual to see serious biologists advocating evolution-
ary hypotheses that we would now regard as magical thinking. These muddled
notions played an important historical role in the development of later evo-
lutionary theory, error calling forth correction; like the folly of English kings
provoking into existence the Magna Carta and constitutional democracy.
As an example of romance, Vero Wynne-Edwards, Warder Allee, and J. L. Br-
ereton, among others, believed that predators would voluntarily restrain their
breeding to avoid overpopulating their habitat and exhausting the prey popu-
lation.
But evolution does not open the floodgates to arbitrary purposes. You
cannot explain a rattlesnake’s rattle by saying that it exists to benefit other
animals who would otherwise be bitten. No outside Evolution Fairy decides
when a gene ought to be promoted; the gene’s effect must somehow directly
cause the gene to be more prevalent in the next generation. It’s clear why our
human sense of aesthetics, witnessing a population crash of foxes who’ve eaten
all the rabbits, cries “Something should’ve been done!” But how would a gene
complex for restraining reproduction—of all things!—cause itself to become
more frequent in the next generation?
A human being designing a neat little toy ecology—for entertainment
purposes, like a model railroad—might be annoyed if their painstakingly
constructed fox and rabbit populations self-destructed by the foxes eating all
the rabbits and then dying of starvation themselves. So the human would
tinker with the toy ecology—a fox-breeding-restrainer is the obvious solution
that leaps to our human minds—until the ecology looked nice and neat. Nature
has no human, of course, but that needn’t stop us—now that we know what we
want on aesthetic grounds, we just have to come up with a plausible argument
that persuades Nature to want the same thing on evolutionary grounds.
Obviously, selection on the level of the individual won’t produce individual
restraint in breeding. Individuals who reproduce unrestrainedly will, naturally,
produce more offspring than individuals who restrain themselves.
(Individual selection will not produce individual sacrifice of breeding op-
portunities. Individual selection can certainly produce individuals who, after
acquiring all available resources, use those resources to produce four big eggs
instead of eight small eggs—not to conserve social resources, but because that
is the individual sweet spot for (number of eggs)×(egg survival probability).
This does not get rid of the commons problem.)
But suppose that the species population was broken up into subpopulations,
which were mostly isolated, and only occasionally interbred. Then, surely,
subpopulations that restrained their breeding would be less likely to go extinct,
and would send out more messengers, and create new colonies to reinhabit
the territories of crashed populations.
The problem with this scenario wasn’t that it was mathematically impossible.
The problem was that it was possible but very difficult.
The fundamental problem is that it’s not only restrained breeders who reap
the benefits of restrained breeding. If some foxes refrain from spawning cubs
who eat rabbits, then the uneaten rabbits don’t go to only cubs who carry the
restrained-breeding adaptation. The unrestrained foxes, and their many more
cubs, will happily eat any rabbits left unhunted. The only way the restraining
gene can survive against this pressure, is if the benefits of restraint preferentially
go to restrainers.
Specifically, the requirement is C/B < FST where C is the cost of altruism
to the donor, B is the benefit of altruism to the recipient, and FST is the
spatial structure of the population: the average relatedness between a randomly
selected organism and its randomly selected neighbor, where a “neighbor” is
any other fox who benefits from an altruistic fox’s restraint.1
So is the cost of restrained breeding sufficiently small, and the empirical
benefit of less famine sufficiently large, compared to the empirical spatial
structure of fox populations and rabbit populations, that the group selection
argument can work?
The math suggests this is pretty unlikely. In this simulation, for example,
the cost to altruists is 3% of fitness, pure altruist groups have a fitness twice as
great as pure selfish groups, the subpopulation size is 25, and 20% of all deaths
are replaced with messengers from another group: the result is polymorphic for
selfishness and altruism. If the subpopulation size is doubled to 50, selfishness
is fixed; if the cost to altruists is increased to 6%, selfishness is fixed; if the
altruistic benefit is decreased by half, selfishness is fixed or in large majority.
Neighborhood-groups must be very small, with only around 5 members, for
group selection to operate when the cost of altruism exceeds 10%. This doesn’t
seem plausibly true of foxes restraining their breeding.
You can guess by now, I think, that the group selectionists ultimately lost
the scientific argument. The kicker was not the mathematical argument, but
empirical observation: foxes didn’t restrain their breeding (I forget the exact
species of dispute; it wasn’t foxes and rabbits), and indeed, predator-prey
systems crash all the time. Group selectionism would later revive, somewhat,
in drastically different form—mathematically speaking, there is neighborhood
structure, which implies nonzero group selection pressure not necessarily
capable of overcoming countervailing individual selection pressure, and if you
don’t take it into account your math will be wrong, full stop. And evolved
enforcement mechanisms (not originally postulated) change the game entirely.
So why is this now-historical scientific dispute worthy material for Overcoming
Bias?
A decade after the controversy, a biologist had a fascinating idea. The
mathematical conditions for group selection overcoming individual selection
were too extreme to be found in Nature. Why not create them artificially, in
the laboratory? Michael J. Wade proceeded to do just that, repeatedly selecting
populations of insects for low numbers of adults per subpopulation.2 And what
was the result? Did the insects restrain their breeding and live in quiet peace
with enough food for all?
No; the adults adapted to cannibalize eggs and larvae, especially female
larvae.
Of course selecting for small subpopulation sizes would not select for indi-
viduals who restrained their own breeding; it would select for individuals who
ate other individuals’ children. Especially the girls.
Once you have that experimental result in hand—and it’s massively ob-
vious in retrospect—then it suddenly becomes clear how the original group
selectionists allowed romanticism, a human sense of aesthetics, to cloud their
predictions of Nature.
This is an archetypal example of a missed Third Alternative, resulting
from a rationalization of a predetermined bottom line that produced a fake
justification and then motivatedly stopped. The group selectionists didn’t start
with clear, fresh minds, happen upon the idea of group selection, and neu-
trally extrapolate forward the probable outcome. They started out with the
beautiful idea of fox populations voluntarily restraining their reproduction to
what the rabbit population would bear, Nature in perfect harmony; then they
searched for a reason why this would happen, and came up with the idea of
group selection; then, since they knew what they wanted the outcome of group
selection to be, they didn’t look for any less beautiful and aesthetic adaptations
that group selection would be more likely to promote instead. If they’d really
been trying to calmly and neutrally predict the result of selecting for small sub-
population sizes resistant to famine, they would have thought of cannibalizing
other organisms’ children or some similarly “ugly” outcome—long before they
imagined anything so evolutionarily outré as individual restraint in breeding!
This also illustrates the point I was trying to make in Einstein’s Arrogance:
With large answer spaces, nearly all of the real work goes into promoting one
possible answer to the point of being singled out for attention. If a hypothesis
is improperly promoted to your attention—your sense of aesthetics suggests
a beautiful way for Nature to be, and yet natural selection doesn’t involve an
Evolution Fairy who shares your appreciation—then this alone may seal your
doom, unless you can manage to clear your mind entirely and start over.
In principle, the world’s stupidest person may say the Sun is shining, but
that doesn’t make it dark out. Even if an answer is suggested by a lunatic on
LSD, you should be able to neutrally calculate the evidence for and against,
and if necessary, un-believe.
In practice, the group selectionists were doomed because their bottom line
was originally suggested by their sense of aesthetics, and Nature’s bottom line
was produced by natural selection. These two processes had no principled
reason for their outputs to correlate, and indeed they didn’t. All the furious
argument afterward didn’t change that.
If you start with your own desires for what Nature should do, consider
Nature’s own observed reasons for doing things, and then rationalize an ex-
tremely persuasive argument for why Nature should produce your preferred
outcome for Nature’s own reasons, then Nature, alas, still won’t listen. The
universe has no mind and is not subject to clever political persuasion. You can
argue all day why gravity should really make water flow uphill, and the water
just ends up in the same place regardless. It’s like the universe plain isn’t lis-
tening. J. R. Molloy said: “Nature is the ultimate bigot, because it is obstinately
and intolerantly devoted to its own prejudices and absolutely refuses to yield
to the most persuasive rationalizations of humans.”
I often recommend evolutionary biology to friends just because the modern
field tries to train its students against rationalization, error calling forth correc-
tion. Physicists and electrical engineers don’t have to be carefully trained to
avoid anthropomorphizing electrons, because electrons don’t exhibit mindish
behaviors. Natural selection creates purposefulnesses which are alien to hu-
mans, and students of evolutionary theory are warned accordingly. It’s good
training for any thinker, but it is especially important if you want to think
clearly about other weird mindish processes that do not work like you do.
1. David Sloan Wilson, “A Theory of Group Selection,” Proceedings of the National Academy of
Sciences of the United States of America 72, no. 1 (1975): 143–146.
2. Michael J. Wade, “Group selections among laboratory populations of Tribolium,” Proceedings of
the National Academy of Sciences of the United States of America 73, no. 12 (1976): 4604–4607,
doi:10.1073/pnas.73.12.4604.
137
Fake Optimization Criteria
*
138
Adaptation-Executers, Not
Fitness-Maximizers
Fifty thousand years ago, the taste buds of Homo sapiens directed their bearers
to the scarcest, most critical food resources—sugar and fat. Calories, in a word.
Today, the context of a taste bud’s function has changed, but the taste buds
themselves have not. Calories, far from being scarce (in First World countries),
are actively harmful. Micronutrients that were reliably abundant in leaves and
nuts are absent from bread, but our taste buds don’t complain. A scoop of ice
cream is a superstimulus, containing more sugar, fat, and salt than anything in
the ancestral environment.
No human being with the deliberate goal of maximizing their alleles’ in-
clusive genetic fitness would ever eat a cookie unless they were starving. But
individual organisms are best thought of as adaptation-executers, not fitness-
maximizers.
A Phillips-head screwdriver, though its designer intended it to turn screws,
won’t reconform itself to a flat-head screw to fulfill its function. We created
these tools, but they exist independently of us, and they continue independently
of us.
The atoms of a screwdriver don’t have tiny little XML tags inside describing
their “objective” purpose. The designer had something in mind, yes, but that’s
not the same as what happens in the real world. If you forgot that the designer
is a separate entity from the designed thing, you might think, “The purpose of
the screwdriver is to drive screws”—as though this were an explicit property
of the screwdriver itself, rather than a property of the designer’s state of mind.
You might be surprised that the screwdriver didn’t reconfigure itself to the
flat-head screw, since, after all, the screwdriver’s purpose is to turn screws.
The cause of the screwdriver’s existence is the designer’s mind, which
imagined an imaginary screw, and imagined an imaginary handle turning.
The actual operation of the screwdriver, its actual fit to an actual screw head,
cannot be the objective cause of the screwdriver’s existence: The future cannot
cause the past. But the designer’s brain, as an actually existent thing within
the past, can indeed be the cause of the screwdriver.
The consequence of the screwdriver’s existence may not correspond to the
imaginary consequences in the designer’s mind. The screwdriver blade could
slip and cut the user’s hand.
And the meaning of the screwdriver—why, that’s something that exists in
the mind of a user, not in tiny little labels on screwdriver atoms. The designer
may intend it to turn screws. A murderer may buy it to use as a weapon. And
then accidentally drop it, to be picked up by a child, who uses it as a chisel.
So the screwdriver’s cause, and its shape, and its consequence, and its various
meanings, are all different things; and only one of these things is found within
the screwdriver itself.
Where do taste buds come from? Not from an intelligent designer visual-
izing their consequences, but from a frozen history of ancestry: Adam liked
sugar and ate an apple and reproduced, Barbara liked sugar and ate an apple
and reproduced, Charlie liked sugar and ate an apple and reproduced, and 2763
generations later, the allele became fixed in the population. For convenience of
thought, we sometimes compress this giant history and say: “Evolution did it.”
But it’s not a quick, local event like a human designer visualizing a screwdriver.
This is the objective cause of a taste bud.
What is the objective shape of a taste bud? Technically, it’s a molecular
sensor connected to reinforcement circuitry. This adds another level of indi-
rection, because the taste bud isn’t directly acquiring food. It’s influencing the
organism’s mind, making the organism want to eat foods that are similar to
the food just eaten.
What is the objective consequence of a taste bud? In a modern First World
human, it plays out in multiple chains of causality: from the desire to eat more
chocolate, to the plan to eat more chocolate, to eating chocolate, to getting fat,
to getting fewer dates, to reproducing less successfully. This consequence is
directly opposite the key regularity in the long chain of ancestral successes that
caused the taste bud’s shape. But, since overeating has only recently become
a problem, no significant evolution (compressed regularity of ancestry) has
further influenced the taste bud’s shape.
What is the meaning of eating chocolate? That’s between you and your
moral philosophy. Personally, I think chocolate tastes good, but I wish it were
less harmful; acceptable solutions would include redesigning the chocolate or
redesigning my biochemistry.
Smushing several of the concepts together, you could sort-of-say, “Modern
humans do today what would have propagated our genes in a hunter-gatherer
society, whether or not it helps our genes in a modern society.” But this still
isn’t quite right, because we’re not actually asking ourselves which behaviors
would maximize our ancestors’ inclusive fitness. And many of our activities
today have no ancestral analogue. In the hunter-gatherer society there wasn’t
any such thing as chocolate.
So it’s better to view our taste buds as an adaptation fitted to ancestral
conditions that included near-starvation and apples and roast rabbit, which
modern humans execute in a new context that includes cheap chocolate and
constant bombardment by advertisements.
Therefore it is said: Individual organisms are best thought of as adaptation-
executers, not fitness-maximizers.
1. John Tooby and Leda Cosmides, “The Psychological Foundations of Culture,” in The Adapted Mind:
Evolutionary Psychology and the Generation of Culture, ed. Jerome H. Barkow, Leda Cosmides,
and John Tooby (New York: Oxford University Press, 1992), 19–136.
139
Evolutionary Psychology
*
140
An Especially Elegant Evolutionary
Psychology Experiment
2. People who say that evolutionary psychology hasn’t made any advance
predictions are (ironically) mere victims of “no one knows what science
doesn’t know” syndrome. You wouldn’t even think of this as an experi-
ment to be performed if not for evolutionary psychology.
1. Robert Wright, The Moral Animal: Why We Are the Way We Are: The New Science of Evolutionary
Psychology (Pantheon Books, 1994); Charles B. Crawford, Brenda E. Salter, and Kerry L. Jang,
“Human Grief: Is Its Intensity Related to the Reproductive Value of the Deceased?,” Ethology and
Sociobiology 10, no. 4 (1989): 297–307.
141
Superstimuli and the Collapse of
Western Civilization
At least three people have died playing online games for days without rest.
People have lost their spouses, jobs, and children to World of Warcraft. If
people have the right to play video games—and it’s hard to imagine a more
fundamental right—then the market is going to respond by supplying the most
engaging video games that can be sold, to the point that exceptionally engaged
consumers are removed from the gene pool.
How does a consumer product become so involving that, after 57 hours of
using the product, the consumer would rather use the product for one more
hour than eat or sleep? (I suppose one could argue that the consumer makes a
rational decision that they’d rather play Starcraft for the next hour than live
out the rest of their life, but let’s just not go there. Please.)
A candy bar is a superstimulus: it contains more concentrated sugar, salt, and
fat than anything that exists in the ancestral environment. A candy bar matches
taste buds that evolved in a hunter-gatherer environment, but it matches those
taste buds much more strongly than anything that actually existed in the hunter-
gatherer environment. The signal that once reliably correlated to healthy food
has been hijacked, blotted out with a point in tastespace that wasn’t in the train-
ing dataset—an impossibly distant outlier on the old ancestral graphs. Tastiness,
formerly representing the evolutionarily identified correlates of healthiness,
has been reverse-engineered and perfectly matched with an artificial substance.
Unfortunately there’s no equally powerful market incentive to make the re-
sulting food item as healthy as it is tasty. We can’t taste healthfulness, after
all.
The now-famous Dove Evolution video shows the painstaking construction
of another superstimulus: an ordinary woman transformed by makeup, careful
photography, and finally extensive Photoshopping, into a billboard model—a
beauty impossible, unmatchable by human women in the unretouched real
world. Actual women are killing themselves (e.g., supermodels using cocaine
to keep their weight down) to keep up with competitors that literally don’t
exist.
And likewise, a video game can be so much more engaging than mere reality,
even through a simple computer monitor, that someone will play it without
food or sleep until they literally die. I don’t know all the tricks used in video
games, but I can guess some of them—challenges poised at the critical point
between ease and impossibility, intermittent reinforcement, feedback showing
an ever-increasing score, social involvement in massively multiplayer games.
Is there a limit to the market incentive to make video games more engaging?
You might hope there’d be no incentive past the point where the players lose
their jobs; after all, they must be able to pay their subscription fee. This would
imply a “sweet spot” for the addictiveness of games, where the mode of the
bell curve is having fun, and only a few unfortunate souls on the tail become
addicted to the point of losing their jobs. As of 2007, playing World of Warcraft
for 58 hours straight until you literally die is still the exception rather than the
rule. But video game manufacturers compete against each other, and if you
can make your game 5% more addictive, you may be able to steal 50% of your
competitor’s customers. You can see how this problem could get a lot worse.
If people have the right to be tempted—and that’s what free will is all
about—the market is going to respond by supplying as much temptation as
can be sold. The incentive is to make your stimuli 5% more tempting than
those of your current leading competitors. This continues well beyond the
point where the stimuli become ancestrally anomalous superstimuli. Consider
how our standards of product-selling feminine beauty have changed since
the advertisements of the 1950s. And as candy bars demonstrate, the market
incentive also continues well beyond the point where the superstimulus begins
wreaking collateral damage on the consumer.
So why don’t we just say no? A key assumption of free-market economics
is that, in the absence of force and fraud, people can always refuse to engage in
a harmful transaction. (To the extent this is true, a free market would be, not
merely the best policy on the whole, but a policy with few or no downsides.)
An organism that regularly passes up food will die, as some video game
players found out the hard way. But, on some occasions in the ancestral
environment, a typically beneficial (and therefore tempting) act may in fact be
harmful. Humans, as organisms, have an unusually strong ability to perceive
these special cases using abstract thought. On the other hand we also tend to
imagine lots of special-case consequences that don’t exist, like ancestor spirits
commanding us not to eat perfectly good rabbits.
Evolution seems to have struck a compromise, or perhaps just aggregated
new systems on top of old. Homo sapiens are still tempted by food, but our
oversized prefrontal cortices give us a limited ability to resist temptation. Not
unlimited ability—our ancestors with too much willpower probably starved
themselves to sacrifice to the gods, or failed to commit adultery one too many
times. The video game players who died must have exercised willpower (in
some sense) to keep playing for so long without food or sleep; the evolutionary
hazard of self-control.
Resisting any temptation takes conscious expenditure of an exhaustible
supply of mental energy. It is not in fact true that we can “just say no”—not
just say no, without cost to ourselves. Even humans who won the birth lottery
for willpower or foresightfulness still pay a price to resist temptation. The price
is just more easily paid.
Our limited willpower evolved to deal with ancestral temptations; it may not
operate well against enticements beyond anything known to hunter-gatherers.
Even where we successfully resist a superstimulus, it seems plausible that the
effort required would deplete willpower much faster than resisting ancestral
temptations.
Is public display of superstimuli a negative externality, even to the people
who say no? Should we ban chocolate cookie ads, or storefronts that openly
say “Ice Cream”?
Just because a problem exists doesn’t show (without further justification
and a substantial burden of proof) that the government can fix it. The regu-
lator’s career incentive does not focus on products that combine low-grade
consumer harm with addictive superstimuli; it focuses on products with failure
modes spectacular enough to get into the newspaper. Conversely, just because
the government may not be able to fix something, doesn’t mean it isn’t going
wrong.
I leave you with a final argument from fictional evidence: Simon Funk’s
online novel After Life depicts (among other plot points) the planned exter-
mination of biological Homo sapiens—not by marching robot armies, but by
artificial children that are much cuter and sweeter and more fun to raise than
real children. Perhaps the demographic collapse of advanced societies hap-
pens because the market supplies ever-more-tempting alternatives to having
children, while the attractiveness of changing diapers remains constant over
time. Where are the advertising billboards that say “Breed”? Who will pay
professional image consultants to make arguing with sullen teenagers seem
more alluring than a vacation in Tahiti?
“In the end,” Simon Funk wrote, “the human species was simply marketed
out of existence.”
*
142
Thou Art Godshatter
Before the twentieth century, not a single human being had an explicit concept
of “inclusive genetic fitness,” the sole and absolute obsession of the blind idiot
god. We have no instinctive revulsion of condoms or oral sex. Our brains,
those supreme reproductive organs, don’t perform a check for reproductive
efficacy before granting us sexual pleasure.
Why not? Why aren’t we consciously obsessed with inclusive genetic fit-
ness? Why did the Evolution-of-Humans Fairy create brains that would invent
condoms? “It would have been so easy,” thinks the human, who can design
new complex systems in an afternoon.
The Evolution Fairy, as we all know, is obsessed with inclusive genetic fitness.
When she decides which genes to promote to universality, she doesn’t seem
to take into account anything except the number of copies a gene produces.
(How strange!)
But since the maker of intelligence is thus obsessed, why not create intel-
ligent agents—you can’t call them humans—who would likewise care purely
about inclusive genetic fitness? Such agents would have sex only as a means of
reproduction, and wouldn’t bother with sex that involved birth control. They
could eat food out of an explicitly reasoned belief that food was necessary to
reproduce, not because they liked the taste, and so they wouldn’t eat candy if
it became detrimental to survival or reproduction. Post-menopausal women
would babysit grandchildren until they became sick enough to be a net drain
on resources, and would then commit suicide.
It seems like such an obvious design improvement—from the Evolution
Fairy’s perspective.
Now it’s clear that it’s hard to build a powerful enough consequentialist.
Natural selection sort-of reasons consequentially, but only by depending on
the actual consequences. Human evolutionary theorists have to do really high-
falutin’ abstract reasoning in order to imagine the links between adaptations
and reproductive success.
But human brains clearly can imagine these links in protein. So when the
Evolution Fairy made humans, why did It bother with any motivation except
inclusive genetic fitness?
It’s been less than two centuries since a protein brain first represented
the concept of natural selection. The modern notion of “inclusive genetic
fitness” is even more subtle, a highly abstract concept. What matters is not
the number of shared genes. Chimpanzees share 95% of your genes. What
matters is shared genetic variance, within a reproducing population—your
sister is one-half related to you, because any variations in your genome, within
the human species, are 50% likely to be shared by your sister.
Only in the last century—arguably only in the last fifty years—have evolu-
tionary biologists really begun to understand the full range of causes of repro-
ductive success, things like reciprocal altruism and costly signaling. Without
all this highly detailed knowledge, an intelligent agent that set out to “maximize
inclusive fitness” would fall flat on its face.
So why not preprogram protein brains with the knowledge? Why wasn’t a
concept of “inclusive genetic fitness” programmed into us, along with a library
of explicit strategies? Then you could dispense with all the reinforcers. The
organism would be born knowing that, with high probability, fatty foods would
lead to fitness. If the organism later learned that this was no longer the case,
it would stop eating fatty foods. You could refactor the whole system. And it
wouldn’t invent condoms or cookies.
This looks like it should be quite possible in principle. I occasionally run
into people who don’t quite understand consequentialism, who say, “But if
the organism doesn’t have a separate drive to eat, it will starve, and so fail
to reproduce.” So long as the organism knows this very fact, and has a utility
function that values reproduction, it will automatically eat. In fact, this is
exactly the consequentialist reasoning that natural selection itself used to build
automatic eaters.
What about curiosity? Wouldn’t a consequentialist only be curious when
it saw some specific reason to be curious? And wouldn’t this cause it to miss
out on lots of important knowledge that came with no specific reason for
investigation attached? Again, a consequentialist will investigate given only
the knowledge of this very same fact. If you consider the curiosity drive of a
human—which is not undiscriminating, but responds to particular features of
problems—then this complex adaptation is purely the result of consequentialist
reasoning by DNA, an implicit representation of knowledge: Ancestors who
engaged in this kind of inquiry left more descendants.
So in principle, the pure reproductive consequentialist is possible. In prin-
ciple, all the ancestral history implicitly represented in cognitive adaptations
can be converted to explicitly represented knowledge, running on a core conse-
quentialist.
But the blind idiot god isn’t that smart. Evolution is not a human program-
mer who can simultaneously refactor whole code architectures. Evolution is
not a human programmer who can sit down and type out instructions at sixty
words per minute.
For millions of years before hominid consequentialism, there was rein-
forcement learning. The reward signals were events that correlated reliably to
reproduction. You can’t ask a nonhominid brain to foresee that a child eat-
ing fatty foods now will live through the winter. So the DNA builds a protein
brain that generates a reward signal for eating fatty food. Then it’s up to the
organism to learn which prey animals are tastiest.
DNA constructs protein brains with reward signals that have a long-distance
correlation to reproductive fitness, but a short-distance correlation to organism
behavior. You don’t have to figure out that eating sugary food in the fall will
lead to digesting calories that can be stored fat to help you survive the winter
so that you mate in spring to produce offspring in summer. An apple simply
tastes good, and your brain just has to plot out how to get more apples off the
tree.
And so organisms evolve rewards for eating, and building nests, and scaring
off competitors, and helping siblings, and discovering important truths, and
forming strong alliances, and arguing persuasively, and of course having sex . . .
When hominid brains capable of cross-domain consequential reasoning
began to show up, they reasoned consequentially about how to get the existing
reinforcers. It was a relatively simple hack, vastly simpler than rebuilding an
“inclusive fitness maximizer” from scratch. The protein brains plotted how
to acquire calories and sex, without any explicit cognitive representation of
“inclusive fitness.”
A human engineer would have said, “Whoa, I’ve just invented a conse-
quentialist! Now I can take all my previous hard-won knowledge about which
behaviors improve fitness, and declare it explicitly! I can convert all this compli-
cated reinforcement learning machinery into a simple declarative knowledge
statement that ‘fatty foods and sex usually improve your inclusive fitness.’ Con-
sequential reasoning will automatically take care of the rest. Plus, it won’t have
the obvious failure mode where it invents condoms!”
But then a human engineer wouldn’t have built the retina backward, either.
The blind idiot god is not a unitary purpose, but a many-splintered attention.
Foxes evolve to catch rabbits, rabbits evolve to evade foxes; there are as many
evolutions as species. But within each species, the blind idiot god is purely
obsessed with inclusive genetic fitness. No quality is valued, not even survival,
except insofar as it increases reproductive fitness. There’s no point in an
organism with steel skin if it ends up having 1% less reproductive capacity.
Yet when the blind idiot god created protein computers, its monomaniacal
focus on inclusive genetic fitness was not faithfully transmitted. Its optimiza-
tion criterion did not successfully quine. We, the handiwork of evolution, are
as alien to evolution as our Maker is alien to us. One pure utility function
splintered into a thousand shards of desire.
Why? Above all, because evolution is stupid in an absolute sense. But also
because the first protein computers weren’t anywhere near as general as the
blind idiot god, and could only utilize short-term desires.
In the final analysis, asking why evolution didn’t build humans to maxi-
mize inclusive genetic fitness is like asking why evolution didn’t hand humans
a ribosome and tell them to design their own biochemistry. Because evolution
can’t refactor code that fast, that’s why. But maybe in a billion years of con-
tinued natural selection that’s exactly what would happen, if intelligence were
foolish enough to allow the idiot god continued reign.
The Mote in God’s Eye by Niven and Pournelle depicts an intelligent species
that stayed biological a little too long, slowly becoming truly enslaved by
evolution, gradually turning into true fitness maximizers obsessed with outre-
producing each other. But thankfully that’s not what happened. Not here on
Earth. At least not yet.
So humans love the taste of sugar and fat, and we love our sons and daugh-
ters. We seek social status, and sex. We sing and dance and play. We learn for
the love of learning.
A thousand delicious tastes, matched to ancient reinforcers that once cor-
related with reproductive fitness—now sought whether or not they enhance
reproduction. Sex with birth control, chocolate, the music of long-dead Bach
on a CD.
And when we finally learn about evolution, we think to ourselves: “Obsess
all day about inclusive genetic fitness? Where’s the fun in that?”
The blind idiot god’s single monomaniacal goal splintered into a thousand
shards of desire. And this is well, I think, though I’m a human who says so.
Or else what would we do with the future? What would we do with the billion
galaxies in the night sky? Fill them with maximally efficient replicators? Should
our descendants deliberately obsess about maximizing their inclusive genetic
fitness, regarding all else only as a means to that end?
Being a thousand shards of desire isn’t always fun, but at least it’s not boring.
Somewhere along the line, we evolved tastes for novelty, complexity, elegance,
and challenge—tastes that judge the blind idiot god’s monomaniacal focus,
and find it aesthetically unsatisfying.
And yes, we got those very same tastes from the blind idiot’s godshatter.
So what?
*
Part M
Fragile Purposes
143
Belief in Intelligence
I don’t know what moves Garry Kasparov would make in a chess game. What,
then, is the empirical content of my belief that “Kasparov is a highly intelligent
chess player”? What real-world experience does my belief tell me to anticipate?
Is it a cleverly masked form of total ignorance?
To sharpen the dilemma, suppose Kasparov plays against some mere chess
grandmaster Mr. G, who’s not in the running for world champion. My own
ability is far too low to distinguish between these levels of chess skill. When I
try to guess Kasparov’s move, or Mr. G’s next move, all I can do is try to guess
“the best chess move” using my own meager knowledge of chess. Then I would
produce exactly the same prediction for Kasparov’s move or Mr. G’s move in
any particular chess position. So what is the empirical content of my belief
that “Kasparov is a better chess player than Mr. G”?
The empirical content of my belief is the testable, falsifiable prediction
that the final chess position will occupy the class of chess positions that are
wins for Kasparov, rather than drawn games or wins for Mr. G. (Counting
resignation as a legal move that leads to a chess position classified as a loss.)
The degree to which I think Kasparov is a “better player” is reflected in the
amount of probability mass I concentrate into the “Kasparov wins” class of
outcomes, versus the “drawn game” and “Mr. G wins” class of outcomes. These
classes are extremely vague in the sense that they refer to vast spaces of possible
chess positions—but “Kasparov wins” is more specific than maximum entropy,
because it can be definitely falsified by a vast set of chess positions.
The outcome of Kasparov’s game is predictable because I know, and un-
derstand, Kasparov’s goals. Within the confines of the chess board, I know
Kasparov’s motivations—I know his success criterion, his utility function, his
target as an optimization process. I know where Kasparov is ultimately trying
to steer the future and I anticipate he is powerful enough to get there, although
I don’t anticipate much about how Kasparov is going to do it.
Imagine that I’m visiting a distant city, and a local friend volunteers to
drive me to the airport. I don’t know the neighborhood. Each time my friend
approaches a street intersection, I don’t know whether my friend will turn
left, turn right, or continue straight ahead. I can’t predict my friend’s move
even as we approach each individual intersection—let alone predict the whole
sequence of moves in advance.
Yet I can predict the result of my friend’s unpredictable actions: we will
arrive at the airport. Even if my friend’s house were located elsewhere in
the city, so that my friend made a completely different sequence of turns, I
would just as confidently predict our arrival at the airport. I can predict this
long in advance, before I even get into the car. My flight departs soon, and
there’s no time to waste; I wouldn’t get into the car in the first place, if I
couldn’t confidently predict that the car would travel to the airport along an
unpredictable pathway.
Isn’t this a remarkable situation to be in, from a scientific perspective? I
can predict the outcome of a process, without being able to predict any of the
intermediate steps of the process.
How is this even possible? Ordinarily one predicts by imagining the present
and then running the visualization forward in time. If you want a precise model
of the Solar System, one that takes into account planetary perturbations, you
must start with a model of all major objects and run that model forward in
time, step by step.
Sometimes simpler problems have a closed-form solution, where calculat-
ing the future at time T takes the same amount of work regardless of T. A coin
rests on a table, and after each minute, the coin turns over. The coin starts
out showing heads. What face will it show a hundred minutes later? Obvi-
ously you did not answer this question by visualizing a hundred intervening
steps. You used a closed-form solution that worked to predict the outcome,
and would also work to predict any of the intervening steps.
But when my friend drives me to the airport, I can predict the outcome
successfully using a strange model that won’t work to predict any of the interme-
diate steps. My model doesn’t even require me to input the initial conditions—I
don’t need to know where we start out in the city!
I do need to know something about my friend. I must know that my friend
wants me to make my flight. I must credit that my friend is a good enough
planner to successfully drive me to the airport (if he wants to). These are
properties of my friend’s initial state—properties which let me predict the final
destination, though not any intermediate turns.
I must also credit that my friend knows enough about the city to drive
successfully. This may be regarded as a relation between my friend and the
city; hence, a property of both. But an extremely abstract property, which does
not require any specific knowledge about either the city, or about my friend’s
knowledge about the city.
This is one way of viewing the subject matter to which I’ve devoted my
life—these remarkable situations which place us in such odd epistemic positions.
And my work, in a sense, can be viewed as unraveling the exact form of that
strange abstract knowledge we can possess; whereby, not knowing the actions,
we can justifiably know the consequence.
“Intelligence” is too narrow a term to describe these remarkable situations
in full generality. I would say rather “optimization process.” A similar situation
accompanies the study of biological natural selection, for example; we can’t
predict the exact form of the next organism observed.
But my own specialty is the kind of optimization process called “intelli-
gence”; and even narrower, a particular kind of intelligence called “Friendly
Artificial Intelligence”—of which, I hope, I will be able to obtain especially
precise abstract knowledge.
*
144
Humans in Funny Suits
Many times the human species has travelled into space, only to find the stars
inhabited by aliens who look remarkably like humans in funny suits—or even
humans with a touch of makeup and latex—or just beige Caucasians in fee
simple.
*
145
Optimization and the Intelligence
Explosion
Among the topics I haven’t delved into here is the notion of an optimization
process. Roughly, this is the idea that your power as a mind is your ability to
hit small targets in a large search space—this can be either the space of possible
futures (planning) or the space of possible designs (invention).
Suppose you have a car, and suppose we already know that your preferences
involve travel. Now suppose that you take all the parts in the car, or all the
atoms, and jumble them up at random. It’s very unlikely that you’ll end up with
a travel-artifact at all, even so much as a wheeled cart; let alone a travel-artifact
that ranks as high in your preferences as the original car. So, relative to your
preference ordering, the car is an extremely improbable artifact. The power of
an optimization process is that it can produce this kind of improbability.
You can view both intelligence and natural selection as special cases of opti-
mization: processes that hit, in a large search space, very small targets defined
by implicit preferences. Natural selection prefers more efficient replicators.
Human intelligences have more complex preferences. Neither evolution nor
humans have consistent utility functions, so viewing them as “optimization
processes” is understood to be an approximation. You’re trying to get at the sort
of work being done, not claim that humans or evolution do this work perfectly.
This is how I see the story of life and intelligence—as a story of improbably
good designs being produced by optimization processes. The “improbability”
here is improbability relative to a random selection from the design space,
not improbability in an absolute sense—if you have an optimization process
around, then “improbably” good designs become probable.
Looking over the history of optimization on Earth up until now, the first
step is to conceptually separate the meta level from the object level—separate
the structure of optimization from that which is optimized.
If you consider biology in the absence of hominids, then on the object
level we have things like dinosaurs and butterflies and cats. On the meta level
we have things like sexual recombination and natural selection of asexual
populations. The object level, you will observe, is rather more complicated
than the meta level. Natural selection is not an easy subject and it involves
math. But if you look at the anatomy of a whole cat, the cat has dynamics
immensely more complicated than “mutate, recombine, reproduce.”
This is not surprising. Natural selection is an accidental optimization pro-
cess, that basically just started happening one day in a tidal pool somewhere.
A cat is the subject of millions of years and billions of years of evolution.
Cats have brains, of course, which operate to learn over a lifetime; but at
the end of the cat’s lifetime, that information is thrown away, so it does not
accumulate. The cumulative effects of cat-brains upon the world as optimizers,
therefore, are relatively small.
Or consider a bee brain, or a beaver brain. A bee builds hives, and a beaver
builds dams; but they didn’t figure out how to build them from scratch. A
beaver can’t figure out how to build a hive, a bee can’t figure out how to build
a dam.
So animal brains—up until recently—were not major players in the plan-
etary game of optimization; they were pieces but not players. Compared to
evolution, brains lacked both generality of optimization power (they could
not produce the amazing range of artifacts produced by evolution) and cu-
mulative optimization power (their products did not accumulate complexity
over time). For more on this theme see Protein Reinforcement and DNA
Consequentialism.
Very recently, certain animal brains have begun to exhibit both generality
of optimization power (producing an amazingly wide range of artifacts, in
time scales too short for natural selection to play any significant role) and
cumulative optimization power (artifacts of increasing complexity, as a result
of skills passed on through language and writing).
Natural selection takes hundreds of generations to do anything and mil-
lions of years for de novo complex designs. Human programmers can design a
complex machine with a hundred interdependent elements in a single after-
noon. This is not surprising, since natural selection is an accidental optimiza-
tion process that basically just started happening one day, whereas humans are
optimized optimizers handcrafted by natural selection over millions of years.
The wonder of evolution is not how well it works, but that it works at all
without being optimized. This is how optimization bootstrapped itself into
the universe—starting, as one would expect, from an extremely inefficient
accidental optimization process. Which is not the accidental first replicator,
mind you, but the accidental first process of natural selection. Distinguish the
object level and the meta level!
Since the dawn of optimization in the universe, a certain structural com-
monality has held across both natural selection and human intelligence . . .
Natural selection selects on genes, but generally speaking, the genes do not
turn around and optimize natural selection. The invention of sexual recombi-
nation is an exception to this rule, and so is the invention of cells and DNA.
And you can see both the power and the rarity of such events, by the fact that
evolutionary biologists structure entire histories of life on Earth around them.
But if you step back and take a human standpoint—if you think like a
programmer—then you can see that natural selection is still not all that com-
plicated. We’ll try bundling different genes together? We’ll try separating
information storage from moving machinery? We’ll try randomly recombin-
ing groups of genes? On an absolute scale, these are the sort of bright ideas
that any smart hacker comes up with during the first ten minutes of thinking
about system architectures.
Because natural selection started out so inefficient (as a completely acci-
dental process), this tiny handful of meta-level improvements feeding back
in from the replicators—nowhere near as complicated as the structure of a
cat—structure the evolutionary epochs of life on Earth.
And after all that, natural selection is still a blind idiot of a god. Gene pools
can evolve to extinction, despite all cells and sex.
Now natural selection does feed on itself in the sense that each new adapta-
tion opens up new avenues of further adaptation; but that takes place on the
object level. The gene pool feeds on its own complexity—but only thanks to
the protected interpreter of natural selection that runs in the background, and
that is not itself rewritten or altered by the evolution of species.
Likewise, human beings invent sciences and technologies, but we have not
yet begun to rewrite the protected structure of the human brain itself. We have
a prefrontal cortex and a temporal cortex and a cerebellum, just like the first
inventors of agriculture. We haven’t started to genetically engineer ourselves.
On the object level, science feeds on science, and each new discovery paves the
way for new discoveries—but all that takes place with a protected interpreter,
the human brain, running untouched in the background.
We have meta-level inventions like science, that try to instruct humans in
how to think. But the first person to invent Bayes’s Theorem did not become a
Bayesian; they could not rewrite themselves, lacking both that knowledge and
that power. Our significant innovations in the art of thinking, like writing and
science, are so powerful that they structure the course of human history; but
they do not rival the brain itself in complexity, and their effect upon the brain
is comparatively shallow.
The present state of the art in rationality training is not sufficient to turn
an arbitrarily selected mortal into Albert Einstein, which shows the power of a
few minor genetic quirks of brain design compared to all the self-help books
ever written in the twentieth century.
Because the brain hums away invisibly in the background, people tend
to overlook its contribution and take it for granted; and talk as if the simple
instruction to “Test ideas by experiment,” or the p < 0.05 significance rule,
were the same order of contribution as an entire human brain. Try telling
chimpanzees to test their ideas by experiment and see how far you get.
Now . . . some of us want to intelligently design an intelligence that would
be capable of intelligently redesigning itself, right down to the level of machine
code.
The machine code at first, and the laws of physics later, would be a protected
level of a sort. But that “protected level” would not contain the dynamic of
optimization; the protected levels would not structure the work. The human
brain does quite a bit of optimization on its own, and screws up on its own,
no matter what you try to tell it in school. But this fully wraparound recursive
optimizer would have no protected level that was optimizing. All the structure
of optimization would be subject to optimization itself.
And that is a sea change which breaks with the entire past since the first
replicator, because it breaks the idiom of a protected meta level.
The history of Earth up until now has been a history of optimizers spinning
their wheels at a constant rate, generating a constant optimization pressure.
And creating optimized products, not at a constant rate, but at an accelerating
rate, because of how object-level innovations open up the pathway to other
object-level innovations. But that acceleration is taking place with a protected
meta level doing the actual optimizing. Like a search that leaps from island to
island in the search space, and good islands tend to be adjacent to even better
islands, but the jumper doesn’t change its legs. Occasionally, a few tiny little
changes manage to hit back to the meta level, like sex or science, and then
the history of optimization enters a new epoch and everything proceeds faster
from there.
Imagine an economy without investment, or a university without language,
a technology without tools to make tools. Once in a hundred million years, or
once in a few centuries, someone invents a hammer.
That is what optimization has been like on Earth up until now.
When I look at the history of Earth, I don’t see a history of optimization
over time. I see a history of optimization power in, and optimized products out.
Up until now, thanks to the existence of almost entirely protected meta-levels,
it’s been possible to split up the history of optimization into epochs, and, within
each epoch, graph the cumulative object-level optimization over time, because
the protected level is running in the background and is not itself changing
within an epoch.
What happens when you build a fully wraparound, recursively self-
improving AI? Then you take the graph of “optimization in, optimized out,”
and fold the graph in on itself. Metaphorically speaking.
If the AI is weak, it does nothing, because it is not powerful enough to
significantly improve itself—like telling a chimpanzee to rewrite its own brain.
If the AI is powerful enough to rewrite itself in a way that increases its
ability to make further improvements, and this reaches all the way down to
the AI’s full understanding of its own source code and its own design as an
optimizer . . . then even if the graph of “optimization power in” and “optimized
product out” looks essentially the same, the graph of optimization over time is
going to look completely different from Earth’s history so far.
People often say something like, “But what if it requires exponentially
greater amounts of self-rewriting for only a linear improvement?” To this
the obvious answer is, “Natural selection exerted roughly constant optimiza-
tion power on the hominid line in the course of coughing up humans; and
this doesn’t seem to have required exponentially more time for each linear
increment of improvement.”
All of this is still mere analogic reasoning. A full Artificial General Intelli-
gence thinking about the nature of optimization and doing its own AI research
and rewriting its own source code, is not really like a graph of Earth’s history
folded in on itself. It is a different sort of beast. These analogies are at best
good for qualitative predictions, and even then, I have a large amount of other
beliefs I haven’t yet explained, which are telling me which analogies to make,
et cetera.
But if you want to know why I might be reluctant to extend the graph of
biological and economic growth over time, into the future and over the horizon
of an AI that thinks at transistor speeds and invents self-replicating molecular
nanofactories and improves its own source code, then there is my reason: you
are drawing the wrong graph, and it should be optimization power in versus
optimized product out, not optimized product versus time.
*
146
Ghosts in the Machine
People hear about Friendly AI and say—this is one of the top three initial
reactions:
“Oh, you can try to tell the AI to be Friendly, but if the AI can modify its
own source code, it’ll just remove any constraints you try to place on it.”
And where does that decision come from?
Does it enter from outside causality, rather than being an effect of a lawful
chain of causes that started with the source code as originally written? Is the
AI the ultimate source of its own free will?
A Friendly AI is not a selfish AI constrained by a special extra conscience
module that overrides the AI’s natural impulses and tells it what to do. You just
build the conscience, and that is the AI. If you have a program that computes
which decision the AI should make, you’re done. The buck stops immediately.
At this point, I shall take a moment to quote some case studies from the
Computer Stupidities site and Programming subtopic. (I am not linking to
this, because it is a fearsome time-trap; you can Google if you dare.)
begin
read("Number of Apples", apples)
read("Number of Carrots", carrots)
read("Price for 1 Apple", a_price)
read("Price for 1 Carrot", c_price)
write("Total for Apples", a_total)
write("Total for Carrots", c_total)
write("Total", total)
total = a_total + c_total
a_total = apples * a_price
c_total = carrots * c_price
end
Me: “Well, your program can’t print correct results before they’re
computed.”
Him: “Huh? It’s logical what the right solution is, and the com-
puter should reorder the instructions the right way.”
*
147
Artificial Addition
Suppose that human beings had absolutely no idea how they performed arith-
metic. Imagine that human beings had evolved, rather than having learned,
the ability to count sheep and add sheep. People using this built-in ability have
no idea how it worked, the way Aristotle had no idea how his visual cortex sup-
ported his ability to see things. Peano Arithmetic as we know it has not been
invented. There are philosophers working to formalize numerical intuitions,
but they employ notations such as
to formalize the intuitively obvious fact that when you add “seven” plus “six,”
of course you get “thirteen.”
In this world, pocket calculators work by storing a giant lookup table of
arithmetical facts, entered manually by a team of expert Artificial Arithmeti-
cians, for starting values that range between zero and one hundred. While
these calculators may be helpful in a pragmatic sense, many philosophers argue
that they’re only simulating addition, rather than really adding. No machine
can really count—that’s why humans have to count thirteen sheep before typ-
ing “thirteen” into the calculator. Calculators can recite back stored facts, but
they can never know what the statements mean—if you type in “two hundred
plus two hundred” the calculator says “Error: Outrange,” when it’s intuitively
obvious, if you know what the words mean, that the answer is “four hundred.”
Some philosophers, of course, are not so naive as to be taken in by these in-
tuitions. Numbers are really a purely formal system—the label “thirty-seven” is
meaningful, not because of any inherent property of the words themselves, but
because the label refers to thirty-seven sheep in the external world. A number
is given this referential property by its semantic network of relations to other
numbers. That’s why, in computer programs, the lisp token for “thirty-seven”
doesn’t need any internal structure—it’s only meaningful because of reference
and relation, not some computational property of “thirty-seven” itself.
No one has ever developed an Artificial General Arithmetician, though of
course there are plenty of domain-specific, narrow Artificial Arithmeticians
that work on numbers between “twenty” and “thirty,” and so on. And if you
look at how slow progress has been on numbers in the range of “two hundred,”
then it becomes clear that we’re not going to get Artificial General Arithmetic
any time soon. The best experts in the field estimate it will be at least a hundred
years before calculators can add as well as a human twelve-year-old.
But not everyone agrees with this estimate, or with merely conventional
beliefs about Artificial Arithmetic. It’s common to hear statements such as the
following:
• “You’re all wrong. Past efforts to create machine arithmetic were futile
from the start, because they just didn’t have enough computing power.
If you look at how many trillions of synapses there are in the human
brain, it’s clear that calculators don’t have lookup tables anywhere near
that large. We need calculators as powerful as a human brain. According
to Moore’s Law, this will occur in the year 2031 on April 27 between
4:00 and 4:30 in the morning.”
• “But Gödel’s Theorem shows that no formal system can ever capture the
basic properties of arithmetic. Classical physics is formalizable, so to
add two and two, the brain must take advantage of quantum physics.”
There is more than one moral to this parable, and I have told it with different
morals in different contexts. It illustrates the idea of levels of organization,
for example—a CPU can add two large numbers because the numbers aren’t
black-box opaque objects, they’re ordered structures of 32 bits.
But for purposes of overcoming bias, let us draw two morals:
• First, the danger of believing assertions you can’t regenerate from your
own knowledge.
If you can’t read the type system directly, don’t worry, I’ll always translate into
English. For programmers, seeing it described in distinct statements helps to
set up distinct mental objects.
And the decision system itself?
• Expected_Utility : Action A ->
(Sum O in Outcomes: Utility(O) * Probability(O|A))
– The “expected utility” of an action equals the sum, over all out-
comes, of the utility of that outcome times the conditional proba-
bility of that outcome given that action.
( )
EU(administer penicillin) = 0.9
–
EU(don’t administer penicillin) = 0.3
• Choose :
-> (Argmax A in Actions: Expected_Utility(A))
For every action, calculate the conditional probability of all the consequences
that might follow, then add up the utilities of those consequences times their
conditional probability. Then pick the best action.
This is a mathematically simple sketch of a decision system. It is not an
efficient way to compute decisions in the real world.
What if, for example, you need a sequence of acts to carry out a plan? The
formalism can easily represent this by letting each Action stand for a whole
sequence. But this creates an exponentially large space, like the space of all
sentences you can type in 100 letters. As a simple example, if one of the possible
acts on the first turn is “Shoot my own foot off,” a human planner will decide
this is a bad idea generally—eliminate all sequences beginning with this action.
But we’ve flattened this structure out of our representation. We don’t have
sequences of acts, just flat “actions.”
So, yes, there are a few minor complications. Obviously so, or we’d just run
out and build a real AI this way. In that sense, it’s much the same as Bayesian
probability theory itself.
But this is one of those times when it’s a surprisingly good idea to consider
the absurdly simple version before adding in any high-falutin’ complications.
Consider the philosopher who asserts, “All of us are ultimately selfish; we
care only about our own states of mind. The mother who claims to care about
her son’s welfare, really wants to believe that her son is doing well—this belief is
what makes the mother happy. She helps him for the sake of her own happiness,
not his.” You say, “Well, suppose the mother sacrifices her life to push her son
out of the path of an oncoming truck. That’s not going to make her happy, just
dead.” The philosopher stammers for a few moments, then replies, “But she
still did it because she valued that choice above others—because of the feeling
of importance she attached to that decision.”
So you say,
*
149
Leaky Generalizations
Are apples good to eat? Usually, but some apples are rotten.
Do humans have ten fingers? Most of us do, but plenty of people have lost
a finger and nonetheless qualify as “human.”
Unless you descend to a level of description far below any macroscopic
object—below societies, below people, below fingers, below tendon and bone,
below cells, all the way down to particles and fields where the laws are truly
universal—practically every generalization you use in the real world will be
leaky.
(Though there may, of course, be some exceptions to the above rule . . .)
Mostly, the way you deal with leaky generalizations is that, well, you just
have to deal. If the cookie market almost always closes at 10 p.m., except on
Thanksgiving it closes at 6 p.m., and today happens to be National Native
American Genocide Day, you’d better show up before 6 p.m. or you won’t get
a cookie.
Our ability to manipulate leaky generalizations is opposed by need for
closure, the degree to which we want to say once and for all that humans have
ten fingers, and get frustrated when we have to tolerate continued ambiguity.
Raising the value of the stakes can increase need for closure—which shuts
down complexity tolerance when complexity tolerance is most needed.
Life would be complicated even if the things we wanted were simple (they
aren’t). The leakyness of leaky generalizations about what-to-do-next would
leak in from the leaky structure of the real world. Or to put it another way:
Instrumental values often have no specification that is both compact and
local.
Suppose there’s a box containing a million dollars. The box is locked,
not with an ordinary combination lock, but with a dozen keys controlling a
machine that can open the box. If you know how the machine works, you can
deduce which sequences of key-presses will open the box. There’s more than
one key sequence that can trigger the machine to open the box. But if you
press a sufficiently wrong sequence, the machine incinerates the money. And
if you don’t know about the machine, there’s no simple rules like “Pressing
any key three times opens the box” or “Pressing five different keys with no
repetitions incinerates the money.”
There’s a compact nonlocal specification of which keys you want to press:
You want to press keys such that they open the box. You can write a compact
computer program that computes which key sequences are good, bad or neutral,
but the computer program will need to describe the machine, not just the keys
themselves.
There’s likewise a local noncompact specification of which keys to press:
a giant lookup table of the results for each possible key sequence. It’s a very
large computer program, but it makes no mention of anything except the keys.
But there’s no way to describe which key sequences are good, bad, or neutral,
which is both simple and phrased only in terms of the keys themselves.
It may be even worse if there are tempting local generalizations which turn
out to be leaky. Pressing most keys three times in a row will open the box,
but there’s a particular key that incinerates the money if you press it just once.
You might think you had found a perfect generalization—a locally describable
class of sequences that always opened the box—when you had merely failed to
visualize all the possible paths of the machine, or failed to value all the side
effects.
The machine represents the complexity of the real world. The openness
of the box (which is good) and the incinerator (which is bad) represent the
thousand shards of desire that make up our terminal values. The keys represent
the actions and policies and strategies available to us.
When you consider how many different ways we value outcomes, and how
complicated are the paths we take to get there, it’s a wonder that there exists
any such thing as helpful ethical advice. (Of which the strangest of all advices,
and yet still helpful, is that “the end does not justify the means.”)
But conversely, the complicatedness of action need not say anything about
the complexity of goals. You often find people who smile wisely, and say, “Well,
morality is complicated, you know, female circumcision is right in one culture
and wrong in another, it’s not always a bad thing to torture people. How naive
you are, how full of need for closure, that you think there are any simple rules.”
You can say, unconditionally and flatly, that killing anyone is a huge dose of
negative terminal utility. Yes, even Hitler. That doesn’t mean you shouldn’t
shoot Hitler. It means that the net instrumental utility of shooting Hitler carries
a giant dose of negative utility from Hitler’s death, and a hugely larger dose of
positive utility from all the other lives that would be saved as a consequence.
Many commit the type error that I warned against in Terminal Values
and Instrumental Values, and think that if the net consequential expected
utility of Hitler’s death is conceded to be positive, then the immediate local
terminal utility must also be positive, meaning that the moral principle “Death
is always a bad thing” is itself a leaky generalization. But this is double counting,
with utilities instead of probabilities; you’re setting up a resonance between
the expected utility and the utility, instead of a one-way flow from utility to
expected utility.
Or maybe it’s just the urge toward a one-sided policy debate: the best policy
must have no drawbacks.
In my moral philosophy, the local negative utility of Hitler’s death is stable,
no matter what happens to the external consequences and hence to the expected
utility.
Of course, you can set up a moral argument that it’s an inherently good
thing to punish evil people, even with capital punishment for sufficiently evil
people. But you can’t carry this moral argument by pointing out that the
consequence of shooting a man holding a leveled gun may be to save other lives.
This is appealing to the value of life, not appealing to the value of death. If
expected utilities are leaky and complicated, it doesn’t mean that utilities must
be leaky and complicated as well. They might be! But it would be a separate
argument.
*
150
The Hidden Complexity of Wishes
There are three kinds of genies: Genies to whom you can safely say, “I wish for
you to do what I should wish for”; genies for which no wish is safe; and genies
that aren’t very powerful or intelligent.
Suppose your aged mother is trapped in a burning building, and it so
happens that you’re in a wheelchair; you can’t rush in yourself. You could cry,
“Get my mother out of that building!” but there would be no one to hear.
Luckily you have, in your pocket, an Outcome Pump. This handy device
squeezes the flow of time, pouring probability into some outcomes, draining it
from others.
The Outcome Pump is not sentient. It contains a tiny time machine, which
resets time unless a specified outcome occurs. For example, if you hooked up
the Outcome Pump’s sensors to a coin, and specified that the time machine
should keep resetting until it sees the coin come up heads, and then you actually
flipped the coin, you would see the coin come up heads. (The physicists say that
any future in which a “reset” occurs is inconsistent, and therefore never happens
in the first place—so you aren’t actually killing any versions of yourself.)
Whatever proposition you can manage to input into the Outcome Pump
somehow happens, though not in a way that violates the laws of physics. If you
try to input a proposition that’s too unlikely, the time machine will suffer a
spontaneous mechanical failure before that outcome ever occurs.
You can also redirect probability flow in more quantitative ways, using the
“future function” to scale the temporal reset probability for different outcomes.
If the temporal reset probability is 99% when the coin comes up heads, and
1% when the coin comes up tails, the odds will go from 1:1 to 99:1 in favor of
tails. If you had a mysterious machine that spit out money, and you wanted to
maximize the amount of money spit out, you would use reset probabilities that
diminished as the amount of money increased. For example, spitting out $10
might have a 99.999999% reset probability, and spitting out $100 might have a
99.99999% reset probability. This way you can get an outcome that tends to be
as high as possible in the future function, even when you don’t know the best
attainable maximum.
So you desperately yank the Outcome Pump from your pocket—your
mother is still trapped in the burning building, remember?—and try to describe
your goal: get your mother out of the building!
The user interface doesn’t take English inputs. The Outcome Pump isn’t
sentient, remember? But it does have 3D scanners for the near vicinity, and
built-in utilities for pattern matching. So you hold up a photo of your mother’s
head and shoulders; match on the photo; use object contiguity to select your
mother’s whole body (not just her head and shoulders); and define the future
function using your mother’s distance from the building’s center. The further
she gets from the building’s center, the less the time machine’s reset probability.
You cry “Get my mother out of the building!,” for luck, and press Enter.
For a moment it seems like nothing happens. You look around, waiting
for the fire truck to pull up, and rescuers to arrive—or even just a strong, fast
runner to haul your mother out of the building—
Boom! With a thundering roar, the gas main under the building explodes.
As the structure comes apart, in what seems like slow motion, you glimpse
your mother’s shattered body being hurled high into the air, traveling fast,
rapidly increasing its distance from the former center of the building.
On the side of the Outcome Pump is an Emergency Regret Button. All
future functions are automatically defined with a huge negative value for the
Regret Button being pressed—a temporal reset probability of nearly 1—so that
the Outcome Pump is extremely unlikely to do anything which upsets the
user enough to make them press the Regret Button. You can’t ever remember
pressing it. But you’ve barely started to reach for the Regret Button (and what
good will it do now?) when a flaming wooden beam drops out of the sky and
smashes you flat.
Which wasn’t really what you wanted, but scores very high in the defined
future function . . .
The Outcome Pump is a genie of the second class. No wish is safe.
If someone asked you to get their poor aged mother out of a burning
building, you might help, or you might pretend not to hear. But it wouldn’t
even occur to you to explode the building. “Get my mother out of the building”
sounds like a much safer wish than it really is, because you don’t even consider
the plans that you assign extreme negative values.
Consider again the Tragedy of Group Selectionism: Some early biologists
asserted that group selection for low subpopulation sizes would produce in-
dividual restraint in breeding; and yet actually enforcing group selection in
the laboratory produced cannibalism, especially of immature females. It’s
obvious in hindsight that, given strong selection for small subpopulation sizes,
cannibals will outreproduce individuals who voluntarily forego reproductive
opportunities. But eating little girls is such an un-aesthetic solution that Wynne-
Edwards, Allee, Brereton, and the other group-selectionists simply didn’t think
of it. They only saw the solutions they would have used themselves.
Suppose you try to patch the future function by specifying that the Out-
come Pump should not explode the building: outcomes in which the building
materials are distributed over too much volume will have ∼1 temporal reset
probabilities.
So your mother falls out of a second-story window and breaks her neck.
The Outcome Pump took a different path through time that still ended up with
your mother outside the building, and it still wasn’t what you wanted, and it
still wasn’t a solution that would occur to a human rescuer.
If only the Open-Source Wish Project had developed a Wish To Get Your
Mother Out Of A Burning Building:
All these special cases, the seemingly unlimited number of required patches,
should remind you of the parable of Artificial Addition—programming an
Arithmetic Expert Systems by explicitly adding ever more assertions like
“fifteen plus fifteen equals thirty, but fifteen plus sixteen equals thirty-one
instead.”
How do you exclude the outcome where the building explodes and flings
your mother into the sky? You look ahead, and you foresee that your mother
would end up dead, and you don’t want that consequence, so you try to forbid
the event leading up to it.
Your brain isn’t hardwired with a specific, prerecorded statement that
“Blowing up a burning building containing my mother is a bad idea.” And
yet you’re trying to prerecord that exact specific statement in the Outcome
Pump’s future function. So the wish is exploding, turning into a giant lookup
table that records your judgment of every possible path through time.
You failed to ask for what you really wanted. You wanted your mother to
go on living, but you wished for her to become more distant from the center of
the building.
Except that’s not all you wanted. If your mother was rescued from the
building but was horribly burned, that outcome would rank lower in your
preference ordering than an outcome where she was rescued safe and sound.
So you not only value your mother’s life, but also her health.
And you value not just her bodily health, but her state of mind. Being
rescued in a fashion that traumatizes her—for example, a giant purple monster
roaring up out of nowhere and seizing her—is inferior to a fireman showing
up and escorting her out through a non-burning route. (Yes, we’re supposed
to stick with physics, but maybe a powerful enough Outcome Pump has aliens
coincidentally showing up in the neighborhood at exactly that moment.) You
would certainly prefer her being rescued by the monster to her being roasted
alive, however.
How about a wormhole spontaneously opening and swallowing her to a
desert island? Better than her being dead; but worse than her being alive,
well, healthy, untraumatized, and in continual contact with you and the other
members of her social network.
Would it be okay to save your mother’s life at the cost of the family dog’s
life, if it ran to alert a fireman but then got run over by a car? Clearly yes, but
it would be better ceteris paribus to avoid killing the dog. You wouldn’t want
to swap a human life for hers, but what about the life of a convicted murderer?
Does it matter if the murderer dies trying to save her, from the goodness of
his heart? How about two murderers? If the cost of your mother’s life was
the destruction of every extant copy, including the memories, of Bach’s Little
Fugue in G Minor, would that be worth it? How about if she had a terminal
illness and would die anyway in eighteen months?
If your mother’s foot is crushed by a burning beam, is it worthwhile to
extract the rest of her? What if her head is crushed, leaving her body? What if
her body is crushed, leaving only her head? What if there’s a cryonics team
waiting outside, ready to suspend the head? Is a frozen head a person? Is Terry
Schiavo a person? How much is a chimpanzee worth?
Your brain is not infinitely complicated; there is only a finite Kolmogorov
complexity / message length which suffices to describe all the judgments you
would make. But just because this complexity is finite does not make it small.
We value many things, and no they are not reducible to valuing happiness or
valuing reproductive fitness.
There is no safe wish smaller than an entire human morality. There are
too many possible paths through Time. You can’t visualize all the roads that
lead to the destination you give the genie. “Maximizing the distance between
your mother and the center of the building” can be done even more effectively
by detonating a nuclear weapon. Or, at higher levels of genie power, flinging
her body out of the Solar System. Or, at higher levels of genie intelligence,
doing something that neither you nor I would think of, just like a chimpanzee
wouldn’t think of detonating a nuclear weapon. You can’t visualize all the
paths through time, any more than you can program a chess-playing machine
by hardcoding a move for every possible board position.
And real life is far more complicated than chess. You cannot predict, in
advance, which of your values will be needed to judge the path through time
that the genie takes. Especially if you wish for something longer-term or
wider-range than rescuing your mother from a burning building.
I fear the Open-Source Wish Project is futile, except as an illustration of
how not to think about genie problems. The only safe genie is a genie that
shares all your judgment criteria, and at that point, you can just say “I wish
for you to do what I should wish for.” Which simply runs the genie’s should
function.
Indeed, it shouldn’t be necessary to say anything. To be a safe fulfiller of
a wish, a genie must share the same values that led you to make the wish.
Otherwise the genie may not choose a path through time that leads to the
destination you had in mind, or it may fail to exclude horrible side effects that
would lead you to not even consider a plan in the first place. Wishes are leaky
generalizations, derived from the huge but finite structure that is your entire
morality; only by including this entire structure can you plug all the leaks.
With a safe genie, wishing is superfluous. Just run the genie.
*
151
Anthropomorphic Optimism
It was in either kindergarten or first grade that I was first asked to pray, given
a transliteration of a Hebrew prayer. I asked what the words meant. I was
told that so long as I prayed in Hebrew, I didn’t need to know what the words
meant, it would work anyway.
That was the beginning of my break with Judaism.
As you read this, some young man or woman is sitting at a desk in a
university, earnestly studying material they have no intention of ever using,
and no interest in knowing for its own sake. They want a high-paying job, and
the high-paying job requires a piece of paper, and the piece of paper requires a
previous master’s degree, and the master’s degree requires a bachelor’s degree,
and the university that grants the bachelor’s degree requires you to take a
class in twelfth-century knitting patterns to graduate. So they diligently study,
intending to forget it all the moment the final exam is administered, but still
seriously working away, because they want that piece of paper.
Maybe you realized it was all madness, but I bet you did it anyway. You
didn’t have a choice, right? A recent study here in the Bay Area showed that
80% of teachers in K-5 reported spending less than one hour per week on
science, and 16% said they spend no time on science. Why? I’m given to
understand the proximate cause is the No Child Left Behind Act and similar
legislation. Virtually all classroom time is now spent on preparing for tests
mandated at the state or federal level. I seem to recall (though I can’t find the
source) that just taking mandatory tests was 40% of classroom time in one
school.
The old Soviet bureaucracy was famous for being more interested in ap-
pearances than reality. One shoe factory overfulfilled its quota by producing
lots of tiny shoes. Another shoe factory reported cut but unassembled leather
as a “shoe.” The superior bureaucrats weren’t interested in looking too hard,
because they also wanted to report quota overfulfillments. All this was a great
help to the comrades freezing their feet off.
It is now being suggested in several sources that an actual majority of pub-
lished findings in medicine, though “statistically significant with p < 0.05,”
are untrue. But so long as p < 0.05 remains the threshold for publication, why
should anyone hold themselves to higher standards, when that requires bigger
research grants for larger experimental groups, and decreases the likelihood
of getting a publication? Everyone knows that the whole point of science is to
publish lots of papers, just as the whole point of a university is to print certain
pieces of parchment, and the whole point of a school is to pass the mandatory
tests that guarantee the annual budget. You don’t get to set the rules of the
game, and if you try to play by different rules, you’ll just lose.
(Though for some reason, physics journals require a threshold of
p < 0.0001. It’s as if they conceive of some other purpose to their existence
than publishing physics papers.)
There’s chocolate at the supermarket, and you can get to the supermarket by
driving, and driving requires that you be in the car, which means opening your
car door, which needs keys. If you find there’s no chocolate at the supermarket,
you won’t stand around opening and slamming your car door because the
car door still needs opening. I rarely notice people losing track of plans they
devised themselves.
It’s another matter when incentives must flow through large organizations—
or worse, many different organizations and interest groups, some of them gov-
ernmental. Then you see behaviors that would mark literal insanity, if they
were born from a single mind. Someone gets paid every time they open a car
door, because that’s what’s measurable; and this person doesn’t care whether
the driver ever gets paid for arriving at the supermarket, let alone whether the
buyer purchases the chocolate, or whether the eater is happy or starving.
From a Bayesian perspective, subgoals are epiphenomena of conditional
probability functions. There is no expected utility without utility. How silly
would it be to think that instrumental value could take on a mathematical life of
its own, leaving terminal value in the dust? It’s not sane by decision-theoretical
criteria of sanity.
But consider the No Child Left Behind Act. The politicians want to look
like they’re doing something about educational difficulties; the politicians have
to look busy to voters this year, not fifteen years later when the kids are looking
for jobs. The politicians are not the consumers of education. The bureaucrats
have to show progress, which means that they’re only interested in progress
that can be measured this year. They aren’t the ones who’ll end up ignorant of
science. The publishers who commission textbooks, and the committees that
purchase textbooks, don’t sit in the classrooms bored out of their skulls.
The actual consumers of knowledge are the children—who can’t pay, can’t
vote, can’t sit on the committees. Their parents care for them, but don’t sit in
the classes themselves; they can only hold politicians responsible according
to surface images of “tough on education.” Politicians are too busy being
re-elected to study all the data themselves; they have to rely on surface images
of bureaucrats being busy and commissioning studies—it may not work to
help any children, but it works to let politicians appear caring. Bureaucrats
don’t expect to use textbooks themselves, so they don’t care if the textbooks
are hideous to read, so long as the process by which they are purchased looks
good on the surface. The textbook publishers have no motive to produce
bad textbooks, but they know that the textbook purchasing committee will be
comparing textbooks based on how many different subjects they cover, and that
the fourth-grade purchasing committee isn’t coordinated with the third-grade
purchasing committee, so they cram as many subjects into one textbook as
possible. Teachers won’t get through a fourth of the textbook before the end
of the year, and then the next year’s teacher will start over. Teachers might
complain, but they aren’t the decision-makers, and ultimately, it’s not their
future on the line, which puts sharp bounds on how much effort they’ll spend
on unpaid altruism . . .
It’s amazing, when you look at it that way—consider all the lost informa-
tion and lost incentives—that anything at all remains of the original purpose,
gaining knowledge. Though many educational systems seem to be currently
in the process of collapsing into a state not much better than nothing.
Want to see the problem really solved? Make the politicians go to school.
A single human mind can track a probabilistic expectation of utility as
it flows through the conditional chances of a dozen intermediate events—
including nonlocal dependencies, places where the expected utility of opening
the car door depends on whether there’s chocolate in the supermarket. But
organizations can only reward today what is measurable today, what can be
written into legal contract today, and this means measuring intermediate events
rather than their distant consequences. These intermediate measures, in turn,
are leaky generalizations—often very leaky. Bureaucrats are untrustworthy
genies, for they do not share the values of the wisher.
Miyamoto Musashi said:1
The primary thing when you take a sword in your hands is your
intention to cut the enemy, whatever the means. Whenever you
parry, hit, spring, strike or touch the enemy’s cutting sword, you
must cut the enemy in the same movement. It is essential to
attain this. If you think only of hitting, springing, striking or
touching the enemy, you will not be able actually to cut him. More
than anything, you must be thinking of carrying your movement
through to cutting him. You must thoroughly research this.
(I wish I lived in an era where I could just tell my readers they have to thoroughly
research something, without giving insult.)
Why would any individual lose track of their purposes in a swordfight? If
someone else had taught them to fight, if they had not generated the entire art
from within themselves, they might not understand the reason for parrying at
one moment, or springing at another moment; they might not realize when
the rules had exceptions, fail to see the times when the usual method won’t
cut through.
The essential thing in the art of epistemic rationality is to understand how
every rule is cutting through to the truth in the same movement. The corre-
sponding essential of pragmatic rationality—decision theory, versus probability
theory—is to always see how every expected utility cuts through to utility. You
must thoroughly research this.
C. J. Cherryh said:2
Your sword has no blade. It has only your intention. When that
goes astray you have no weapon.
I have seen many people go astray when they wish to the genie of an imagined
AI, dreaming up wish after wish that seems good to them, sometimes with
many patches and sometimes without even that pretense of caution. And they
don’t jump to the meta-level. They don’t instinctively look-to-purpose, the
instinct that started me down the track to atheism at the age of five. They do
not ask, as I reflexively ask, “Why do I think this wish is a good idea? Will the
genie judge likewise?” They don’t see the source of their judgment, hovering
behind the judgment as its generator. They lose track of the ball; they know the
ball bounced, but they don’t instinctively look back to see where it bounced
from—the criterion that generated their judgments.
Likewise with people not automatically noticing when supposedly selfish
people give altruistic arguments in favor of selfishness, or when supposedly
altruistic people give selfish arguments in favor of altruism.
People can handle goal-tracking for driving to the supermarket just fine,
when it’s all inside their own heads, and no genies or bureaucracies or philoso-
phies are involved. The trouble is that real civilization is immensely more
complicated than this. Dozens of organizations, and dozens of years, inter-
vene between the child suffering in the classroom, and the new-minted college
graduate not being very good at their job. (But will the interviewer or man-
ager notice, if the college graduate is good at looking busy?) With every new
link that intervenes between the action and its consequence, intention has one
more chance to go astray. With every intervening link, information is lost, in-
centive is lost. And this bothers most people a lot less than it bothers me, or
why were all my classmates willing to say prayers without knowing what they
meant? They didn’t feel the same instinct to look-to-the-generator.
Can people learn to keep their eye on the ball? To keep their intention from
going astray? To never spring or strike or touch, without knowing the higher
goal they will complete in the same movement? People do often want to do
their jobs, all else being equal. Can there be such a thing as a sane corporation?
A sane civilization, even? That’s only a distant dream, but it’s what I’ve been
getting at with all of these essays on the flow of intentions (a.k.a. expected
utility, a.k.a. instrumental value) without losing purpose (a.k.a. utility, a.k.a.
terminal value). Can people learn to feel the flow of parent goals and child
goals? To know consciously, as well as implicitly, the distinction between
expected utility and utility?
Do you care about threats to your civilization? The worst metathreat to
complex civilization is its own complexity, for that complication leads to the
loss of many purposes.
I look back, and I see that more than anything, my life has been driven by an
exceptionally strong abhorrence to lost purposes. I hope it can be transformed
to a learnable skill.
Either this box contains an angry frog, or the box with a false
inscription contains an angry frog, but not both.
Either this box contains gold and the box with a false inscription
contains an angry frog, or this box contains an angry frog and
the box with a true inscription contains gold.
And the jester said to the king: “One box contains an angry frog, the other box
gold; and one, and only one, of the inscriptions is true.”
The king opened the wrong box, and was savaged by an angry frog.
“You see,” the jester said, “let us hypothesize that the first inscription is
the true one. Then suppose the first box contains gold. Then the other box
would have an angry frog, while the box with a true inscription would contain
gold, which would make the second statement true as well. Now hypothesize
that the first inscription is false, and that the first box contains gold. Then the
second inscription would be—”
The king ordered the jester thrown in the dungeons.
A day later, the jester was brought before the king in chains and shown two
boxes.
“One box contains a key,” said the king, “to unlock your chains; and if you
find the key you are free. But the other box contains a dagger for your heart if
you fail.”
And the first box was inscribed:
The jester reasoned thusly: “Suppose the first inscription is true. Then the
second inscription must also be true. Now suppose the first inscription is
false. Then again the second inscription must be true. So the second box must
contain the key, if the first inscription is true, and also if the first inscription is
false. Therefore, the second box must logically contain the key.”
The jester opened the second box, and found a dagger.
“How?!” cried the jester in horror, as he was dragged away. “It’s logically
impossible!”
“It is entirely possible,” replied the king. “I merely wrote those inscriptions
on two boxes, and then I put the dagger in the second one.”
1. Raymond M. Smullyan, What Is the Name of This Book?: The Riddle of Dracula and Other Logical
Puzzles (Penguin Books, 1990).
154
The Parable of Hemlock
*
155
Words as Hidden Inferences
Suppose I find a barrel, sealed at the top, but with a hole large enough for a
hand. I reach in and feel a small, curved object. I pull the object out, and it’s
blue—a bluish egg. Next I reach in and feel something hard and flat, with
edges—which, when I extract it, proves to be a red cube. I pull out 11 eggs and
8 cubes, and every egg is blue, and every cube is red.
Now I reach in and I feel another egg-shaped object. Before I pull it out
and look, I have to guess: What will it look like?
The evidence doesn’t prove that every egg in the barrel is blue and every
cube is red. The evidence doesn’t even argue this all that strongly: 19 is not a
large sample size. Nonetheless, I’ll guess that this egg-shaped object is blue—or
as a runner-up guess, red. If I guess anything else, there’s as many possibilities
as distinguishable colors—and for that matter, who says the egg has to be a
single shade? Maybe it has a picture of a horse painted on.
So I say “blue,” with a dutiful patina of humility. For I am a sophis-
ticated rationalist-type person, and I keep track of my assumptions and
dependencies—I guess, but I’m aware that I’m guessing . . . right?
But when a large yellow striped feline-shaped object leaps out at me from
the shadows, I think, “Yikes! A tiger!” Not, “Hm . . . objects with the properties
of largeness, yellowness, stripedness, and feline shape, have previously often
possessed the properties ‘hungry’ and ‘dangerous,’ and thus, although it is not
logically necessary, it may be an empirically good guess that aaauuughhhh
crunch crunch gulp.”
The human brain, for some odd reason, seems to have been adapted to
make this inference quickly, automatically, and without keeping explicit track
of its assumptions.
And if I name the egg-shaped objects “bleggs” (for blue eggs) and the red
cubes “rubes,” then, when I reach in and feel another egg-shaped object, I may
think, Oh, it’s a blegg, rather than considering all that problem-of-induction
stuff.
It is a common misconception that you can define a word any way you like.
This would be true if the brain treated words as purely logical constructs,
Aristotelian classes, and you never took out any more information than you
put in.
Yet the brain goes on about its work of categorization, whether or not we
consciously approve. “All humans are mortal; Socrates is a human; there-
fore Socrates is mortal”—thus spake the ancient Greek philosophers. Well, if
mortality is part of your logical definition of “human,” you can’t logically clas-
sify Socrates as human until you observe him to be mortal. But—this is the
problem—Aristotle knew perfectly well that Socrates was a human. Aristo-
tle’s brain placed Socrates in the “human” category as efficiently as your own
brain categorizes tigers, apples, and everything else in its environment: Swiftly,
silently, and without conscious approval.
Aristotle laid down rules under which no one could conclude Socrates was
“human” until after he died. Nonetheless, Aristotle and his students went on
concluding that living people were humans and therefore mortal; they saw
distinguishing properties such as human faces and human bodies, and their
brains made the leap to inferred properties such as mortality.
Misunderstanding the working of your own mind does not, thankfully,
prevent the mind from doing its work. Otherwise Aristotelians would have
starved, unable to conclude that an object was edible merely because it looked
and felt like a banana.
So the Aristotelians went on classifying environmental objects on the basis
of partial information, the way people had always done. Students of Aris-
totelian logic went on thinking exactly the same way, but they had acquired an
erroneous picture of what they were doing.
If you asked an Aristotelian philosopher whether Carol the grocer was
mortal, they would say “Yes.” If you asked them how they knew, they would
say “All humans are mortal; Carol is human; therefore Carol is mortal.” Ask
them whether it was a guess or a certainty, and they would say it was a certainty
(if you asked before the sixteenth century, at least). Ask them how they knew
that humans were mortal, and they would say it was established by definition.
The Aristotelians were still the same people, they retained their original
natures, but they had acquired incorrect beliefs about their own functioning.
They looked into the mirror of self-awareness, and saw something unlike their
true selves: they reflected incorrectly.
Your brain doesn’t treat words as logical definitions with no empirical
consequences, and so neither should you. The mere act of creating a word
can cause your mind to allocate a category, and thereby trigger unconscious
inferences of similarity. Or block inferences of similarity; if I create two labels
I can get your mind to allocate two categories. Notice how I said “you” and
“your brain” as if they were different things?
Making errors about the inside of your head doesn’t change what’s there;
otherwise Aristotle would have died when he concluded that the brain was
an organ for cooling the blood. Philosophical mistakes usually don’t interfere
with blink-of-an-eye perceptual inferences.
But philosophical mistakes can severely mess up the deliberate thinking
processes that we use to try to correct our first impressions. If you believe that
you can “define a word any way you like,” without realizing that your brain
goes on categorizing without your conscious oversight, then you won’t make
the effort to choose your definitions wisely.
*
156
Extensions and Intensions
“What is red?”
“Red is a color.”
“What’s a color?”
“A color is a property of a thing.”
But what is a thing? And what’s a property? Soon the two are lost in a maze of
words defined in other words, the problem that Steven Harnad once described
as trying to learn Chinese from a Chinese/Chinese dictionary.
Alternatively, if you asked me “What is red?” I could point to a stop sign,
then to someone wearing a red shirt, and a traffic light that happens to be red,
and blood from where I accidentally cut myself, and a red business card, and
then I could call up a color wheel on my computer and move the cursor to the
red area. This would probably be sufficient, though if you know what the word
“No” means, the truly strict would insist that I point to the sky and say “No.”
I think I stole this example from S. I. Hayakawa—though I’m really not
sure, because I heard this way back in the indistinct blur of my childhood.
(When I was twelve, my father accidentally deleted all my computer files. I
have no memory of anything before that.)
But that’s how I remember first learning about the difference between
intensional and extensional definition. To give an “intensional definition” is
to define a word or phrase in terms of other words, as a dictionary does. To
give an “extensional definition” is to point to examples, as adults do when
teaching children. The preceding sentence gives an intensional definition of
“extensional definition,” which makes it an extensional example of “intensional
definition.”
In Hollywood Rationality and popular culture generally, “rationalists” are
depicted as word-obsessed, floating in endless verbal space disconnected from
reality.
But the actual Traditional Rationalists have long insisted on maintaining a
tight connection to experience:
Once upon a time, the philosophers of Plato’s Academy claimed that the best
definition of human was a “featherless biped.” Diogenes of Sinope, also called
Diogenes the Cynic, is said to have promptly exhibited a plucked chicken
and declared “Here is Plato’s man.” The Platonists promptly changed their
definition to “a featherless biped with broad nails.”
No dictionary, no encyclopedia, has ever listed all the things that humans
have in common. We have red blood, five fingers on each of two hands, bony
skulls, 23 pairs of chromosomes—but the same might be said of other animal
species. We make complex tools to make complex tools, we use syntactical
combinatorial language, we harness critical fission reactions as a source of
energy: these things may serve out to single out only humans, but not all
humans—many of us have never built a fission reactor. With the right set of
necessary-and-sufficient gene sequences you could single out all humans, and
only humans—at least for now—but it would still be far from all that humans
have in common.
But so long as you don’t happen to be near a plucked chicken, saying “Look
for featherless bipeds” may serve to pick out a few dozen of the particular
things that are humans, as opposed to houses, vases, sandwiches, cats, colors,
or mathematical theorems.
Once the definition “featherless biped” has been bound to some particular
featherless bipeds, you can look over the group, and begin harvesting some
of the other characteristics—beyond mere featherfree twolegginess—that the
“featherless bipeds” seem to share in common. The particular featherless
bipeds that you see seem to also use language, build complex tools, speak
combinatorial language with syntax, bleed red blood if poked, die when they
drink hemlock.
Thus the category “human” grows richer, and adds more and more charac-
teristics; and when Diogenes finally presents his plucked chicken, we are not
fooled: This plucked chicken is obviously not similar to the other “featherless
bipeds.”
(If Aristotelian logic were a good model of human psychology, the Platonists
would have looked at the plucked chicken and said, “Yes, that’s a human; what’s
your point?”)
If the first featherless biped you see is a plucked chicken, then you may end
up thinking that the verbal label “human” denotes a plucked chicken; so I can
modify my treasure map to point to “featherless bipeds with broad nails,” and
if I am wise, go on to say, “See Diogenes over there? That’s a human, and I’m
a human, and you’re a human; and that chimpanzee is not a human, though
fairly close.”
The initial clue only has to lead the user to the similarity cluster—the group
of things that have many characteristics in common. After that, the initial clue
has served its purpose, and I can go on to convey the new information “humans
are currently mortal,” or whatever else I want to say about us featherless bipeds.
A dictionary is best thought of, not as a book of Aristotelian class definitions,
but a book of hints for matching verbal labels to similarity clusters, or matching
labels to properties that are useful in distinguishing similarity clusters.
*
158
Typicality and Asymmetrical
Similarity
Birds fly. Well, except ostriches don’t. But which is a more typical bird—a
robin, or an ostrich?
Which is a more typical chair: a desk chair, a rocking chair, or a beanbag
chair?
Most people would say that a robin is a more typical bird, and a desk chair
is a more typical chair. The cognitive psychologists who study this sort of thing
experimentally, do so under the heading of “typicality effects” or “prototype
effects.”1 For example, if you ask subjects to press a button to indicate “true”
or “false” in response to statements like “A robin is a bird” or “A penguin is a
bird,” reaction times are faster for more central examples.2 Typicality measures
correlate well using different investigative methods—reaction times are one
example; you can also ask people to directly rate, on a scale of 1 to 10, how
well an example (like a specific robin) fits a category (like “bird”).
So we have a mental measure of typicality—which might, perhaps, function
as a heuristic—but is there a corresponding bias we can use to pin it down?
Well, which of these statements strikes you as more natural: “98 is approxi-
mately 100,” or “100 is approximately 98”? If you’re like most people, the first
statement seems to make more sense.3 For similar reasons, people asked to rate
how similar Mexico is to the United States, gave consistently higher ratings
than people asked to rate how similar the United States is to Mexico.4
And if that still seems harmless, a study by Rips showed that people were
more likely to expect a disease would spread from robins to ducks on an island,
than from ducks to robins.5 Now this is not a logical impossibility, but in a
pragmatic sense, whatever difference separates a duck from a robin and would
make a disease less likely to spread from a duck to a robin, must also be a
difference between a robin and a duck, and would make a disease less likely to
spread from a robin to a duck.
Yes, you can come up with rationalizations, like “Well, there could be more
neighboring species of the robins, which would make the disease more likely
to spread initially, etc.,” but be careful not to try too hard to rationalize the
probability ratings of subjects who didn’t even realize there was a comparison
going on. And don’t forget that Mexico is more similar to the United States
than the United States is to Mexico, and that 98 is closer to 100 than 100 is
to 98. A simpler interpretation is that people are using the (demonstrated)
similarity heuristic as a proxy for the probability that a disease spreads, and
this heuristic is (demonstrably) asymmetrical.
Kansas is unusually close to the center of the United States, and Alaska is
unusually far from the center of the United States; so Kansas is probably closer
to most places in the US and Alaska is probably farther. It does not follow,
however, that Kansas is closer to Alaska than is Alaska to Kansas. But people
seem to reason (metaphorically speaking) as if closeness is an inherent property
of Kansas and distance is an inherent property of Alaska; so that Kansas is still
close, even to Alaska; and Alaska is still distant, even from Kansas.
So once again we see that Aristotle’s notion of categories—logical classes
with membership determined by a collection of properties that are individually
strictly necessary, and together strictly sufficient—is not a good model of
human cognitive psychology. (Science’s view has changed somewhat over
the last 2,350 years? Who would’ve thought?) We don’t even reason as if set
membership is a true-or-false property: statements of set membership can
be more or less true. (Note: This is not the same thing as being more or less
probable.)
One more reason not to pretend that you, or anyone else, is really going to
treat words as Aristotelian logical classes.
1. Eleanor Rosch, “Principles of Categorization,” in Cognition and Categorization, ed. Eleanor Rosch
and Barbara B. Lloyd (Hillsdale, NJ: Lawrence Erlbaum, 1978).
2. George Lakoff, Women, Fire, and Dangerous Things: What Categories Reveal about the Mind
(Chicago: Chicago University Press, 1987).
3. Jerrold Sadock, “Truth and Approximations,” Papers from the Third Annual Meeting of the Berkeley
Linguistics Society (1977): 430–439.
4. Amos Tversky and Itamar Gati, “Studies of Similarity,” in Cognition and Categorization, ed. Eleanor
Rosch and Barbara Lloyd (Hillsdale, NJ: Lawrence Erlbaum Associates, Inc., 1978), 79–98.
5. Lance J. Rips, “Inductive Judgments about Natural Categories,” Journal of Verbal Learning and
Verbal Behavior 14 (1975): 665–681.
159
The Cluster Structure of Thingspace
*
160
Disguised Queries
Imagine that you have a peculiar job in a peculiar factory: Your task is to
take objects from a mysterious conveyor belt, and sort the objects into two
bins. When you first arrive, Susan the Senior Sorter explains to you that blue
egg-shaped objects are called “bleggs” and go in the “blegg bin,” while red
cubes are called “rubes” and go in the “rube bin.”
Once you start working, you notice that bleggs and rubes differ in ways
besides color and shape. Bleggs have fur on their surface, while rubes are
smooth. Bleggs flex slightly to the touch; rubes are hard. Bleggs are opaque,
the rube’s surface slightly translucent.
Soon after you begin working, you encounter a blegg shaded an unusually
dark blue—in fact, on closer examination, the color proves to be purple, halfway
between red and blue.
Yet wait! Why are you calling this object a “blegg”? A “blegg” was originally
defined as blue and egg-shaped—the qualification of blueness appears in the
very name “blegg,” in fact. This object is not blue. One of the necessary
qualifications is missing; you should call this a “purple egg-shaped object,” not
a “blegg.”
But it so happens that, in addition to being purple and egg-shaped, the
object is also furred, flexible, and opaque. So when you saw the object, you
thought, “Oh, a strangely colored blegg.” It certainly isn’t a rube . . . right?
Still, you aren’t quite sure what to do next. So you call over Susan the Senior
Sorter.
“Oh, yes, it’s a blegg,” Susan says, “you can put it in the blegg
bin.”
You start to toss the purple blegg into the blegg bin, but pause
for a moment. “Susan,” you say, “how do you know this is a
blegg?”
Susan looks at you oddly. “Isn’t it obvious? This object may
be purple, but it’s still egg-shaped, furred, flexible, and opaque,
like all the other bleggs. You’ve got to expect a few color defects.
Or is this one of those philosophical conundrums, like ‘How do
you know the world wasn’t created five minutes ago complete
with false memories?’ In a philosophical sense I’m not absolutely
certain that this is a blegg, but it seems like a good guess.”
“No, I mean . . .” You pause, searching for words. “Why is
there a blegg bin and a rube bin? What’s the difference between
bleggs and rubes?”
“Bleggs are blue and egg-shaped, rubes are red and cube-
shaped,” Susan says patiently. “You got the standard orientation
lecture, right?”
“Why do bleggs and rubes need to be sorted?”
“Er . . . because otherwise they’d be all mixed up?” says Susan.
“Because nobody will pay us to sit around all day and not sort
bleggs and rubes?”
“Who originally determined that the first blue egg-shaped
object was a ‘blegg,’ and how did they determine that?”
Susan shrugs. “I suppose you could just as easily call the
red cube-shaped objects ‘bleggs’ and the blue egg-shaped objects
‘rubes,’ but it seems easier to remember this way.”
You think for a moment. “Suppose a completely mixed-up
object came off the conveyor. Like, an orange sphere-shaped
furred translucent object with writhing green tentacles. How
could I tell whether it was a blegg or a rube?”
“Wow, no one’s ever found an object that mixed up,” says
Susan, “but I guess we’d take it to the sorting scanner.”
“How does the sorting scanner work?” you inquire. “X-rays?
Magnetic resonance imaging? Fast neutron transmission spec-
troscopy?”
“I’m told it works by Bayes’s Rule, but I don’t quite understand
how,” says Susan. “I like to say it, though. Bayes Bayes Bayes Bayes
Bayes.”
“What does the sorting scanner tell you?”
“It tells you whether to put the object into the blegg bin or the
rube bin. That’s why it’s called a sorting scanner.”
At this point you fall silent.
“Incidentally,” Susan says casually, “it may interest you to
know that bleggs contain small nuggets of vanadium ore, and
rubes contain shreds of palladium, both of which are useful in-
dustrially.”
“Susan, you are pure evil.”
“Thank you.”
So now it seems we’ve discovered the heart and essence of bleggness: a blegg
is an object that contains a nugget of vanadium ore. Surface characteristics,
like blue color and furredness, do not determine whether an object is a blegg;
surface characteristics only matter because they help you infer whether an
object is a blegg, that is, whether the object contains vanadium.
Containing vanadium is a necessary and sufficient definition: all bleggs
contain vanadium and everything that contains vanadium is a blegg: “blegg”
is just a shorthand way of saying “vanadium-containing object.” Right?
Not so fast, says Susan: Around 98% of bleggs contain vanadium, but 2%
contain palladium instead. To be precise (Susan continues) around 98% of
blue egg-shaped furred flexible opaque objects contain vanadium. For unusual
bleggs, it may be a different percentage: 95% of purple bleggs contain vanadium,
92% of hard bleggs contain vanadium, etc.
Now suppose you find a blue egg-shaped furred flexible opaque object, an
ordinary blegg in every visible way, and just for kicks you take it to the sorting
scanner, and the scanner says “palladium”—this is one of the rare 2%. Is it a
blegg?
At first you might answer that, since you intend to throw this object in the
rube bin, you might as well call it a “rube.” However, it turns out that almost
all bleggs, if you switch off the lights, glow faintly in the dark, while almost all
rubes do not glow in the dark. And the percentage of bleggs that glow in the
dark is not significantly different for blue egg-shaped furred flexible opaque
objects that contain palladium, instead of vanadium. Thus, if you want to guess
whether the object glows like a blegg, or remains dark like a rube, you should
guess that it glows like a blegg.
So is the object really a blegg or a rube?
On one hand, you’ll throw the object in the rube bin no matter what else
you learn. On the other hand, if there are any unknown characteristics of the
object you need to infer, you’ll infer them as if the object were a blegg, not a
rube—group it into the similarity cluster of blue egg-shaped furred flexible
opaque things, and not the similarity cluster of red cube-shaped smooth hard
translucent things.
The question “Is this object a blegg?” may stand in for different queries on
different occasions.
If it weren’t standing in for some query, you’d have no reason to care.
Is atheism a “religion”? Is transhumanism a “cult”? People who argue that
atheism is a religion “because it states beliefs about God” are really trying to
argue (I think) that the reasoning methods used in atheism are on a par with
the reasoning methods used in religion, or that atheism is no safer than religion
in terms of the probability of causally engendering violence, etc. . . . What’s
really at stake is an atheist’s claim of substantial difference and superiority
relative to religion, which the religious person is trying to reject by denying
the difference rather than the superiority(!).
But that’s not the a priori irrational part: The a priori irrational part is where,
in the course of the argument, someone pulls out a dictionary and looks up
the definition of “atheism” or “religion.” (And yes, it’s just as silly whether an
atheist or religionist does it.) How could a dictionary possibly decide whether
an empirical cluster of atheists is really substantially different from an empirical
cluster of theologians? How can reality vary with the meaning of a word? The
points in thingspace don’t move around when we redraw a boundary.
But people often don’t realize that their argument about where to draw a
definitional boundary, is really a dispute over whether to infer a characteristic
shared by most things inside an empirical cluster . . .
Hence the phrase, “disguised query.”
*
161
Neural Categories
Shape: Luminance:
+egg / -cube +glow / -dark
Texture: Interior:
+furred / +vanadium /
-smooth -palladium
Figure 161.2 for distinguishing humans from Space Monsters, with input from
Aristotle (“All men are mortal”) and Plato’s Academy (“A featherless biped
with broad nails”).
A neural network needs a learning rule. The obvious idea is that when two
nodes are often active at the same time, we should strengthen the connection
between them—this is one of the first rules ever proposed for training a neural
network, known as Hebb’s Rule.
Thus, if you often saw things that were both blue and furred—thus simul-
taneously activating the “color” node in the + state and the “texture” node
in the + state—the connection would strengthen between color and texture,
so that + colors activated + textures, and vice versa. If you saw things that
were blue and egg-shaped and vanadium-containing, that would strengthen
positive mutual connections between color and shape and interior.
Lifespan:
+mortal / -immortal
Nails: Feathers:
+broad / -talons +no / -yes
Legs: Blood:
+2 / -17 +red /
-glows green
Let’s say you’ve already seen plenty of bleggs and rubes come off the conveyor
belt. But now you see something that’s furred, egg-shaped, and—gasp!—
reddish purple (which we’ll model as a “color” activation level of −2/3). You
haven’t yet tested the luminance, or the interior. What to predict, what to
predict?
What happens then is that the activation levels in Network 1 bounce around
a bit. Positive activation flows to luminance from shape, negative activation
flows to interior from color, negative activation flows from interior to lumi-
nance . . . Of course all these messages are passed in parallel!! and asyn-
chronously!! just like the human brain . . .
Finally Network 1 settles into a stable state, which has high positive acti-
vation for “luminance” and “interior.” The network may be said to “expect”
(though it has not yet seen) that the object will glow in the dark, and that it
contains vanadium.
And lo, Network 1 exhibits this behavior even though there’s no explicit
node that says whether the object is a blegg or not. The judgment is implicit
in the whole network!! Bleggness is an attractor!! which arises as the result of
emergent behavior!! from the distributed!! learning rule.
Now in real life, this kind of network design—however faddish it may
sound—runs into all sorts of problems. Recurrent networks don’t always settle
right away: They can oscillate, or exhibit chaotic behavior, or just take a very
long time to settle down. This is a Bad Thing when you see something big
and yellow and striped, and you have to wait five minutes for your distributed
neural network to settle into the “tiger” attractor. Asynchronous and parallel
it may be, but it’s not real-time.
And there are other problems, like double-counting the evidence when
messages bounce back and forth: If you suspect that an object glows in the
dark, your suspicion will activate belief that the object contains vanadium,
which in turn will activate belief that the object glows in the dark.
Plus if you try to scale up the Network 1 design, it requires O(N 2 ) connec-
tions, where N is the total number of observables.
So what might be a more realistic neural network design?
In Network 2 of Figure 161.3, a wave of activation converges on the central
node from any clamped (observed) nodes, and then surges back out again
to any unclamped (unobserved) nodes. Which means we can compute the
answer in one step, rather than waiting for the network to settle—an important
requirement in biology when the neurons only run at 20Hz. And the network
architecture scales as O(N ), rather than O(N 2 ).
Admittedly, there are some things you can notice more easily with the first
network architecture than the second. Network 1 has a direct connection
between every two nodes. So if red objects never glow in the dark, but red
furred objects usually have the other blegg characteristics like egg-shape and
vanadium, Network 1 can easily represent this: it just takes a very strong direct
negative connection from color to luminance, but more powerful positive
connections from texture to all other nodes except luminance.
Color:
+blue / -red
Shape: Luminance:
+egg / -cube +glow / -dark
Category:
+BLEGG /
-RUBE
Texture: Interior:
+furred / +vanadium /
-smooth -palladium
Nor is this a “special exception” to the general rule that bleggs glow—remember,
in Network 1, there is no unit that represents blegg-ness; blegg-ness emerges
as an attractor in the distributed network.
So yes, those O(N 2 ) connections were buying us something. But not very
much. Network 1 is not more useful on most real-world problems, where you
rarely find an animal stuck halfway between being a cat and a dog.
(There are also facts that you can’t easily represent in Network 1 or Network
2. Let’s say sea-blue color and spheroid shape, when found together, always
indicate the presence of palladium; but when found individually, without
the other, they are each very strong evidence for vanadium. This is hard to
represent, in either architecture, without extra nodes. Both Network 1 and
Network 2 embody implicit assumptions about what kind of environmental
structure is likely to exist; the ability to read this off is what separates the adults
from the babes, in machine learning.)
Make no mistake: Neither Network 1 nor Network 2 is biologically realistic.
But it still seems like a fair guess that however the brain really works, it is in
some sense closer to Network 2 than Network 1. Fast, cheap, scalable, works
well to distinguish dogs and cats: natural selection goes for that sort of thing
like water running down a fitness landscape.
It seems like an ordinary enough task to classify objects as either bleggs or
rubes, tossing them into the appropriate bin. But would you notice if sea-blue
objects never glowed in the dark?
Maybe, if someone presented you with twenty objects that were alike only in
being sea-blue, and then switched off the light, and none of the objects glowed.
If you got hit over the head with it, in other words. Perhaps by presenting you
with all these sea-blue objects in a group, your brain forms a new subcategory,
and can detect the “doesn’t glow” characteristic within that subcategory. But
you probably wouldn’t notice if the sea-blue objects were scattered among a
hundred other bleggs and rubes. It wouldn’t be easy or intuitive to notice, the
way that distinguishing cats and dogs is easy and intuitive.
Or: “Socrates is human, all humans are mortal, therefore Socrates is mortal.”
How did Aristotle know that Socrates was human? Well, Socrates had no
feathers, and broad nails, and walked upright, and spoke Greek, and, well,
was generally shaped like a human and acted like one. So the brain decides,
once and for all, that Socrates is human; and from there, infers that Socrates is
mortal like all other humans thus yet observed. It doesn’t seem easy or intuitive
to ask how much wearing clothes, as opposed to using language, is associated
with mortality. Just, “things that wear clothes and use language are human”
and “humans are mortal.”
Are there biases associated with trying to classify things into categories
once and for all? Of course there are. See e.g. Cultish Countercultishness.
*
162
How An Algorithm Feels From
Inside
“If a tree falls in the forest, and no one hears it, does it make a sound?” I
remember seeing an actual argument get started on this subject—a fully naive
argument that went nowhere near Berkeleian subjectivism. Just:
The standard rationalist view would be that the first person is speaking as if
“sound” means acoustic vibrations in the air; the second person is speaking
as if “sound” means an auditory experience in a brain. If you ask “Are there
acoustic vibrations?” or “Are there auditory experiences?,” the answer is at
once obvious. And so the argument is really about the definition of the word
“sound.”
I think the standard analysis is essentially correct. So let’s accept that as
a premise, and ask: Why do people get into such arguments? What’s the
underlying psychology?
A key idea of the heuristics and biases program is that mistakes are often
more revealing of cognition than correct answers. Getting into a heated dispute
about whether, if a tree falls in a deserted forest, it makes a sound, is traditionally
considered a mistake.
So what kind of mind design corresponds to that error?
In Disguised Queries I introduced the blegg/rube classification task, in
which Susan the Senior Sorter explains that your job is to sort objects coming
off a conveyor belt, putting the blue eggs or “bleggs” into one bin, and the red
cubes or “rubes” into the rube bin. This, it turns out, is because bleggs contain
small nuggets of vanadium ore, and rubes contain small shreds of palladium,
both of which are useful industrially.
Except that around 2% of blue egg-shaped objects contain palladium instead.
So if you find a blue egg-shaped thing that contains palladium, should you call
it a “rube” instead? You’re going to put it in the rube bin—why not call it a
“rube”?
But when you switch off the light, nearly all bleggs glow faintly in the dark.
And blue egg-shaped objects that contain palladium are just as likely to glow
in the dark as any other blue egg-shaped object.
So if you find a blue egg-shaped object that contains palladium and you ask
“Is it a blegg?,” the answer depends on what you have to do with the answer. If
you ask “Which bin does the object go in?,” then you choose as if the object
is a rube. But if you ask “If I turn off the light, will it glow?,” you predict as if
the object is a blegg. In one case, the question “Is it a blegg?” stands in for the
disguised query, “Which bin does it go in?” In the other case, the question “Is
it a blegg?” stands in for the disguised query, “Will it glow in the dark?”
Now suppose that you have an object that is blue and egg-shaped and
contains palladium; and you have already observed that it is furred, flexible,
opaque, and glows in the dark.
This answers every query, observes every observable introduced. There’s
nothing left for a disguised query to stand for.
So why might someone feel an impulse to go on arguing whether the object
is really a blegg?
These diagrams from Neural Categories show two different neural networks
that might be used to answer questions about bleggs and rubes. Network 1
(Figure 162.1) has a number of disadvantages—such as potentially oscillat-
Color:
+blue / -red
Shape: Luminance:
+egg / -cube +glow / -dark
Texture: Interior:
+furred / +vanadium /
-smooth -palladium
Shape: Luminance:
+egg / -cube +glow / -dark
Category:
+BLEGG /
-RUBE
Texture: Interior:
+furred / +vanadium /
-smooth -palladium
*
163
Disputing Definitions
If a tree falls in the forest, and no one hears it, does it make a sound?
Albert: “Of course it does. What kind of silly question is
that? Every time I’ve listened to a tree fall, it made a sound, so
I’ll guess that other trees falling also make sounds. I don’t believe
the world changes around when I’m not looking.”
Barry: “Wait a minute. If no one hears it, how can it be a
sound?”
Albert and Barry recruit arguments that feel like support for their respec-
tive positions, describing in more detail the thoughts that caused their
“sound”-detectors to fire or stay silent. But so far the conversation has still fo-
cused on the forest, rather than definitions. And note that they don’t actually
disagree on anything that happens in the forest.
Insult has been proffered and accepted; now neither party can back down
without losing face. Technically, this isn’t part of the argument, as rationalists
account such things; but it’s such an important part of the Standard Dispute
that I’m including it anyway.
Albert deploys an argument that feels like support for the word “sound” having
a particular meaning. This is a different kind of question from whether acoustic
vibrations take place in a forest—but the shift usually passes unnoticed.
Barry: “Oh, yeah? Let’s just see if the dictionary agrees with
you.”
There’s quite a lot of rationality errors in the Standard Dispute. Some of them
I’ve already covered, and some of them I’ve yet to cover; likewise the remedies.
But for now, I would just like to point out—in a mournful sort of way—that
Albert and Barry seem to agree on virtually every question of what is actually
going on inside the forest, and yet it doesn’t seem to generate any feeling of
agreement.
Arguing about definitions is a garden path; people wouldn’t go down the
path if they saw at the outset where it led. If you asked Albert (Barry) why he’s
still arguing, he’d probably say something like: “Barry (Albert) is trying to
sneak in his own definition of ‘sound,’ the scurvey scoundrel, to support his
ridiculous point; and I’m here to defend the standard definition.”
But suppose I went back in time to before the start of the argument:
This remedy doesn’t destroy every dispute over categorizations. But it destroys
a substantial fraction.
*
164
Feel the Meaning
When I hear someone say, “Oh, look, a butterfly,” the spoken phonemes “but-
terfly” enter my ear and vibrate on my ear drum, being transmitted to the
cochlea, tickling auditory nerves that transmit activation spikes to the auditory
cortex, where phoneme processing begins, along with recognition of words,
and reconstruction of syntax (a by no means serial process), and all manner of
other complications.
But at the end of the day, or rather, at the end of the second, I am primed to
look where my friend is pointing and see a visual pattern that I will recognize
as a butterfly; and I would be quite surprised to see a wolf instead.
My friend looks at a butterfly, his throat vibrates and lips move, the pressure
waves travel invisibly through the air, my ear hears and my nerves transduce
and my brain reconstructs, and lo and behold, I know what my friend is looking
at. Isn’t that marvelous? If we didn’t know about the pressure waves in the
air, it would be a tremendous discovery in all the newspapers: Humans are
telepathic! Human brains can transfer thoughts to each other!
Well, we are telepathic, in fact; but magic isn’t exciting when it’s merely
real, and all your friends can do it too.
Think telepathy is simple? Try building a computer that will be telepathic
with you. Telepathy, or “language,” or whatever you want to call our partial
thought transfer ability, is more complicated than it looks.
But it would be quite inconvenient to go around thinking, “Now I shall
partially transduce some features of my thoughts into a linear sequence of
phonemes which will invoke similar thoughts in my conversational partner . . .”
So the brain hides the complexity—or rather, never represents it in the first
place—which leads people to think some peculiar thoughts about words.
As I remarked earlier, when a large yellow striped object leaps at me, I
think “Yikes! A tiger!” not “Hm . . . objects with the properties of largeness,
yellowness, and stripedness have previously often possessed the properties
‘hungry’ and ‘dangerous,’ and therefore, although it is not logically necessary,
auughhhh crunch crunch gulp.”
Similarly, when someone shouts “Yikes! A tiger!,” natural selection would
not favor an organism that thought, “Hm . . . I have just heard the syllables
‘Tie’ and ‘Grr’ which my fellow tribe members associate with their internal
analogues of my own tiger concept, and which they are more likely to utter if
they see an object they categorize as aiiieeee crunch crunch help it’s got my
arm crunch gulp.”
Considering this as a design constraint on the human cognitive architecture,
you wouldn’t want any extra steps between when your auditory cortex recog-
nizes the syllables “tiger,” and when the tiger concept gets activated.
Going back to the parable of bleggs and rubes, and the centralized network
that categorizes quickly and cheaply, you might visualize a direct connection
running from the unit that recognizes the syllable “blegg” to the unit at the
center of the blegg network. The central unit, the blegg concept, gets activated
almost as soon as you hear Susan the Senior Sorter say, “Blegg!”
Or, for purposes of talking—which also shouldn’t take eons—as soon as
you see a blue egg-shaped thing and the central blegg unit fires, you holler
“Blegg!” to Susan.
And what that algorithm feels like from inside is that the label, and the
concept, are very nearly identified; the meaning feels like an intrinsic property
of the word itself.
Color:
+blue / -red
Blegg!
Shape: Luminance:
+egg / -cube +glow / -dark
Category:
+BLEGG /
-RUBE
Texture: Interior:
+furred / +vanadium /
-smooth -palladium
Albert feels intuitively that the word “sound” has a meaning and that the
meaning is acoustic vibrations. Just as Albert feels that a tree falling in the
forest makes a sound (rather than causing an event that matches the sound
category).
Barry likewise feels that:
Rather than:
myBrain.FindConcept("sound") ==
concept_AuditoryExperience
concept_AuditoryExperience.match(forest) ==
false .
Which is closer to what’s really going on; but humans have not evolved to know
this, anymore than humans instinctively know the brain is made of neurons.
Albert and Barry’s conflicting intuitions provide the fuel for continuing the
argument in the phase of arguing over what the word “sound” means—which
feels like arguing over a fact like any other fact, like arguing over whether the
sky is blue or green.
You may not even notice that anything has gone astray, until you try to
perform the rationalist ritual of stating a testable experiment whose result
depends on the facts you’re so heatedly disputing . . .
*
165
The Argument from Common Usage
Not all definitional disputes progress as far as recognizing the notion of com-
mon usage. More often, I think, someone picks up a dictionary because they
believe that words have meanings, and the dictionary faithfully records what
this meaning is. Some people even seem to believe that the dictionary deter-
mines the meaning—that the dictionary editors are the Legislators of Language.
Maybe because back in elementary school, their authority-teacher said that
they had to obey the dictionary, that it was a mandatory rule rather than an
optional one?
Dictionary editors read what other people write, and record what the words
seem to mean; they are historians. The Oxford English Dictionary may be
comprehensive, but never authoritative.
But surely there is a social imperative to use words in a commonly under-
stood way? Does not our human telepathy, our valuable power of language,
rely on mutual coordination to work? Perhaps we should voluntarily treat dic-
tionary editors as supreme arbiters—even if they prefer to think of themselves
as historians—in order to maintain the quiet cooperation on which all speech
depends.
The phrase “authoritative dictionary” is almost never used correctly, an
example of proper usage being The Authoritative Dictionary of ieee Standards
Terms. The ieee is a body of voting members who have a professional need for
exact agreement on terms and definitions, and so The Authoritative Dictionary
of ieee Standards Terms is actual, negotiated legislation, which exerts whatever
authority one regards as residing in the ieee.
In everyday life, shared language usually does not arise from a deliberate
agreement, as of the ieee. It’s more a matter of infection, as words are invented
and diffuse through the culture. (A “meme,” one might say, following Richard
Dawkins forty years ago—but you already know what I mean, and if not, you
can look it up on Google, and then you too will have been infected.)
Yet as the example of the ieee shows, agreement on language can also
be a cooperatively established public good. If you and I wish to undergo an
exchange of thoughts via language, the human telepathy, then it is in our mutual
interest that we use the same word for similar concepts—preferably, concepts
similar to the limit of resolution in our brain’s representation thereof—even
though we have no obvious mutual interest in using any particular word for a
concept.
We have no obvious mutual interest in using the word “oto” to mean sound,
or “sound” to mean oto; but we have a mutual interest in using the same word,
whichever word it happens to be. (Preferably, words we use frequently should
be short, but let’s not get into information theory just yet.)
But, while we have a mutual interest, it is not strictly necessary that you
and I use the similar labels internally; it is only convenient. If I know that, to
you, “oto” means sound—that is, you associate “oto” to a concept very similar
to the one I associate to “sound”—then I can say “Paper crumpling makes a
crackling oto.” It requires extra thought, but I can do it if I want.
Similarly, if you say “What is the walking-stick of a bowling ball dropping
on the floor?” and I know which concept you associate with the syllables
“walking-stick,” then I can figure out what you mean. It may require some
thought, and give me pause, because I ordinarily associate “walking-stick” with
a different concept. But I can do it just fine.
When humans really want to communicate with each other, we’re hard to
stop! If we’re stuck on a deserted island with no common language, we’ll take
up sticks and draw pictures in sand.
Albert’s appeal to the Argument from Common Usage assumes that agree-
ment on language is a cooperatively established public good. Yet Albert as-
sumes this for the sole purpose of rhetorically accusing Barry of breaking
the agreement, and endangering the public good. Now the falling-tree argu-
ment has gone all the way from botany to semantics to politics; and so Barry
responds by challenging Albert for the authority to define the word.
A rationalist, with the discipline of hugging the query active, would notice
that the conversation had gone rather far astray.
Oh, dear reader, is it all really necessary? Albert knows what Barry means by
“sound.” Barry knows what Albert means by “sound.” Both Albert and Barry
have access to words, such as “acoustic vibrations” or “auditory experience,”
which they already associate to the same concepts, and which can describe
events in the forest without ambiguity. If they were stuck on a deserted island,
trying to communicate with each other, their work would be done.
When both sides know what the other side wants to say, and both sides
accuse the other side of defecting from “common usage,” then whatever it is
they are about, it is clearly not working out a way to communicate with each
other. But this is the whole benefit that common usage provides in the first
place.
Why would you argue about the meaning of a word, two sides trying to
wrest it back and forth? If it’s just a namespace conflict that has gotten blown
out of proportion, and nothing more is at stake, then the two sides need merely
generate two new words and use them consistently.
Yet often categorizations function as hidden inferences and disguised
queries. Is atheism a “religion”? If someone is arguing that the reasoning
methods used in atheism are on a par with the reasoning methods used in Ju-
daism, or that atheism is on a par with Islam in terms of causally engendering
violence, then they have a clear argumentative stake in lumping it all together
into an indistinct gray blur of “faith.”
Or consider the fight to blend together blacks and whites as “people.” This
would not be a time to generate two words—what’s at stake is exactly the idea
that you shouldn’t draw a moral distinction.
But once any empirical proposition is at stake, or any moral proposition,
you can no longer appeal to common usage.
If the question is how to cluster together similar things for purposes of
inference, empirical predictions will depend on the answer; which means that
definitions can be wrong. A conflict of predictions cannot be settled by an
opinion poll.
If you want to know whether atheism should be clustered with supernatural-
ist religions for purposes of some particular empirical inference, the dictionary
can’t answer you.
If you want to know whether blacks are people, the dictionary can’t answer
you.
If everyone believes that the red light in the sky is Mars the God of War, the
dictionary will define “Mars” as the God of War. If everyone believes that fire
is the release of phlogiston, the dictionary will define “fire” as the release of
phlogiston.
There is an art to using words; even when definitions are not literally true
or false, they are often wiser or more foolish. Dictionaries are mere histories
of past usage; if you treat them as supreme arbiters of meaning, it binds you to
the wisdom of the past, forbidding you to do better.
Though do take care to ensure (if you must depart from the wisdom of the
past) that people can figure out what you’re trying to swim.
*
166
Empty Labels
Consider (yet again) the Aristotelian idea of categories. Let’s say that there’s
some object with properties A, B, C, D, and E, or at least it looks E-ish.
Fred: “You mean that thing over there is blue, round, fuzzy,
and—”
Me: “In Aristotelian logic, it’s not supposed to make a differ-
ence what the properties are, or what I call them. That’s why I’m
just using the letters.”
Next, I invent the Aristotelian category “zawa,” which describes those objects,
all those objects, and only those objects, that have properties A, C, and D.
Then I add another word, “yokie,” which describes all and only objects that are
B and E; and the word “xippo,” which describes all and only objects which
are E but not D.
Me: “Object 1 is zawa and yokie, but not xippo.”
Fred: “Wait, is it luminescent? I mean, is it E?”
Me: “Yes. That is the only possibility on the information
given.”
Fred: “I’d rather you spelled it out.”
Me: “Fine: Object 1 is A, zawa, B, yokie, C, D, E, and not
xippo.”
Fred: “Amazing! You can tell all that just by looking?”
Impressive, isn’t it? Let’s invent even more new words: “Bolo” is A, C, and
yokie; “mun” is A, C, and xippo; and “merlacdonian” is bolo and mun.
Pointlessly confusing? I think so too. Let’s replace the labels with the
definitions:
And the thing to remember about the Aristotelian idea of categories is that
[A, C, D] is the entire information of “zawa.” It’s not just that I can vary the
label, but that I can get along just fine without any label at all—the rules for
Aristotelian classes work purely on structures like [A, C, D]. To call one of
these structures “zawa,” or attach any other label to it, is a human convenience
(or inconvenience) which makes not the slightest difference to the Aristotelian
rules.
Let’s say that “human” is to be defined as a mortal featherless biped. Then
the classic syllogism would have the form:
The feat of reasoning looks a lot less impressive now, doesn’t it?
Here the illusion of inference comes from the labels, which conceal the
premises, and pretend to novelty in the conclusion. Replacing labels with
definitions reveals the illusion, making visible the tautology’s empirical un-
helpfulness. You can never say that Socrates is a [mortal, ¬feathers, biped]
until you have observed him to be mortal.
There’s an idea, which you may have noticed I hate, that “you can de-
fine a word any way you like.” This idea came from the Aristotelian notion
of categories; since, if you follow the Aristotelian rules exactly and without
flaw—which humans never do; Aristotle knew perfectly well that Socrates was
human, even though that wasn’t justified under his rules—but, if some imagi-
nary nonhuman entity were to follow the rules exactly, they would never arrive
at a contradiction. They wouldn’t arrive at much of anything: they couldn’t
say that Socrates is a [mortal, ¬feathers, biped] until they observed him to be
mortal.
But it’s not so much that labels are arbitrary in the Aristotelian system, as
that the Aristotelian system works fine without any labels at all—it cranks out
exactly the same stream of tautologies, they just look a lot less impressive. The
labels are only there to create the illusion of inference.
So if you’re going to have an Aristotelian proverb at all, the proverb should
be, not “I can define a word any way I like,” nor even, “Defining a word never
has any consequences,” but rather, “Definitions don’t need words.”
*
167
Taboo Your Words
In the game Taboo (by Hasbro), the objective is for a player to have their partner
guess a word written on a card, without using that word or five additional words
listed on the card. For example, you might have to get your partner to say
“baseball” without using the words “sport,” “bat,” “hit,” “pitch,” “base” or of
course “baseball.”
As soon as I see a problem like that, I at once think, “An artificial group
conflict in which you use a long wooden cylinder to whack a thrown spheroid,
and then run between four safe positions.” It might not be the most efficient
strategy to convey the word “baseball” under the stated rules—that might be,
“It’s what the Yankees play”—but the general skill of blanking a word out of my
mind was one I’d practiced for years, albeit with a different purpose.
In the previous essay we saw how replacing terms with definitions could
reveal the empirical unproductivity of the classical Aristotelian syllogism. All
humans are mortal (and also, apparently, featherless bipeds); Socrates is hu-
man; therefore Socrates is mortal. When we replace the word “human” by its
apparent definition, the following underlying reasoning is revealed:
But the principle of replacing words by definitions applies much more broadly:
Clearly, since one says “sound” and one says “not sound,” we must have a
contradiction, right? But suppose that they both dereference their pointers
before speaking:
*
168
Replace the Symbol with the
Substance
What does it take to—as in the previous essay’s example—see a “baseball game”
as “An artificial group conflict in which you use a long wooden cylinder to
whack a thrown spheroid, and then run between four safe positions”? What
does it take to play the rationalist version of Taboo, in which the goal is not to
find a synonym that isn’t on the card, but to find a way of describing without
the standard concept-handle?
You have to visualize. You have to make your mind’s eye see the details, as
though looking for the first time. You have to perform an Original Seeing.
Is that a “bat”? No, it’s a long, round, tapering, wooden rod, narrowing at
one end so that a human can grasp and swing it.
Is that a “ball”? No, it’s a leather-covered spheroid with a symmetrical
stitching pattern, hard but not metal-hard, which someone can grasp and
throw, or strike with the wooden rod, or catch.
Are those “bases”? No, they’re fixed positions on a game field, that players
try to run to as quickly as possible because of their safety within the game’s
artificial rules.
The chief obstacle to performing an original seeing is that your mind already
has a nice neat summary, a nice little easy-to-use concept handle. Like the
word “baseball,” or “bat,” or “base.” It takes an effort to stop your mind from
sliding down the familiar path, the easy path, the path of least resistance, where
the small featureless word rushes in and obliterates the details you’re trying
to see. A word itself can have the destructive force of cliché; a word itself can
carry the poison of a cached thought.
Playing the game of Taboo—being able to describe without using the stan-
dard pointer/label/handle—is one of the fundamental rationalist capacities. It
occupies the same primordial level as the habit of constantly asking “Why?” or
“What does this belief make me anticipate?”
The art is closely related to:
• Pragmatism, because seeing in this way often gives you a much closer
connection to anticipated experience, rather than propositional belief;
• Reductionism, because seeing in this way often forces you to drop down
to a lower level of organization, look at the parts instead of your eye
skipping over the whole;
• Hugging the query, because words often distract you from the question
you really want to ask;
• The writer’s rule of “Show, don’t tell!,” which has power among ratio-
nalists;
*
169
Fallacies of Compression
“The map is not the territory,” as the saying goes. The only life-size, atomically
detailed, 100% accurate map of California is California. But California has im-
portant regularities, such as the shape of its highways, that can be described us-
ing vastly less information—not to mention vastly less physical material—than
it would take to describe every atom within the state borders. Hence the other
saying: “The map is not the territory, but you can’t fold up the territory and
put it in your glove compartment.”
A paper map of California, at a scale of 10 kilometers to 1 centimeter (a
million to one), doesn’t have room to show the distinct position of two fallen
leaves lying a centimeter apart on the sidewalk. Even if the map tried to show
the leaves, the leaves would appear as the same point on the map; or rather the
map would need a feature size of 10 nanometers, which is a finer resolution
than most book printers handle, not to mention human eyes.
Reality is very large—just the part we can see is billions of lightyears across.
But your map of reality is written on a few pounds of neurons, folded up
to fit inside your skull. I don’t mean to be insulting, but your skull is tiny.
Comparatively speaking.
Inevitably, then, certain things that are distinct in reality, will be compressed
into the same point on your map.
But what this feels like from inside is not that you say, “Oh, look, I’m
compressing two things into one point on my map.” What it feels like from
inside is that there is just one thing, and you are seeing it.
A sufficiently young child, or a sufficiently ancient Greek philosopher,
would not know that there were such things as “acoustic vibrations” or “audi-
tory experiences.” There would just be a single thing that happened when a
tree fell; a single event called “sound.”
To realize that there are two distinct events, underlying one point on your
map, is an essentially scientific challenge—a big, difficult scientific challenge.
Sometimes fallacies of compression result from confusing two known things
under the same label—you know about acoustic vibrations, and you know
about auditory processing in brains, but you call them both “sound” and so
confuse yourself. But the more dangerous fallacy of compression arises from
having no idea whatsoever that two distinct entities even exist. There is just one
mental folder in the filing system, labeled “sound,” and everything thought
about “sound” drops into that one folder. It’s not that there are two folders with
the same label; there’s just a single folder. By default, the map is compressed;
why would the brain create two mental buckets where one would serve?
Or think of a mystery novel in which the detective’s critical insight is that
one of the suspects has an identical twin. In the course of the detective’s
ordinary work, their job is just to observe that Carol is wearing red, that
she has black hair, that her sandals are leather—but all these are facts about
Carol. It’s easy enough to question an individual fact, like WearsRed(Carol)
or BlackHair(Carol). Maybe BlackHair(Carol) is false. Maybe Carol dyes her
hair. Maybe BrownHair(Carol). But it takes a subtler detective to wonder if the
Carol in WearsRed(Carol) and BlackHair(Carol)—the Carol file into which
their observations drop—should be split into two files. Maybe there are two
Carols, so that the Carol who wore red is not the same woman as the Carol
who had black hair.
Here it is the very act of creating two different buckets that is the stroke of
genius insight. ’Tis easier to question one’s facts than one’s ontology.
The map of reality contained in a human brain, unlike a paper map of
California, can expand dynamically when we write down more detailed de-
scriptions. But what this feels like from inside is not so much zooming in on a
map, as fissioning an indivisible atom—taking one thing (it felt like one thing)
and splitting it into two or more things.
Often this manifests in the creation of new words, like “acoustic vibrations”
and “auditory experiences” instead of just “sound.” Something about creating
the new name seems to allocate the new bucket. The detective is liable to start
calling one of their suspects “Carol-2” or “the Other Carol” almost as soon as
they realize that there are two Carols.
But expanding the map isn’t always as simple as generating new city names.
It is a stroke of scientific insight to realize that such things as acoustic vibrations,
or auditory experiences, even exist.
The obvious modern-day illustration would be words like “intelligence” or
“consciousness.” Every now and then one sees a press release claiming that a
research study has “explained consciousness” because a team of neurologists
investigated a 40Hz electrical rhythm that might have something to do with
cross-modality binding of sensory information, or because they investigated
the reticular activating system that keeps humans awake. That’s an extreme
example, and the usual failures are more subtle, but they are of the same kind.
The part of “consciousness” that people find most interesting is reflectivity,
self-awareness, realizing that the person I see in the mirror is “me”; that and the
hard problem of subjective experience as distinguished by David Chalmers.
We also label “conscious” the state of being awake, rather than asleep, in our
daily cycle. But they are all different concepts going under the same name, and
the underlying phenomena are different scientific puzzles. You can explain
being awake without explaining reflectivity or subjectivity.
Fallacies of compression also underlie the bait-and-switch technique in
philosophy—you argue about “consciousness” under one definition (like the
ability to think about thinking) and then apply the conclusions to “conscious-
ness” under a different definition (like subjectivity). Of course it may be that
the two are the same thing, but if so, genuinely understanding this fact would
require first a conceptual split and then a genius stroke of reunification.
Expanding your map is (I say again) a scientific challenge: part of the art of
science, the skill of inquiring into the world. (And of course you cannot solve
a scientific challenge by appealing to dictionaries, nor master a complex skill
of inquiry by saying “I can define a word any way I like.”) Where you see a
single confusing thing, with protean and self-contradictory attributes, it is a
good guess that your map is cramming too much into one point—you need to
pry it apart and allocate some new buckets. This is not like defining the single
thing you see, but it does often follow from figuring out how to talk about the
thing without using a single mental handle.
So the skill of prying apart the map is linked to the rationalist version of
Taboo, and to the wise use of words; because words often represent the points
on our map, the labels under which we file our propositions and the buckets
into which we drop our information. Avoiding a single word, or allocating
new ones, is often part of the skill of expanding the map.
*
170
Categorizing Has Consequences
Among the many genetic variations and mutations you carry in your genome,
there are a very few alleles you probably know—including those determining
your blood type: the presence or absence of the A, B, and + antigens. If you
receive a blood transfusion containing an antigen you don’t have, it will trigger
an allergic reaction. It was Karl Landsteiner’s discovery of this fact, and how
to test for compatible blood types, that made it possible to transfuse blood
without killing the patient. (1930 Nobel Prize in Medicine.) Also, if a mother
with blood type A (for example) bears a child with blood type A+, the mother
may acquire an allergic reaction to the + antigen; if she has another child with
blood type A+, the child will be in danger, unless the mother takes an allergic
suppressant during pregnancy. Thus people learn their blood types before they
marry.
Oh, and also: people with blood type A are earnest and creative, while peo-
ple with blood type B are wild and cheerful. People with type O are agreeable
and sociable, while people with type AB are cool and controlled. (You would
think that O would be the absence of A and B, while AB would just be A plus B,
but no . . .) All this, according to the Japanese blood type theory of personality.
It would seem that blood type plays the role in Japan that astrological signs
play in the West, right down to blood type horoscopes in the daily newspaper.
This fad is especially odd because blood types have never been mysterious,
not in Japan and not anywhere. We only know blood types even exist thanks
to Karl Landsteiner. No mystic witch doctor, no venerable sorcerer, ever said a
word about blood types; there are no ancient, dusty scrolls to shroud the error
in the aura of antiquity. If the medical profession claimed tomorrow that it
had all been a colossal hoax, we layfolk would not have one scrap of evidence
from our unaided senses to contradict them.
There’s never been a war between blood types. There’s never even been a
political conflict between blood types. The stereotypes must have arisen strictly
from the mere existence of the labels.
Now, someone is bound to point out that this is a story of categorizing
humans. Does the same thing happen if you categorize plants, or rocks, or
office furniture? I can’t recall reading about such an experiment, but of course,
that doesn’t mean one hasn’t been done. (I’d expect the chief difficulty of
doing such an experiment would be finding a protocol that didn’t mislead the
subjects into thinking that, since the label was given you, it must be significant
somehow.) So while I don’t mean to update on imaginary evidence, I would
predict a positive result for the experiment: I would expect them to find that
mere labeling had power over all things, at least in the human imagination.
You can see this in terms of similarity clusters: once you draw a bound-
ary around a group, the mind starts trying to harvest similarities from the
group. And unfortunately the human pattern-detectors seem to operate in
such overdrive that we see patterns whether they’re there or not; a weakly nega-
tive correlation can be mistaken for a strong positive one with a bit of selective
memory.
You can see this in terms of neural algorithms: creating a name for a set of
things is like allocating a subnetwork to find patterns in them.
You can see this in terms of a compression fallacy: things given the same
name end up dumped into the same mental bucket, blurring them together
into the same point on the map.
Or you can see this in terms of the boundless human ability to make stuff
up out of thin air and believe it because no one can prove it’s wrong. As soon
as you name the category, you can start making up stuff about it. The named
thing doesn’t have to be perceptible; it doesn’t have to exist; it doesn’t even
have to be coherent.
And no, it’s not just Japan: Here in the West, a blood-type-based diet book
called Eat Right 4 Your Type was a bestseller.
Any way you look at it, drawing a boundary in thingspace is not a neutral
act. Maybe a more cleanly designed, more purely Bayesian AI could ponder
an arbitrary class and not be influenced by it. But you, a human, do not have
that option. Categories are not static things in the context of a human brain; as
soon as you actually think of them, they exert force on your mind. One more
reason not to believe you can define a word any way you like.
*
171
Sneaking in Connotations
In the previous essay, we saw that in Japan, blood types have taken the place of
astrology—if your blood type is AB, for example, you’re supposed to be “cool
and controlled.”
So suppose we decided to invent a new word, “wiggin,” and defined this
word to mean people with green eyes and black hair—
*
172
Arguing “By Definition”
*
173
Where to Draw the Boundary?
Long have I pondered the meaning of the word “Art,” and at last
I’ve found what seems to me a satisfactory definition: “Art is that
which is designed for the purpose of creating a reaction in an
audience.”
Just because there’s a word “art” doesn’t mean that it has a meaning, floating
out there in the void, which you can discover by finding the right definition.
It feels that way, but it is not so.
Wondering how to define a word means you’re looking at the problem
the wrong way—searching for the mysterious essence of what is, in fact, a
communication signal.
Now, there is a real challenge which a rationalist may legitimately attack,
but the challenge is not to find a satisfactory definition of a word. The real
challenge can be played as a single-player game, without speaking aloud. The
challenge is figuring out which things are similar to each other—which things
are clustered together—and sometimes, which things have a common cause.
If you define “eluctromugnetism” to include lightning, include compasses,
exclude light, and include Mesmer’s “animal magnetism” (what we now
call hypnosis), then you will have some trouble asking “How does eluctro-
mugnetism work?” You have lumped together things which do not belong
together, and excluded others that would be needed to complete a set. (This
example is historically plausible; Mesmer came before Faraday.)
We could say that eluctromugnetism is a wrong word, a boundary in
thingspace that loops around and swerves through the clusters, a cut that
fails to carve reality along its natural joints.
Figuring where to cut reality in order to carve along the joints—this is the
problem worthy of a rationalist. It is what people should be trying to do, when
they set out in search of the floating essence of a word.
And make no mistake: it is a scientific challenge to realize that you need
a single word to describe breathing and fire. So do not think to consult the
dictionary editors, for that is not their job.
What is “art”? But there is no essence of the word, floating in the void.
Perhaps you come to me with a long list of the things that you call “art” and
“not art”:
And you say to me: “It feels intuitive to me to draw this boundary, but I don’t
know why—can you find me an intension that matches this extension? Can
you give me a simple description of this boundary?”
So I reply: “I think it has to do with admiration of craftsmanship: work
going in and wonder coming out. What the included items have in common
is the similar aesthetic emotions that they inspire, and the deliberate human
effort that went into them with the intent of producing such an emotion.”
Is this helpful, or is it just cheating at Taboo? I would argue that the list of
which human emotions are or are not aesthetic is far more compact than the
list of everything that is or isn’t art. You might be able to see those emotions
lighting up an f MRI scan—I say this by way of emphasizing that emotions are
not ethereal.
But of course my definition of art is not the real point. The real point is that
you could well dispute either the intension or the extension of my definition.
You could say, “Aesthetic emotion is not what these things have in common;
what they have in common is an intent to inspire any complex emotion for
the sake of inspiring it.” That would be disputing my intension, my attempt
to draw a curve through the data points. You would say, “Your equation may
roughly fit those points, but it is not the true generating distribution.”
Or you could dispute my extension by saying, “Some of these things do
belong together—I can see what you’re getting at—but the Python language
shouldn’t be on the list, and Modern Art should be.” (This would mark you as a
philistine, but you could argue it.) Here, the presumption is that there is indeed
an underlying curve that generates this apparent list of similar and dissimilar
things—that there is a rhyme and reason, even though you haven’t said yet
where it comes from—but I have unwittingly lost the rhythm and included some
data points from a different generator.
Long before you know what it is that electricity and magnetism have in
common, you might still suspect—based on surface appearances—that “animal
magnetism” does not belong on the list.
Once upon a time it was thought that the word “fish” included dolphins.
Now you could play the oh-so-clever arguer, and say, “The list: {Salmon, gup-
pies, sharks, dolphins, trout} is just a list—you can’t say that a list is wrong. I
can prove in set theory that this list exists. So my definition of fish, which is
simply this extensional list, cannot possibly be ‘wrong’ as you claim.”
Or you could stop playing games and admit that dolphins don’t belong on
the fish list.
You come up with a list of things that feel similar, and take a guess at why
this is so. But when you finally discover what they really have in common, it
may turn out that your guess was wrong. It may even turn out that your list
was wrong.
You cannot hide behind a comforting shield of correct-by-definition. Both
extensional definitions and intensional definitions can be wrong, can fail to
carve reality at the joints.
Categorizing is a guessing endeavor, in which you can make mistakes; so
it’s wise to be able to admit, from a theoretical standpoint, that your definition-
guesses can be “mistaken.”
*
174
Entropy, and Short Codes
(If you aren’t familiar with Bayesian inference, this may be a good time to read
An Intuitive Explanation of Bayes’s Theorem.)
Suppose you have a system X that’s equally likely to be in any of 8 possible
states:
{X1 , X2 , X3 , X4 , X5 , X6 , X7 , X8 } .
So if I asked “Is the first symbol 1?” and heard “yes,” then asked “Is the second
symbol 1?” and heard “no,” then asked “Is the third symbol 1?” and heard “no,”
I would know that X was in state 4.
Now suppose that the system Y has four possible states with the following
probabilities:
Y1 : 1/2 (50%) Y2 : 1/4 (25%)
Y3 : 1/8 (12.5%) Y4 : 1/8 (12.5%) .
Then the entropy of Y would be 1.75 bits, meaning that we can find out its
value by asking 1.75 yes-or-no questions.
What does it mean to talk about asking one and three-fourths of a question?
Imagine that we designate the states of Y using the following code:
Y1 : 1 Y2 : 01 Y3 : 001 Y4 : 000 .
First you ask, “Is the first symbol 1?” If the answer is “yes,” you’re done: Y is
in state 1. This happens half the time, so 50% of the time, it takes 1 yes-or-no
question to find out Y ’s state.
Suppose that instead the answer is “No.” Then you ask, “Is the second
symbol 1?” If the answer is “yes,” you’re done: Y is in state 2. The system Y is
in state 2 with probability 1/4, and each time Y is in state 2 we discover this
fact using two yes-or-no questions, so 25% of the time it takes 2 questions to
discover Y ’s state.
If the answer is “No” twice in a row, you ask “Is the third symbol 1?” If
“yes,” you’re done and Y is in state 3; if “no,” you’re done and Y is in state 4.
The 1/8 of the time that Y is in state 3, it takes three questions; and the 1/8 of
the time that Y is in state 4, it takes three questions.
The general formula for the entropy H(S) of a system S is the sum, over all
Si , of −P (Si ) log2 (P (Si )).
For example, the log (base 2) of 1/8 is −3. So −(1/8 × −3) = 0.375 is
the contribution of state S4 to the total entropy: 1/8 of the time, we have to
ask 3 questions.
You can’t always devise a perfect code for a system, but if you have to tell
someone the state of arbitrarily many copies of S in a single message, you can
get arbitrarily close to a perfect code. (Google “arithmetic coding” for a simple
method.)
Now, you might ask: “Why not use the code 10 for Y4 , instead of 000?
Wouldn’t that let us transmit messages more quickly?”
But if you use the code 10 for Y4 , then when someone answers “Yes” to
the question “Is the first symbol 1?,” you won’t know yet whether the system
state is Y1 (1) or Y4 (10). In fact, if you change the code this way, the whole
system falls apart—because if you hear “1001,” you don’t know if it means “Y4 ,
followed by Y2 ” or “Y1 , followed by Y3 .”
The moral is that short words are a conserved resource.
The key to creating a good code—a code that transmits messages as com-
pactly as possible—is to reserve short words for things that you’ll need to say
frequently, and use longer words for things that you won’t need to say as often.
When you take this art to its limit, the length of the message you need
to describe something corresponds exactly or almost exactly to its probabil-
ity. This is the Minimum Description Length or Minimum Message Length
formalization of Occam’s Razor.
And so even the labels that we use for words are not quite arbitrary. The
sounds that we attach to our concepts can be better or worse, wiser or more
foolish. Even apart from considerations of common usage!
I say all this, because the idea that “You can X any way you like” is a huge
obstacle to learning how to X wisely. “It’s a free country; I have a right to my
own opinion” obstructs the art of finding truth. “I can define a word any way
I like” obstructs the art of carving reality at its joints. And even the sensible-
sounding “The labels we attach to words are arbitrary” obstructs awareness
of compactness. Prosody too, for that matter—Tolkien once observed what a
beautiful sound the phrase “cellar door” makes; that is the kind of awareness it
takes to use language like Tolkien.
The length of words also plays a nontrivial role in the cognitive science of
language:
Consider the phrases “recliner,” “chair,” and “furniture.” Recliner is a more
specific category than chair; furniture is a more general category than chair.
But the vast majority of chairs have a common use—you use the same sort of
motor actions to sit down in them, and you sit down in them for the same sort
of purpose (to take your weight off your feet while you eat, or read, or type,
or rest). Recliners do not depart from this theme. “Furniture,” on the other
hand, includes things like beds and tables which have different uses, and call
up different motor functions, from chairs.
In the terminology of cognitive psychology, “chair” is a basic-level category.
People have a tendency to talk, and presumably think, at the basic level of
categorization—to draw the boundary around “chairs,” rather than around
the more specific category “recliner,” or the more general category “furniture.”
People are more likely to say “You can sit in that chair” than “You can sit in
that recliner” or “You can sit in that furniture.”
And it is no coincidence that the word for “chair” contains fewer syllables
than either “recliner” or “furniture.” Basic-level categories, in general, tend
to have short names; and nouns with short names tend to refer to basic-level
categories. Not a perfect rule, of course, but a definite tendency. Frequent use
goes along with short words; short words go along with frequent use.
Or as Douglas Hofstadter put it, there’s a reason why the English language
uses “the” to mean “the” and “antidisestablishmentarianism” to mean “antidis-
establishmentarianism” instead of antidisestablishmentarianism other way
around.
*
175
Mutual Information, and Density in
Thingspace
Suppose you have a system X that can be in any of 8 states, which are all
equally probable (relative to your current state of knowledge), and a system Y
that can be in any of 4 states, all equally probable.
The entropy of X, as defined in the previous essay, is 3 bits; we’ll need to
ask 3 yes-or-no questions to find out X’s exact state. The entropy of Y is 2
bits; we have to ask 2 yes-or-no questions to find out Y ’s exact state. This
may seem obvious since 23 = 8 and 22 = 4, so 3 questions can distinguish
8 possibilities and 2 questions can distinguish 4 possibilities; but remember
that if the possibilities were not all equally likely, we could use a more clever
code to discover Y ’s state using e.g. 1.75 questions on average. In this case,
though, X’s probability mass is evenly distributed over all its possible states,
and likewise Y, so we can’t use any clever codes.
What is the entropy of the combined system (X, Y )?
You might be tempted to answer, “It takes 3 questions to find out X, and
then 2 questions to find out Y, so it takes 5 questions total to find out the state
of X and Y. ”
But what if the two variables are entangled, so that learning the state of Y
tells us something about the state of X?
In particular, let’s suppose that X and Y are either both odd or both even.
Now if we receive a 3-bit message (ask 3 questions) and learn that X is in
state X5 , we know that Y is in state Y1 or state Y3 , but not state Y2 or state
Y4 . So the single additional question “Is Y in state Y3 ?,” answered “No,” tells
us the entire state of (X, Y ): X = X5 , Y = Y1 . And we learned this with a
total of 4 questions.
Conversely, if we learn that Y is in state Y4 using two questions, it will take
us only an additional two questions to learn whether X is in state X2 , X4 , X6 ,
or X8 . Again, four questions to learn the state of the joint system.
The mutual information of two variables is defined as the difference between
the entropy of the joint system and the entropy of the independent systems:
I(X; Y ) = H(X) + H(Y ) − H(X, Y ).
Here there is one bit of mutual information between the two systems: Learn-
ing X tells us one bit of information about Y (cuts down the space of pos-
sibilities from 4 possibilities to 2, a factor-of-2 decrease in the volume) and
learning Y tells us one bit of information about X (cuts down the possibility
space from 8 possibilities to 4).
What about when probability mass is not evenly distributed? Last essay, for
example, we discussed the case in which Y had the probabilities 1/2, 1/4, 1/8,
1/8 for its four states. Let us take this to be our probability distribution over
Y, considered independently—if we saw Y, without seeing anything else, this
is what we’d expect to see. And suppose the variable Z has two states, Z1 and
Z2 , with probabilities 3/8 and 5/8 respectively.
Then if and only if the joint distribution of Y and Z is as follows, there is
zero mutual information between Y and Z:
Z1 Y1 : 3/16 Z1 Y2 : 3/32 Z1 Y3 : 3/64 Z1 Y4 : 3/64
Z2 Y1 : 5/16 Z2 Y2 : 5/32 Z2 Y3 : 5/64 Z2 Y4 : 5/64 .
This distribution obeys the law
P (Y, Z) = P (Y )P (Z) .
So, just by inspecting the joint distribution, we can determine whether the
marginal variables Y and Z are independent; that is, whether the joint distri-
bution factors into the product of the marginal distributions; whether, for all
Y and Z, we have P (Y, Z) = P (Y )P (Z).
This last is significant because, by Bayes’s Rule,
In English: “After you learn Zj , your belief about Yi is just what it was before.”
So when the distribution factorizes—when P (Y, Z) = P (Y )P (Z)—this
is equivalent to “Learning about Y never tells us anything about Z or vice
versa.”
From which you might suspect, correctly, that there is no mutual informa-
tion between Y and Z. Where there is no mutual information, there is no
Bayesian evidence, and vice versa.
Suppose that in the distribution (Y, Z) above, we treated each possible
combination of Y and Z as a separate event—so that the distribution (Y, Z)
would have a total of 8 possibilities, with the probabilities shown—and then
we calculated the entropy of the distribution (Y, Z) the same way we would
calculate the entropy of any distribution:
You would end up with the same total you would get if you separately calculated
the entropy of Y plus the entropy of Z. There is no mutual information
between the two variables, so our uncertainty about the joint system is not any
less than our uncertainty about the two systems considered separately. (I am
not showing the calculations, but you are welcome to do them; and I am not
showing the proof that this is true in general, but you are welcome to Google
on “Shannon entropy” and “mutual information.”)
What if the joint distribution doesn’t factorize? For example:
If you add up the joint probabilities to get marginal probabilities, you should
find that P (Y1 ) = 1/2, P (Z1 ) = 3/8, and so on—the marginal probabilities
are the same as before.
But the joint probabilities do not always equal the product of the marginal
probabilities. For example, the probability P (Z1 Y2 ) equals 8/64, where
P (Z1 )P (Y2 ) would equal 3/8 × 1/4 = 6/64. That is, the probability of
running into Z1 Y2 together is greater than you’d expect based on the proba-
bilities of running into Z1 or Y2 separately.
Which in turn implies:
Why, then, would you even want to have a word for “human”? Why not just
say “Socrates is a mortal featherless biped”?
Because it’s helpful to have shorter words for things that you encounter
often. If your code for describing single properties is already efficient, then
there will not be an advantage to having a special word for a conjunction—like
“human” for “mortal featherless biped”—unless things that are mortal and
featherless and bipedal, are found more often than the marginal probabilities
would lead you to expect.
In efficient codes, word length corresponds to probability—so the code
for Z1 Y2 will be just as long as the code for Z1 plus the code for Y2 , unless
P (Z1 Y2 ) > P (Z1 )P (Y2 ), in which case the code for the word can be shorter
than the codes for its parts.
And this in turn corresponds exactly to the case where we can infer some
of the properties of the thing from seeing its other properties. It must be more
likely than the default that featherless bipedal things will also be mortal.
Of course the word “human” really describes many, many more properties—
when you see a human-shaped entity that talks and wears clothes, you can infer
whole hosts of biochemical and anatomical and cognitive facts about it. To
replace the word “human” with a description of everything we know about hu-
mans would require us to spend an inordinate amount of time talking. But this
is true only because a featherless talking biped is far more likely than default to
be poisonable by hemlock, or have broad nails, or be overconfident.
Having a word for a thing, rather than just listing its properties, is a more
compact code precisely in those cases where we can infer some of those prop-
erties from the other properties. (With the exception perhaps of very primitive
words, like “red,” that we would use to send an entirely uncompressed de-
scription of our sensory experiences. But by the time you encounter a bug, or
even a rock, you’re dealing with nonsimple property collections, far above the
primitive level.)
So having a word “wiggin” for green-eyed black-haired people is more
useful than just saying “green-eyed black-haired person” precisely when:
One may even consider the act of defining a word as a promise to this effect.
Telling someone, “I define the word ‘wiggin’ to mean a person with green eyes
and black hair,” by Gricean implication, asserts that the word “wiggin” will
somehow help you make inferences / shorten your messages.
If green-eyes and black hair have no greater than default probability to
be found together, nor does any other property occur at greater than default
probability along with them, then the word “wiggin” is a lie: The word claims
that certain people are worth distinguishing as a group, but they’re not.
In this case the word “wiggin” does not help describe reality more
compactly—it is not defined by someone sending the shortest message—it has
no role in the simplest explanation. Equivalently, the word “wiggin” will be
of no help to you in doing any Bayesian inference. Even if you do not call the
word a lie, it is surely an error.
And the way to carve reality at its joints is to draw your boundaries around
concentrations of unusually high probability density in Thingspace.
*
176
Superexponential Conceptspace, and
Simple Words
Thingspace, you might think, is a rather huge space. Much larger than reality,
for where reality only contains things that actually exist, Thingspace contains
everything that could exist.
Actually, the way I “defined” Thingspace to have dimensions for every
possible attribute—including correlated attributes like density and volume and
mass—Thingspace may be too poorly defined to have anything you could call
a size. But it’s important to be able to visualize Thingspace anyway. Surely,
no one can really understand a flock of sparrows if all they see is a cloud of
flapping cawing things, rather than a cluster of points in Thingspace.
But as vast as Thingspace may be, it doesn’t hold a candle to the size of
Conceptspace.
“Concept,” in machine learning, means a rule that includes or excludes
examples. If you see the data {2:+, 3:-, 14:+, 23:-, 8:+, 9:-} then
you might guess that the concept was “even numbers.” There is a rather large
literature (as one might expect) on how to learn concepts from data . . . given
random examples, given chosen examples . . . given possible errors in classifi-
cation . . . and most importantly, given different spaces of possible rules.
Suppose, for example, that we want to learn the concept “good days on
which to play tennis.” The possible attributes of Days are
• Maintain the set of the most general hypotheses that fit the data—those
that positively classify as many examples as possible, while still fitting
the facts.
• Maintain another set of the most specific hypotheses that fit the data—
those that negatively classify as many examples as possible, while still
fitting the facts.
• Each time we see a new negative example, we strengthen all the most
general hypotheses as little as possible, so that the new set is again as
general as possible while fitting the facts.
• Each time we see a new positive example, we relax all the most specific
hypotheses as little as possible, so that the new set is again as specific as
possible while fitting the facts.
• We continue until we have only a single hypothesis left. This will be the
answer if the target concept was in our hypothesis space at all.
while the set of most specific hypotheses contains the single member
{Sunny, Warm, High, ?}.
Any other concept you can find that fits the data will be strictly more specific
than one of the most general hypotheses, and strictly more general than the
most specific hypothesis.
(For more on this, I recommend Tom Mitchell’s Machine Learning, from
which this example was adapted.1 )
Now you may notice that the format above cannot represent all possible
concepts. E.g., “Play tennis when the sky is sunny or the air is warm.” That fits
the data, but in the concept representation defined above, there’s no quadruplet
of values that describes the rule.
Clearly our machine learner is not very general. Why not allow it to rep-
resent all possible concepts, so that it can learn with the greatest possible
flexibility?
Days are composed of these four variables, one variable with 3 values and
three variables with 2 values. So there are 3 × 2 × 2 × 2 = 24 possible Days
that we could encounter.
The format given for representing Concepts allows us to require any of these
values for a variable, or leave the variable open. So there are 4×3×3×3 = 108
concepts in that representation. For the most-general/most-specific algorithm
to work, we need to start with the most specific hypothesis “no example is ever
positively classified.” If we add that, it makes a total of 109 concepts.
Is it suspicious that there are more possible concepts than possible Days?
Surely not: After all, a concept can be viewed as a collection of Days. A concept
can be viewed as the set of days that it classifies positively, or isomorphically,
the set of days that it classifies negatively.
So the space of all possible concepts that classify Days is the set of all possible
sets of Days, whose size is 224 = 16,777,216.
This complete space includes all the concepts we have discussed so far. But
it also includes concepts like “Positively classify only the examples {Sunny,
Warm, High, Strong} and {Sunny, Warm, High, Weak} and reject ev-
erything else” or “Negatively classify only the example {Rainy, Cold, High,
Strong} and accept everything else.” It includes concepts with no compact
representation, just a flat list of what is and isn’t allowed.
That’s the problem with trying to build a “fully general” inductive learner:
They can’t learn concepts until they’ve seen every possible example in the
instance space.
If we add on more attributes to Days—like the Water temperature, or
the Forecast for tomorrow—then the number of possible days will grow
exponentially in the number of attributes. But this isn’t a problem with our
restricted concept space, because you can narrow down a large space using a
logarithmic number of examples.
Let’s say we add the Water: {Warm, Cold} attribute to days, which will
make for 48 possible Days and 325 possible concepts. Let’s say that each Day
we see is, usually, classified positive by around half of the currently-plausible
concepts, and classified negative by the other half. Then when we learn the
actual classification of the example, it will cut the space of compatible concepts
in half. So it might only take 9 examples (29 = 512) to narrow 325 possible
concepts down to one.
Even if Days had forty binary attributes, it should still only take a manage-
able amount of data to narrow down the possible concepts to one. Sixty-four
examples, if each example is classified positive by half the remaining concepts.
Assuming, of course, that the actual rule is one we can represent at all!
If you want to think of all the possibilities, well, good luck with that. The
space of all possible concepts grows superexponentially in the number of at-
tributes.
By the time you’re talking about data with forty binary attributes, the num-
ber of possible examples is past a trillion—but the number of possible concepts
is past two-to-the-trillionth-power. To narrow down that superexponential
concept space, you’d have to see over a trillion examples before you could say
what was In, and what was Out. You’d have to see every possible example, in
fact.
That’s with forty binary attributes, mind you. Forty bits, or 5 bytes, to be
classified simply “Yes” or “No.” Forty bits implies 240 possible examples, and
40
22 possible concepts that classify those examples as positive or negative.
So, here in the real world, where objects take more than 5 bytes to describe
and a trillion examples are not available and there is noise in the training data,
we only even think about highly regular concepts. A human mind—or the
whole observable universe—is not nearly large enough to consider all the other
hypotheses.
From this perspective, learning doesn’t just rely on inductive bias, it is
nearly all inductive bias—when you compare the number of concepts ruled
out a priori, to those ruled out by mere evidence.
But what has this (you inquire) to do with the proper use of words?
It’s the whole reason that words have intensions as well as extensions.
In the last essay, I concluded:
Otherwise you would just gerrymander Thingspace. You would create really
odd noncontiguous boundaries that collected the observed examples, examples
that couldn’t be described in any shorter message than your observations
themselves, and say: “This is what I’ve seen before, and what I expect to see
more of in the future.”
In the real world, nothing above the level of molecules repeats itself exactly.
Socrates is shaped a lot like all those other humans who were vulnerable to
hemlock, but he isn’t shaped exactly like them. So your guess that Socrates is
a “human” relies on drawing simple boundaries around the human cluster in
Thingspace. Rather than, “Things shaped exactly like [5-megabyte shape speci-
fication 1] and with [lots of other characteristics], or exactly like [5-megabyte
shape specification 2] and [lots of other characteristics], . . . , are human.”
If you don’t draw simple boundaries around your experiences, you can’t do
inference with them. So you try to describe “art” with intensional definitions
like “that which is intended to inspire any complex emotion for the sake of
inspiring it,” rather than just pointing at a long list of things that are, or aren’t
art.
In fact, the above statement about “how to carve reality at its joints” is a bit
chicken-and-eggish: You can’t assess the density of actual observations until
you’ve already done at least a little carving. And the probability distribution
comes from drawing the boundaries, not the other way around—if you already
had the probability distribution, you’d have everything necessary for inference,
so why would you bother drawing boundaries?
And this suggests another—yes, yet another—reason to be suspicious of
the claim that “you can define a word any way you like.” When you consider
the superexponential size of Conceptspace, it becomes clear that singling out
one particular concept for consideration is an act of no small audacity—not
just for us, but for any mind of bounded computing power.
Presenting us with the word “wiggin,” defined as “a black-haired green-eyed
person,” without some reason for raising this particular concept to the level
of our deliberate attention, is rather like a detective saying: “Well, I haven’t
the slightest shred of support one way or the other for who could’ve murdered
those orphans . . . not even an intuition, mind you . . . but have we considered
John Q. Wiffleheim of 1234 Norkle Rd as a suspect?”
Why? Because if you have the mutual information between X and Z, and
the mutual information between Z and Y, that may include some of the same
mutual information that we’ll calculate exists between X and Y. In this case,
for example, knowing that X is even tells us that Z is even, and knowing that
Z is even tells us that Y is even, but this is the same information that X would
tell us about Y. We double-counted some of our knowledge, and so came up
with too little entropy.
The correct formula is (I believe):
Here the last term, I(X; Y |Z), means, “the information that X tells us about
Y, given that we already know Z.” In this case, X doesn’t tell us anything
about Y, given that we already know Z, so the term comes out as zero—and
the equation gives the correct answer. There, isn’t that nice?
“No,” you correctly reply, “for you have not told me how to calculate
I(X; Y |Z), only given me a verbal argument that it ought to be zero.”
We calculate I(X; Y |Z) just the way you would expect. We know
I(X; Y ) = H(X) + H(Y ) − H(X, Y ), so
And now, I suppose, you want to know how to calculate the conditional en-
tropy? Well, the original formula for the entropy is
X
H(S) = −P (Si ) × log2 (P (Si )) .
i
So if we’re going to learn a new fact Z, but we don’t know which Z yet, then,
on average, we expect to be around this uncertain of S afterward:
!
X X
H(S|Z) = P (Zj ) −P (Si |Zj ) log2 (P (Si |Zj )) .
j i
And that’s how one calculates conditional entropies; from which, in turn, we
can get the conditional mutual information.
There are all sorts of ancillary theorems here, like
and
if I(X; Z) = 0 and I(Y ; X|Z) = 0 then I(X; Y ) = 0 ,
I(X; Y ) > 0 ⇒
P (x, y) 6= P (x)P (y)
P (x, y)
6= P (x)
P (y)
P (x|y) 6= P (x) .
Which last line reads “Even knowing Z, learning Y still changes our beliefs
about X.”
Conversely, as in our original case of Z being “even” or “odd,” Z screens
off X from Y—that is, if we know that Z is “even,” learning that Y is in state
Y4 tells us nothing more about whether X is X2 , X4 , X6 , or X8 . Or if we know
that Z is “odd,” then learning that X is X5 tells us nothing more about whether
Y is Y1 or Y3 . Learning Z has rendered X and Y conditionally independent.
Conditional independence is a hugely important concept in probability
theory—to cite just one example, without conditional independence, the uni-
verse would have no structure.
Here, though, I only intend to talk about one particular kind of conditional
independence—the case of a central variable that screens off other variables
surrounding it, like a central body with tentacles.
Let there be five variables U, V, W, X, and Y; and moreover, suppose that
for every pair of these variables, one variable is evidence about the other. If you
select U and W, for example, then learning U = U1 will tell you something
you didn’t know before about the probability that W = W1 .
An unmanageable inferential mess? Evidence gone wild? Not necessarily.
Maybe U is “Speaks a language,” V is “Two arms and ten digits,” W is
“Wears clothes,” X is “Poisonable by hemlock,” and Y is “Red blood.” Now if
you encounter a thing-in-the-world, that might be an apple and might be a
rock, and you learn that this thing speaks Chinese, you are liable to assess a
much higher probability that it wears clothes; and if you learn that the thing is
not poisonable by hemlock, you will assess a somewhat lower probability that
it has red blood.
Now some of these rules are stronger than others. There is the case of Fred,
who is missing a finger due to a volcano accident, and the case of Barney the
Baby who doesn’t speak yet, and the case of Irving the IRCBot who emits
sentences but has no blood. So if we learn that a certain thing is not wearing
clothes, that doesn’t screen off everything that its speech capability can tell us
about its blood color. If the thing doesn’t wear clothes but does talk, maybe it’s
Nude Nellie.
This makes the case more interesting than, say, five integer variables that are
all odd or all even, but otherwise uncorrelated. In that case, knowing any one
of the variables would screen off everything that knowing a second variable
could tell us about a third variable.
But here, we have dependencies that don’t go away as soon as we learn
just one variable, as the case of Nude Nellie shows. So is it an unmanageable
inferential inconvenience?
Fear not! For there may be some sixth variable Z, which, if we knew it,
really would screen off every pair of variables from each other. There may
be some variable Z—even if we have to construct Z rather than observing it
directly—such that:
P (U |V, W, X, Y, Z) = P (U |Z)
P (V |U, W, X, Y, Z) = P (V |Z)
P (W |U, V, X, Y, Z) = P (W |Z)
..
.
Shape: Luminance:
+egg / -cube +glow / -dark
Category:
+BLEGG /
-RUBE
Texture: Interior:
+furred / +vanadium /
-smooth -palladium
Just because someone is presenting you with an algorithm that they call a
“neural network” with buzzwords like “scruffy” and “emergent” plastered all
over it, disclaiming proudly that they have no idea how the learned network
works—well, don’t assume that their little AI algorithm really is Beyond the
Realms of Logic. For this paradigm of adhockery, if it works, will turn out to
have Bayesian structure; it may even be exactly equivalent to an algorithm of
the sort called “Bayesian.”
Even if it doesn’t look Bayesian, on the surface.
And then you just know that the Bayesians are going to start explaining
exactly how the algorithm works, what underlying assumptions it reflects,
which environmental regularities it exploits, where it works and where it fails,
and even attaching understandable meanings to the learned network weights.
Disappointing, isn’t it?
*
178
Words as Mental Paintbrush Handles
Suppose I tell you: “It’s the strangest thing: The lamps in this hotel have
triangular lightbulbs.”
You may or may not have visualized it—if you haven’t done it yet, do so
now—what, in your mind’s eye, does a “triangular lightbulb” look like?
In your mind’s eye, did the glass have sharp edges, or smooth?
When the phrase “triangular lightbulb” first crossed my mind—no, the
hotel doesn’t have them—then as best as my introspection could determine, I
first saw a pyramidal lightbulb with sharp edges, then (almost immediately)
the edges were smoothed, and then my mind generated a loop of flourescent
bulb in the shape of a smooth triangle as an alternative.
As far as I can tell, no deliberative/verbal thoughts were involved—just
wordless reflex flinch away from the imaginary mental vision of sharp glass,
which design problem was solved before I could even think in words.
Believe it or not, for some decades, there was a serious debate about whether
people really had mental images in their mind—an actual picture of a chair
somewhere—or if people just naively thought they had mental images (having
been misled by “introspection,” a very bad forbidden activity), while actually
just having a little “chair” label, like a lisp token, active in their brain.
I am trying hard not to say anything like “How spectacularly silly,” because
there is always the hindsight effect to consider, but: how spectacularly silly.
This academic paradigm, I think, was mostly a deranged legacy of behavior-
ism, which denied the existence of thoughts in humans, and sought to explain
all human phenomena as “reflex,” including speech. Behaviorism probably de-
serves its own write at some point, as it was a perversion of rationalism; but
this is not that write.
“You call it ‘silly,’ ” you inquire, “but how do you know that your brain
represents visual images? Is it merely that you can close your eyes and see
them?”
This question used to be harder to answer, back in the day of the controversy.
If you wanted to prove the existence of mental imagery “scientifically,” rather
than just by introspection, you had to infer the existence of mental imagery
from experiments like this: Show subjects two objects and ask them if one can
be rotated into correspondence with the other. The response time is linearly
proportional to the angle of rotation required. This is easy to explain if you are
actually visualizing the image and continuously rotating it at a constant speed,
but hard to explain if you are just checking propositional features of the image.
Today we can actually neuroimage the little pictures in the visual cortex. So,
yes, your brain really does represent a detailed image of what it sees or imagines.
See Stephen Kosslyn’s Image and Brain: The Resolution of the Imagery Debate.1
Part of the reason people get in trouble with words, is that they do not
realize how much complexity lurks behind words.
Can you visualize a “green dog”? Can you visualize a “cheese apple”?
“Apple” isn’t just a sequence of two syllables or five letters. That’s a shadow.
That’s the tip of the tiger’s tail.
Words, or rather the concepts behind them, are paintbrushes—you can
use them to draw images in your own mind. Literally draw, if you employ
concepts to make a picture in your visual cortex. And by the use of shared
labels, you can reach into someone else’s mind, and grasp their paintbrushes
to draw pictures in their minds—sketch a little green dog in their visual cortex.
But don’t think that, because you send syllables through the air, or letters
through the Internet, it is the syllables or the letters that draw pictures in the
visual cortex. That takes some complex instructions that wouldn’t fit in the
sequence of letters. “Apple” is 5 bytes, and drawing a picture of an apple from
scratch would take more data than that.
“Apple” is merely the tag attached to the true and wordless apple concept,
which can paint a picture in your visual cortex, or collide with “cheese,” or
recognize an apple when you see one, or taste its archetype in apple pie, maybe
even send out the motor behavior for eating an apple . . .
And it’s not as simple as just calling up a picture from memory. Or how
would you be able to visualize combinations like a “triangular lightbulb”—
imposing triangleness on lightbulbs, keeping the essence of both, even if you’ve
never seen such a thing in your life?
Don’t make the mistake the behaviorists made. There’s far more to speech
than sound in air. The labels are just pointers—“look in memory area 1387540.”
Sooner or later, when you’re handed a pointer, it comes time to dereference it,
and actually look in memory area 1387540.
What does a word point to?
1. Stephen M. Kosslyn, Image and Brain: The Resolution of the Imagery Debate (Cambridge, MA: MIT
Press, 1994).
179
Variable Question Fallacies
While writing the dialogue of Albert and Barry in their dispute over whether
a falling tree in a deserted forest makes a sound, I sometimes found myself
losing empathy with my characters. I would start to lose the gut feel of why
anyone would ever argue like that, even though I’d seen it happen many times.
On these occasions, I would repeat to myself, “Either the falling tree makes
a sound, or it does not!” to restore my borrowed sense of indignation.
(P or ¬P ) is not always a reliable heuristic, if you substitute arbitrary
English sentences for P. “This sentence is false” cannot be consistently viewed
as true or false. And then there’s the old classic, “Have you stopped beating
your wife?”
Now if you are a mathematician, and one who believes in classical (rather
than intuitionistic) logic, there are ways to continue insisting that (P or ¬P )
is a theorem: for example, saying that “This sentence is false” is not a sentence.
But such resolutions are subtle, which suffices to demonstrate a need for
subtlety. You cannot just bull ahead on every occasion with “Either it does or
it doesn’t!”
So does the falling tree make a sound, or not, or . . . ?
Surely, 2 + 2 = X or it does not? Well, maybe, if it’s really the same X,
the same 2, and the same + and = . If X evaluates to 5 on some occasions
and 4 on another, your indignation may be misplaced.
To even begin claiming that (P or ¬P ) ought to be a necessary truth,
the symbol P must stand for exactly the same thing in both halves of the
dilemma. “Either the fall makes a sound, or not!”—but if Albert::sound is not
the same as Barry::sound, there is nothing paradoxical about the tree making
an Albert::sound but not a Barry::sound.
(The :: idiom is something I picked up in my C++ days for avoiding names-
pace collisions. If you’ve got two different packages that define a class Sound,
you can write Package1::Sound to specify which Sound you mean. The idiom
is not widely known, I think; which is a pity, because I often wish I could use it
in writing.)
The variability may be subtle: Albert and Barry may carefully verify that it
is the same tree, in the same forest, and the same occasion of falling, just to
ensure that they really do have a substantive disagreement about exactly the
same event. And then forget to check that they are matching this event against
exactly the same concept.
Think about the grocery store that you visit most often: Is it on the left
side of the street, or the right? But of course there is no “the left side” of the
street, only your left side, as you travel along it from some particular direction.
Many of the words we use are really functions of implicit variables supplied by
context.
It’s actually one heck of a pain, requiring one heck of a lot of work, to handle
this kind of problem in an Artificial Intelligence program intended to parse
language—the phenomenon going by the name of “speaker deixis.”
“Martin told Bob the building was on his left.” But “left” is a function-word
that evaluates with a speaker-dependent variable invisibly grabbed from the
surrounding context. Whose “left” is meant, Bob’s or Martin’s?
The variables in a variable question fallacy often aren’t neatly labeled—it’s
not as simple as “Say, do you think Z + 2 equals 6?”
If a namespace collision introduces two different concepts that look like
“the same concept” because they have the same name—or a map compression
introduces two different events that look like the same event because they
don’t have separate mental files—or the same function evaluates in different
contexts—then reality itself becomes protean, changeable. At least that’s what
the algorithm feels like from inside. Your mind’s eye sees the map, not the
territory directly.
If you have a question with a hidden variable, that evaluates to different
expressions in different contexts, it feels like reality itself is unstable—what
your mind’s eye sees, shifts around depending on where it looks.
This often confuses undergraduates (and postmodernist professors) who
discover a sentence with more than one interpretation; they think they have
discovered an unstable portion of reality.
“Oh my gosh! ‘The Sun goes around the Earth’ is true for Hunga Hunter-
gatherer, but for Amara Astronomer, ‘The Sun goes around the Earth’ is false!
There is no fixed truth!” The deconstruction of this sophomoric nitwittery is
left as an exercise to the reader.
And yet, even I initially found myself writing “If X is 5 on some occasions
and 4 on another, the sentence ‘2 + 2 = X’ may have no fixed truth-value.”
There is not one sentence with a variable truth-value. “2 + 2 = X” has
no truth-value. It is not a proposition, not yet, not as mathematicians define
proposition-ness, any more than “2 + 2 =” is a proposition, or “Fred jumped
over the” is a grammatical sentence.
But this fallacy tends to sneak in, even when you allegedly know better,
because, well, that’s how the algorithm feels from inside.
*
180
37 Ways That Words Can Be Wrong
Some reader is bound to declare that a better title for this essay would be “37
Ways That You Can Use Words Unwisely,” or “37 Ways That Suboptimal Use
Of Categories Can Have Negative Side Effects On Your Cognition.”
But one of the primary lessons of this gigantic list is that saying “There’s
no way my choice of X can be ‘wrong’ ” is nearly always an error in practice,
whatever the theory. You can always be wrong. Even when it’s theoretically
impossible to be wrong, you can still be wrong. There is never a Get Out of Jail
Free card for anything you do. That’s life.
Besides, I can define the word “wrong” to mean anything I like—it’s not
like a word can be wrong.
Personally, I think it quite justified to use the word “wrong” when:
6. You try to define a word using words, in turn defined with ever-more-
abstract words, without being able to point to an example. “What is red?”
“Red is a color.” “What’s a color?” “It’s a property of a thing.” “What’s a
thing? What’s a property?” It never occurs to you to point to a stop sign
and an apple. (Extensions and Intensions)
10. A verbal definition works well enough in practice to point out the intended
cluster of similar things, but you nitpick exceptions. Not every human
has ten fingers, or wears clothes, or uses language; but if you look for
an empirical cluster of things which share these characteristics, you’ll
get enough information that the occasional nine-fingered human won’t
fool you. (The Cluster Structure of Thingspace)
11. You ask whether something “is” or “is not” a category member but can’t
name the question you really want answered. What is a “man”? Is Barney
the Baby Boy a “man”? The “correct” answer may depend considerably
on whether the query you really want answered is “Would hemlock be a
good thing to feed Barney?” or “Will Barney make a good husband?”
(Disguised Queries)
12. You treat intuitively perceived hierarchical categories like the only correct
way to parse the world, without realizing that other forms of statistical
inference are possible even though your brain doesn’t use them. It’s much
easier for a human to notice whether an object is a “blegg” or “rube”;
than for a human to notice that red objects never glow in the dark, but
red furred objects have all the other characteristics of bleggs. Other
statistical algorithms work differently. (Neural Categories)
13. You talk about categories as if they are manna fallen from the Platonic
Realm, rather than inferences implemented in a real brain. The ancient
philosophers said “Socrates is a man,” not, “My brain perceptually
classifies Socrates as a match against the ‘human’ concept.” (How An
Algorithm Feels From Inside)
14. You argue about a category membership even after screening off all ques-
tions that could possibly depend on a category-based inference. After you
observe that an object is blue, egg-shaped, furred, flexible, opaque, lu-
minescent, and palladium-containing, what’s left to ask by arguing, “Is
it a blegg?” But if your brain’s categorizing neural network contains a
(metaphorical) central unit corresponding to the inference of blegg-ness,
it may still feel like there’s a leftover question. (How An Algorithm Feels
From Inside)
15. You allow an argument to slide into being about definitions, even though
it isn’t what you originally wanted to argue about. If, before a dispute
started about whether a tree falling in a deserted forest makes a “sound,”
you asked the two soon-to-be arguers whether they thought a “sound”
should be defined as “acoustic vibrations” or “auditory experiences,”
they’d probably tell you to flip a coin. Only after the argument starts
does the definition of a word become politically charged. (Disputing
Definitions)
16. You think a word has a meaning, as a property of the word itself; rather
than there being a label that your brain associates to a particular concept.
When someone shouts “Yikes! A tiger!,” evolution would not favor
an organism that thinks, “Hm . . . I have just heard the syllables ‘Tie’
and ‘Grr’ which my fellow tribemembers associate with their internal
analogues of my own tiger concept and which aiiieeee crunch crunch
gulp.” So the brain takes a shortcut, and it seems that the meaning of
tigerness is a property of the label itself. People argue about the correct
meaning of a label like “sound.” (Feel the Meaning)
17. You argue over the meanings of a word, even after all sides understand
perfectly well what the other sides are trying to say. The human ability
to associate labels to concepts is a tool for communication. When peo-
ple want to communicate, we’re hard to stop; if we have no common
language, we’ll draw pictures in sand. When you each understand what
is in the other’s mind, you are done. (The Argument From Common
Usage)
18. You pull out a dictionary in the middle of an empirical or moral argument.
Dictionary editors are historians of usage, not legislators of language. If
the common definition contains a problem—if “Mars” is defined as the
God of War, or a “dolphin” is defined as a kind of fish, or “Negroes” are
defined as a separate category from humans, the dictionary will reflect
the standard mistake. (The Argument From Common Usage)
19. You pull out a dictionary in the middle of any argument ever. Seriously,
what the heck makes you think that dictionary editors are an authority
on whether “atheism” is a “religion” or whatever? If you have any
substantive issue whatsoever at stake, do you really think dictionary
editors have access to ultimate wisdom that settles the argument? (The
Argument From Common Usage)
20. You defy common usage without a reason, making it gratuitously hard for
others to understand you. Fast stand up plutonium, with bagels without
handle. (The Argument From Common Usage)
21. You use complex renamings to create the illusion of inference. Is a “hu-
man” defined as a “mortal featherless biped”? Then write: “All [mortal
featherless bipeds] are mortal; Socrates is a [mortal featherless biped];
therefore, Socrates is mortal.” Looks less impressive that way, doesn’t it?
(Empty Labels)
22. You get into arguments that you could avoid if you just didn’t use the
word. If Albert and Barry aren’t allowed to use the word “sound,” then
Albert will have to say “A tree falling in a deserted forest generates
acoustic vibrations,” and Barry will say “A tree falling in a deserted forest
generates no auditory experiences.” When a word poses a problem, the
simplest solution is to eliminate the word and its synonyms. (Taboo
Your Words)
23. The existence of a neat little word prevents you from seeing the details of
the thing you’re trying to think about. What actually goes on in schools
once you stop calling it “education”? What’s a degree, once you stop
calling it a “degree”? If a coin lands “heads,” what’s its radial orientation?
What is “truth,” if you can’t say “accurate” or “correct” or “represent” or
“reflect” or “semantic” or “believe” or “knowledge” or “map” or “real”
or any other simple term? (Replace the Symbol with the Substance)
24. You have only one word, but there are two or more different things-in-
reality, so that all the facts about them get dumped into a single undifferen-
tiated mental bucket. It’s part of a detective’s ordinary work to observe
that Carol wore red last night, or that she has black hair; and it’s part
of a detective’s ordinary work to wonder if maybe Carol dyes her hair.
But it takes a subtler detective to wonder if there are two Carols, so that
the Carol who wore red is not the same as the Carol who had black hair.
(Fallacies of Compression)
25. You see patterns where none exist, harvesting other characteristics from
your definitions even when there is no similarity along that dimension. In
Japan, it is thought that people of blood type A are earnest and creative,
blood type Bs are wild and cheerful, blood type Os are agreeable and
sociable, and blood type ABs are cool and controlled. (Categorizing
Has Consequences)
26. You try to sneak in the connotations of a word, by arguing from a defini-
tion that doesn’t include the connotations. A “wiggin” is defined in the
dictionary as a person with green eyes and black hair. The word “wig-
gin” also carries the connotation of someone who commits crimes and
launches cute baby squirrels, but that part isn’t in the dictionary. So you
point to someone and say: “Green eyes? Black hair? See, told you he’s
a wiggin! Watch, next he’s going to steal the silverware.” (Sneaking in
Connotations)
27. You claim “X, by definition, is a Y !” On such occasions you’re almost
certainly trying to sneak in a connotation of Y that wasn’t in your given
definition. You define “human” as a “featherless biped,” and point to
Socrates and say, “No feathers—two legs—he must be human!” But what
you really care about is something else, like mortality. If what was in
dispute was Socrates’s number of legs, the other fellow would just reply,
“Whaddaya mean, Socrates’s got two legs? That’s what we’re arguing
about in the first place!” (Arguing “By Definition”)
28. You claim “Ps, by definition, are Qs!” If you see Socrates out in the field
with some biologists, gathering herbs that might confer resistance to
hemlock, there’s no point in arguing “Men, by definition, are mortal!”
The main time you feel the need to tighten the vise by insisting that
something is true “by definition” is when there’s other information that
calls the default inference into doubt. (Arguing “By Definition”)
30. Your definition draws a boundary around things that don’t really belong
together. You can claim, if you like, that you are defining the word “fish”
to refer to salmon, guppies, sharks, dolphins, and trout, but not jellyfish
or algae. You can claim, if you like, that this is merely a list, and there is
no way a list can be “wrong.” Or you can stop playing games and admit
that you made a mistake and that dolphins don’t belong on the fish list.
(Where to Draw the Boundary?)
31. You use a short word for something that you won’t need to describe often,
or a long word for something you’ll need to describe often. This can result
in inefficient thinking, or even misapplications of Occam’s Razor, if your
mind thinks that short sentences sound “simpler.” Which sounds more
plausible, “God did a miracle” or “A supernatural universe-creating
entity temporarily suspended the laws of physics”? (Entropy, and Short
Codes)
32. You draw your boundary around a volume of space where there is no
greater-than-usual density, meaning that the associated word does not
correspond to any performable Bayesian inferences. Since green-eyed
people are not more likely to have black hair, or vice versa, and they
don’t share any other characteristics in common, why have a word for
“wiggin”? (Mutual Information, and Density in Thingspace)
33. You draw an unsimple boundary without any reason to do so. The act
of defining a word to refer to all humans, except black people, seems
kind of suspicious. If you don’t present reasons to draw that particular
boundary, trying to create an “arbitrary” word in that location is like
a detective saying: “Well, I haven’t the slightest shred of support one
way or the other for who could’ve murdered those orphans . . . but have
we considered John Q. Wiffleheim as a suspect?” (Superexponential
Conceptspace, and Simple Words)
34. You use categorization to make inferences about properties that don’t have
the appropriate empirical structure, namely, conditional independence
given knowledge of the class, to be well-approximated by Naive Bayes. No
way am I trying to summarize this one. Just read the essay. (Conditional
Independence, and Naive Bayes)
35. You think that words are like tiny little lisp symbols in your mind, rather
than words being labels that act as handles to direct complex mental
paintbrushes that can paint detailed pictures in your sensory workspace.
Visualize a “triangular lightbulb.” What did you see? (Words as Mental
Paintbrush Handles)
36. You use a word that has different meanings in different places as though
it meant the same thing on each occasion, possibly creating the illusion
of something protean and shifting. “Martin told Bob the building was
on his left.” But “left” is a function-word that evaluates with a speaker-
dependent variable grabbed from the surrounding context. Whose “left”
is meant, Bob’s or Martin’s? (Variable Question Fallacies)
37. You think that definitions can’t be “wrong,” or that “I can define a word any
way I like!” This kind of attitude teaches you to indignantly defend your
past actions, instead of paying attention to their consequences, or fessing
up to your mistakes. (37 Ways That Suboptimal Use Of Categories Can
Have Negative Side Effects On Your Cognition)
Everything you do in the mind has an effect, and your brain races ahead
unconsciously without your supervision.
Saying “Words are arbitrary; I can define a word any way I like” makes
around as much sense as driving a car over thin ice with the accelerator floored
and saying, “Looking at this steering wheel, I can’t see why one radial angle is
special—so I can turn the steering wheel any way I like.”
If you’re trying to go anywhere, or even just trying to survive, you had
better start paying attention to the three or six dozen optimality criteria that
control how you use words, definitions, categories, classes, boundaries, labels,
and concepts.
*
Interlude
An Intuitive Explanation of Bayes’s
Theorem
Your friends and colleagues are talking about something called “Bayes’s Theo-
rem” or “Bayes’s Rule,” or something called Bayesian reasoning. They sound
really enthusiastic about it, too, so you google and find a web page about Bayes’s
Theorem and . . .
It’s this equation. That’s all. Just one equation. The page you found gives
a definition of it, but it doesn’t say what it is, or why it’s useful, or why your
friends would be interested in it. It looks like this random statistics thing.
Why does a mathematical concept generate this strange enthusiasm in its
students? What is the so-called Bayesian Revolution now sweeping through
the sciences, which claims to subsume even the experimental method itself as
a special case? What is the secret that the adherents of Bayes know? What is
the light that they have seen?
Soon you will know. Soon you will be one of us.
While there are a few existing online explanations of Bayes’s Theorem,
my experience with trying to introduce people to Bayesian reasoning is that
the existing online explanations are too abstract. Bayesian reasoning is very
counterintuitive. People do not employ Bayesian reasoning intuitively, find
it very difficult to learn Bayesian reasoning when tutored, and rapidly forget
Bayesian methods once the tutoring is over. This holds equally true for novice
students and highly trained professionals in a field. Bayesian reasoning is
apparently one of those things which, like quantum mechanics or the Wason
Selection Test, is inherently difficult for humans to grasp with our built-in
mental faculties.
Or so they claim. Here you will find an attempt to offer an intuitive ex-
planation of Bayesian reasoning—an excruciatingly gentle introduction that
invokes all the human ways of grasping numbers, from natural frequencies to
spatial visualization. The intent is to convey, not abstract rules for manipulat-
ing numbers, but what the numbers mean, and why the rules are what they are
(and cannot possibly be anything else). When you are finished reading this,
you will see Bayesian problems in your dreams.
And let’s begin.
What do you think the answer is? If you haven’t encountered this kind of
problem before, please take a moment to come up with your own answer
before continuing.
Next, suppose I told you that most doctors get the same wrong answer on this
problem—usually, only around 15% of doctors get it right. (“Really? 15%?
Is that a real number, or an urban legend based on an Internet poll?” It’s a
real number. See Casscells, Schoenberger, and Graboys 1978;1 Eddy 1982;2
Gigerenzer and Hoffrage 1995;3 and many other studies. It’s a surprising result
which is easy to replicate, so it’s been extensively replicated.)
On the story problem above, most doctors estimate the probability to be
between 70% and 80%, which is wildly incorrect.
Here’s an alternate version of the problem on which doctors fare somewhat
better:
And finally, here’s the problem on which doctors fare best of all, with 46%—
nearly half—arriving at the correct answer:
The correct answer is 7.8%, obtained as follows: Out of 10,000 women, 100 have
breast cancer; 80 of those 100 have positive mammographies. From the same
10,000 women, 9,900 will not have breast cancer and of those 9,900 women, 950
will also get positive mammographies. This makes the total number of women
with positive mammographies 950 + 80 or 1,030. Of those 1,030 women with
positive mammographies, 80 will have cancer. Expressed as a proportion, this
is 80/1,030 or 0.07767 or 7.8%.
To put it another way, before the mammography screening, the 10,000
women can be divided into two groups:
Summing these two groups gives a total of 10,000 patients, confirming that
none have been lost in the math. After the mammography, the women can be
divided into four groups:
The sum of groups A and B, the groups with breast cancer, corresponds to
group 1; and the sum of groups C and D, the groups without breast cancer,
corresponds to group 2. If you administer a mammography to 10,000 pa-
tients, then out of the 1,030 with positive mammographies, eighty of those
positive-mammography patients will have cancer. This is the correct answer,
the answer a doctor should give a positive-mammography patient if she asks
about the chance she has breast cancer; if thirteen patients ask this question,
roughly one out of those thirteen will have cancer.
The most common mistake is to ignore the original fraction of women with
breast cancer, and the fraction of women without breast cancer who receive
false positives, and focus only on the fraction of women with breast cancer
who get positive results. For example, the vast majority of doctors in these
studies seem to have thought that if around 80% of women with breast cancer
have positive mammographies, then the probability of a women with a positive
mammography having breast cancer must be around 80%.
Figuring out the final answer always requires all three pieces of
information—the percentage of women with breast cancer, the percentage
of women without breast cancer who receive false positives, and the percentage
of women with breast cancer who receive (correct) positives.
The original proportion of patients with breast cancer is known as the prior
probability. The chance that a patient with breast cancer gets a positive mam-
mography, and the chance that a patient without breast cancer gets a positive
mammography, are known as the two conditional probabilities. Collectively,
this initial information is known as the priors. The final answer—the estimated
probability that a patient has breast cancer, given that we know she has a posi-
tive result on her mammography—is known as the revised probability or the
posterior probability. What we’ve just seen is that the posterior probability
depends in part on the prior probability.
To see that the final answer always depends on the original fraction of
women with breast cancer, consider an alternate universe in which only one
woman out of a million has breast cancer. Even if mammography in this world
detects breast cancer in 8 out of 10 cases, while returning a false positive on
a woman without breast cancer in only 1 out of 10 cases, there will still be a
hundred thousand false positives for every real case of cancer detected. The
original probability that a woman has cancer is so extremely low that, although
a positive result on the mammography does increase the estimated probability,
the probability isn’t increased to certainty or even “a noticeable chance”; the
probability goes from 1:1,000,000 to 1:100,000.
What this demonstrates is that the mammography result doesn’t replace
your old information about the patient’s chance of having cancer; the mam-
mography slides the estimated probability in the direction of the result. A
positive result slides the original probability upward; a negative result slides
the probability downward. For example, in the original problem where 1%
of the women have cancer, 80% of women with cancer get positive mammo-
graphies, and 9.6% of women without cancer get positive mammographies, a
positive result on the mammography slides the 1% chance upward to 7.8%.
Most people encountering problems of this type for the first time carry
out the mental operation of replacing the original 1% probability with the
80% probability that a woman with cancer gets a positive mammography. It
may seem like a good idea, but it just doesn’t work. “The probability that
a woman with a positive mammography has breast cancer” is not at all the
same thing as “the probability that a woman with breast cancer has a positive
mammography”; they are as unlike as apples and cheese.
Suppose that a barrel contains many small plastic eggs. Some eggs are painted
red and some are painted blue. 40% of the eggs in the bin contain pearls, and
60% contain nothing. 30% of eggs containing pearls are painted blue, and 10%
of eggs containing nothing are painted blue. What is the probability that a blue
egg contains a pearl? For this example the arithmetic is simple enough that
you may be able to do it in your head, and I would suggest trying to do so.
A more compact way of specifying the problem:
P (pearl) = 40%
P (blue|pearl) = 30%
P (blue|¬pearl) = 10%
P (pearl|blue) = ?
The symbol “¬” is shorthand for “not,” so ¬pearl reads “not pearl.”
The notation P (blue|pearl) is shorthand for “the probability of blue given
pearl” or “the probability that an egg is painted blue, given that the egg con-
tains a pearl.” The item on the right side is what you already know or the
premise, and the item on the left side is the implication or conclusion. If we have
P (blue|pearl) = 30%, and we already know that some egg contains a pearl,
then we can conclude there is a 30% chance that the egg is painted blue. Thus,
the final fact we’re looking for—“the chance that a blue egg contains a pearl”
or “the probability that an egg contains a pearl, if we know the egg is painted
blue”—reads P (pearl|blue).
40% of the eggs contain pearls, and 60% of the eggs contain nothing. 30%
of the eggs containing pearls are painted blue, so 12% of the eggs altogether
contain pearls and are painted blue. 10% of the eggs containing nothing are
painted blue, so altogether 6% of the eggs contain nothing and are painted
blue. A total of 18% of the eggs are painted blue, and a total of 12% of the eggs
are painted blue and contain pearls, so the chance a blue egg contains a pearl
is 12/18 or 2/3 or around 67%.
As before, we can see the necessity of all three pieces of information by
considering extreme cases. In a (large) barrel in which only one egg out of
a thousand contains a pearl, knowing that an egg is painted blue slides the
probability from 0.1% to 0.3% (instead of sliding the probability from 40% to
67%). Similarly, if 999 out of 1,000 eggs contain pearls, knowing that an egg is
blue slides the probability from 99.9% to 99.966%; the probability that the egg
does not contain a pearl goes from 1/1,000 to around 1/3,000.
On the pearl-egg problem, most respondents unfamiliar with Bayesian
reasoning would probably respond that the probability a blue egg contains a
pearl is 30%, or perhaps 20% (the 30% chance of a true positive minus the 10%
chance of a false positive). Even if this mental operation seems like a good
idea at the time, it makes no sense in terms of the question asked. It’s like the
experiment in which you ask a second-grader: “If eighteen people get on a bus,
and then seven more people get on the bus, how old is the bus driver?” Many
second-graders will respond: “Twenty-five.” They understand when they’re
being prompted to carry out a particular mental procedure, but they haven’t
quite connected the procedure to reality. Similarly, to find the probability that
a woman with a positive mammography has breast cancer, it makes no sense
whatsoever to replace the original probability that the woman has cancer with
the probability that a woman with breast cancer gets a positive mammography.
Neither can you subtract the probability of a false positive from the probability
of the true positive. These operations are as wildly irrelevant as adding the
number of people on the bus to find the age of the bus driver.
A study by Gigerenzer and Hoffrage in 1995 showed that some ways of phrasing
story problems are much more evocative of correct Bayesian reasoning.4 The
least evocative phrasing used probabilities. A slightly more evocative phrasing
used frequencies instead of probabilities; the problem remained the same, but
instead of saying that 1% of women had breast cancer, one would say that 1 out
of 100 women had breast cancer, that 80 out of 100 women with breast cancer
would get a positive mammography, and so on. Why did a higher proportion
of subjects display Bayesian reasoning on this problem? Probably because
saying “1 out of 100 women” encourages you to concretely visualize X women
with cancer, leading you to visualize X women with cancer and a positive
mammography, etc.
The most effective presentation found so far is what’s known as natural
frequencies—saying that 40 out of 100 eggs contain pearls, 12 out of 40 eggs
containing pearls are painted blue, and 6 out of 60 eggs containing nothing
are painted blue. A natural frequencies presentation is one in which the infor-
mation about the prior probability is included in presenting the conditional
probabilities. If you were just learning about the eggs’ conditional probabilities
through natural experimentation, you would—in the course of cracking open
a hundred eggs—crack open around 40 eggs containing pearls, of which 12
eggs would be painted blue, while cracking open 60 eggs containing nothing,
of which about 6 would be painted blue. In the course of learning the condi-
tional probabilities, you’d see examples of blue eggs containing pearls about
twice as often as you saw examples of blue eggs containing nothing.
Unfortunately, while natural frequencies are a step in the right direction, it
probably won’t be enough. When problems are presented in natural frequen-
cies, the proportion of people using Bayesian reasoning rises to around half. A
big improvement, but not big enough when you’re talking about real doctors
and real patients.
Actually, priors are true or false just like the final answer—they reflect reality
and can be judged by comparing them against reality. For example, if you think
that 920 out of 10,000 women in a sample have breast cancer, and the actual
number is 100 out of 10,000, then your priors are wrong. For our particular
problem, the priors might have been established by three studies—a study on
the case histories of women with breast cancer to see how many of them tested
positive on a mammography, a study on women without breast cancer to see
how many of them test positive on a mammography, and an epidemiological
study on the prevalence of breast cancer in some specific demographic.
The probability P (A, B) is the same as P (B, A), but P (A|B) is not the same
thing as P (B|A), and P (A, B) is completely different from P (A|B). It’s a
common confusion to mix up some or all of these quantities.
To get acquainted with all the relationships between them, we’ll play “fol-
low the degrees of freedom.” For example, the two quantities P (cancer) and
P (¬cancer) have one degree of freedom between them, because of the gen-
eral law P (A) + P (¬A) = 1. If you know that P (¬cancer) = 0.99, you can
obtain P (cancer) = 1 − P (¬cancer) = 0.01.
The quantities P (positive|cancer) and P (¬positive|cancer) also have only
one degree of freedom between them; either a woman with breast cancer gets a
positive mammography or she doesn’t. On the other hand, P (positive|cancer)
and P (positive|¬cancer) have two degrees of freedom. You can have a mam-
mography test that returns positive for 80% of cancer patients and 9.6% of
healthy patients, or that returns positive for 70% of cancer patients and 2% of
healthy patients, or even a health test that returns “positive” for 30% of can-
cer patients and 92% of healthy patients. The two quantities, the output of
the mammography test for cancer patients and the output of the mammog-
raphy test for healthy patients, are in mathematical terms independent; one
cannot be obtained from the other in any way, and so they have two degrees of
freedom between them.
What about P (positive, cancer), P (positive|cancer), and P (cancer)?
Here we have three quantities; how many degrees of freedom are there? In this
case the equation that must hold is
This equality reduces the degrees of freedom by one. If we know the fraction
of patients with cancer, and the chance that a cancer patient has a positive
mammography, we can deduce the fraction of patients who have breast cancer
and a positive mammography by multiplying.
Similarly, if we know the number of patients with breast cancer and positive
mammographies, and also the number of patients with breast cancer, we can
estimate the chance that a woman with breast cancer gets a positive mammog-
raphy by dividing: P (positive|cancer) = P (positive, cancer)/P (cancer). In
fact, this is exactly how such medical diagnostic tests are calibrated; you do a
study on 8,520 women with breast cancer and see that there are 6,816 (or there-
abouts) women with breast cancer and positive mammographies, then divide
6,816 by 8,520 to find that 80% of women with breast cancer had positive mam-
mographies. (Incidentally, if you accidentally divide 8,520 by 6,816 instead of
the other way around, your calculations will start doing strange things, such
as insisting that 125% of women with breast cancer and positive mammogra-
phies have breast cancer. This is a common mistake in carrying out Bayesian
arithmetic, in my experience.) And finally, if you know P (positive, cancer)
and P (positive|cancer), you can deduce how many cancer patients there must
have been originally. There are two degrees of freedom shared out among the
three quantities; if we know any two, we can deduce the third.
How about P (positive), P (positive, cancer), and P (positive, ¬cancer)?
Again there are only two degrees of freedom among these three variables. The
equation occupying the extra degree of freedom is
This is how P (positive) is computed to begin with; we figure out the num-
ber of women with breast cancer who have positive mammographies, and
the number of women without breast cancer who have positive mammo-
graphies, then add them together to get the total number of women with
positive mammographies. It would be very strange to go out and conduct a
study to determine the number of women with positive mammographies—
just that one number and nothing else—but in theory you could do so.
And if you then conducted another study and found the number of those
women who had positive mammographies and breast cancer, you would also
know the number of women with positive mammographies and no breast
cancer—either a woman with a positive mammography has breast cancer or
she doesn’t. In general, P (A, B) + P (A, ¬B) = P (A). Symmetrically,
P (A, B) + P (¬A, B) = P (B).
What about P (positive, cancer), P (positive, ¬cancer), P (¬positive,
cancer), and P (¬positive, ¬cancer)? You might at first be tempted to think
that there are only two degrees of freedom for these four quantities—that
you can, for example, get P (positive, ¬cancer) by multiplying P (positive) ×
P (¬cancer), and thus that all four quantities can be found given only
the two quantities P (positive) and P (cancer). This is not the case!
P (positive, ¬cancer) = P (positive) × P (¬cancer) only if the two proba-
bilities are statistically independent—if the chance that a woman has breast
cancer has no bearing on whether she has a positive mammography. This
amounts to requiring that the two conditional probabilities be equal to each
other—a requirement which would eliminate one degree of freedom. If you
remember that these four quantities are the groups A, B, C, and D, you can
look over those four groups and realize that, in theory, you can put any num-
ber of people into the four groups. If you start with a group of 80 women with
breast cancer and positive mammographies, there’s no reason why you can’t
add another group of 500 women with breast cancer and negative mammo-
graphies, followed by a group of 3 women without breast cancer and negative
mammographies, and so on. So now it seems like the four quantities have
four degrees of freedom. And they would, except that in expressing them as
probabilities, we need to normalize them to fractions of the complete group,
which adds the constraint that P (positive, cancer) + P (positive, ¬cancer) +
P (¬positive, cancer) + P (¬positive, ¬cancer) = 1. This equation takes up
one degree of freedom, leaving three degrees of freedom among the four quan-
tities. If you specify the fractions of women in groups A, B, and D, you can
deduce the fraction of women in group C.
Given the four groups A, B, C, and D, it is very straightforward to compute
everything else:
A+B
P (cancer) =
A+B+C +D
B
P (¬positive|cancer) = ,
A+B
and so on. Since {A, B, C, D} contains three degrees of freedom, it follows
that the entire set of probabilities relating cancer rates to test results contains
only three degrees of freedom. Remember that in our problems we always
needed three pieces of information—the prior probability and the two con-
ditional probabilities—which, indeed, have three degrees of freedom among
them. Actually, for Bayesian problems, any three quantities with three degrees
of freedom between them should logically specify the entire problem.
The probability that a test gives a true positive divided by the probability that
a test gives a false positive is known as the likelihood ratio of that test. The
likelihood ratio for a positive result summarizes how much a positive result
will slide the prior probability. Does the likelihood ratio of a medical test then
sum up everything there is to know about the usefulness of the test?
No, it does not! The likelihood ratio sums up everything there is to know
about the meaning of a positive result on the medical test, but the meaning of a
negative result on the test is not specified, nor is the frequency with which the
test is useful. For example, a mammography with a hit rate of 80% for patients
with breast cancer and a false positive rate of 9.6% for healthy patients has the
same likelihood ratio as a test with an 8% hit rate and a false positive rate of
0.96%. Although these two tests have the same likelihood ratio, the first test is
more useful in every way—it detects disease more often, and a negative result
is stronger evidence of health.
Suppose that you apply two tests for breast cancer in succession—say, a standard
mammography and also some other test which is independent of mammogra-
phy. Since I don’t know of any such test that is independent of mammography,
I’ll invent one for the purpose of this problem, and call it the Tams-Braylor
Division Test, which checks to see if any cells are dividing more rapidly than
other cells. We’ll suppose that the Tams-Braylor gives a true positive for 90%
of patients with breast cancer, and gives a false positive for 5% of patients with-
out cancer. Let’s say the prior prevalence of breast cancer is 1%. If a patient
gets a positive result on her mammography and her Tams-Braylor, what is the
revised probability she has breast cancer?
One way to solve this problem would be to take the revised probability for
a positive mammography, which we already calculated as 7.8%, and plug that
into the Tams-Braylor test as the new prior probability. If we do this, we find
that the result comes out to 60%.
Suppose that the prior prevalence of breast cancer in a demographic is 1%.
Suppose that we, as doctors, have a repertoire of three independent tests for
breast cancer. Our first test, test A, a mammography, has a likelihood ratio
of 80%/9.6% = 8.33. The second test, test B, has a likelihood ratio of 18.0
(for example, from 90% versus 5%); and the third test, test C, has a likelihood
ratio of 3.5 (which could be from 70% versus 20%, or from 35% versus 10%; it
makes no difference). Suppose a patient gets a positive result on all three tests.
What is the probability the patient has breast cancer?
Here’s a fun trick for simplifying the bookkeeping. If the prior prevalence
of breast cancer in a demographic is 1%, then 1 out of 100 women have breast
cancer, and 99 out of 100 women do not have breast cancer. So if we rewrite
the probability of 1% as an odds ratio, the odds are 1:99.
And the likelihood ratios of the three tests A, B, and C are:
8.33 : 1 = 25 : 3
18.0 : 1 = 18 : 1
3.5 : 1 = 7 : 2 .
The odds for women with breast cancer who score positive on all three tests,
versus women without breast cancer who score positive on all three tests, will
equal:
1 × 25 × 18 × 7 : 99 × 3 × 1 × 2 = 3150 : 594 .
To recover the probability from the odds, we just write:
This always works regardless of how the odds ratios are written; i.e., 8.33:1
is just the same as 25:3 or 75:9. It doesn’t matter in what order the tests are
administered, or in what order the results are computed. The proof is left as an
exercise for the reader.
intensity = 10decibels/10 .
Suppose we start with a prior probability of 1% that a woman has breast cancer,
corresponding to an odds ratio of 1:99. And then we administer three tests of
likelihood ratios 25:3, 18:1, and 7:2. You could multiply those numbers . . . or
you could just add their logarithms:
It starts out as fairly unlikely that a woman has breast cancer—our credibility
level is at −20 decibels. Then three test results come in, corresponding to 9,
13, and 5 decibels of evidence. This raises the credibility level by a total of 27
decibels, meaning that the prior credibility of −20 decibels goes to a posterior
credibility of 7 decibels. So the odds go from 1:99 to 5:1, and the probability
goes from 1% to around 83%.
P (positive|cancer) × P (cancer)
!
P (positive|cancer) × P (cancer)
+ P (positive|¬cancer) × P (¬cancer)
which is
P (positive, cancer)
P (positive, cancer) + P (positive, ¬cancer)
which is
P (positive, cancer)
P (positive)
which is
P (cancer|positive) .
The fully general form of this calculation is known as Bayes’s Theorem or Bayes’s
Rule.
P (X|A) × P (A)
P (A|X) = .
P (X|A) × P (A) + P (X|¬A) × P (¬A)
Bayes’s Theorem describes what makes something “evidence” and how much
evidence it is. Statistical models are judged by comparison to the Bayesian
method because, in statistics, the Bayesian method is as good as it gets—the
Bayesian method defines the maximum amount of mileage you can get out
of a given piece of evidence, in the same way that thermodynamics defines
the maximum amount of work you can get out of a temperature differential.
This is why you hear cognitive scientists talking about Bayesian reasoners. In
cognitive science, Bayesian reasoner is the technically precise code word that
we use to mean rational mind.
There are also a number of general heuristics about human reasoning that
you can learn from looking at Bayes’s Theorem.
For example, in many discussions of Bayes’s Theorem, you may hear cogni-
tive psychologists saying that people do not take prior frequencies sufficiently
into account, meaning that when people approach a problem where there’s
some evidence X indicating that condition A might hold true, they tend to
judge A’s likelihood solely by how well the evidence X seems to match A,
without taking into account the prior frequency of A. If you think, for exam-
ple, that under the mammography example, the woman’s chance of having
breast cancer is in the range of 70%–80%, then this kind of reasoning is insen-
sitive to the prior frequency given in the problem; it doesn’t notice whether
1% of women or 10% of women start out having breast cancer. “Pay more at-
tention to the prior frequency!” is one of the many things that humans need to
bear in mind to partially compensate for our built-in inadequacies.
A related error is to pay too much attention to P (X|A) and not enough to
P (X|¬A) when determining how much evidence X is for A. The degree to
which a result X is evidence for A depends not only on the strength of the state-
ment we’d expect to see result X if A were true, but also on the strength of the
statement we wouldn’t expect to see result X if A weren’t true. For example, if it
is raining, this very strongly implies the grass is wet—P (wetgrass|rain) ≈ 1—
but seeing that the grass is wet doesn’t necessarily mean that it has just rained;
perhaps the sprinkler was turned on, or you’re looking at the early morning dew.
Since P (wetgrass|¬rain) is substantially greater than zero, P (rain|wetgrass)
is substantially less than one. On the other hand, if the grass was never wet
when it wasn’t raining, then knowing that the grass was wet would always show
that it was raining, P (rain|wetgrass) ≈ 1, even if P (wetgrass|rain) = 50%;
that is, even if the grass only got wet 50% of the times it rained. Evidence is
always the result of the differential between the two conditional probabilities.
Strong evidence is not the product of a very high probability that A leads to X,
but the product of a very low probability that not-A could have led to X.
The Bayesian revolution in the sciences is fueled, not only by more and more
cognitive scientists suddenly noticing that mental phenomena have Bayesian
structure in them; not only by scientists in every field learning to judge their
statistical methods by comparison with the Bayesian method; but also by the
idea that science itself is a special case of Bayes’s Theorem; experimental evidence
is Bayesian evidence. The Bayesian revolutionaries hold that when you perform
an experiment and get evidence that “confirms” or “disconfirms” your theory,
this confirmation and disconfirmation is governed by the Bayesian rules. For
example, you have to take into account not only whether your theory predicts
the phenomenon, but whether other possible explanations also predict the
phenomenon.
Previously, the most popular philosophy of science was probably Karl Pop-
per’s falsificationism—this is the old philosophy that the Bayesian revolution is
currently dethroning. Karl Popper’s idea that theories can be definitely falsi-
fied, but never definitely confirmed, is yet another special case of the Bayesian
rules; if P (X|A) ≈ 1—if the theory makes a definite prediction—then ob-
serving ¬X very strongly falsifies A. On the other hand, if P (X|A) ≈ 1, and
we observe X, this doesn’t definitely confirm the theory; there might be some
other condition B such that P (X|B) ≈ 1, in which case observing X doesn’t
favor A over B. For observing X to definitely confirm A, we would have to
know, not that P (X|A) ≈ 1, but that P (X|¬A) ≈ 0, which is something
that we can’t know because we can’t range over all possible alternative expla-
nations. For example, when Einstein’s theory of General Relativity toppled
Newton’s incredibly well-confirmed theory of gravity, it turned out that all of
Newton’s predictions were just a special case of Einstein’s predictions.
You can even formalize Popper’s philosophy mathematically. The likeli-
hood ratio for X, the quantity P (X|A)/P (X|¬A), determines how much
observing X slides the probability for A; the likelihood ratio is what says how
strong X is as evidence. Well, in your theory A, you can predict X with prob-
ability 1, if you like; but you can’t control the denominator of the likelihood
ratio, P (X|¬A)—there will always be some alternative theories that also pre-
dict X, and while we go with the simplest theory that fits the current evidence,
you may someday encounter some evidence that an alternative theory predicts
but your theory does not. That’s the hidden gotcha that toppled Newton’s
theory of gravity. So there’s a limit on how much mileage you can get from
successful predictions; there’s a limit on how high the likelihood ratio goes for
confirmatory evidence.
On the other hand, if you encounter some piece of evidence Y that is
definitely not predicted by your theory, this is enormously strong evidence
against your theory. If P (Y |A) is infinitesimal, then the likelihood ratio will
also be infinitesimal. For example, if P (Y |A) is 0.0001%, and P (Y |¬A) is
1%, then the likelihood ratio P (Y |A)/P (Y |¬A) will be 1:10,000. That’s −40
decibels of evidence! Or, flipping the likelihood ratio, if P (Y |A) is very small,
then P (Y |¬A)/P (Y |A) will be very large, meaning that observing Y greatly
favors ¬A over A. Falsification is much stronger than confirmation. This is a
consequence of the earlier point that very strong evidence is not the product
of a very high probability that A leads to X, but the product of a very low
probability that not-A could have led to X. This is the precise Bayesian rule
that underlies the heuristic value of Popper’s falsificationism.
Similarly, Popper’s dictum that an idea must be falsifiable can be inter-
preted as a manifestation of the Bayesian conservation-of-probability rule; if a
result X is positive evidence for the theory, then the result ¬X would have
disconfirmed the theory to some extent. If you try to interpret both X and
¬X as “confirming” the theory, the Bayesian rules say this is impossible! To
increase the probability of a theory you must expose it to tests that can po-
tentially decrease its probability; this is not just a rule for detecting would-be
cheaters in the social process of science, but a consequence of Bayesian proba-
bility theory. On the other hand, Popper’s idea that there is only falsification
and no such thing as confirmation turns out to be incorrect. Bayes’s Theorem
shows that falsification is very strong evidence compared to confirmation, but
falsification is still probabilistic in nature; it is not governed by fundamentally
different rules from confirmation, as Popper argued.
So we find that many phenomena in the cognitive sciences, plus the statisti-
cal methods used by scientists, plus the scientific method itself, are all turning
out to be special cases of Bayes’s Theorem. Hence the Bayesian revolution.
P (X|A) × P (A)
P (A|X) =
P (X|A) × P (A) + P (X|¬A) × P (¬A)
We’ll start with P (A|X). If you ever find yourself getting confused about
what’s A and what’s X in Bayes’s Theorem, start with P (A|X) on the left
side of the equation; that’s the simplest part to interpret. In P (A|X), A is the
thing we want to know about. X is how we’re observing it; X is the evidence
we’re using to make inferences about A. Remember that for every expression
P (Q|P ), we want to know about the probability for Q given P, the degree
to which P implies Q—a more sensible notation, which it is now too late to
adopt, would be P (Q ← P ).
P (Q|P ) is closely related to P (Q, P ), but they are not identical. Expressed
as a probability or a fraction, P (Q, P ) is the proportion of things that have
property Q and property P among all things; e.g., the proportion of “women
with breast cancer and a positive mammography” within the group of all
women. If the total number of women is 10,000, and 80 women have breast
cancer and a positive mammography, then P (Q, P ) is 80/10,000 = 0.8%. You
might say that the absolute quantity, 80, is being normalized to a probability
relative to the group of all women. Or to make it clearer, suppose that there’s a
group of 641 women with breast cancer and a positive mammography within
a total sample group of 89,031 women. Six hundred and forty-one is the
absolute quantity. If you pick out a random woman from the entire sample,
then the probability you’ll pick a woman with breast cancer and a positive
mammography is P (Q, P ), or 0.72% (in this example).
On the other hand, P (Q|P ) is the proportion of things that have property
Q and property P among all things that have P ; e.g., the proportion of women
with breast cancer and a positive mammography within the group of all women
with positive mammographies. If there are 641 women with breast cancer and
positive mammographies, 7,915 women with positive mammographies, and
89,031 women, then P (Q, P ) is the probability of getting one of those 641
women if you’re picking at random from the entire group of 89,031, while
P (Q|P ) is the probability of getting one of those 641 women if you’re picking
at random from the smaller group of 7,915.
In a sense, P (Q|P ) really means P (Q, P |P ), but specifying the extra P
all the time would be redundant. You already know it has property P, so the
property you’re investigating is Q—even though you’re looking at the size
of group (Q, P ) within group P, not the size of group Q within group P
(which would be nonsense). This is what it means to take the property on
the right-hand side as given; it means you know you’re working only within
the group of things that have property P. When you constrict your focus of
attention to see only this smaller group, many other probabilities change. If
you’re taking P as given, then P (Q, P ) equals just P (Q)—at least, relative
to the group P . The old P (Q), the frequency of “things that have property
Q within the entire sample,” is revised to the new frequency of “things that
have property Q within the subsample of things that have property P. ” If P is
given, if P is our entire world, then looking for (Q, P ) is the same as looking
for just Q.
If you constrict your focus of attention to only the population of eggs that
are painted blue, then suddenly “the probability that an egg contains a pearl”
becomes a different number; this proportion is different for the population of
blue eggs than the population of all eggs. The given, the property that constricts
our focus of attention, is always on the right side of P (Q|P ); the P becomes
our world, the entire thing we see, and on the other side of the “given” P
always has probability 1—that is what it means to take P as given. So P (Q|P )
means “If P has probability 1, what is the probability of Q?” or “If we constrict
our attention to only things or events where P is true, what is the probability
of Q?” The statement Q, on the other side of the given, is not certain—its
probability may be 10% or 90% or any other number. So when you use Bayes’s
Theorem, and you write the part on the left side as P (A|X)—how to update
the probability of A after seeing X, the new probability of A given that we
know X, the degree to which X implies A—you can tell that X is always the
observation or the evidence, and A is the property being investigated, the thing
you want to know about.
The right side of Bayes’s Theorem is derived from the left side through these
steps:
P (A|X) = P (A|X)
P (X, A)
P (A|X) =
P (X)
P (X, A)
P (A|X) =
P (X, A) + P (X, ¬A)
P (X|A) × P (A)
P (A|X) = .
P (X|A) × P (A) + P (X|¬A) × P (¬A)
Once the derivation is finished, all the implications on the right side of the equa-
tion are of the form P (X|A) or P (X|¬A), while the implication on the left
side is P (A|X). The symmetry arises because the elementary causal relations
are generally implications from facts to observations, e.g., from breast cancer to
positive mammography. The elementary steps in reasoning are generally impli-
cations from observations to facts, e.g., from a positive mammography to breast
cancer. The left side of Bayes’s Theorem is an elementary inferential step from
the observation of positive mammography to the conclusion of an increased
probability of breast cancer. Implication is written right-to-left, so we write
P (cancer|positive) on the left side of the equation. The right side of Bayes’s
Theorem describes the elementary causal steps—for example, from breast can-
cer to a positive mammography—and so the implications on the right side of
Bayes’s Theorem take the form P (positive|cancer) or P (positive|¬cancer).
And that’s Bayes’s Theorem. Rational inference on the left end, physical
causality on the right end; an equation with mind on one side and reality on
the other. Remember how the scientific method turned out to be a special
case of Bayes’s Theorem? If you wanted to put it poetically, you could say that
Bayes’s Theorem binds reasoning into the physical universe.
Okay, we’re done.
3. Gerd Gigerenzer and Ulrich Hoffrage, “How to Improve Bayesian Reasoning without Instruction:
Frequency Formats,” Psychological Review 102 (1995): 684–704.
4. Ibid.
5. Edwin T. Jaynes, “Probability Theory, with Applications in Science and Engineering,” Unpublished
manuscript (1974).