Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
108 views19 pages

Koponen - Data Visualization Handbook - Introduction

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
108 views19 pages

Koponen - Data Visualization Handbook - Introduction

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

CONTENTS IV

INFORMATION
DESIGN GENRES 111

++ The origin of information ++ Scatterplot 190


design genres 111
++ Correlation 192
++ The emergence of writing systems:
from pictograms to alphabets 114 ++ Visualizing uncertainty
FOREWORD 13 and variation 196

Information illustration 119 ++ Pictorial unit chart 198

++ Annotated photograph 122 ++ Isotype 200

++ Artist’s rendering 122 ++ Chart design 207

I III ++ Field guide illustration 124


Concept graphics 219
++ Diagram 126
INTRODUCTION 19 GENERAL PRINCIPLES ++ Periodic table 220
++ Pictograms 126
OF VISUALIZATION DESIGN 83 ++ Matrix 222
++ Vision is the ++ Cultural stereotypes
strongest human sense 20 ++ Consistency 83 in pictograms 130 ++ Quad chart 222
++ What is information design? 23 ++ Narrative structures 87 ++ Technical drawing 132 ++ Venn diagram 223
++ When should data be presented ++ Layout 88 ++ Cutaway drawing 134 ++ Timeline 223
in visual form and when not? 29
++ Combination charts 93 ++ Step-by-step diagram 134 ++ Word cloud and
++ The grammar of word bubbles 224
information graphics 31 ++ Organization
and categorization of data 94 Maps 137 Network diagrams 227
++ Simplify, compare and organize 33
++ Visualizing multivariate ++ Data maps 142 ++ Force-directed algorithms 228
++ The golden rule of data using small multiples 100
information design 42 ++ Tabula Peutingeriana 158 ++ Tree structure diagrams 230
++ Importance of esthetic choices 101
++ Geographical terms 161
Scientific visualizations 235
++ Map design 162
II ++ Three-dimensional
++ What is GIS? 163 scientific visualizations 236
VISUAL PERCEPTION ++ Map symbols 170 ++ The structure of the Drosophila
IN ACTION 47 ++ Map projections 172 larva wing imaginal disc 237

++ Visual queries 47 ++ Maps as means of wayshowing 175 ++ HemoVis 238

++ The process of visual perception 49 ++ Tractographic visualization


Statistical graphics 179 using Google Maps 239
++ Depth perception 63
++ Bar chart 180 ++ Detailed visualization of
++ Color 64 the synaptic structure 240
++ Dot plot 183
++ Three-dimensional structure
++ Line chart 184 of the local universe 241
++ Pie and donut charts 188

6 DATA VISUALIZATION 7 contents


HANDBOOK +
V VI

TEXT AND TYPOGRAPHY 245 INFORMATION DESIGN APPENDICES 315 Index 332
WORKFLOW 277
++ Body size and x-height 246
++ One, two or three roles? 278 Map projections 316 Bibliography 340
++ Typefaces 249
++ Project phases 283 ++ Choice of map projection 321
++ Upper- and lower-case letters 256
Picture sources
++ The designer –
++ Text typography 256
a resource or a journalist? 292 Interaction 325 and credits 348
++ Romanization 262 ++ General design principles for
++ The New York Times
++ Map typography 264 graphics desk 294 interactive user interfaces 325

++ Information design in a ++ Basic interactions 326


public sector organization: ++ Interactions typical
Algemene Rekenkamer 296 of visualizations 329
++ Malofiej infographics
competition 298

++ The ethics of
information design 299
++ Guidelines for visual journalists 309

++ Guidelines for journalists 310

8 DATA VISUALIZATION 9 contents


HANDBOOK +
Acknowledgements

This book would never have been written without Tapio Vapaasalo.
Neither would it ever have been finished without the hard work of our
production editor, Pia Alapeteri.

We are also deeply in debt to Alberto Cairo, who has been amazingly
supportive in our undertaking, as well as an inspiration.

Annu Ahonen, Iida Turpeinen, Kimmo Vehkalahti, and Jussi


Tuulensuu have been immensely helpful in commenting the manuscript
at its various stages.

We want to thank everyone who agreed to be interviewed for the book:


Fernando Goméz Baptista, Bonnie Berkowitz, Jonathon Berlin,
Brian Boyer, Jen Christiansen, Ben Fry, Martine Hendriksen,
Scott Klein, Miska Knapek, Vesa Kuusela, Charlie Loyd, Damon
Burgett, Stefanie Posavec, Kim Rees, Brett Johnson, Jon
Schwabish, Harri Siirtola, Robert Simmon, Yrjö Sucksdorff,
Mikko Hynninen, Jan Willem Tulp, Ben Welsh, Sakke Yrjölä, and
Javier Zarracina.

Likewise, we thank everyone who gave us the permission to feature


their work in the book; especially Anssi Arte, Nadieh Bremer, Juan
Colombato, Stewart Gray, Sami Heiskanen, Benjamin Hennig,
Hannu Kyyriäinen, Viktor Landström, Johannes Nieminen,
Tuomas Siitonen, Janne Pulkkinen, Pekka Veikkolainen, Hannes
Vartiainen, Lauri Vanhala, and Aljaž Vindiš.

Paula Ahonen-Rainio, Keetos Balion, Otto Donner, Juuli


Hurskainen, Tiina Koivusalo, Tommi Kovala, Esa Lehtinen,
Eemeli Nieminen, Eeva Rautio, Ville Tikkanen, and Tuukka
Ylä-Anttila have helped us in other important ways.

We wish to thank our friends and families for their patience and support.

10 DATA VISUALIZATION 11 acknowledgements


HANDBOOK +
FOREWORD

E very few days, we generate more data than had been


produced since the dawn of history up to the year 2000.
As Google’s chief economist Hal Varian has stated, we “have
essentially free and ubiquitous data … [now] the scarce factor is
the ability to understand that data and extract value from it.” In
the end, data is valuable only to the extent that it can, in one way
or another, be transformed into knowledge and wisdom.
Although artificial intelligence and other automated methods
for analyzing data have seen impressive improvements during
the past years, computers are not about to replace people as
the final arbiter of meaning and relevance. For the foreseeable
future, human beings—not machines—are going to be the
end-users of data, and any insights that might arise from it.
Consequently, the relevant question is: how can a person make
sense of this deluge of data, and gain actionable insights from it?
However versatile and adaptable we are as a species, humans
were not built for absorbing raw data. We have evolved to
survive in an environment very different from the one most
of us currently inhabit, engaged in tasks very different from
combing through databases or spreadsheets for nuggets of
information. Improvements in nutrition, health, and education
notwithstanding, our brains and sensory systems have remained
essentially unchanged for the 300,000 or more years our species
has existed. This sets limits to our ability to absorb and process
information.
Sensory stimulation is the only way for information from the
outside world to enter our brain, and by far, vision is the most
important of our senses. “The surface is the only way in,” as pro-
fessor emeritus Tapio Vapaasalo (our co-author in Tieto näkyväksi,
the Finnish-language version of this book) has put it. Information
design is the art and science of transforming data into visual
structures, from which we can extract meaning—make compari-
sons, see trends, recognize patterns, spot exceptions and outliers,
understand causal relationships, and much more.

12 DATA VISUALIZATION 13 Foreword


HANDBOOK +
As more and more activities become increasingly data-inten-
sive, the ability to analyze data and communicate data-driven
insights effectively become essential skills for people in a wide
range of professions. Although graphic designers, communica-
tors, and all those who are professionally involved in information
dissemination have an important role to play, data-driven
communication cannot remain the exclusive purview of such
professionals. Instead, we believe it is highly useful—maybe even
necessary—for all knowledge workers to understand the basics
of information design, to be able to create clear, compelling,
and insightful presentations of data—for their own use or for
informing others.
Information graphics shape our understanding of reality.
Charts are seen as authoritative, and a well-made chart can help
convince people to change their views on important societal
issues. Maps, which were once rare artifacts, mainly used by
specialists, are now regularly perused by most smartphone
users—a group of people numbering some 4 billion in 2019.
Formerly seen mainly as as an implement of science and engi-
neering, visualization is today applied to a myriad of different
uses in fields as disparate as sports and agriculture.
Yet understanding such displays of data is not an innate skill,
but one that has to be learned and honed. Visualization literacy
should, in our opinion, not be seen as limited just to the ability
to read charts and maps, but also to include the ability to make
use of tools readily available for creating them. Taking control
of chart- and map-making can be an agent of democracy for
common citizens, especially those belonging to marginalized
groups, to better participate in public discourse previously
governed by an elite group of highly trained professionals.
This book is the culmination of a decade of teaching, writing
about, and professionally pursuing information design. We hope
it will provide a good overview of this varied and ever-evolving
topic, and prove useful and enlightening for anyone wishing
to communicate data—whether they are design professionals,
analysts, journalists, scientists, civil servants, or practitioners of
any other profession.
And above all, we hope you enjoy reading this book as much
as we have enjoyed writing it.
In Helsinki,
January 6th 2019
The authors

14 DATA VISUALIZATION 15 Foreword


HANDBOOK Acknowledgements
part

I INTRO-­
DUCTION

16 DATA VISUALIZATION 17 I
HANDBOOK Acknowledgements
I

I INTRODUCTION

”The purpose of visualization


is insight, not pictures”
— Ben Shneiderman

H umankind has been creating visual documents of its


surroundings for tens of thousands of years—long before
the invention of writing. Alongside naturalistic imagery that
◀ A still from the animation
The next big spill – the Baltic
Sea traffic visualized (2013),
created by Lauri Vanhala
attempts to capture what can be seen with our eyes, other visual for HELCOM (Baltic Marine
representation methods, such as maps, were developed early in Environment Protection
Comission).
human history. Although maps resemble miniature images of
terrain, the purpose is not to accurately reproduce a bird’s-eye
view of the landscape. They are conceptual representations based
on established visualization conventions such as a scale factor.
Developments in science, technology and society have brought
about a need to present visually things that are not normally
visible. Our eyes cannot directly see bodily functions or social
and economic structures, but when presented in visual form,
they are easier to understand.
According to the visualization researcher Colin Ware, most
of our thinking happens as a kind of interaction with a variety
of methods and tools that enhance our brains’ data processing
ability. Rapid technological progress constantly produces new,
better tools, alongside traditional methods, such as drawing
and writing—the most recent of which are computers and
smart devices. In almost all fields of work, thinking is “carried
out through distributed cognitive systems” in social networks,
in which information is often processed by a large number 1. Ware 2004/2013, p. 2. Ware
of people.1 Interaction is communication, communication is is here building on the work of
anthropologist Edwin Hutchins,
thinking, and visualization is in many cases the most effective especially his book Cognition in the
way of communicating information. wild (1995).

18 DATA VISUALIZATION 19 Introduction


HANDBOOK +
I

Despite the development of tools, methods and processes, In fact, the human visual cortex is larger than all the other parts
the fundamental laws governing the visual presentation of of the brain used for processing sensory information combined.
information have not changed. Our eyes and brain still function We are able to visually perceive our environment, and
in the same way as they did for the first humans, hundreds of changes in it considerably more quickly and precisely than using
thousands of years ago. Understanding the capabilities and other senses. It is estimated that, each moment, our visual system
limitations of the visual system helps to identify a suitable visual sends our brains around eight times more information than all
presentation method for each communication need. The selec- the other senses combined.3 3. Zimmermann 1989
tion of the presentation method and a layout for the information,
taking into account the capabilities and limitations of the visual
system, is called information design.
Every knowledge worker’s job description includes situations
Seeing 10 Mb/s
in which data has to be converted into visual form. Common
tasks include giving presentations and creating presentation
graphics and charts using spreadsheet software. In scientific
publications, visualizing key research results in some form is
Touch 1 Mb/s
a basic requirement. It is hard to imagine a modern school
Hearing and smell,
without educational materials, created by teachers or by textbook 100,000 b/s each
publishers, that use a variety of different graphic presentations. Taste 1,000 b/s
The data visualization handbook is written for journalists, as well
as graphic designers and other visual professionals, but also The amount of information processed by the sensory system
for people working in academia, education, public relations, cannot be measured directly, so such an estimate is inevitably
government, and politics, who want to use information graphics imprecise. It is also worth pointing out that only a tiny fraction of
and visualizations to support communication, analysis or the information transmitted by the senses reaches the conscious
decision-making. As the name suggests, the book can be used as mind. If we look only at what is processed consciously, sight is
a practical guide to creating visual presentations, but it is also not quite so overwhelmingly superior compared to the other
suitable for use as a textbook. senses—especially hearing. It is nevertheless obvious that vision
The book introduces the reader to the general principles of is the strongest of human senses, and new information is usually
information design, the limitations and strengths of human per- adopted most readily when presented in visual form.
ception, and the requirements they place on visual presentation The supremacy of sight over the other senses is easy to see in
methods. The book explains the various genres of data visual- many everyday situations. When we walk through a door, we can
ization and includes a chapter on typographic issues related instantly form an overview of the space we have entered: how
to visualization. The book ends with a discussion of workflows, large it is, where the doors, windows and furniture are located,
ethics, and good professional practices in information design. what surface materials have been used, whether there are other
people or animals in the room, and so on. Forming a similar
+ overview relying on the other senses would be much slower. It is
Vision is the strongest no coincidence that the languages we speak include many idioms
human sense and expressions in which knowledge formation is described in
Humans and other apes have exceptionally well-developed terms related to seeing: I see, an overview, a vision, to get the picture.
eyesight. This distinguishes us from most other mammals, which The statistician John Tukey has said that “[t]he greatest value
rely much more on smell and hearing in perceiving the world of a picture is when it forces us to notice what we never expected
around them. More than a quarter of the cells in our cerebral to see.”4 When shown in visual form, the data often reveals 4. Tukey 1977, p. vi. Emphasis
2. Van Essen 2004 cortex are specialized for processing signals from the eyes.2 features that would remain hidden in a text or a table. removed.

20 DATA VISUALIZATION 21 Introduction


HANDBOOK +Vision is the strongest human sense
I

Vote shares in European In 2014, Danish voters cast their votes both in an election for the slow system 2 thinking. Both systems have their strengths
Parliament election 2014
and referendum results
the European Parliament, as well as in a referendum on whether and weaknesses, and often the best result is obtained by using
by precinct in Denmark the country should join the pan-European Unified Patent both in conjunction.
“No” votes
in the referendum ↑ Court. Immediately after election day, activists belonging to the Visualizations have been described5 as a mixture of image 5. E.g. Huovila 1996
non-profit Open Knowledge Denmark analyzed the vote. In the and text. Indeed, a well-designed visualization supports both fast
80%

70% adjacent scatterplot based on their work, each dot shows one of and slow thinking. Visual elements help viewers to understand
60%
the 1,396 voting precincts in Denmark. Their position along the the structure of data and to form a quick, intuitive overall picture
horizontal axis shows the percentage of votes received by the two of a phenomenon, even a complex one. The strength of text on
50%
parties, People’s Movement against the EU (Folkebevægelsen the other hand, lies in conveying either precise, abstract, and
40% mod EU, FB) and the Danish People’s Party (Dansk Folkeparti, analytical information. It focuses on details and enables a more
30%
DF), which advocated a “no” vote in the referendum, and the thorough interpretation.
vertical axis shows the share of “no” votes cast. Using different cognitive systems in parallel helps in pro-
20%
Look at the figure for a moment. Do you notice anything odd? cessing, internalizing, and memorizing information. Although
The points cluster quite densely around the diagonal, which images alone are recalled better than text, what is recalled best
20% 30% 40% 50% 60%
N

Vote share of FB and DF parties → means that there is a strong correlation between the two vari- are compositions that include the same information in both
Source: Open Knowledge Danmark 2014.
Valgdata workshop
ables. This is not surprising: in precincts where a high number of visual and verbal form.6 Accoring to the dual coding theory, 6. E.g. Paivio 1991, Levie & Lentz
votes were cast for the parties that opposed joining the Court, a developed by the psychologist Allan Paivio, this is due to the 1982, Atkinson et al. 1999

high number of “no” votes were cast, and vice versa. The figure, fact that visual information is stored as images in one part of the
however, reveals one precinct that is completely different from brain, the non-verbal part, and verbal information as concepts
the others. The Taarbæk precinct shown in the upper left corner in another, the verbal part. A message is dual coded when it is
stands out dramatically from all the other dots in the figure. perceived as both words and images, that is, processed in both
When the issue was investigated, it was found that election systems. The resulting memory trace becomes stronger than
officials in Taarbæk had, when reporting the result to the central when the message is perceived in just one of the two ways.
electoral board, inadvertently reported the “yes” and “no” votes
cast in the referendum the wrong way around. Despite multiple
+
checks, no-one noticed the error, because the results were
What is
processed in the form of numbers in a table. However, when the
information design?
data is converted into visual form, the outlying data point cannot Information design is about presenting information in the clear-
not be seen. We can instantly see that one data point stands out in est way possible. According to the definition suggested by the
some way from others. researcher Robert E. Horn,7 information design is “the art and 7. Horn 2000
It is worth remembering, however, that a visualization can science of preparing information so that it can be used by human
only show us certain features in the data, therefore obtaining beings with efficiency and effectiveness.” Clarifying the structure
a more detailed explanation for the deviation requires famil- of data presented in written form or in a table also falls under
iarity with the original tables and texts. Image and text are this definition, but the concept refers above all to creating visual
subsequently mutually supportive, not substitutive, means of displays of information. Information design consists of selecting,
communication. organizing and presenting information, taking into account the
In his book Thinking, fast and slow, psychologist Daniel needs and characteristics of the selected target audience, and the
Kahneman, a Nobel laureate in economics, describes two context of use.
systems that govern our thinking. System 1 is fast and intuitive, Is graphic design information design? This is a legitimate
while system 2 is slow and analytical. Visual communication question as graphic design—or visual communication design,
particularly supports the fast system 1 thinking, while language as it is also known—is in essence about designing the visual
and other conceptual structures, such as mathematics, support presentation of information. When the graphic design firm

22 DATA VISUALIZATION 23 Introduction


HANDBOOK + What is information design?
I

Pentagram first coined the term information design in the 1970s to In his book, Good charts,9 Scott Berinato presents a typology of 9. Berinato 2016, pp. 54–63
describe its work, the intention was to highlight the fundamental charts based on two fundamental questions about each chart:
purpose of graphic design as supporting information and com-
munication. With the establishment of Information Design Journal 1. Is the information conceptual or data-driven?
(1979) however, information design has been established as its 2. Is the purpose to declare (or explain) or to explore the
own area of expertise, separate from graphic design. information?
An example of the divergent aims and methods of graphic
design and information design, which is worth mentioning Although Berinato mostly discusses the typology in terms of
here, are the problems related to pharmaceutical packaging business graphics, we believe it can be applied to any type of
design. According to a survey by a Finnish pharmacists’ trade visual presentation of information. We call the two dimensions
magazine, patient safety is compromised –or at risk of being conceptual–measurable and explanatory–exploratory (which we
compromised– on a weekly basis in 30% of pharmacies, due discuss below).
8. Kairenius 2012 to packaging for different medicines being too similar.8 One of
The structure of a supercell cloud
the reasons for this concerning situation is probably that the
Height (km) Overshooting top
graphic or packaging designer responsible for the look of the 15
feet

50,000
package usually focuses primarily on the consistent application Anvil

of the manufacturer’s brand identity, instead of first considering


12
the visual distinctiveness of packages in different use scenarios. 40,000
Rear
Tornado A tornado travels an
The perspective of information design is different: brand and flank
average of 8km (5mi),
9
esthetics are always subordinate to the communicative function 30,000
at the speed of 50 km/h
before dissipating,
of packaging or graphics. causing significant
6 damage on the ground.
20,000
Speeds in the funnel
Terminology and typology Flanking
cloud can reach
line
The origins of information design and data visualization lie in 3 400 km/h (250mph)
10,000
The average diameter
several disparate fields. The methods and practices discussed Storm movement at the base of the funnel
Thunder Light rain
in this book originate in a number of differing disciplines: 0
is about 150m (500ft).

graphic design and illustration, journalism, cartography, com- Moderate rain


10km 6mi
Tornado Small hail Heavy rain
puter science, business administration, and statistics, among Large hail
others.
A side effect of this diverse heritage is that the terminology The primary purpose of explanatory graphics is to commu- This explanatory graphic,
used by different visualization practitioners, authors, and nicate information between people. They are used to declare, adapted from a high school
geography texbook, shows
researchers varies—at times widely. Different terms are explain and affirm the facts. The creator of the graphic already the structure of a supercell,
sometimes used to describe the same thing, or the same word to knows the information at hand, and the primary design challenge a type of a storm cloud. The
describe different things. is to find a way to convey that knowledge to the audience. choice of data presented here
derives directly from the high
Because of this, we have paid special attention to terminology In contrast, the purpose of exploratory graphics is primarily school curriculum, and the
in this book. We have tried to be unambiguous in our choice of to facilitate discovery and analysis of the information. They can design has been chosen to
words and give our definitions of specialist terms when they first be used in communication, but the primary aim of exploratory best convey that information.

appear in the text. Some of the principal terminology is outlined graphics is not to convey a message that has been determined in
below. advance by the creator of the graphic. Their function is to act as
a tool that enables the reader to find interesting features in the
+++ data,10 and the creator of the graphic does not know in advance 10. See Cairo 2016, p. 31
what the visualization will reveal.

24 DATA VISUALIZATION 25 Introduction


HANDBOOK + What is information design?
I

Aircraft in European and North American airspaces In this book, we consider visualization to be a catch-all term for
Each is a single aircraft.
Tuesday, 18 June 2013 at 14:00 UTC Source: Flightradar24.com
all graphics showing measurable information, whether explana-
tory or exploratory, and infographics to be a partially overlapping
Moscow
Helsinki term that encompasses all explanatory graphics. The hypernym,
Seattle Stockholm
Oslo or generic, overarching term for both is information graphics.
San Francisco
Reykjavik
Copenhagen
Berlin
Whatever terminology is used, the division into explanatory
Frankfurt
Los Angeles Istanbul
Amsterdam Vienna
Munich
and exploratory graphics is not black and white. They are better
San Diego
Phoenix Denver
Zürich
viewed as a continuum, rather than two clearly defined cate-
London Geneva Rome
Minneapolis Paris gories. Graphics often have both explanatory and exploratory
Chicago Barcelona

Dallas
Toronto features. These categories should therefore be considered as two
Madrid
Houston Charlotte
approaches to presenting information, with both common and
New York
Atlanta Washington, D.C. divergent features.
typology of
Purpose of Explanatory
information
Miami the graphic ↓
graphics

This exploratory graphic has also been adapted from a geography textbook. It shows the positions and Type of
bearings of the aircraft flying in the European and North American airspaces at a specific time in 2013. information → Infographics
Besides showing the incredible number of simultaneously airborne planes, the visualization reveals
Conceptual Measurable
interesting patterns. Not surprisingly, densely populated areas get more air traffic, but there are also
exceptions to the rule. For example, Denver, a city in the relatively sparsely populated Mountain West part
of the United States, stands out on the map thanks to its airport serving as a hub for several airlines.
Purely conceptual
information cannot
Information
Exploratory graphics are usually created with a computer be visualized, only
illustrated. graphics Exploratory Visualization
and they are often, but not always, interactive. According to the
typology outlined above, the information shown tends to be The visualization researcher Robert Kosara has suggested13 13. Kosara 2007b
measurable. The graphic is based on a scheme—an agreed-upon the following definition for a visualization (exploratory, in
set of rules for converting the information into visual form, as particular):
opposed to hand-crafting each element, which is common in • It is based on (non-visual) data,
explanatory graphics. • it produces an image, and 14. A representational image may,
The difference between the categories is that an explanatory • the result is readable and recognizable. however, be part of an information
graphic. It has been said (for
graphic tells a story, while an exploratory graphic is a tool for the example, Sullivan 1987, p. 41) that the
reader to find their own story in the data. Visualizations are therefore based on data that is abstract in simplest form of information graphic
There is a longstanding tradition of using the term info- nature, such as statistics, or that is not visible under normal cir- is a photograph with an arrow or
a circle marked on it, indicating
graphics for explanatory graphics and the term visualization for cumstances, such as the structure of internal organs. An image a feature of interest in the image.
exploratory graphics. This is for example how Alberto Cairo, a that directly imitates visual perception, such as a landscape Some types of information graphics
foremost authority on the topic, defines the terms in his book painting, is not a visualization.14 The main result of the visualiza- (see pp. 119–125), such as artist’s
renderings and identification
11. Cairo 2012, pp. xv–xvi & 18. See The functional art.11 However, this distinction is less than universal, tion process is always an image—though it can be supplemented pictures, again, come very close to
also Cairo 2016, pp. 27–40. and the word infographics is also used to mean other things, with explanatory text for example. The image should be a style that directly imitates visual
12. As does, indeed, Alberto Cairo such as to describe what in this book are termed combination recognizable as an information graphic, and the information it perception, and therefore do not
in his more recent book The truthful meet the definition presented by
charts (see p. 93).12 Sometimes the term is also used to describe contains should be presented as unambiguously as possible. This Kosara. They can be considered
art (2016).
graphics showing conceptual as opposed to measurable informa- definition excludes, for example, data art (see next spread). information graphics, but not
tion (according to our typology). visualizations.
+++
26 DATA VISUALIZATION 27 Introduction
HANDBOOK + What is information design?
I

Rose Garden Heinäkuu 2006 Not all visual communication aims


Kuudes linja
Oujee
Beatroot to convey information
Information design, then, is about presenting information in
the clearest possible way. Nevertheless, not all communication
is about conveying information. Messaging can aim to convey
feelings, moods, values, or other less tangible things instead.
If the aim is not to communicate concrete data, information
graphics are the wrong tools for the job. According to Alberto
Cairo, the purpose of an infographic or visualization “is not to
TÄMÄN VIHKON OMISTAA
Nimi
Osoite make numbers ‘interesting,’ but to transform those numbers (or
other phenomena) into visual shapes from which the human
The difference between an brain can extract meaning.”15
infographic and an illustration This does not of course mean that information graphics could
is one of substance and
function, not of style. The not also function as a visual element that adds color to a page,
illustration on the cover of this evokes a feeling, and attracts the reader. These tasks are however
brochure has a visual style subordinate to the main purpose of information graphics as a
similar to many infographics,
but it does not convey any tool for conveying of information. (See pp. 101–103.) the structures in data, while data art takes the data and creates Roads to Rome (2013) is a
facts. What it aims to do In the minds of many communicators, information graphics new kinds of visual structures from it. If then the presentation data art project that explores
instead is to convey a feeling the question of whether
are regrettably often confused with illustrations. The concept method is decided based on data, it is a visualization, but if data “all roads lead to Rome,” as
or an impression.
is quite loosely defined. The word “illustration” may also refer is picked to fit a chosen presentation method, it is data art.17 the saying goes. Benedikt
15. Kuenn 2013 to one of the genres of information design (see information Groß, Philip Schmitt, and

illustration, pp. 119–135), but here the term refers to a type of


+ Raphael Reimann created

image which Harold Evans, a pioneer in research on visual


When should data be a computer program which
used OpenStreetMap data to
journalism, calls flavor graphic—differentiating it from fact graphic,
presented in visual calculate the shortest route

16. Salo 2000, p. 155 that is information graphics and visualizations.16 Defined in this
form and when not? by road to Rome from each
point in the network. The end
way, illustrations seek to communicate not concrete facts, but When should data be visualized, and when is text enough? In result resembles a web of
things that are more difficult to identify, such as emotions and short, if something can be expressed just as clearly or even more blood vessels branching out
from the heart of the ancient
values, or simply to attract people to read the publication. In clearly in words, visualization is unnecessary—sometimes even Roman Empire all the way
journalism—especially in periodicals—illustrations have been counterproductive. In 1801, the inventor of statistical charts, to its periphery. The group
a key part of the visual style since the emergence of modern William Playfair, wrote that a visual presentation “gives a simple, has since utilized the same
technique in more practical
magazines, and they have a clearly defined role, which differs accurate, and permanent idea, by giving form and shape to a applications, such as for
from that of fact graphics. number of separate ideas, which are otherwise abstract and analyzing urban structures.
Data art in turn, uses data as its starting point, but deals unconnected.”18 When this is indeed the case, the data usually roadstorome.moovellab.com

with that data as material for artistic expression. Although it should be visualized instead of using just text. 17. It is not possible to discuss
often applies presentation methods that resemble those used in The success of a visualization is ultimately determined by data art more extensively in this
book, but for those interested in the
visualization, data art does not seek to provide an unambiguous whether its form and shape help the reader to perceive the topic, we recommend Casey Reas
interpretation of data, but to create esthetic experiences and to data better, or whether it just adds an extra layer of code to be and Chandler McWilliams’ book
test new methods of presentation. The new presentation methods deciphered. A figure is clear when the viewer understands what Form+Code (2010).

that emerge within data art are often later adopted as part of the it shows and, through it, finds answers to questions or gains 18. Playfair 1801/2005 p. 30
toolkit of visualizations. insights about features of the data.
The boundary between data art and visualization is not always If these criteria are not met, the graphic is unsuccessful. The
clear-cut, but as a rule of thumb, visualization seeks to explain reason for this may be that the choice of presentation method

28 DATA VISUALIZATION 29 Introduction


HANDBOOK + When should data be presented in visual form and when not?
I

is wrong or that it is poorly executed. Information graphics


+
work best for showing spatial and geographical relationships,
The grammar
processes, chronologies and above all numerical data. The visu-
of information graphics
alization of ideas, values, and abstract concepts is much more According to the computer scientist Leland Wilkinson, a
difficult—often even impossible. To paraphrase philosopher language consisting only of words and no grammar expresses
19. Wittgenstein 1922, p. 45 Ludwig Wittgenstein:19 Everything that can be shown can be only as many ideas as there are words.22 Grammar enables the 22. Wilkinson 1999/2006, p. 1
shown clearly. If a graphic seems hopelessly unclear, the reason formation of multi-word expressions and dramatically expands
can be that it attempts to convert into visual form something that the scope of a language. (Almost) anything can be expressed by
cannot in fact be visualized. combining words, in accordance with the rules of grammar.
The graphic below comes from a presentation discussing the Seeing is pre-lexical—we usually identify what we see immedi-
United States Department of Defense Architecture Framework. ately without linguistic interpretation. For example, research23 23. E.g. Stevenage, Nixon & Vince
It is an example of a visualization that is hopelessly confusing, has shown that humans can reliably identify other individuals 1999

because the topic is not one that can be fully converted into merely based on their gait, that is, the way they walk. Few of us
visual form. Blogger Paul Ford describes it as follows: “… this however, are able describe someone’s gait in any detail to another
image could be used anywhere in any paper or presentation person. Visually identifying something or someone does not
and make perfect sense. This is a graphic that defines a way of require the viewer to be able to describe what they see verbally.
describing anything that has ever existed and everything that has In information design, our ability to intuitively understand
20. Ford 2014. Emphasis in original ever happened, in any situation.”20 (It is an example of what car- what we see is crucial. The visualization of data helps us to
tographer Jacques Bertin calls a pansemic graphic: “In its attempt identify features in the data that we may not necessarily be able
21. Bertin 1983/2011, p. 2 to signify ‘everything’ it no longer signifies anything precise.”)21 to name. We do not need to know what words such as correlation
or outlier mean to be able to visually identify the phenomena that 24. According to Zimmerman
they describe in a scatterplot for example. (1989), perception is mostly
processed unconsciously, and
The pre-lexical understanding of what is seen is nevertheless only a very small part of all
only the first level in interpreting perception. Understanding sensory information is consciously
complex data, irrespective of the presentation method, always processed: 40 bits per second for
Condition sight, 30 b/s for hearing, 5 b/s of for
requires active interpretation, which is a much slower process touch, and only 1 b/s for taste and
Guidance is-performable-under than forming an initial overview based on perception.24 smell.
Rule

Standard Agreement
constrains requires- Symbols and glyphs
Activity ability-to-
perform
Capability In his book The design of everyday things, Donald A. Norman, a
pioneer of user-centered design, distinguishes between additive
has consumes-
is-
and-
is-realized-
by achieves-desired-
and substitutive dimensions in user interfaces.25 Additive values 25. Norman 1988/2002, p. 23.
performed-
Project
is-the-
produces effect (a state of a such as length can be changed incrementally and are used to Additive and substitutive scales were
by
goal-of resource) originally identified by psychologist
indicate varying amounts of something. Substitutive values such Stanley Smith Stevens (see Norman
describes-
Resource something as color hue26 cannot be changed in size but only substituted for 1991).
Information
another value and are used to indicate categories.
Materiel Data
Performer This basic logic is an example of what Norman calls natural 26. Hue means the shade of a color,
mapping: “taking advantage of physical analogies and cultural such as red or green. See p. 66.
System Organization is-at
Location
standards” to visually encode abstract information. It is easy to
GeoPolitical
Service PersonRole
understand that elements that may fluctuate in size are used to
show measurable, quantitative values, while things that can only
be substituted by one another correspond to differences in kind,

30 DATA VISUALIZATION 31 Introduction


HANDBOOK + The grammar of information graphics
I

namely categories. We intuitively understand that a higher bar


QUA
or a bigger circle indicates a greater number or amount when CUS
Y ST
.
TOM
compared to a smaller element, and that a different-colored we S t.
S ST
.
Fansha
bar differs somehow from the other bars in a group. The inter-

Be
ach
Vict
pretation of additive features is largely intuitive and based on oria
St.

Rd.
t.
the operational logic of lower-level visual perception (for more Well

on S
e sley
St.

Nels
information, see visual variables, pp. 58–62).

St.

ST.
Church

on
Visual elements that include only substitutive dimensions

EEN
bs
symbols glyphs

Ho
Mosque

QU
population
Town
are called symbols. According to Colin Ware, they represent Other place
of worship
1,000

✈ Airport
an object through a simple “this is X” relation.27 The expressive 500m
1,640ft
2,000
power of a visual code that only uses symbols is rather limited.
In the vein of Wilkinson’s definition presented above, it can be
Hospital
5,000

viewed as a grammarless language, in which all possible expres-


+ The aerial photo and map
sions and their meanings must be defined beforehand.
Simplify, compare above both show the same
Museum 10,000
Visual elements that include at least one additive dimension
and organize area in Auckland, New
Zealand. The photo includes
are called glyphs.28 An example of a glyph is a bar in a bar chart. Three factors above all define how easy or hard a visualization is much, much more information
27. Ware 2004/2013, p. 140 The meaning of a data visualization emerges above all from to read and what insights can be gleaned from it: how the data is than the map. Most of this
information is however
the interrelationships between symbols and glyphs, not from simplified, compared and organized. useless in the majority of use
28. Ward 2008. In visualization these parts themselves. cases. By leaving out most of
research, the term glyph is Simplify the information in the photo,
often used specifically to mean +++ When irrelevant details are left out from a visualization, the
the map becomes more useful
multivariate glyphs: visual objects in its primary purpose of
that are used to show several data The cartographer Jacques Bertin, who developed much of the remaining elements are easier to identify and their mutual com- locating religious buildings
dimensions at once. Probably the theoretical foundation of information design in the 1960s, parison is easier. It is believed this is due to decreased need for in the area. For some other
best-known example of glyphs purposes however, such as
in this narrower sense are the divides visual signs into two main categories: monosemic (hav- visual decoding, which frees up more working memory capacity differentiating between
so-called Chernoff faces (see, e.g., ing a single meaning) and polysemic (having multiple meanings). to process relevant information. residential and industrial
Kosara 2007a). Here we will use the The meaning of monosemic signs such as mathematical symbols Consider maps: although a map also includes information areas, the map is useless.
term mostly for simple glyphs, which
form most of the basic building is given—known in advance—and their interpretation is that is missing from a photograph—place names, for example
blocks used in more complex visual therefore unambiguous.29 The meaning of polysemic signs, such —an aerial photograph almost always includes vastly more
presentations of data. as the individual details that make up a drawing, follows from the information about the terrain than a map of the same area. Most
29. Bertin 1983/2011, p. 2 other signs that are present: a circle may denote very different of this information is useless to the map’s user—the individual
things from one picture to another. Polysemic signs are read trees, cars, boats, roofing materials of buildings, weather, and
“between the sign and its meaning,” and thus their interpretation the direction of light at the moment the photograph was taken,
is always ambiguous and debatable to some extent. and so on. When such unnecessary information is removed, the
According to Bertin, information graphics are read “among information left in the map acquires greater visual prominence
30. Bertin 1983/2011, pp. 2–3 the given meanings” of monosemic signs.30 The meaning of each and reading the map becomes easier. It is often said that the most
element in isolation is known in advance thanks to legends, color important decision in making a map is not what to put in, but
keys, and so on, but most of the actual information in the graphic what to leave out. The same is true of all information graphics.
is embedded in the relationships between the signs—symbols It is a common mistake to cram so much information into
31. Robinson et al. 1995, p. 451 and glyphs. This is called induction:31 the reader is able to a visualization that the eye is no longer able to pick out visual
induce from a graphic much more than merely the individual structures in the image. The information extractable from this
pieces of information its designer has specifically included in it. crammed network diagram (see pp. 227–229)—a “hairball” in
industry jargon—can be summarized in one sentence:

32 DATA VISUALIZATION 33 Introduction


HANDBOOK + Simplify, compare and organize
I

The makers of these two “There are many connections.” Something that can be expressed
maps showing the same place in one sentence does not deserve to be visualized.
in South-Western Finland, a
“terrain map” on the left and On the other hand, data can also be too narrow or simple
a nautical chart on the right, to be visualized. Because visualizations are based on the visual
have made very different comparison, a single number cannot be visualized.32 If the data
choices in selecting what to
leave out of their map. only includes a few numbers or other facts, in many cases a few
sentences or a single table can convey the information as well
as, or better than, a graphic. Edward Tufte, Professor Emeritus
32. To visualize a single number, of statistics at Yale university, has suggested as a rule of thumb
a designer can choose a familiar that “[t]ables usually outperform graphs in reporting small sets
object or comparison level as a
reference point; for example, how of 20 numbers or less.”33 The creation of a graphic requires more
many football fields would fit inside effort than a text or a table, so in cases like this, it often makes it is unfortunately often the case that graphics are decorated at This infographic, created by
an area, or how much the share of sense to allocate limited resources to something other than the expense of clarity. The importance of esthetic choices in the illustrator Viktor Landström,
a specific demographic in a city’s explains the making of
population differs from the national creating information graphics. design of visualizations is discussed on pages 101–103. an Ecuadorean tsantsa
average. In this way, it is no longer It is worth noting that simplification is something you do to (a shrunken head). The
a question of visualizing one but the underlying data, not to the graphic itself. Simplified data Compare content is simplified, but the
two numbers, which is very much presentation style is anything
achievable. does not necessarily require a minimalistic style of presentation. Edward Tufte writes that “At the heart of quantitative reasoning but minimalistic.
A decorative graphic can be clear and illustrative, as long as is a single question: Compared to what? ”34 A visualization is at
33. Tufte 1983/2002, p. 56. the reader is able to easily identify which of the elements in the its heart, a tool for making comparisons. Thus the single most 34. Tufte 1990, p. 67
The rule should not be taken too image carry information and which are just decoration. The important question to answer when designing one is: what
literally. There are many examples
of datasets of merely two or three decorative elements should never fight for attention with the comparisons should be enabled?
numbers that benefit from being actual content. Often however, a minimalist style is a safer choice Written and spoken language is linear in nature. Speech, text,
visualized. than a decorative one. It means there are fewer opportunities and video proceed from beginning to end, at the pace and in the
for mistakes that compromise the clarity of the visualization, as order determined by their creator. By contrast, data visualization
hands control over to the viewer. They can explore the content at
Legal drinking ages
in the Nordics
On the left is an example of a data set that is their own pace, only superficially or drilling down to details, and
Iceland too simple to benefit from being visualized. going through the elements in the any order they like. Someone
Finland The interesting thing here is the most common
Norway value (18 years), and the exceptions form it reading a text or listening to a speech needs to rely on their
Sweden (Iceland and Denmark), not the percentual working memory and mental calculation in order to be able to
Faroe Islands
difference between numbers 18 and 20, or 18 and compare the numbers presented or the connections explained.
Greenland
Denmark 16. The same information would be easier to read
0 5 10 15 20 years if presented as a list or a table. Someone viewing an information graphic does not need to

34 DATA VISUALIZATION 35 Introduction


HANDBOOK + Simplify, compare and organize
I

engage in such brain-taxing activities, as a visual comparison of Time refers to relations between positions in time, such as dates.
elements in the image takes place almost automatically. It is one of the basic relationships shown in time-series charts,
York
On the other hand, text can use many different types of content such as line charts, and in timelines.
Category
structures, while an abstract visualization mainly just presents
York various relationships between data points.35 For this reason, a sin- Category refers to some qualitative similarity between data
Meaningless bar chart and gle bar, map symbol or shape does not convey information, but points. For example, companies can be categorized by industry,
map. only becomes meaningful by its relationship with other elements or municipalities by region.
in the image—in other words, it is polysemic. As Vesa Kuusela,
35. This is not necessarily true of an expert on statistical charts, has said, “A chart acquires its Connectionrefers to various links between the data points. The
information graphics that visualize meaning from comparison.” connections can be hierarchical (directed), as in a food web, or
conceptual information. See
pp. 219–225. Visual comparability requires that the visual encoding non-hierarchical (undirected), as in a network of friends. (For
methods, such as colors, symbols and scales, are used consis- further information, see network diagrams, pp. 227–229.)
tently. (For more information, see pp. 83–86.)
A well-made visualization facilitates a fairly large number of Qualitative data that does not include these relationships must, in
different visual comparisons. Our visual system and cognitive order to be visualized, first be transformed to fit one of the above
capacity, however, set an upper limit to the complexity of presen- groups, for example by giving it a classification or score. An example
tations we can interpret. Because not all the methods available of this is the Martin–Quinn score, developed by the political scientists
for visually encoding data are equally suitable (see pp. 58–62), Andrew D. Martin and Kevin M. Quinn. It is a scoring system for U.S.
the designer must also decide which of the comparisons are Supreme Court justices based on their voting record on the bench,
the most important. The best encoding methods, such as the giving each an ideological score ranging from –5 (extremely liberal)
positions of the glyphs, should be used for the most important to 5 (extremely conservative). This conversion from a highly qualita-
comparisons; and the less effective ones, such as their surface tive data set—the court’s decisions—to numbers enables the data set
area, to encode secondary or supplementary data. to be analyzed using statistical methods and visualized.

+++ The ideological spread of SCOTUS justices has slightly narrowed since the late 1970s
Martin-Quinn scores for Supreme Court justices

A visualization can show many types of relationships, which can ← more liberal more conservative →
-5 -4 -3 -2 -1 0 1 2 3 4 5
be roughly divided into six main groups: Marshall Brennan Stevens White Stewart Burger Rehnquist
1976–1981
Blackmun Powell

Numbers refers to either quantities (amounts) or sizes. Relation- Sotomayor Kagan Kennedy Roberts Scalia Alito Thomas
2011–2016
ships between numbers are by far the most common type shown Ginsburg Breyer

in information graphics. Most visualizations include at least Chief justice Median justice Source: Martin, Andrew D. & Quinn, Kevin M. 2018. Martin-Quinn scores.
Center for Empirical Research in the Law, Washington University

some comparisons between numbers.


Organize
rank
Rank (or ordering) is different from numbers only in that the hier- Besides simplification and comparison, the third defining charac-
archical relationship of the data points—larger or smaller rank—is teristic in how a visualization is read and understood is how the
known, not the actual magnitudes or differences in values. (This data is visually organized.
has a variety of practical implications. See ordinal scale, p. 94.) In his book Information architects,36 Richard Saul Wurman, 36. Wurman 1996
designer, author, and the founder of the TED conference, has
Location (or position) is the basic type of relationship shown on proposed a set of organizing principles for information, which he
maps. Though it usually refers to geographic and astronomical calls the “Five Hat Racks,” and provides the acronym LATCH as a
position, location can also refer to locations such as those within mnemonic. Wurman himself has since disowned37 this system, but it 37. Wurman & Grimwade 2016
the human body from which a biological sample was taken. serves as a good starting point for an improved taxonomy.

36 DATA VISUALIZATION 37 Introduction


HANDBOOK + Simplify, compare and organize
I

CZECH REP. organization principle


Regensburg UKRAINE
Ingolstadt Straubing SLOVAKIA MOLDOVA
relationship five hat racks seven c’s based on
Bratislava
Passau Győr
Ulm GERMANY HUNGARY Numbers Continuum by magnitude
Linz Continuum
Vienna Budapest
AUSTRIA (Hierarchy)
Dunaújváros Galați Izmail Rank Continuum by rank
ROMANIA
Brăila Tulcea Location Location Coordinates
SLOVENIA Novi Sad Based on data
Pančevo Drobeta-Turnu Silistra
CROATIA
Time Time Chronology
Milan Zagreb Severin
Giurgiu Călăraşi
Categories Category Categories
BOSNIA Belgrade Smederevo Ruse
AND HERZ. BE DA NU
ITALY SERBIA BLACK SEA
Connection Connection

a BULGARIA
(none) Alphabet Convention Based on convention
dr
MONTENEGRO
ia
ti KOSOVO Sofia
c TURKEY
se
a
MACEDONIA
Continuum by magnitude, in which the elements are ordered
Rome ALBANIA 250km
GREECE 0
155mi in a descending or ascending order based on the magnitude
of their value, in other words, from the largest to the smallest
A map of cities located along Wurman’s Five Hat Racks are: value or vice versa. The particular advantage in organizing the
the river Danube. The size • Location data like this is that the values with the smallest differences are
of the bubble indicates the
population of the city. On the • Alphabet always located adjacent to each other, which enables very small
next spread, the populations • Time differences to be detected.
are visualized as a set of • Category Continuum by rank, in which the ordering is based on the
horizontal bar charts which
have been ordered using each • Hierarchy structure of the scale itself. For example, if the data deals with
of the different organizing In an earlier book, Information anxiety, published in 1989, educational attainment, it usually makes sense to order the data
principles outlined in the text. Wurman uses the term continuum instead of hierarchy. The term by the level of education from primary to secondary to tertiary
Depending on the organizing
principle used, each chart has apparently been changed mainly so that the initials of the level (continuum by rank), not the number of people at each level
gives a different impression of organizing principles form the memorable acronym LATCH. (continuum by magnitude).
the data and reveals different Below, we use the original term continuum, as it better captures Coordinates (location) is the organizing principle used in
patterns in it.
the nature of this organizing principle. maps. It is less common in other types of information graphics,
An attentive reader may notice that the organizing principles but can be used as an organizing principle in many types of
bear similarities to the comparable relationships in data statistical charts, for example.
described above. This is no coincidence. The relationships are Chronology is the organizing principle used in line charts
closely related to the organizing principles, so that organizing and other time series. In a graphic created for a Western reader,
the data in a particular way in a visualization strongly empha- time should usually be shown as moving from left to right.38 38. The direction of time in
sizes one relationship in the data over others. In the domain of statistical graphics, the main exception to this graphics is the same as the writing
direction of the Latin alphabet. In the
By making a few additions and clarifications to Wurman’s rule are the graphics used immediately next to tables in financial case of, for example, Arabic, Hebrew
Five Hat Racks, we can match each organizing principle with one statements, such as annual reports. In such tables, time can or other right-to-left languages,
of the relations, and create our own mnemonic—the Seven C’s: also be presented as moving from top to bottom or from right time can also be shown as moving
from right to left in figures. In these
to left, and this same order can also be used for related graph- languages however, practice varies
Continuum is the most typical way of organizing elements in ics—excluding line charts, in which the direction of time should more than in Western languages;
a graphic. It is usually the most natural way of organizing data always move from left to right. In some visualizations other than Israeli newspapers feature time
series that are ordered both from
points in a visualization, unless there is a specific reason to do statistical graphics, such as timelines, time can also be presented right to left and from left to right.
otherwise. Continuum has two subtypes: as moving from top to bottom.

38 DATA VISUALIZATION 39 Introduction


HANDBOOK + Simplify, compare and organize
I

continuum
continuumby magnitude
by magnitude coordinates
coordinates connection
connection convention
convention
By population
By population By location
By location
along the
along
Danube
the Danube
downstream
downstream
from Ulm
from Ulm By railroad
By railroad
connections
connections Alphabetical
Alphabetical
order by
order
name
by name

ViennaVienna 2,587km2,587km Ulm


· 1,607mi· 1,607mi Ulm connected
connected Ulm Ulm Belgrade
Belgrade
via thevia the
BudapestBudapest Ingolstadt
Ingolstadt Ingolstadt
Ingolstadt Bratislava
Bratislava
trans- trans-
Belgrade
Belgrade Regensburg
Regensburg european Regensburg
european Regensburg Brăila Brăila
rail rail
Bratislava
Bratislava Straubing
Straubing Straubing
Straubing BudapestBudapest
networknetwork
Galați Galați Passau Passau Passau Passau Călăraşi
Călăraşi
Brăila Brăila Linz Linz Linz Linz Drobeta-Turnu
Drobeta-Turnu
Severin Severin
Novi Sad
Novi Sad ViennaVienna ViennaVienna Dunaújváros
Dunaújváros
Linz Linz Bratislava
Bratislava Bratislava
Bratislava Galați Galați
Ruse Ruse Győr Győr Győr Győr Giurgiu Giurgiu
Regensburg
Regensburg BudapestBudapest Budapest
Budapest Győr Győr
Győr Győr Dunaújváros
Dunaújváros Drobeta-Turnu
Drobeta-Turnu
Severin Severin Ingolstadt
Ingolstadt
Ingolstadt
Ingolstadt Novi Sad
Novi Sad Giurgiu Giurgiu Izmail Izmail
Ulm Ulm Belgrade
Belgrade Brăila Brăila Linz Linz
Drobeta-Turnu Severin
Drobeta-Turnu Severin Pančevo Pančevo Galați Galați Novi Sad
Novi Sad
TulceaTulcea Smederevo
Smederevo connected
connected Novi Sad
Novi Sad Pančevo Pančevo
via thevia
žs the žs
Călăraşi
Călăraşi Drobeta-Turnu
Drobeta-Turnu
Severin Severin main line
main line
Belgrade
Belgrade Passau Passau
Pančevo Pančevo Giurgiu Giurgiu neworknework Pančevo Pančevo Regensburg
Regensburg
Giurgiu Giurgiu Ruse Ruse not connected Călăraşi
not connected Călăraşi Ruse Ruse
via either
via either
Izmail Izmail SilistraSilistra Dunaújváros
Dunaújváros SilistraSilistra
Smederevo
Smederevo Călăraşi
Călăraşi Izmail Izmail Smederevo
Smederevo
Passau Passau Brăila Brăila Ruse Ruse Straubing
Straubing
SilistraSilistra Galați Galați SilistraSilistra TulceaTulcea
Dunaújváros
Dunaújváros Izmail Izmail Smederevo
Smederevo Ulm Ulm
Straubing
Straubing 70km · 44miTulceaTulcea
70km · 44mi TulceaTulcea ViennaVienna
the black
thesea
black sea

chronology
chronology categories
categories Category as an organizing principle means grouping data
By cityBy
founding
city founding
year, earliest
year, earliest
to latest
to latest Cities Cities
in European
in European
Union Union
membermember
countries
countries
points based on some kind of similarity in content. Usually, the
100 ce 100 ce Belgrade
Belgrade in eu in eu ViennaVienna grouping is based on values on a nominal scale (see p. 94): people
SilistraSilistra BudapestBudapest
can be divided into employed, unemployed, and out of the
BudapestBudapest Bratislava
Bratislava
Drobeta-Turnu
Drobeta-Turnu
Severin Severin Galați Galați workforce for example. Quantitative values can, however, also be
ViennaVienna Brăila Brăila classified into categories. For example, companies can be classi-
Regensburg
Regensburg Linz Linz
fied into small, medium-sized, and large, based on the number of
Passau Passau Regensburg
Regensburg
Linz Linz Győr Győr employees or annual turnover.
Straubing
Straubing Ingolstadt
Ingolstadt The visual encoding of categorical data can generally involve
Ingolstadt
Ingolstadt Ulm Ulm
the use of visual variables (see pp. 58–62) that are ill-suited for
Ulm Ulm Drobeta-Turnu Severin
Drobeta-Turnu Severin
Győr Győr TulceaTulcea showing other classifications, such as color hue or shapes. For
Pančevo Pančevo Călăraşi
Călăraşi this reason, categorical ordering can often easily be combined
Bratislava
Bratislava Giurgiu Giurgiu
Brăila Brăila Passau Passau
with other organizing principles. Even when position is used
Giurgiu Giurgiu SilistraSilistra for encoding categories, the items within the categories can be
Ruse Ruse Dunaújváros
Dunaújváros organized by position based on some of the other principles, such
Smederevo
Smederevo Straubing
Straubing
Galați Galați not in eu Belgrade
not in eu Belgrade as continuum, as in the example on the left.
TulceaTulcea Novi Sad
Novi Sad Connection is an organizing principle mostly used in network
Izmail Izmail Ruse Ruse diagrams (see pp. 227–229). It is based on the links between the data
Novi Sad
Novi Sad Pančevo Pančevo
CălăraşiCălăraşi Izmail Izmail points, which are often called nodes in this context. Often, nodes
Dunaújváros
1949 ce 1949 ce Dunaújváros Smederevo
Smederevo that are connected are also shown close to each other (see p. 228).

40 DATA VISUALIZATION 41 Introduction


HANDBOOK + Simplify, compare and organize
I

chronology
By city founding year, earliest to latest
Silistra
Drobeta Turnu-Severin Linz
Regensburg Ingolstadt Győr Bratislava Giurgiu Smederevo Tulcea Novi Sad Dunaújváros
Belgrade

Budapest Vienna Passau Straubing Ulm Pančevo Brăila Ruse Galați Izmail Călăraşi

1 ce 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 ce
approximate founding year or earliest mention

Another example of using Convention refers to organizing principles that are not based of human perception grows with new research, an information
chronology as an organizing on characteristics of the data, like the other six principles, but designer’s work is still more art than science. The rules and
principle. Here the data
points are positioned along on an agreed-upon convention, such alphabetical order or, say, guidelines we propose do not cover all the potential issues that
an interval scale, as opposed numbering of military units (1st Infantry Division, 2nd Infantry a designer has to address, and they can, in some cases, conflict
to the ordinal scale used for Division etc.). Organizing principles based on how the data has with each other. The best result is not achieved by mechanically
positioning the bars in the bar
charts on the previous spread. been collected, such as following the order in which questions following the rules. The designer must always use independent
(For further information about were asked on a questionnaire, also fall in this category. judgment when creating visualizations.
scales of measurement, see Convention-based orderings are better than a completely There is, however, one rule that an information designer
pp. 94–96.)
arbitrary order, but in almost all cases, the other six organizing should always follow: choose the clearest presentation method
principles are better options. The main advantage of alphabetical available. Disregard any other rules and guidelines we present
order is that the viewer can quickly look up an individual data in this book, when they are in conflict with this golden rule of
point. This is useful in a table, but the purpose of a visualization information design. These rules should not, of course, be ignored
is to reveal patterns and other larger-scale structures (and the without a well-founded reason, but such reasons may occasion-
exceptions to those patterns) in the data. A convention-based (as ally come up in a designer’s work—as a reader carefully studying
opposed to data-based) ordering does not reveal such patterns. the graphics in this book may indeed notice.
The various organizing principles are not mutually exclusive, A visualization should only make true statements about the
and in many cases multiple principles are applied in parallel in real world, in the clearest possible way. The data should be correct
the same presentation. Because different organizing principles and from a trusted source. The presentation method should be
reveal different patterns in the data, it is often a good idea to selected so as not to mislead the reader and take attention away
let the user select the order of items from a number of options, from the important features of the data. It should show large
when technically feasible. differences as large, and small differences as small. It should draw
the reader’s attention first to the most significant features, and
+ leave insignificant features in the background. A visualization
The golden rule should include as much information as possible, but not too
of information design much. Its visual style should be appropriate for the context, and
In this book, we present a variety of guidelines and rules for the it should be carefully executed down to the smallest detail. At its
design of visualizations. Some of these are based on research, best, a visualization gives an overview of the topic quickly, but
while others draw on the observations and experience of the also rewards the reader who spends time exploring it in depth.39 39. Edward Tufte calls this “micro/
authors of this book and of others working in the field. We In order for all these goals to be achievable, the various rules macro readings.” Tufte 1990, pp.
37–51
believe that by observing them, designers can create clear, illus- and guidelines set forth in this book should be applied on a case-
trative and interesting visualizations. Even as our understanding by-case basis and should sometimes even be broken.

42 DATA VISUALIZATION 43 Introduction


HANDBOOK + The golden rule of information design

You might also like