Visualization Basics
Data Visualization
Review
• What is the purpose of visualization?
• How do we accomplish that?
Basic Visualization Model
Goal
Data transfer
Data
Insight
(learning, knowledge extraction)
Method
Data transfer
Data
Insight
Map: ~Map-1:
data → visual visual → data insight
Visual transfer
Visualization
(communication bandwidth)
Visual Mappings
Data Visual Mappings must be:
• Computable (math)
visual = f(data)
Map: • Comprehensible (invertible)
data → visual data = f-1(visual)
• Creative!
Visualization
PolarEyes
Visualization Pipeline
tas
k
Raw data Data Visual Visualization
tables structures (views)
(information)
Data Visual View
transformations mappings transformations
User interaction
Data Table: Canonical data model
• Visualization requires structure, data model
• (All?) information can be modeled as data tables
Data Table
Attributes (aka: dimensions, variables, fields, columns, …)
Values
Data Types:
•Quantitative
•Ordinal
•Categorical
•Nominal
Items
(aka:
tuples, cases,
records,
data points,
rows, …)
Attributes
• Dependent variables (measured)
• Independent variables (controlled)
ID Year Length Title
0 1986 128 Terminator
1 1993 120 T2
2 2003 142 T3
… … … …
Data Transformations
• Data table operations:
• Selection
• Projection
• Aggregation
– r = f(rows)
– c = f(cols)
• Join
• Transpose
• Sort
• …
Visual Structure
• Spatial substrate
• Visual marks
• Visual properties
Visual Mapping: Step 1
1. Map: data items → visual marks
Visual marks:
• Points
• Lines
• Areas
• Volumes
• Glyphs
Visual Mapping: Step 2
1. Map: data items → visual marks
2. Map: data attributes → visual properties of marks
Visual properties of marks:
• Position, x, y, z
• Size, length, area, volume
• Orientation, angle, slope
• Color, gray scale, texture
• Shape
• Animation, time, blink, motion
Example: Spotfire
• Film database
• Year → x
• Length → y
• Popularity → size
• Subject → color
• Award? → shape
Visual Mapping Definition Language
• Films → dots
• Year → x
• Length → y
• Popularity → size
• Subject → color
• Award? → shape
The Simple Stuff
• Univariate
• Bivariate
• Trivariate
Univariate
• Dot plot
• Bar chart (item vs. attribute)
• Tukey box plot
• Histogram
Bivariate
• Scatterplot
•
Trivariate
• 3D scatterplot, spin plot
• 2D plot + size (or color…)
Visualization Design
HCI Design Process
Analyze Design Evaluate
• Iterative, progressive refinement
Analyze
• Data:
• Information types (multiD, tree, …)
• Scalability****
• Semantics
• Users:
• Tasks
• Expertise
• …
• Existing solutions (literature review)
Data Scalability
• # of attributes (dimensionality)
• # of items
• Value range
(e.g. bits/value)
User Tasks
• Easy stuff: Forms can do this
• Reduce to only 1 data item or value
• Stats: Min, max, average, %
• Search: known item
• Hard stuff: Visualization can do this!
• Require seeing the whole
• Patterns: distributions, trends, frequencies, structures
• Outliers: exceptions
• Relationships: correlations, multi-way interactions
• Tradeoffs: combined min/max
• Comparisons: choices (1:1), context (1:M), sets (M:M)
• Clusters: groups, similarities
• Anomalies: data errors
• Paths: distances, ancestors, decompositions, …
Design the Visualization Pipeline
tas
k
Raw data Data Visual Visualization
tables structures (views)
(information)
Data Visual View
transformations mappings transformations
User interaction
Design
• Methods:
• Optimize tasks on data, scenarios
• Apply principles
• Build on existing solutions
• Brainstorm
• Artifacts:
• Paper sketches
• Mockups (powerpoint, macromedia,…)
• Prototypes (VB, …)
• Implementation
HCI UI Evaluation Metrics
• User learnability:
• Learning time
• Retention time
• User performance: *** Measure while
• Performance time users perform
• Success rates benchmark tasks
• Error rates, recovery
• Clicks, actions
• User satisfaction:
• Surveys
Not “user friendly”
Some
Visualization Design
Principles
Effectiveness & Expressiveness
(Mackinlay)
• Effectiveness
• Cleveland’s rules
• Expressiveness
• Encodes all data
• Encodes only the data
Ranking Visual Properties
1. Position
2. Length Increased accuracy for
quantitative data
3. Angle, Slope
(Cleveland and McGill)
4. Area, Volume
5. Color
Categorical data:
1. Position
2. Color, Shape
Design guideline: 3. Length
• Map more important data attributes 4. Angle, slope
to more accurate visual attributes 5. Area, volume
(based on user task) (Mackinlay hypoth.)
Example
• Hard drives for sale: price ($), capacity (MB), quality rating (1-5)
Pie vs. Bar
• Data: population of the 50 states
• Pie: state and pop overloaded on circumf.
• Bar: state on x, pop on y
AK
AL
Stacked Bar
AR
CA
CO
…
Eliminate “Chart Junk” (Tufte)
• How much “ink” is used for non-data?
• Reclaim empty space
(% screen empty)
• Attempt simplicity
(e.g. am I using 3d
just for coolness?)
Increase Data Density (Tufte)
• Calculate data/pixel
“A pixel
is a
terrible
thing to
waste.”
(Shneiderman)
Interaction Approach
• Direct Manipulation (Shneiderman)
• Visual representation
• Rapid, incremental, reversible actions
• Pointing instead of typing
• Immediate, continuous feedback
Information Visualization Mantra
(Shneiderman)
• Overview first, zoom and filter, then details on demand
• Overview first, zoom and filter, then details on demand
• Overview first, zoom and filter, then details on demand
• Overview first, zoom and filter, then details on demand
• Overview first, zoom and filter, then details on demand
• Overview first, zoom and filter, then details on demand
• Overview first, zoom and filter, then details on demand
• Overview first, zoom and filter, then details on demand
• Overview first, zoom and filter, then details on demand
Cost of Knowledge / Info Foraging
(Card, Piroli, et al.)
• Frequently accessed info should be quick
• At expense of infrequently accessed info
• Bubble up “scent” of details to overview
The “Insight” Factor
• Avoid the temptation to design a form-based search engine
• More tasks than just “search”
• How do I know what to “search” for?
• What if there’s something better that I don’t know to search for?
• Hides the data
Break out of the Box
• Resistance is not futile!
• Creativity; Think bigger, broader
• Does the design help me explore, learn, understand?
• Reveal the data
Class Motto
Show me
the data!
How (not) to Lie
with Visualization
Information Types
• Multi-dimensional: databases,…
• 1D: timelines,…
• 2D: maps,…
• 3D: volumes,…
• Hierarchies/Trees: directories,…
• Networks/Graphs: web, communications,…
• Document collections: digital libraries,…