Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
59 views33 pages

Slide02 Data Abstraction

The document discusses various data types and dataset types, including tables, networks, trees, fields, and geometry, along with their attributes and processing methods. It categorizes data attributes into qualitative and quantitative types, further detailing nominal, ordinal, discrete, and continuous attributes. Additionally, it covers data availability, transformation, and references relevant literature in the field of data visualization and analysis.

Uploaded by

manthanh350790
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views33 pages

Slide02 Data Abstraction

The document discusses various data types and dataset types, including tables, networks, trees, fields, and geometry, along with their attributes and processing methods. It categorizes data attributes into qualitative and quantitative types, further detailing nominal, ordinal, discrete, and continuous attributes. Additionally, it covers data availability, transformation, and references relevant literature in the field of data visualization and analysis.

Uploaded by

manthanh350790
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

DATA ABSTRACTION

Bùi Tiến Lên

2023
Contents

1. Data Types

2. Dataset Types

3. Attribute Types

4. Data Processing
The Big Picture
Data Types

Dataset Types
Tables
Networks and Trees
Fields
Geometry What?
Other Combinations Datasets Attributes

Dataset Availability Data Types Attribute Types


Items Attributes Links Positions Grids Categorical
Attribute Types
Data and Dataset Types
Data Tables Networks & Fields Geometry Clusters, Ordered
Processing Trees Sets, Lists Ordinal
Items Items (nodes) Grids Items Items
Attributes Links Positions Positions
Attributes Attributes Quantitative

Dataset Types
Ordering Direction
Tables Networks Fields (Continuous)
Sequential
Attributes (columns) Grid of positions

Items Link
Cell
(rows)
Node
Diverging
(item)
Cell containing value Attributes (columns)

Value in cell

Multidimensional Table Trees


Cyclic

Value in cell

Geometry (Spatial)

Position

Dataset Availability What?

Static Dynamic Why?

How?
3
Data Types
Data Types
Data Types

Dataset Types
Tables
Networks and Trees
Fields
Geometry The five basic data types:
Other Combinations
Dataset Availability
1. An item is an individual entity that is discrete, such as a row in a simple
Attribute Types
table or a node in a network
Data
Processing 2. An attribute is some specific property that can be measured, observed, or
logged
3. A link is a relationship between items, typically within a network
4. A position is spatial data, providing a location in two-dimensional (2D) or
three-dimensional (3D) space
5. A grid specifies the strategy for sampling continuous data in terms of both
geometric and topological relationships between its cells

Data Types
Items Attributes Links Positions Grids

5
Dataset Types
• Tables
• Networks and Trees
• Fields
• Geometry
• Other Combinations
• Dataset Availability
Dataset Types
Data Types

Dataset Types
Tables
Networks and Trees
Fields
Geometry
Other Combinations Concept 1
A dataset is any collection of information that is the target of analysis. Data sets
Dataset Availability

Attribute Types

Data
are made up of data objects.
Processing
• These basic dataset types arise from combinations of the data types of
items, attributes, links, positions, and grids.

Data and Dataset Types


Tables Networks & Fields Geometry Clusters,
Trees Sets, Lists
Items Items (nodes) Grids Items Items
Attributes Links Positions Positions
Attributes Attributes

7
Dataset Types (cont.)
Data Types

Dataset Types
Tables
Networks and Trees
Fields
Geometry
Other Combinations
• The detailed structure of the four basic dataset types
Dataset Availability

Attribute Types
Dataset Types
Data
Processing Tables Networks Fields (Continuous) Geometry (Spatial)
Attributes (columns) Grid of positions

Items Link
Cell
(rows) Position
Node
(item)
Cell containing value Attributes (columns)

Value in cell

Multidimensional Table Trees

Value in cell

8
Tables
Data Types

Dataset Types
Tables
Networks and Trees
Fields
Geometry
Other Combinations
• Many datasets come in the form of tables that are made up of rows and
Dataset Availability
columns, a familiar form to anybody who has used a spreadsheet
Attribute Types
• For a simple flat table
Data
Processing • Each row represents an item of data, and each column is an attribute
of the dataset
• Each cell in the table is fully specified by the combination of a row and
a column—an item and an attribute—and contains a value for that pair
• A multidimensional table has a more complex structure for indexing into a
cell, with multiple keys

9
Tables (cont.)
Data Types

Dataset Types
Tables
Networks and Trees
Fields
Geometry
Other Combinations
Dataset Availability

Attribute Types

Data
Processing

10
Networks
Data Types

Dataset Types
Tables
Networks and Trees
Fields
Geometry The dataset type of networks is well suited for specifying that there is some kind
of relationship between two or more items.
Other Combinations
Dataset Availability

Attribute Types
• An item in a network is often called a node.
Data
Processing • A link is a relation between two items.

11
Trees
Data Types

Dataset Types
Tables
Networks and Trees
Fields
Geometry
Other Combinations
• Networks with hierarchical structure are more specifically called trees.
Dataset Availability
• In contrast to a general network, trees do not have cycles: each child node
Attribute Types

Data
has only one parent node pointing to it
Processing

12
Fields
Data Types

Dataset Types
Tables
Networks and Trees
Fields
Geometry
Other Combinations
• The field dataset type also contains attribute values associated with cells
Dataset Availability
• Each cell in a field contains measurements or calculations from a continuous
Attribute Types

Data
domain
Processing
• Continuous data requires careful treatment that takes into account the
mathematical questions of sampling and interpolation
• In contrast, the table and network datatypes discussed above are an example
of discrete data where a finite number of individual items exist, and
interpolation between them is not a meaningful concept.

13
Spatial Fields
Data Types

Dataset Types
Tables
Networks and Trees
Fields
Geometry
Other Combinations
• Continuous data is often found in the form of a spatial field, where the cell
Dataset Availability
structure of the field is based on sampling at spatial positions
Attribute Types

Data
Processing

14
Grid Types
Data Types

Dataset Types
Tables
Networks and Trees
Fields
Geometry
Other Combinations
• When a field contains data created by sampling at completely regular
Dataset Availability
intervals, the cells form a uniform grid
Attribute Types
• There is no need to explicitly store the grid geometry in terms of its location
Data
Processing in space, or the grid topology in terms of how each cell connects with its
neighboring cells

15
Geometry
Data Types

Dataset Types
Tables
Networks and Trees
Fields
Geometry
Other Combinations
• The geometry dataset type specifies information about the shape of items
Dataset Availability
with explicit spatial positions.
Attribute Types
• The items could be points, or one-dimensional lines or curves, or 2D surfaces
Data
Processing or regions, or 3D volumes.
• Geometry datasets are intrinsically spatial. Spatial data often includes
hierarchical structure at multiple scales.

16
Other Combinations
Data Types

Dataset Types
Tables
Networks and Trees
Fields
Geometry There are many ways to group multiple items together, including sets, lists, and
clusters
Other Combinations
Dataset Availability

Attribute Types
• A set is simply an unordered group of items
Data
Processing • A group of items with a specified ordering could be called a list
• A cluster is a grouping based on attribute similarity
There are also more complex structures built on top of the basic network type
• A path through a network is an ordered set of segments formed by links
connecting nodes
• A compound network is a network with an associated tree

17
Dataset Availability
Data Types

Dataset Types
Tables
Networks and Trees
Fields
Geometry
Other Combinations
• The default approach to vis assumes that the entire dataset is available all at
Dataset Availability
once, as a static file
Attribute Types
• Some datasets are dynamic streams, where the dataset information trickles
Data
Processing in over the course of the vis session

Dataset Availability
Static Dynamic

18
Attribute Types
Attribute types
Data Types

Dataset Types
Tables
Networks and Trees
Fields

Concept 2
Geometry
Other Combinations
Dataset Availability

Attribute Types An attribute (also called dimension, feature, variable) is a data field, representing
Data a characteristic or feature of a data object.
Processing

• At the top level,


• we can differentiate qualitative (or categorical) and quantitative (or
numerical) attribute.
• At a second level,
• we can categorize qualitative data into nominal and ordinal attribute,
• and quantitative data into discrete and continuous attribute.

20
Level of measurement
Data Types

Dataset Types
Tables
Networks and Trees
Fields
Geometry
Other Combinations
• Describe the nature of information within the values assigned to variables
Dataset Availability Type Measure property Mathematical Advanced Central tendency Variability
operators operations
Attribute Types
Nominal Classification, =, 6= Grouping Mode Qualitative variation
Data membership
Processing Ordinal Comparison, level >, < Sorting Median Range, interquartile
range
Interval Difference, affinity +, − Comparison to a Arithmetic mean Deviation
standard
Ratio Magnitude, amount ∗, / Ratio Geometric mean, Coefficient of
harmonic mean variation,
studentized range

21
Nominal Attribute
Data Types

Dataset Types
Tables
Networks and Trees
Fields

Concept 3
Geometry
Other Combinations
Dataset Availability

Attribute Types
Nominal attribute represents things
Data
Processing • His name is Brent Spiner.
• By profession he is an actor.
• He played the character Data in the TV show Star Trek: The Next
Generation.

22
Ordinal Attribute
Data Types

Dataset Types
Tables
Networks and Trees
Fields

Concept 4
Geometry
Other Combinations
Dataset Availability

Attribute Types
Ordinal attribute is similar to categorical data, except it has a clear order
Data
Processing • Brent Spiner’s date of birth is Wednesday, February 2, 1949.
• He appeared in all seven seasons of Star Trek: The Next Generation.
• Data’s rank was lieutenant commander.

23
Discrete Attribute
Data Types

Dataset Types
Tables
Networks and Trees
Fields

Concept 5
Geometry
Other Combinations
Dataset Availability

Attribute Types
Discrete data are numeric data whose domain can be equated to the set of whole
Data numbers Z
Processing
An example of discrete data would be the number of people visiting a doctor.

24
Continuous Attribute
Data Types

Dataset Types
Tables
Networks and Trees
Fields

Concept 6
Geometry
Other Combinations
Dataset Availability

Attribute Types
Continuous data are numeric data whose domain can be equated to the set of real
Data numbers R.
Processing
An example of continuous data would be temperature values as measured hourly
by a weather station.

25
Sequential, Diverging and Cyclic
Data Types

Dataset Types
Tables
Networks and Trees
Fields
Geometry Ordered data can be
Other Combinations
Dataset Availability
• sequential, where there is a homogeneous range from a minimum to a
Attribute Types
maximum value,
Data
Processing • diverging, which can be deconstructed into two sequences pointing in
opposite directions that meet at a common zero point
• cyclic, where the values wrap around back to a starting point rather than
continuing to increase indefinitely.

26
Hierarchical Attributes
Data Types

Dataset Types
Tables
Networks and Trees
Fields
Geometry
Other Combinations
• There may be hierarchical structure within an attribute or between multiple
Dataset Availability
attributes.
Attribute Types

Data
Processing

27
The Big Picture
Data Types

Dataset Types
Tables
Networks and Trees
Fields
Geometry How?
Other Combinations
Encode Manipulate Facet Reduce
Dataset Availability
Arrange Change Juxtapose Filter
Attribute Types
Express Separate
Data
Processing
Order Align Select Partition Aggregate

Use
Navigate Superimpose Embed

Map
from categorical and ordered
attributes
Color
Hue Saturation Luminance

Size, Angle, Curvature, ...

Shape
What?

Motion Why?
Direction, Rate, Frequency, ...
How?
28
Data Processing
Dataset
Data Types

Dataset Types
Tables
Networks and Trees
Fields
Geometry
Other Combinations ID Name Age Shirt Size Favorite Fruit
1 Amy 8 S Apple
Dataset Availability

Attribute Types

Data
2 Basil 7 S Pear
Processing
3 Clara 9 M Durian
4 Desmond 13 L Elderberry
5 Ernest 12 L Peach
6 Fanny 10 S Lychee
7 George 9 M Orange
8 Hector 8 L Loquat
9 Ida 10 M Pear
10 Amy 12 M Orange

30
Functional dependency
Data Types

Dataset Types
Tables
Networks and Trees
Fields
Geometry

f : (D1 × D2 × · · · × Dn ) → (A1 × A2 × · · · × Am ) (1)


Other Combinations
Dataset Availability

Attribute Types

Data
where Di denote the dimensions (independent variables) and Ai the attributes
Processing (dependent variables)

D1 D2 ··· Dn A1 A2 ··· Am
d1,1 d1,2 ··· d1,n a1,1 a1,2 ··· a1,m
.. ..
. .
dk,1 dk,2 ··· dk,n ak,1 ak,2 ··· ak,m

31
Data Transformation
Data Types

Dataset Types
Tables
Networks and Trees
Fields
Geometry
Other Combinations
Preprocessing Mapping Rendering
Dataset Availability
Operators Operators Operators

Attribute Types

Data
Processing Data Analytical Visual Image
Values Abstractions Abstractions Data

Value Analytical Visual Image


Operators Operators Operators Operators
Data-oriented Graphics-oriented

32
References

Goodfellow, I., Bengio, Y., and Courville, A. (2016).


Deep learning.
MIT press.
Munzner, T. (2014).
Visualization analysis and design.
CRC press.
Russell, S. and Norvig, P. (2016).
Artificial intelligence: a modern approach.
Pearson Education Limited.
Ward, M. O., Grinstein, G., and Keim, D. (2015).
Interactive data visualization: foundations, techniques, and applications.
CRC Press.

You might also like