JYOTHISHMATHI INSTITUTE OF TECHNOLOGY AND SCIENCE
K A R M N A G A R - 50 5 4 8 1
(Approved by AIC T E , Ne w D e lh i a nd A ff il ia ted to JNTU, Hyderabad)
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
III BTECH I SEMESTER 2024-2025
DATA ANALYTICS UNIT-V
Data visualization
• Data visualization is the practice of translating information into a visual context, such as a map or
graph, to make data easier for the human brain to understand and pull insights from.
• The main goal of data visualization is to make it easier to identify patterns, trends and outliers in
large data sets. The term is often used interchangeably with others, including information graphics,
information visualization and statistical graphics.
• Data visualization is one of the steps of the data science process, which states that after data has been
collected, processed and modeled, it must be visualized for conclusions to be made. Data visualization
is also an element of the broader data presentation architecture (DPA) discipline, which aims to
identify, locate, manipulate, format and deliver data in the most efficient way possible.
Why is data visualization important?
Data visualization provides a quick and effective way to communicate information in a universal
manner using visual information.
The practice can also help businesses identify which factors affect customer behavior; pinpoint areas
that need to be improved or need more attention; make data more memorable for stakeholders;
understand when and where to place specific products; and predict sales volumes.
Other benefits of data visualization include the following:
The ability to absorb information quickly, improve insights and make faster decisions;
an increased understanding of the next steps that must be taken to improve the organization;
an improved ability to maintain the audience's interest with information they can understand;
an easy distribution of information that increases the opportunity to share insights with everyone
involved;
eliminate the need for data scientists since data is more accessible and understandable; and
an increased ability to act on findings quickly and, therefore, achieve success with greater speed and
less mistakes.
Categorization of visualization methods
Pixel-oriented visualization techniques
Prepared by E.Shireesha , Asst. Prof, CSE Dept. JITS - Karimnagar Page 1
JYOTHISHMATHI INSTITUTE OF TECHNOLOGY AND SCIENCE
K A R M N A G A R - 50 5 4 8 1
(Approved by AIC T E , Ne w D e lh i a nd A ff il ia ted to JNTU, Hyderabad)
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
Geometric projection visualization techniques
Icon-based visualization techniques
Hierarchical visualization techniques
Visualizing complex data and relations
Pixel-Oriented Visualization Techniques
A simple way to visualize the value of a dimension is to use a pixel where the color of the pixel reflects
the dimension’s value.
For a data set of m dimensions pixel oriented techniques create m windows on the screen, one for
each dimension.
The m dimension values of a record are mapped to m pixels at the corresponding position in the
windows.
The color of the pixel reflects other corresponding values. Inside a window, the data values are
arranged in some global order shared by all windows.
Eg: All Electronics maintains a customer information table, which consists of 4 dimensions:
income, credit_limit, transaction_volume and age. We analyze the correlation between income
and other attributes by visualization.
We sort all customers in income in ascending order and use this order to layout the customer data in
the 4 visualization windows as shown in fig.
The pixel colors are chosen so that the smaller the value, the lighter the shading.
Using pixel based visualization we can easily observe that credit_limit increases as income increases
customer whose income is in the middle range are more likely to purchase more from All Electronics,
these is no clear correlation between income ages.
Prepared by E.Shireesha , Asst. Prof, CSE Dept. JITS - Karimnagar Page 2
JYOTHISHMATHI INSTITUTE OF TECHNOLOGY AND SCIENCE
K A R M N A G A R - 50 5 4 8 1
(Approved by AIC T E , Ne w D e lh i a nd A ff il ia ted to JNTU, Hyderabad)
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
Laying Out Pixels in Circle Segments
To save space and show the connections among multiple dimensions, space filling is often done in a
circle segment.
Geometric Projection Visualization Techniques
In the pixel-oriented visualization technique, the distribution of data in multidimensional space cannot
be specified clearly. A drawback of pixel-oriented visualization techniques is that they cannot help us
much in understanding the distribution of data in a multidimensional space.
Geometric projection techniques help users find interesting projections of multidimensional data sets.
The various techniques of geometric projection visualization includes the following:
Methods
Direct visualization
Scatter plot and scatter plot matrices
Landscapes Projection pursuit technique: Help users find meaningful projections of multidimensional
data
Prosection views
Hyperslice
Parallel coordinates
Direct visualization
Direct visualizations of image data make use of the images in their original visible format. The first technique,
the slice histogram, arranges slices of images as histograms, organized by both visual and non-visual variables.
Prepared by E.Shireesha , Asst. Prof, CSE Dept. JITS - Karimnagar Page 3
JYOTHISHMATHI INSTITUTE OF TECHNOLOGY AND SCIENCE
K A R M N A G A R - 50 5 4 8 1
(Approved by AIC T E , Ne w D e lh i a nd A ff il ia ted to JNTU, Hyderabad)
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
Scatter Plots
A scatter plot displays 2-D data points using Cartesian coordinates.
A third dimension can be added using different colors or shapes to represent different data points
Through this visualization, in the adjacent figure, we can see that points of types “+” and “×” tend to be
collocated.
Scatterplots show many points plotted in the Cartesian plane. Each point represents the values of two
variables. One variable is chosen in the horizontal axis and another in the vertical axis.
The technique of scatter plot is inefficient if the number of dimensions in a data set is greater than
four. So enhanced techniques of scatter plot is called scatter-plot matrix.
Scatterplot Matrices
The scatter-plot matrix is an extension to the scatter plot.
In this technique, if data set contains ‘k’ dimensions then ‘k x k grid of 2-D scatter plots represents a
scatter-plot matrix. Thus provides visualization for each dimension with remaining dimensions.
When we have more than two variables and we want to find the correlation between one variable
versus the remaining ones we use scatter plot matrix.
For k-dimensional data a minimum of (k2-k)/2 scatter plots of 2D will be required.
There can be maximum of k2 plots of 2D
In the adjoining figure , there are k2 plots.
Out of these, k are X-X plots, and all X-Y plots (where X, Y are distinct dimensions) are given in 2
orientations (X vs Y and Y vs, X)
Prepared by E.Shireesha , Asst. Prof, CSE Dept. JITS - Karimnagar Page 4
JYOTHISHMATHI INSTITUTE OF TECHNOLOGY AND SCIENCE
K A R M N A G A R - 50 5 4 8 1
(Approved by AIC T E , Ne w D e lh i a nd A ff il ia ted to JNTU, Hyderabad)
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
Parallel Coordinates
The scatter-plot matrix becomes less effective as the dimensionality increases.
Another technique, called parallel coordinates, can handle higher dimensionality
n equidistant axes which are parallel to one of the screen axes and correspond to the attributes (i.e. n
dimensions)
The axes are scaled to the [minimum, maximum]: range of the corresponding attribute
Every data item corresponds to a polygonal line which intersects each of the axes at the point which
corresponds to the value for the attribute
Prepared by E.Shireesha , Asst. Prof, CSE Dept. JITS - Karimnagar Page 5
JYOTHISHMATHI INSTITUTE OF TECHNOLOGY AND SCIENCE
K A R M N A G A R - 50 5 4 8 1
(Approved by AIC T E , Ne w D e lh i a nd A ff il ia ted to JNTU, Hyderabad)
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
Icon-Based Visualization Techniques
Icon based visualization techniques makes use of small icons for representing multidimensional data vales.
Visualization of the data values as features of icons
Typical visualization methods
o Chernoff Faces
o Stick Figures
General techniques
o Shape coding: Use shape to represent certain information encoding
o Color icons: Use color icons to encode more information
o Tile bars: Use small icons to represent the relevant feature vectors in document retrieval
Chernoff Faces
It displays multidimensional data in the form of cartoon human face upto 18 dimensions.
It specifies the dimensional value of various components like eyes, ears, mouth and nose by their
shape, position and orientation.
Moveover, it also utilizes the mindset of a human in identifying the differences between facial
features.
A way to display variables on a two-dimensional surface, e.g., let x be eyebrow slant, y be eye size, z be
nose length, etc.
The figure shows faces produced using 10 characteristics–head eccentricity, eye size, eye spacing, eye
eccentricity, pupil size, eyebrow slant, nose size, mouth shape, mouth size, and mouth opening): Each
assigned one of 10 possible values.
Prepared by E.Shireesha , Asst. Prof, CSE Dept. JITS - Karimnagar Page 6
JYOTHISHMATHI INSTITUTE OF TECHNOLOGY AND SCIENCE
K A R M N A G A R - 50 5 4 8 1
(Approved by AIC T E , Ne w D e lh i a nd A ff il ia ted to JNTU, Hyderabad)
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
Stick Figure
It maps multidimensional data to five –piece stick figure, where each figure has 4 limbs and a body.
2 dimensions are mapped to the display axes and the remaining dimensions are mapped to the angle
and/ or length of the limbs.
Hierarchical Visualization Techniques
Visualization of the data using a hierarchical partitioning into subspaces.
For a large data set of high dimensionality, it would be difficult to visualize all dimensions at the same
time.
Hierarchical visualization techniques partition all dimensions into subsets (i.e., subspaces).
The subspaces are visualized in a hierarchical manner
“Worlds-within-Worlds,” also known as n-Vision, is a representative hierarchical visualization method.
To visualize a 6-D data set, where the dimensions are F,X1,X2,X3,X4,X5.
We want to observe how F changes w.r.t. other dimensions. We can fix X3,X4,X5 dimensions to
selected values and visualize changes to F w.r.t. X1, X2.
Methods:
Dimensional Stacking
Worlds-Within-Worlds
Tree-Map
Cone Trees
InfoCube
Dimensional Stacking:
Prepared by E.Shireesha , Asst. Prof, CSE Dept. JITS - Karimnagar Page 7
JYOTHISHMATHI INSTITUTE OF TECHNOLOGY AND SCIENCE
K A R M N A G A R - 50 5 4 8 1
(Approved by AIC T E , Ne w D e lh i a nd A ff il ia ted to JNTU, Hyderabad)
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
The n-dimensional attribute space is partitioned into 2-D subspaces, which are “stacked” into each
other.
Attribute value ranges are divided into classes.
Important attributes should be placed at the outer levels.
Suitable for ordinal attributes of low cardinality.
Limitation: Difficult to display beyond 9 dimensions.
Visualization example: In oil mining data, longitude and latitude are mapped to the outer axes,
while ore grade and depth are mapped to the inner axes.
EX: Retail Store Sales
Dataset dimensions:
Region (North, South, East, West)
Product Category (Electronics, Clothing, Grocery, Furniture)
Customer Age Group (Young, Adult, Senior)
Sales Performance (Low, Medium, High)
Dimensional Stacking Visualization:
1. Outer axes: Region (x-axis), Product Category (y-axis).
2. Inner grid inside each cell: Customer Age Group (x-axis) vs Sales Performance (y-axis).
3. Color coding: Revenue or Profit margin.
👉 This allows managers to visually detect which age group is buying more in which region and category —
without creating multiple separate charts. Here’s the color-coded Dimensional Stacking diagram for Retail
Store Sales 📊:
Outer Grid: Region vs Product Category
Inner Grid: Customer Age Group (x-axis) vs Sales Performance (y-axis)
Colors:
o 🟥 Red = Low Sales
Prepared by E.Shireesha , Asst. Prof, CSE Dept. JITS - Karimnagar Page 8
JYOTHISHMATHI INSTITUTE OF TECHNOLOGY AND SCIENCE
K A R M N A G A R - 50 5 4 8 1
(Approved by AIC T E , Ne w D e lh i a nd A ff il ia ted to JNTU, Hyderabad)
o 🟨 Yellow = Medium Sales
o 🟩 Green = High Sales
This way, each cell immediately shows which age group drives sales performance in each region & product
category — very useful for business dashboards.
Prepared by E.Shireesha , Asst. Prof, CSE Dept. JITS - Karimnagar Page 9
JYOTHISHMATHI INSTITUTE OF TECHNOLOGY AND SCIENCE
K A R M N A G A R - 50 5 4 8 1
(Approved by AIC T E , Ne w D e lh i a nd A ff il ia ted to JNTU, Hyderabad)
Prepared by E.Shireesha , Asst. Prof, CSE Dept. JITS - Karimnagar Page 10
JYOTHISHMATHI INSTITUTE OF TECHNOLOGY AND SCIENCE
K A R M N A G A R - 50 5 4 8 1
(Approved by AIC T E , Ne w D e lh i a nd A ff il ia ted to JNTU, Hyderabad)
Here’s the color-coded Dimensional Stacking diagram for Retail Store Sales 📊:
Outer Grid: Region vs Product Category
Inner Grid: Customer Age Group (x-axis) vs Sales Performance (y-axis)
Colors:
o 🟥 Red = Low Sales
o 🟨 Yellow = Medium Sales
o 🟩 Green = High Sales
This way, each cell immediately shows which age group drives sales performance in each
region & product category — very useful for business dashboards.
Worlds-within-worlds
Assign the function and two most important parameters to innermost world.
Fix all other parameters at constant values-draw other (1 or 2 or 3 dimensional worlds choosing these
as the axes)
Software that uses this paradigm.
Prepared by E.Shireesha , Asst. Prof, CSE Dept. JITS - Karimnagar Page 11
JYOTHISHMATHI INSTITUTE OF TECHNOLOGY AND SCIENCE
K A R M N A G A R - 50 5 4 8 1
(Approved by AIC T E , Ne w D e lh i a nd A ff il ia ted to JNTU, Hyderabad)
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
Real-Time Example: Student Performance Analysis
Prepared by E.Shireesha , Asst. Prof, CSE Dept. JITS - Karimnagar Page 12
JYOTHISHMATHI INSTITUTE OF TECHNOLOGY AND SCIENCE
K A R M N A G A R - 50 5 4 8 1
(Approved by AIC T E , Ne w D e lh i a nd A ff il ia ted to JNTU, Hyderabad)
Function (F): Student’s Final Exam Score
Dimensions:
Hours of Study (Study Time)
Attendance (%)
Participation in Activities
Type of Subject (Theory / Practical)
Year of Study (1st, 2nd, 3rd, 4th Year)
How Worlds-Within-Worlds works here
Innermost World (2D Plot):
X-axis = Study Time
Y-axis = Attendance
F (color/height) = Final Exam Score
Fixed Parameters:
Subject = "Mathematics"
Year = "2nd Year"
Next Outer World:
Type of Subject (Theory vs Practical)
Each subject type has its own innermost world (Study Time vs Attendance).
Another Outer World:
Year of Study (1st–4th Year)
Each year contains its subject-type worlds, which then contain the innermost Study vs Attendance
world.
How to Draw the Diagram in PowerPoint
1. Insert 3 rectangles (Insert → Shapes → Rectangle).
o Place one big outer rectangle → label it Year of Study (1st–4th Year).
o Inside it, add a medium rectangle → label it Subject Type (Theory / Practical).
o Inside that, add a small rectangle → label it Study Time (X) vs Attendance (Y) → Final Exam
Score (F).
2. Apply colors:
o Outermost (Year) → light blue
o Middle (Subject Type) → light green
o Innermost (Study vs Attendance) → light yellow
3. Add bold titles inside each box.
o Example: use 16pt font for outer, 14pt for middle, 12pt for inner.
4. Optional styling:
o Add thicker borders for the outer box.
o Use alignment → center for neat text placement.
Prepared by E.Shireesha , Asst. Prof, CSE Dept. JITS - Karimnagar Page 13
JYOTHISHMATHI INSTITUTE OF TECHNOLOGY AND SCIENCE
K A R M N A G A R - 50 5 4 8 1
(Approved by AIC T E , Ne w D e lh i a nd A ff il ia ted to JNTU, Hyderabad)
Tree-Map:
A Tree-Map is great for showing hierarchical data as nested rectangles, where the size of each rectangle
represents a quantitative value (e.g., sales, marks, revenue), and the color represents another attribute
(e.g., performance level).
Screen-Filling method which uses a hierarchical partitioning of the screen into regions departing on the
attribute values.
The X- and Y- dimension of the screen are partitioned alternately according o the attribute values
(classes)
Prepared by E.Shireesha , Asst. Prof, CSE Dept. JITS - Karimnagar Page 14
JYOTHISHMATHI INSTITUTE OF TECHNOLOGY AND SCIENCE
K A R M N A G A R - 50 5 4 8 1
(Approved by AIC T E , Ne w D e lh i a nd A ff il ia ted to JNTU, Hyderabad)
Here:
[A] = Green rectangle (Good Grade)
[B] = Yellow rectangle (Average Grade)
[C] = Red rectangle (Poor Grade)
Real-Time Example: Student Performance in a College
Hierarchy Levels:
1. Department (CSE, ECE, ME, CE)
2. Year (1st, 2nd, 3rd, 4th Year)
3. Individual Students
Prepared by E.Shireesha , Asst. Prof, CSE Dept. JITS - Karimnagar Page 15
JYOTHISHMATHI INSTITUTE OF TECHNOLOGY AND SCIENCE
K A R M N A G A R - 50 5 4 8 1
(Approved by AIC T E , Ne w D e lh i a nd A ff il ia ted to JNTU, Hyderabad)
Metrics:
o Rectangle Size: Student’s total marks
o Rectangle Color: Grade (A=Green, B=Yellow, C=Red)
So, in the treemap:
Each big rectangle = Department
Inside, smaller rectangles = Year
Inside each year = students, with size = marks and color = grade.
This helps teachers quickly see:
Which department has overall better performance
Which year inside a department is weak
Which students are underperforming.
Info Cube
A 3-D visualization techniques where hierarchical information is displayed as nested semi-transparent
cubes.
The outermost cubes correspond to the top level data, while the sub nodes or the lower level data are
represented as smaller cubes inside the outermost cubes, and so on.
Prepared by E.Shireesha , Asst. Prof, CSE Dept. JITS - Karimnagar Page 16
JYOTHISHMATHI INSTITUTE OF TECHNOLOGY AND SCIENCE
K A R M N A G A R - 50 5 4 8 1
(Approved by AIC T E , Ne w D e lh i a nd A ff il ia ted to JNTU, Hyderabad)
Example: University Data Visualization 🎓
We want to visualize university academic data in a hierarchical manner.
Outermost Cube (Level 1): University
Next Inner Cubes (Level 2): Faculties (Engineering, Science, Arts,
Commerce)
Next Inner Cubes (Level 3): Departments (e.g., CSE, ECE, Mechanical
inside Engineering)
Innermost Cubes (Level 4): Courses (Python, DBMS, AI under CSE)
So, the user can look at the big cube (University) and zoom inside to see
smaller cubes for Faculties, and further into Departments, and finally
Courses.
This way, hierarchical academic information is represented clearly in
nested semi-transparent 3D cubes.
Prepared by E.Shireesha , Asst. Prof, CSE Dept. JITS - Karimnagar Page 17
JYOTHISHMATHI INSTITUTE OF TECHNOLOGY AND SCIENCE
K A R M N A G A R - 50 5 4 8 1
(Approved by AIC T E , Ne w D e lh i a nd A ff il ia ted to JNTU, Hyderabad)
Three-D Cone Trees
A 3D Cone Tree is a hierarchical visualization technique used to display large tree structures in 3D
Prepared by E.Shireesha , Asst. Prof, CSE Dept. JITS - Karimnagar Page 18
JYOTHISHMATHI INSTITUTE OF TECHNOLOGY AND SCIENCE
K A R M N A G A R - 50 5 4 8 1
(Approved by AIC T E , Ne w D e lh i a nd A ff il ia ted to JNTU, Hyderabad)
space. It was introduced by Robertson et al. at Xerox PARC.
Key Points:
Nodes are arranged in 3D with the root at the top, and children placed in a circular layout
beneath it (like a cone).
When projected onto 2D screens, users can interact by rotating, zooming, or expanding nodes
to reduce overlap.
Works well for up to ~1000 nodes.
Useful for visualizing file systems, organizational structures, or course prerequisites.
Real-Time Example: University Course Prerequisites
Root Node: Computer Science Program
Children Nodes: Semester 1, Semester 2, Semester 3 …
Leaf Nodes: Subjects (e.g., Programming, Data Structures, AI, ML, etc.)
Visualization: Each semester’s subjects are arranged around the cone base of that semester.
Prepared by E.Shireesha , Asst. Prof, CSE Dept. JITS - Karimnagar Page 19
JYOTHISHMATHI INSTITUTE OF TECHNOLOGY AND SCIENCE
K A R M N A G A R - 50 5 4 8 1
(Approved by AIC T E , Ne w D e lh i a nd A ff il ia ted to JNTU, Hyderabad)
3D cone tree visualization techniques works well for up to a thousand nodes or so.
First build a 2D circle tree that arranges its nodes in concentric circles centered on the root
node.
Cannot avoid overlaps when projected to 2D.
Prepared by E.Shireesha , Asst. Prof, CSE Dept. JITS - Karimnagar Page 20
JYOTHISHMATHI INSTITUTE OF TECHNOLOGY AND SCIENCE
K A R M N A G A R - 50 5 4 8 1
(Approved by AIC T E , Ne w D e lh i a nd A ff il ia ted to JNTU, Hyderabad)
Visualizing Complex Data and Relations.
Most visualization techniques were mainly for numeric data.
Recently, more and more non-numeric data, such as text and social networks, have become
available.
Many people on the Web tag various objects such as pictures, blog entries, and product reviews.
A tag cloud is a visualization of statistics of user-generated tags.
Often, in a tag cloud, tags are listed alphabetically or in a user-preferred order.
The importance of a tag is indicated by font size or color.
Visualizing non-numeric data: Text and social networks
Tag cloud: visualizing user-generated tags.
Importance of tag is represented by font size/color.
Prepared by E.Shireesha , Asst. Prof, CSE Dept. JITS - Karimnagar Page 21
JYOTHISHMATHI INSTITUTE OF TECHNOLOGY AND SCIENCE
K A R M N A G A R - 50 5 4 8 1
(Approved by AIC T E , Ne w D e lh i a nd A ff il ia ted to JNTU, Hyderabad)
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
Besides text data, there are also methods to visualize relationships, such as visualizing social network.
Visualizing Complex Data and Relations
When datasets are large, high-dimensional, and interconnected, traditional charts (bar, line, pie) are
insufficient. Specialized data visualization techniques help reveal patterns, hierarchies, and
relationships effectively.
🔹 Techniques for Complex Data Visualization
Graph / Network Visualization
Use Case: Social media networks, citation graphs, communication links.
Example: Showing how students in a class are connected through group projects.
Hierarchical Visualization
Techniques: Cone Trees, Tree Maps, InfoCubes, Dimensional Stacking.
Example: Course → Department → Faculty → University.
Multi-Dimensional Visualization
Techniques: Parallel Coordinates, Scatterplot Matrix.
Example: Analyzing students’ performance by (Marks, Attendance, Assignments, Projects).
3D & Immersive Visualization
Techniques: Cone Trees, InfoCubes, VR-based exploration.
Example: 3D cube showing University → Departments → Courses → Students.
Pixel-Oriented & Icon-Based Visualization
Use Case: Very large datasets (millions of records).
Example: Heatmaps of student attendance across an academic year.
Geometric Projection Techniques
Techniques: PCA, MDS, t-SNE.
Use Case: Dimensionality reduction for clustering student performance patterns.
🔹 Real-Time Example: Student Performance Dashboard
Input Data: Marks, Attendance, Age, Study Hours, Activities.
Visualization:
Tree Map → Subjects contribution to final grade.
Cone Tree → Course hierarchy with prerequisites.
Prepared by E.Shireesha , Asst. Prof, CSE Dept. JITS - Karimnagar Page 22
JYOTHISHMATHI INSTITUTE OF TECHNOLOGY AND SCIENCE
K A R M N A G A R - 50 5 4 8 1
(Approved by AIC T E , Ne w D e lh i a nd A ff il ia ted to JNTU, Hyderabad)
Parallel Coordinates → Compare multiple attributes (marks vs. attendance vs. projects).
Network Graph → Group collaboration and peer influence.
Prepared by E.Shireesha , Asst. Prof, CSE Dept. JITS - Karimnagar Page 23