CBSE XII – INFORMATICS PRACTICES(065) TUTORIAL
By:
Shabana Anvar (email : [email protected])
Bhavan’s Public School, Doha
”Data Visualization” basically refers to the graphical or visual
representation of data using visual elements like chart, graph and
map etc.
Data visualization plays an essential role in the representation of
both small and large-scale data.
Its main goal is to distill large datasets into visual graphics to allow
for easy understanding of complex relationships within the data.
Several data visualization libraries are available in Python, namely
Matplotlib, Seaborn, and Folium etc
PURPOSE OF DATA VISUALIZATION
• Better analysis
• Quick action
• Identifying patterns
• Finding errors
• Understanding the story
• Exploring business insights
• Grasping the Latest Trends
Matplotlib is the whole python package/ library used to create 2D graphs
and plots by using python.
Pyplot is a module in matplotlib, which supports a very wide variety of graphs
and plots namely – line chart, histogram, bar charts, error charts etc. It is a
collection of methods within matplotlib which allow us to construct 2D plots
easily and interactively.
It is used along with NumPy to provide an environment for MatLab.
Pyplot provides the state-machine interface to the plotting library in
matplotlib.It means that figures and axes are implicitly and automatically
created to achieve the desired plot.
For example, calling plot from pyplot will automatically create the necessary figure and axes
to achieve the desired plot. Setting a title will then automatically set that title to the current
axes object
Matplotlib – pyplot features
Following features are provided in matplotlib library for data
visualization.
• Drawing – plots can be drawn based on passed data through
specific functions.
• Customization – plots can be customized as per requirement after
specifying it in the arguments of the functions.Like color, style
(dashed, dotted), width; adding label, title, and legend in plots can
be customized.
• Saving – After drawing and customization plots can be saved for
future use.
TYPES OF PLOT USING MATPLOTLIB
• LINE PLOT
• BAR GRAPH
• HISTOGRAM
• PIE CHART
• FREQUENCY POLYGON
• BOX PLOT
• SCATTER PLOT
A line chart or line graph is a type of chart which
displays information as a series of data points called ‘markers’
connected by straight line segments.
Generally line plots are used to display trends over time.
A line plot or line graph can be created using the plot() function
available in pyplot library.
EXAMPLES
import matplotlib.pyplot as plt
a=[1,2,3,4]
b=[2,3,6,8]
plt.plot(a,b)
plt.show()
import matplotlib.pyplot as plt
a=[1,2,3,4]
b=[2,3,6,8]
plt.xlabel("a values")
plt.ylabel("b values")
plt.plot(a,b)
plt.show()
➢ Custom line color
Line Plot customization ➢ Custom line style
➢ Custom line width
➢ Title
➢ Label
➢ Legend
import matplotlib.pyplot as plt
a=[1,2,3,4]
b=[2,3,6,8]
plt.xlabel("a values")
plt.ylabel("b values")
plt.plot(a,b,color='r',linewidth=1,linestyle='--',
marker='d',markersize=8,markeredgecolor='b')
plt.legend(('a'))
plt.title("Variation in a and b")
plt.show()
Line Plot customization ➢ Custom line color
➢ Custom line style
➢ Custom line width
➢ Title
➢ Label
➢ Legend
Multiple Plots
import matplotlib.pyplot as plt
a=[1,2,3,4]
b=[2,3,6,8]
plt.plot(a,b,color='r',linewidth=1,linestyle='-',
marker='d',markersize=8,markeredgecolor='b')
a1=[1,2.5,3,4]
b1=[3,6,2.5,7]
plt.plot(a1,b1,color='b',linewidth=1,linestyle='--',
marker='d',markersize=8,markeredgecolor='b')
plt.xlabel("a values")
plt.ylabel("b values")
plt.legend(('aValues','bValues'))
plt.show()
Multiple Views
import matplotlib.pyplot as plt
a=[1,2,3,4]
b=[2,3,6,8]
plt.subplot(2,1,1)
plt.plot(a,b,color='r',linewidth=1,linestyle='-',
marker='d',markersize=8,markeredgecolor='b')
a1=[1,2.5,3,4]
b1=[3,6,2.5,7]
plt.subplot(2,1,2)
plt.plot(a1,b1,color='b',linewidth=1,linestyle='--',
marker='d',markersize=8,markeredgecolor='b')
plt.xlabel("a values")
plt.ylabel("b values")
plt.legend(('aValues','bValues'))
plt.show()
Multiple Views
import matplotlib.pyplot as plt
a=[1,2,3,4]
b=[2,3,6,8]
plt.subplot(2,1,1)
plt.plot(a,b,color='r',linewidth=1,linestyle='-',
marker='d',markersize=8,markeredgecolor='b')
a1=[1,2.5,3,4]
b1=[3,6,2.5,7]
plt.subplot(2,1,2)
plt.plot(a1,b1,color='b',linewidth=1,linestyle='--',
marker='d',markersize=8,markeredgecolor='b')
plt.xlabel("a values")
plt.ylabel("b values")
plt.grid(True)
plt.legend(('aValues','bValues'))
plt.show()
BAR CHART
A Bar Graph /Chart a graphical display of data using
bars of different heights.
Syntax–
matplotlib.pyplot.bar(a,b)
* The bars can be horizontal or vertical.
* A bar graph makes it easy to compare data between different groups at a
glance.
* Bar graph represents categories on one axis and a discrete value in the other. The
goal bar graph is to show the relationship between the two axes. Bar graph can also
show big changes in data over time.
BAR CHART
import matplotlib.pyplot as plt
import numpy as np
x=np.arange(62,70)
print(x)
y=[30,78,42,90,89,55,95,20]
plt.xlabel("x axis")
plt.ylabel("y axis")
plt.bar(x,y)
plt.title("BAR CHART")
plt.show()
Bar graph customization
Custom bar color
plt.bar(index, per,color="green",edgecolor="blue")
Custom line style
plt.bar(index, per,color="green",edgecolor="blue",linewidth=4,linestyle='--')
Custom line width
plt.bar(index, per,color="green",edgecolor="blue",linewidth=4)
BAR CHART
import matplotlib.pyplot as plt
label = ['Anil', 'Vikas', 'Dharma', 'Mahen',
'Manish', 'Rajesh']
per = [94,85,45,25,50,54]
plt.bar(label, per,color='r',edgecolor='k',linewidth=3,linestyle='-')
plt.xlabel('Student Name', fontsize=5)
plt.ylabel('Percentage', fontsize=5)
plt.title('Percentage of Marks achieve by student Class XII')
plt.show()
BAR CHART
import matplotlib.pyplot as plt
import numpy as np
x=np.arange(62,70)
y=[30,78,42,90,89,55,95,20]
plt.xlabel("x axis")
plt.ylabel("y axis")
plt.barh(x,y)
plt.title("BAR CHART")
plt.show()
BAR CHART
BAR CHART
BAR CHART
BAR CHART
HISTOGRAM
A histogram is a graphical representation which organizes a group
of data points into user-specified ranges.
Histogram provides a visual interpretation of numerical data by
showing the number of data points that fall within a specified
range of values (“bins”).
It is similar to a vertical bar graph but without gaps between the
bars.
HISTOGRAM
import matplotlib.pyplot as plt
y=[20,22,25,24,22,25,27,24.5]
plt.hist(y)
plt.title("HISTOGRAM")
plt.show()
import matplotlib.pyplot as plt
import numpy as np
y=np.random.randn(100)
plt.hist(y)
plt.title("HISTOGRAM")
plt.show()
HISTOGRAM
import matplotlib.pyplot as plt
import numpy as np
y=np.random.randn(1000)
plt.hist(y,bins=25)
plt.title("HISTOGRAM")
plt.show()
import matplotlib.pyplot as plt
import numpy as np
y=np.random.randn(1000)
plt.hist(y,bins=25,edgecolor='r')
plt.title("HISTOGRAM")
plt.show()
HISTOGRAM
import matplotlib.pyplot as plt
y=[20,22,25,21,24,22,27,25,22]
plt.hist(y,bins=[20,21,22,23,24,25,26,27,28])
plt.title("HISTOGRAM")
plt.show()
import matplotlib.pyplot as plt
y=[20,22,25,24,25,27,21,23]
plt.hist(y,bins=[20,21,22,23,24,25,26,27,28],weights=[10,5,3,12,10,5,6,2],
edgecolor='r')
plt.title("HISTOGRAM")
plt.show()
HISTOGRAM
import matplotlib.pyplot as plt
y=[5,15,25,35,45,55]
plt.hist(y,bins=[0,10,20,30,40,50,60],weights=[20,10,45,33,6,8],edgecolor='r',facecolor='y')
plt.title("HISTOGRAM")
plt.xlabel("x axis")
plt.ylabel("y axis")
plt.show()
The Pandas Plot function (Pandas Visualization)
The plot() method of Pandas accepts a considerable number of arguments that
can be used to plot a variety of graphs.
It allows customising different plot types by supplying the kind keyword
arguments.
The general syntax is:
plt.plot(kind), where kind accepts a string indicating the type of plot.
Plotting Histogram
import pandas as pd
import matplotlib.pyplot as plt
data = {'Name':['Arnav', 'Sheela', 'Azhar', 'Bincy', 'Yash',
'Nazar'],
'Height' : [60,61,63,65,61,60],
'Weight' : [47,89,52,58,50,47]}
df=pd.DataFrame(data)
df.plot(kind='hist',edgecolor='Green',linewidth=2,linestyle=':')
plt.show()
THANK YOU