Matplotlib
Matplotlib
What is Matplotlib?
Data visualization involves exploring data through visual
representations.
The matplotlib package helps you make visually appealing
representations of the data you’re working with.
Matplotlib is extremely flexible
Following examples will help you get started with a few simple
visualizations.
Installing Matplotlib
Matplotlib runs on all systems, but setup is slightly different
depending on your OS.
If the minimal instructions here don’t work for you, see the more
detailed instructions at http://ehmatthes.github.io/pcc/.
You should also consider installing the Anaconda distribution of
Python from https://continuum.io/downloads/, which includes
matplotlib.
Installing Matplotlib
matplotlib on Linux
$ sudo apt-get install python3-matplotlib
matplotlib
matplotlib on OS X
Start a terminal session and enter import matplotlib to see if it’s already
installed on your system.
If not, try this command: $ pip install --user matplotlib
matplotlib on Windows
You first need to install Visual Studio, which you can do from
https://dev.windows.com/. The Community edition is free.
Then go to https://pypi.python.org/pypi/matplotlib/ or
http://www.lfd.uic.edu/~gohlke/pythonlibs/#matplotlib and download an
appropriate installer file.
Since we are using anaconda it might be there.
Line graphs and scatter plot
Making a line graph
import matplotlib.pyplot as plt
x_values = [0, 1, 2, 3, 4, 5]
squares = [0, 1, 4, 9, 16, 25]
plt.plot(x_values, squares)
plt.show()
Line graphs and scatter plot
Making a scatter plot
The scatter() function takes a list of x values and a list of y values,
and a variety of optional arguments. The s=10 argument controls
the size of each point.
import matplotlib.pyplot as plt
x_values = list(range(1000))
squares = [x**2 for x in x_values]
plt.scatter(x_values, squares, s=10)
plt.show()
Customizing plots
Plots can be customized in a wide variety of ways. Just about any
element of a plot can be customized.
Adding titles and labels, and scaling axes
import matplotlib.pyplot as plt x_values = list(range(1000))
squares = [x**2 for x in x_values]
plt.scatter(x_values, squares, s=10)
plt.title("Square Numbers", fontsize=24)
=24)
plt.xlabel("Value", fontsize=18)
plt.ylabel("Square of Value", fontsize=18)
=18)
plt.tick_params(axis='both',
(axis='both', which='major', labelsize=14)
plt.axis([0, 1100, 0, 1100000])
plt.show()
Customizing plots
Using a colormap
A colormap varies the point colors from one shade to another,
based on a certain value for each point.
The value used to determine the color of each point is passed to the
c argument, and the cmap argument specifies which colormap to use.
The edgecolor='none'
='none' argument removes the black outline from
each point.
plt.scatter(x_values,, squares, c=squares, cmap=plt.cm.Blues,
edgecolor='none', s=10)
Customizing plots
Emphasizing points
You can plot as much data as you want on one plot.
Here we replot the first and last points larger to emphasize them.
import matplotlib.pyplot as plt
x_values = list(range(1000))
squares = [x**2 for x in x_values]
plt.scatter(x_values,, squares, c=squares, cmap=plt.cm.Blues, edgecolor='none’, s=10)
plt.scatter(x_values[0],
[0], squares[0], c='green',
edgecolor='none', s=100)
plt.scatter(x_values[-1], squares[-1],
1], c='red',
edgecolor='none', s=100)
plt.title("Square Numbers", fontsize=24)
Customizing plots
Removing axes
You can customize or remove axes entirely.
Here’s how to access each axis, and hide it.
plt.axes().get_xaxis().set_visible(False)
(False)
plt.axes().get_yaxis().set_visible(False)
(False)
Customizing plots
Setting a custom figure size
You can make your plot as big or small as you want.
Before plotting your data, add the following code. The dpi argument
is optional; if you don’t know your system’s resolution you can omit
the argument and adjust the figsize argument accordingly.
plt.figure(dpi=128, figsize=(10, 6))
Customizing plots
Saving a plot
The matplotlib viewer has an interactive save button, but you can
also save your visualizations programmatically.
To do so, replace plt.show() with plt.savefig().
plt.savefig
The bbox_inches='tight’
='tight’ argument trims extra whitespace from the
plot
plt.savefig('squares.png', bbox_inches='tight')
='tight')
Line Plot sin wave example
import matplotlib.pyplot as plt
import numpy as np
plt.style.use('_mpl-gallery')
# make data
x = np.linspace(0, 10, 100)
y = 4 + 2 * np.sin(2 * x)
# plot
fig, ax = plt.subplots()
ax.plot(x, y, linewidth=2.0)
ax.set(xlim=(0, 8), xticks=np.arange(1,
(1, 8), ylim=(0, 8), yticks=np.arange(1,
8))
plt.show()
Bar plot
import matplotlib.pyplot as plt
import numpy as np plt.style.use('_ '_mpl-gallery')
# make data: np.random.seed(3)
x = 0.5 + np.arange(8)
y = np.random.uniform(2, 7, len(x))))
# plot
fig, ax = plt.subplots()
ax.bar(x, y, width=1, edgecolor="white"
"white", linewidth=0.7)
ax.set(xlim=(0, 8), xticks=np.arange
arange(1, 8),
ylim=(0, 8), yticks=np.arange
arange(1, 8))
plt.show()
Stem plot
import matplotlib.pyplot as plt
import numpy as np
plt.style.use('_mpl-gallery')
# make data
np.random.seed(3)
x = 0.5 + np.arange(8)
y = np.random.uniform(2, 7, len(x))
# plot
fig, ax = plt.subplots()
ax.stem(x, y)
ax.set(xlim=(0, 8), xticks=np.arange(1, 8),
ylim=(0, 8), yticks=np.arange(1, 8))
plt.show()
Step plot
import matplotlib.pyplot as plt
import numpy as np
plt.style.use('_mpl-gallery')
# make data np.random.seed(3)
x = 0.5 + np.arange(8)
y = np.random.uniform(2, 7, len(x))
# plot
fig, ax = plt.subplots()
ax.step(x, y, linewidth=2.5)
ax.set(xlim=(0, 8), xticks=np.arange(1,
(1, 8),
ylim=(0, 8), yticks=np.arange(1,
(1, 8))
plt.show()
Fill between
import matplotlib.pyplot as plt
import numpy as np
plt.style.use('_mpl-gallery')
# make data
np.random.seed(1)
x = np.linspace(0, 8, 16)
y1 = 3 + 4*x/8 + np.random.uniform(0.0, 0.5, len(x))
y2 = 1 + 2*x/8 + np.random.uniform(0.0, 0.5, len(x))
# plot
fig, ax = plt.subplots()
ax.fill_between(x, y1, y2, alpha=.5, linewidth=0)
linewidth
ax.plot(x, (y1 + y2)/2, linewidth=2)
ax.set(xlim=(0, 8), xticks=np.arange(1, 8), ylim=(0,
ylim 8), yticks=np.arange(1, 8))
plt.show()
Stack plot
import matplotlib.pyplot as plt
import numpy as np
plt.style.use('_mpl-gallery')
# make data
x = np.arange(0, 10, 2)
ay = [1, 1.25, 2, 2.75, 3]
by = [1, 1, 1, 1, 1]
cy = [2, 1, 2, 1, 2]
y = np.vstack([ay, by, cy])
# plot
fig, ax = plt.subplots()
ax.stackplot(x, y)
ax.set(xlim=(0, 8), xticks=np.arange(1, 8), ylim=(0, 8), yticks=np.arange(1, 8))
plt.show()
Histogram
import matplotlib.pyplot as plt
import numpy as np
plt.style.use('_mpl-gallery')
# make data np.random.seed(1)
x = 4 + np.random.normal(0, 1.5, 200)
# plot:
fig, ax = plt.subplots()
ax.hist(x, bins=8, linewidth=0.5, edgecolor="white")
edgecolor
ax.set(xlim=(0, 8), xticks=np.arange
arange(1, 8), ylim=(0, 56),
yticks=np.linspace(0, 56, 9))
plt.show()
Box plot
import matplotlib.pyplot as plt
import numpy as np plt.style.use('_mpl-gallery'
gallery')
# make data:
np.random.seed(10)
D = np.random.normal((3, 5, 4), (1.25, 1.00, 1.25), (100, 3))
# plot
fig, ax = plt.subplots()
VP = ax.boxplot(D, positions=[2, 4, 6], widths=1.5,
widths patch_artist=True,
showmeans=False, showfliers=False, medianprops={"color":
medianprops "white", "linewidth":
0.5}, boxprops={"facecolor": "C0", "edgecolor
edgecolor": "white", "linewidth": 0.5},
whiskerprops={"color": "C0", "linewidth":: 1.5}, capprops={"color": "C0", "linewidth":
1.5}) a
x.set(xlim=(0, 8), xticks=np.arange(1, 8), ylim=(0,
ylim 8), yticks=np.arange(1, 8))
plt.show()
Violin plot
import matplotlib.pyplot as plt
import numpy as np plt.style.use('_mpl-gallery'
gallery')
# make data:
np.random.seed(10)
D = np.random.normal((3, 5, 4), (0.75, 1.00, 0.75), (200, 3))
# plot: fig, ax = plt.subplots()
vp = ax.violinplot(D, [2, 4, 6], widths=2, showmeans=False,
showmedians=False, showextrema=False)
=False)
# styling:
for body in vp['bodies']:
body.set_alpha(0.9)
ax.set(xlim=(0, 8), xticks=np.arange(1, 8), ylim=(0, 8), yticks=np.arange(1, 8))
plt.show()
Hist2d plot
import matplotlib.pyplot as plt
import numpy as np
plt.style.use('_mpl-gallery-nogrid')
# make data: correlated + noise
np.random.seed(1)
x = np.random.randn(5000)
y = 1.2 * x + np.random.randn(5000) / 3
# plot:
fig, ax = plt.subplots()
ax.hist2d(x, y, bins=(np.arange(-3, 3, 0.1), np.arange(-3, 3, 0.1)))
ax.set(xlim=(-2, 2), ylim=(-3, 3))
plt.show()
Hexbin plot
import matplotlib.pyplot as plt
import numpy as np
plt.style.use('_mpl-gallery-nogrid')
# make data: correlated + noise
np.random.seed(1)
x = np.random.randn(5000)
y = 1.2 * x + np.random.randn(5000) / 3
# plot:
fig, ax = plt.subplots()
ax.hexbin(x, y, gridsize=20)
ax.set(xlim=(-2, 2), ylim=(-3, 3))
plt.show()
Pie plot
import matplotlib.pyplot as plt
import numpy as np
plt.style.use('_mpl-gallery-nogrid')
# make data
x = [1, 2, 3, 4]
colors = plt.get_cmap('Blues')(np.linspace
linspace(0.2, 0.7, len(x)))
# plot
fig, ax = plt.subplots()
ax.pie(x, colors=colors, radius=3, center=(4,
center 4), wedgeprops={"linewidth":
1, "edgecolor": "white"}, frame=True)
=True)
ax.set(xlim=(0, 8), xticks=np.arange(1,(1, 8), ylim=(0, 8), yticks=np.arange(1,
8))
plt.show()
# Initialise the subplot function using number of rows
Multiple plots in one and columns
# importing libraries figure, axis = plt.subplots(2, 2)
import matplotlib.pyplot as plt # For Sine Function
import numpy as np axis[0, 0].plot(X, Y1)
axis[0, 0].set_title("Sine
0]. Function")
import math
# Get the angles from 0 to 2 pie (360 # For Cosine Function
degree) in narray object axis[0, 1].plot(X, Y2)
X = np.arange(0, math.pi*2, 0.05) axis[0, 1].set_title("Cosine
1]. Function")
# Using built-in trigonometric function
we can directly plot # For Tangent Function
axis[1, 0].plot(X, Y3)
# the given cosine wave for the given
axis[1, 0].set_title("Tangent
0]. Function")
angles
Y1 = np.sin(X) # For Tanh Function
Y2 = np.cos(X) axis[1, 1].plot(X, Y4)
Y3 = np.tan(X) axis[1, 1].set_title("Tanh
1]. Function")
Y4 = np.tanh(X) # Combine all the operations and display
plt.show
plt.show()
Seaborn
What is seaborn?
Seaborn is an amazing visualization library for statistical graphics
plotting in Python.
It provides beautiful default styles and color palettes to make
statistical plots more attractive.
It is built on the top of matplotlib library and also closely integrated
to the data structures from pandas.
pandas
Seaborn aims to make visualization the central part of exploring and
understanding data.
It provides dataset-oriented
oriented APIs, so that we can switch between
different visual representations for same variables for better
understanding of dataset.
pip install seaborn
Different plots
Seaborn divides plot into the below categories –
Relational plots: This plot is used to understand the relation between two
variables.
Categorical plots: This plot deals with categorical variables and how they can be
visualized.
Distribution plots: This plot is used for examining univariate and bivariate
distributions
Regression plots: The regression plots in seaborn are primarily intended to add
a visual guide that helps to emphasize patterns in a dataset during exploratory data
analyses.
Matrix plots: A matrix plot is an array of scatterplots.
Multi-plot grids: It is an useful approach is to draw multiple instances of the
same plot on different subsets of the dataset.
Go to document