Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
23 views30 pages

Lec-5 Seaborn

Seaborn

Uploaded by

Nure Hafsa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views30 pages

Lec-5 Seaborn

Seaborn

Uploaded by

Nure Hafsa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 30

Introduction to

Seaborn
Topics Covered
• Seaborn Intro
• Distribution Plots
• Categorical Plots
• Matrix Plots
• Grid Plots
• Regression Plots

2
Seaborn
• Uses Matplotlib underneath to plot graphs
• Statistical data visualization ploting library
• Designed to work with dataframe
• Comes with built-in data sets
• Installation
• conda install seaborn
• pip install seaborn
• Example gallery
• https://seaborn.pydata.org/examples/index.html

3
Distribution Plots
• import seaborn as sns
distplot • sns.__version__
• %matplotlib inline # notebook
jointplot • built-in data sets!
• tips = sns.load_dataset('tips’)
pairplot • tips.head()

rugplot

kdeplot
4
The distplot shows the
distribution of a univariate
set of observations
• `distplot` is a deprecated function and will be
removed in seaborn v0.14.0. Replaced by histplot()
• https://seaborn.pydata.org/generated/seaborn.distplot.html
• sns.distplot(tips['total_bill’])
• To remove the kde layer and just have the
histogram use
• sns.distplot(tips['total_bill’], kde=False, bins=30)
• Kernel Density Estimation (KDE)
• Way to estimate the probability density function
of a continuous random variable
• used for non-parametric analysis

5
jointplot
• Allows to basically match up two distplots
for bivariate data (used to determine the
relation between two variables)
• sns.jointplot(x='total_bill’, y='tip’,
data=tips, kind='scatter’)
• sns.jointplot(x='total_bill’, y='tip’,
data=tips, kind='hex’)
• sns.jointplot(x='total_bill’, y='tip’,
data=tips, kind='reg’) # More type:
resid, kde

6
pairplot
• A pairplot visualizes pairwise
relationships between numerical
columns in a dataframe and can
use the hue argument to color
points based on a categorical
column
• sns.pairplot(tips) # normal
• sns.pairplot(tips, hue='sex’,
palette='coolwarm’) # with hue
• hue allows to visually encode an
additional dimension of information

7
rugplot
• A rugplot draws dash marks for each point in
a univariate distribution and is a building
block for a KDE plot
• sns.rugplot(tips['total_bill'])

8
kdeplot
• KDE plots replace every single observation with a
Gaussian (Normal) distribution centered around that
value
• sns.kdeplot(tips['total_bill’])
• sns.rugplot(tips['total_bill'])

x = np.random.randn(200)
sns.kdeplot(x, fill = True)

9
Categorical Data Plots
• Main few plots: • import seaborn as sns
• barplot • sns.__version__
• countplot • %matplotlib inline # notebook
• boxplot • built-in data sets!
• violinplot • tips = sns.load_dataset('tips’)
• stripplot • tips.head()
• swarmplot
• catplot

10
barplot
• These plots provide a concise summary of
aggregate data based on a categorical
feature in the dataset
• sns.barplot(x='sex’, y='total_bill’,
data=tips)
• Showing default avg/mean values of categorical
column (sex)
• sns.barplot(x='sex’, y='total_bill’,
data=tips, estimator=np.std)
• estimator object converts vector to a
scalar
• Statistical function to estimate within each
categorical bin
11
countplo
t
• Same as barplot
except the estimator
is explicitly counting
the number of
occurrences
• Need to pass x value
• sns.countplot(x =
'sex’, data = tips)

12
sns.boxplot(x="day", y="total_bill", data=tips,
palette='rainbow')
boxplot

• Bottom black horizontal line of blue box plot is


minimum value
• First black horizontal line of rectangle shape
of blue box plot is First quartile or 25%
• Second black horizontal line of rectangle
shape of blue box plot is Second quartile or
50% or median.
• Third black horizontal line of rectangle
shape of blue box plot is third quartile or
75%
• Top black horizontal line of rectangle
shape of blue box plot is maximum value.
• Small diamond shape of blue box plot is outlier
sns.boxplot(x="day", y="total_bill", hue="smoker", data or erroneous data.
data=tips, palette="coolwarm")
# you can add hue, then it will show two plot for
each day

13
violinplot
• Similar to a box plot, but
with a mirrored, rotated
kernel density estimate on
both sides
• Used for comparing
probability distributions
(one/more categorial
variables)

14
https://www.labxchange.org/library/items/lb:LabXchange:46f64d7
a:html:1
violinplot

15
violinplot (Compare with boxplot)
sns.violinplot(x="day", y="total_bill", data=tips, hue='sex’, split=True, palette='Set1')

16
stripplot
• An effective complement to a boxplot or violin plot
when displaying all observations alongside a
summarized distribution representation
• Used to draw a scatter plot based on the category
• sns.stripplot(x="day", y="total_bill", data=tips)
• sns.stripplot(x="day", y="total_bill", data=tips,
jitter=True, hue='sex’, palette=‘Set1’)
• jitter can be used to provide displacements along the
horizontal axis

17
• sns.stripplot(x="day", y="total_bill",
data=tips)
stripplot • sns.stripplot(x="day", y="total_bill",
data=tips, jitter=True, hue='sex’,
palette=‘Set1’)

18
• Similar to stripplot, only the points are adjusted so it
won’t get overlap
swarmplot • sns.swarmplot(x="day", y="total_bill", data=tips)
• sns.swarmplot(x="day", y="total_bill", hue='sex’,
data=tips, palette="Set1", split = True)

19
catplot
• Most general form of a
categorical plot
• It can take in a kind
parameter to adjust the plot
type
• sns.catplot(x='sex’,
y='total_bill’, data=tips,
kind='box’)
• Kind option: “strip”, “swarm”,
“box”, “violin”, “boxen”,
“point”, “bar”, or “count”
20
Combining
Categorical Plots
• sns.violinplot(x ='day', y
='total_bill', data = tips)
• sns.swarmplot(x ='day',
y ='total_bill', data =
tips, color ='black’)

21
Matrix Plots
• Matrix plots display data as color-coded grids,
highlighting patterns or clusters.
• Heatmap • import seaborn as sns
• clustermap • sns.__version__
• %matplotlib inline # notebook
• built-in data sets!
• flights = sns.load_dataset('flights')
• tips = sns.load_dataset('tips’)
• tips.head()
• flights.head()

22
Heatmap
• Data should already
be in a matrix form
• sns.heatmap(tips.corr(
))
• sns.heatmap(tips.corr(
), cmap='coolwarm’,
annot=True)

23
Heatmap (Flight
Dataset)
• pvflights =
flights.pivot_table(values='passenger
s',index='month',columns='year’)
• sns.heatmap(pvflights)
• sns.heatmap(pvflights,cmap='magm
a',linecolor='white',linewidths=1)

24
clustermap
• A clustermap applies hierarchical clustering
to create a grouped version of a heatmap.
• sns.clustermap(pvflights)
• sns.clustermap(pvflights,cmap='coolwarm
',standard_scale=1)
• Years and months are reordered by
similar passenger counts.

25
Grids
• Allow to map plot types to rows and columns of a grid,
this helps to create similar plots separated by features
• PairGrid • built-in data sets!
• Pairplot • tips = sns.load_dataset('tips’)
• Facet Grid
• iris = sns.load_dataset('iris’)
• JointGrid • iris.head()

26
Pairgrid and pairplot
• Pairgrid is a subplot grid for plotting pairwise
relationships in a dataset.
• g = sns.PairGrid(iris) # Just the grid
• g.map(plt.scatter)
• g.map_diag(plt.hist) Pairplot (Similar to pairgrid)
sns.pairplot(iris)
• g.map_upper(plt.scatter) sns.pairplot(iris,hue='species',palet
• g.map_lower(sns.kdeplot)te='rainbow')

27
Facet Grid
• FacetGrid is a versatile tool for creating plot grids
based on a feature.
• g = sns.FacetGrid(tips, col="time", row="smoker")
• g = sns.FacetGrid(tips, col="time", row="smoker")
• g = g.map(plt.hist, "total_bill")
• g = sns.FacetGrid(tips, col="time", row="smoker",
hue='sex’)
• # Notice how the arguments come after plt.scatter call
• g = g.map(plt.scatter, "total_bill", "tip").add_legend()

28
JointGrid
• JointGrid is the general
version for jointplot() type
grids
•g=
sns.JointGrid(x="total_bill",
y="tip", data=tips)g =
g.plot(sns.regplot,
sns.distplot)

29
Regression
• sns.lmplot(x='total_bill',y='tip',data=tips)
• sns.lmplot(x='total_bill',y='tip',data=tips,hue='sex
’)
Plots • sns.lmplot(x='total_bill',y='tip',data=tips,hue='sex
',palette='coolwarm’)
Lmplot() • Working with Markers
visualizes linear • lmplot kwargs get passed through to regplot

models and • regplot has a scatter_kws parameter that gets


passed to plt.scatter
enables splitting • sns.lmplot(x='total_bill',y='tip',data=tips,h
ue='sex',palette='coolwarm',
plots by features markers=['o','v'],scatter_kws={'s':100})
while using hue • Using a Grid (col or row argument)

for feature-based • sns.lmplot(x='total_bill',y='tip',data=tips,col='


sex’)
coloring. • Aspect and Size
• sns.lmplot(x='total_bill',y='tip',data=tips,col='
day',hue='sex',palette='coolwarm',
aspect=0.6,size=8)

30

You might also like