Course Code L T P E C
EXPLORATORY DATA ANALYSIS
3 0 0 0 3
COURSE OUTCOMES
Upon completion of the course, students will be able to:
CO1: Understand an overview of exploratory data analysis (CDL1)
CO2: Implement data visualization using python. (CDL2)
CO3: Analyze data summarization using software tools (CDL2)
CO4: Perform univariate and bivariate analysis(CDL2)
CO5: Apply Data exploration and visualization techniques for multivariate and time series data
(CDL2)
CO1: Understand an overview of exploratory data(CDL1)
Exploratory Data Analysis- Definition and importance – Role of EDA – Making sense of data –
EDA vs. Confirmatory Data Analysis – Software tools for EDA - Visual Aids for EDA- Data
transformation techniques-merging database, reshaping and pivoting, Transformation techniques
CO2: Implement data visualization using Matplotlib(CDL2)
Importance and types of Data Visualizations. Tools and Libraries: Python-Matplotlib, R-ggplot2,
etc. Visualizing Univariate Data- Histograms, Box Plots, Bar Charts. Visualizing Multivariate
Data-Scatter Plots, Pair Plots, Heatmaps.
CO3: Analyze data summarization using software tools (CDL2)
Descriptive Statistics- Measures of Central Tendency (Mean, Median, Mode), Measures of
Dispersion (Range, Variance, Standard Deviation), Skewness and Kurtosis. Data Summarization-
Grouping and Aggregation, Pivot Tables.
CO4: Perform univariate and bivariate analysis. (CDL2)
Introduction to Single variable: Distribution Variables - Numerical Summaries of Level and
Spread - Scaling and Standardizing – Inequality - Relationships between Two Variables -
Percentage Tables - Analysing Contingency Tables - Handling Several Batches - Scatterplots and
Resistant Lines.
CO5: Apply data exploration and visualization techniques for multivariate and time series
data(CDL2)
Introducing a Third Variable - Causal Explanations - Three-Variable Contingency Tables- Time
Series Analysis- Time Series Decomposition, Trend and Seasonality. Anomaly Detection-
Outliers, Anomaly Detection
L:45; TOTAL:45 PERIODS
TEXT BOOKS
1. Suresh Kumar Mukhiya, Usman Ahmed, “Hands-On Exploratory Data Analysis with
Python”, Packt Publishing, 2020. (Unit 1) Digital Image Processing, Author:
BhabatoshChanda and DwijeshMajumder, Publisher: PHI, 2nd edition.
2. Jake Vander Plas, "Python Data Science Handbook: Essential Tools for Working with
Data", First Edition, O Reilly, 2017. (Unit 2)
REFERENCES
1. Eric Pimpler, Data Visualization and Exploration with R, GeoSpatial Training service,
2017.
2. 2. Claus O. Wilke, “Fundamentals of Data Visualization”, O’reilly publications, 2019.
3. 3. Matthew O. Ward, Georges Grinstein, Daniel Keim, “Interactive Data Visualization:
Foundations, Techniques, and Applications”, 2nd Edition, CRC press, 2015.