DEPATMENT OF COMPUTER SCIENCE AND ENGINEERING
NAME OF THE LABORATORY : DATA SCIENCE LABORATORY
SUBJECT CODE :CS3361
YEAR/SEMESTER : II /III
REGULATION : R2021
COURSE OBJECTIVE
1. To understand the python libraries for data science
2. To understand the basic Statistical and Probability measures for data science.
3. To learn descriptive analytics on the benchmark data sets.
4. To apply correlation and regression analytics on standard data sets.
5. To present and interpret data using visualization packages in Python.
COURSE OUTCOMES
On completion of the course, the students will be able to:
CO COURSE OUTCOME EXPT.NOS
CO1 Make use of the python libraries for data science 1,2,3,8
CO2 Make use of the basic Statistical and Probability measures for 4,5,9
data science.
CO3 Perform descriptive analytics on the benchmark data sets. 4,5
CO4 Perform correlation and regression analytics on standard data sets 5
CO5 Present and interpret data using visualization packages in Python. 6,7
LIST OF EXPERIMENTS
Ex PSO
NAME OF EXPERIMENT CO PO
No.
Download, install and explore the features of NumPy, PO1,PO2,PO3, PSO1,
1 SciPy, Jupyter, Statsmodels and Pandas packages. CO1 PO4, PO9, PO10, PSO2,
PO11 , PO12 PSO3
Working with Numpy arrays PO1,PO2,PO3, PSO1,
2 CO1 PO4, PO9, PO10, PSO2 ,
PO11 ,PO12 PSO3
Working with Pandas data frames PO1,PO2,PO3, PSO1,
3 CO1 PO4, PO9, PO10, PSO2 ,
PO11 , PO12 PSO3
Reading data from text files, Excel and the web and PO1,PO2,PO3, PSO1,
4 exploring various commands for doing descriptive CO2 , CO3 PO4,PO5, PO9, PSO2,
analytics on the Iris data set. PO10, PO11 , PO12 PSO3
5 Use the diabetes data set from UCI and Pima Indians CO2, CO3 , PO1,PO2,PO3, PSO1,
Diabetes data set for performing the following: CO4 PO4,PO5, PO9, PSO2,
a. Univariate analysis: Frequency, Mean, Median, PO10, PO11 ,PO12 PSO3
Mode, Variance, Standard Deviation, Skewness and
Kurtosis.
b. Bivariate analysis: Linear and logistic regression
modeling
c. Multiple Regression analysis
d. Also compare the results of the above analysis for
the two data sets.
Apply and explore various plotting functions on UCI
data sets.
a. Normal curves PO1,PO2,PO3, PSO1,
6 b. Density and contour plots CO5 PO4,PO5, PO9, PSO2 ,
c. Correlation and scatter plots PO10, PO11 , PO12 PSO3
d. Histograms
e. Three dimensional plotting
Visualizing Geographic Data with Basemap PO1,PO2,PO3, PSO1,
7 CO5 PO4,PO5, PO9, PSO2 ,
PO10, PO11 , PO12 PSO3
ADDITIONAL EXPERIMENTS
PO1,PO2,PO3, PSO1,
Perform feature engineering and handling missing
8 CO1 PO4, PO9, PO10, PSO2 ,
values on a dataset using Scikit-learn.
PO11 , PO12 PSO3
PO1,PO2,PO3, PSO1,
Perform hypothesis testing on a given dataset using
9 CO2 PO4,PO5, PO9, PSO2 ,
Python.
PO10, PO11 , PO12 PSO3