Lab 1 Data Visualization and Statistics From Data

The document outlines a laboratory assignment for the CS214 Artificial Intelligence course at IIT Dharwad, consisting of two main problems involving data visualization and statistical analysis using Python. Problem Statement 1 focuses on a diabetes dataset, requiring tasks such as data display, descriptive statistics, scatter plots, and boxplots. Problem Statement 2 involves a wine quality dataset, with similar tasks including data structure analysis, statistical measures, and visualizations, along with a requirement to submit completed Jupyter Notebooks for evaluation.

Uploaded by

orrebhanuprasad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views4 pages

Lab 1 Data Visualization and Statistics From Data

Uploaded by

orrebhanuprasad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Indian Institute of Technology Dharwad

CS214: Artificial Intelligence Laboratory

Lab 1: Data Visualization and
Statistics from Data

Note
There are two problems that need to be worked out in the lab. Problem
Statement 1 is compulsory. Problem Statement 2 is a bonus.

Problem Statement 1
A dataset related to Indian diabetes, containing medical attributes, is pro-
vided in a CSV file (‘pima-indians-diabetes.csv‘). The dataset includes var-
ious features that provide insights into the conditions leading to diabetes.
These features are: pregs, plas, pres, skin, test, BMI, pedi, Age, and
class.

Write a Python program to perform the following tasks:

1. Display the first 10 tuples of the given dataset using the ‘head()‘ func-
tion.

2. Display the structure of the data to provide details about the number of
entries, data types, and memory usage of the dataset using the ‘info()‘
function.

3. Generate the descriptive statistics for each numerical attribute using

the ‘describe()‘ function to provide the count, mean, standard devia-
tion, minimum, 25th percentile (Q1), median (50th percentile or Q2),
75th percentile (Q3), and maximum of the columns.

1
CS214: Artificial Intelligence Laboratory—Lab 1

4. Calculate and display the following statistical measures for each at-
tribute using the respective functions: mean using ‘mean()‘, median
using ‘median()‘, mode using ‘mode()‘, minimum using ‘min()‘, maxi-
mum using ‘max()‘, and standard deviation using ‘std()‘. Additionally,
compute the quartiles (Q1, Q2, and Q3) using ‘quantile()‘ and the
Interquartile Range (IQR) by subtracting Q1 from Q3.

5. Generate scatter plots to explore the relationships between skin, Age,

BMI, pregs, and plas.

6. Plot histograms for all numerical attributes and the kernel density es-
timation (KDE) curve.

7. Group the data according to the attribute ‘class‘ and bar plots to an-
alyze the distribution of the attributes ‘BMI‘, ‘Age‘, ‘plas‘, and ‘pres‘
for each value of the ‘class‘ variable using the ‘groupby()‘ function.

8. Create boxplots for all attributes to identify outliers and compare dis-
tributions.

Problem Statement 2
A dataset related to red variants of the Portuguese ”Vinho Verde” wine is
given as a CSV file (winequality-red.csv). This dataset contains the values
of different physicochemical tests from each sample of red wine [1]. The orig-
inal goal of the dataset is to model wine quality based on physicochemical
tests. The attributes of the dataset based on physicochemical tests are fixed
acidity, volatile acidity, citric acid, residual sugar, chlorides, free sulfur diox-
ide, total sulfur dioxide, density, pH value, sulphates, alcohol content, and
the last attribute is quality. Each expert graded the wine quality between 0
(very bad) and 10 (very excellent).

Write a Python program to perform the following tasks:

1. Convert the given CSV file to a DataFrame and display the last 10
tuples using the ‘tail()‘ function.

2. Display the structure of the data to provide details about the number of
entries, data types, and memory usage of the dataset using the ‘info()‘
function.

Indian Institute of Technology Dharwad 2

CS214: Artificial Intelligence Laboratory—Lab 1

3. Generate the descriptive statistics for each numerical attribute using

the ‘describe()‘ function to provide the count, mean, standard devia-
tion, minimum, 25th percentile (Q1), median (50th percentile or Q2),
75th percentile (Q3), and maximum of the columns.

5. Generate scatter plots to explore the relationships between each at-

tribute citric acid, residual sugar, chlorides, pH, sulphates,
alcohol.

6. Plot histograms for all numerical attributes and the kernel density es-
timation (KDE) curve.

7. Group the dataset by the variable quality and bar plots to analyze
the distribution of the attributes citric acid, residual sugar, free
sulfur dioxide, and total sulfur dioxide for each quality group.
Use the ‘groupby()‘ function to achieve this and visualize the results.

8. Create boxplots for all attributes to identify outliers and compare dis-
tributions.

Note
Kindly upload the completed Jupyter Notebook for each of the
problem statements separately to Moodle for evaluation.

Deliverables
• Python scripts containing the implementation of all the specified tasks.

• Visualizations generated for scatter plots, histograms, bar plots, and

boxplots.

• A summary of findings from the statistical analysis and visualizations.

Indian Institute of Technology Dharwad 3

CS214: Artificial Intelligence Laboratory—Lab 1

References
[1] P. Cortez, A. Cerdeira, F. Almeida, T. Matos, and J. Reis, “Modeling
wine preferences by data mining from physicochemical properties,” De-
cision Support Systems, vol. 47, no. 4, pp. 547–553, 2009.

Indian Institute of Technology Dharwad 4

Machine Learning (16CIC73) Project Report Template
33% (3)
Machine Learning (16CIC73) Project Report Template
12 pages
FDS Slips Solution
No ratings yet
FDS Slips Solution
7 pages
Devesh
No ratings yet
Devesh
11 pages
ML Assignment 1
No ratings yet
ML Assignment 1
12 pages
Wine Quality Prediction Using Machine Learning Algorithms
100% (1)
Wine Quality Prediction Using Machine Learning Algorithms
4 pages
Grade 9 - English All Unit 3 and Moments #3
No ratings yet
Grade 9 - English All Unit 3 and Moments #3
5 pages
Automata State Elimination Method
No ratings yet
Automata State Elimination Method
3 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
42 pages
HW04
No ratings yet
HW04
3 pages
List of Experiment - Data Analysis Lab
No ratings yet
List of Experiment - Data Analysis Lab
2 pages
Stationary List
No ratings yet
Stationary List
3 pages
Red Wine Quality Prediction Using Machine Learning
No ratings yet
Red Wine Quality Prediction Using Machine Learning
4 pages
Wine Quality Questions
No ratings yet
Wine Quality Questions
2 pages
Processed Food Industry Pakistan
0% (1)
Processed Food Industry Pakistan
6 pages
FLC Provider Database
0% (1)
FLC Provider Database
15 pages
Personal Development: 1 Quarter: Module 2 Developing The Whole Person
100% (2)
Personal Development: 1 Quarter: Module 2 Developing The Whole Person
10 pages
ML Mini Report
No ratings yet
ML Mini Report
6 pages
1 Preoperative
No ratings yet
1 Preoperative
67 pages
FUN Transmissions: by Bill Brayton
No ratings yet
FUN Transmissions: by Bill Brayton
4 pages
MLP Slides Merged
No ratings yet
MLP Slides Merged
480 pages
Radiology MD Training Guide
No ratings yet
Radiology MD Training Guide
12 pages
Splenomegaly: Clinical Insights
No ratings yet
Splenomegaly: Clinical Insights
57 pages
Asuhan Keperawatan Diare
No ratings yet
Asuhan Keperawatan Diare
32 pages
Red Wine Mine
100% (1)
Red Wine Mine
32 pages
Prediction of Wine Quality Using Machine Learning
100% (1)
Prediction of Wine Quality Using Machine Learning
12 pages
Wine Quality Prediction with SVR
100% (1)
Wine Quality Prediction with SVR
6 pages
Object Oriented Programming in Java
No ratings yet
Object Oriented Programming in Java
5 pages
Exp5
No ratings yet
Exp5
6 pages
SDE Task: React Data Analysis
No ratings yet
SDE Task: React Data Analysis
4 pages
Carnot and Rankine Cycle
No ratings yet
Carnot and Rankine Cycle
22 pages
ML Assgn Logistic Wine Quality - Ipynb - Colab
No ratings yet
ML Assgn Logistic Wine Quality - Ipynb - Colab
5 pages
College Project by Muhannad-3
No ratings yet
College Project by Muhannad-3
20 pages
Akshatha Paper
No ratings yet
Akshatha Paper
7 pages
Exploratory Data Analysis and Case
No ratings yet
Exploratory Data Analysis and Case
29 pages
Mini Project Report
No ratings yet
Mini Project Report
12 pages
Wine 9
No ratings yet
Wine 9
20 pages
Wine Quality Prediction Using Machine Learning
No ratings yet
Wine Quality Prediction Using Machine Learning
10 pages
Exercise#9 Instructions 2021
No ratings yet
Exercise#9 Instructions 2021
5 pages
# Tommy Trojan # ITP 449 Fall 2021 # Final Project # Q1
No ratings yet
# Tommy Trojan # ITP 449 Fall 2021 # Final Project # Q1
6 pages
Wine Quality Prediction: Implementation
No ratings yet
Wine Quality Prediction: Implementation
3 pages
Wine DS
No ratings yet
Wine DS
14 pages
Dialogue Completion & Reading Comprehension
0% (1)
Dialogue Completion & Reading Comprehension
8 pages
Business Simulation Assessment 2017 18 PDF
No ratings yet
Business Simulation Assessment 2017 18 PDF
6 pages
Decision Trees
No ratings yet
Decision Trees
2 pages
Using Chemical Composition To Predict Red Wine Quality Via Multiple Linear Regression
No ratings yet
Using Chemical Composition To Predict Red Wine Quality Via Multiple Linear Regression
12 pages
DS Manual 1
No ratings yet
DS Manual 1
96 pages
A Beginner's Guide To ETL With Python - by Jesús Cantú - Medium
No ratings yet
A Beginner's Guide To ETL With Python - by Jesús Cantú - Medium
13 pages
Decision Tree: 1 Description
No ratings yet
Decision Tree: 1 Description
5 pages
Machine Learning Algorithms Assignment
No ratings yet
Machine Learning Algorithms Assignment
71 pages
Grkfinal 123
No ratings yet
Grkfinal 123
22 pages
ML Project Report
No ratings yet
ML Project Report
12 pages
Ai&Ml Bail606 ML Lab Manual
No ratings yet
Ai&Ml Bail606 ML Lab Manual
50 pages
Datamining Exp5 Datanormalisation
No ratings yet
Datamining Exp5 Datanormalisation
14 pages
Wine Quality Prediction GHAR
No ratings yet
Wine Quality Prediction GHAR
19 pages
ML Predicts Red Wine Quality
No ratings yet
ML Predicts Red Wine Quality
12 pages
ML Lab Manual
No ratings yet
ML Lab Manual
40 pages
BDA Lab 4: Python Data Visualization: Your Name: Mohamad Salehuddin Bin Zulkefli Matric No: 17005054
No ratings yet
BDA Lab 4: Python Data Visualization: Your Name: Mohamad Salehuddin Bin Zulkefli Matric No: 17005054
10 pages
Machine Learning On Wine Quality: Prediction and Feature Importance Analysis
No ratings yet
Machine Learning On Wine Quality: Prediction and Feature Importance Analysis
5 pages
SARA-R5 ATCommands UBX-19047455
No ratings yet
SARA-R5 ATCommands UBX-19047455
558 pages
Procesos A Color
No ratings yet
Procesos A Color
6 pages
Python For Data Sceince l1 Hands On
No ratings yet
Python For Data Sceince l1 Hands On
5 pages
Wine Quality Prediction Project
No ratings yet
Wine Quality Prediction Project
32 pages
Wine Quality Analysis
No ratings yet
Wine Quality Analysis
27 pages
Pandas
No ratings yet
Pandas
7 pages
Wine
No ratings yet
Wine
22 pages
Virtual Palletization Plan FNDE
No ratings yet
Virtual Palletization Plan FNDE
299 pages
DMV & ML Lab
No ratings yet
DMV & ML Lab
103 pages
Data Analisis 2
No ratings yet
Data Analisis 2
13 pages
Raisen PDF
No ratings yet
Raisen PDF
99 pages
Date Preparation and Exploration:: Titanic Data - CSV
No ratings yet
Date Preparation and Exploration:: Titanic Data - CSV
5 pages
ML Expt 1 Description
No ratings yet
ML Expt 1 Description
15 pages
Machine Learning Miniproject
No ratings yet
Machine Learning Miniproject
10 pages
Guillermo Garcia Rodriguez - Rivendel S.L
No ratings yet
Guillermo Garcia Rodriguez - Rivendel S.L
85 pages
DT-1 Project Report
No ratings yet
DT-1 Project Report
12 pages
ML LAB Mannual - Index
No ratings yet
ML LAB Mannual - Index
29 pages
R Project
No ratings yet
R Project
22 pages
Data Science
No ratings yet
Data Science
18 pages
ML Lab Records
No ratings yet
ML Lab Records
101 pages
Katz-Moses Multi Sled FENCE Drawing v2
No ratings yet
Katz-Moses Multi Sled FENCE Drawing v2
1 page
R28922 Payslip Jun2023
No ratings yet
R28922 Payslip Jun2023
1 page
Advanced Ventilator Specifications
No ratings yet
Advanced Ventilator Specifications
2 pages
Introduction To The Importance of Sanitation - 5
No ratings yet
Introduction To The Importance of Sanitation - 5
16 pages
Formulation, Development and in Vitro Characterization of Modified Release Tablets of Capecitabine
No ratings yet
Formulation, Development and in Vitro Characterization of Modified Release Tablets of Capecitabine
42 pages
How To Mount A Remote File System Using Network File System (NFS)
No ratings yet
How To Mount A Remote File System Using Network File System (NFS)
3 pages
Free Incoming Inspection Template
No ratings yet
Free Incoming Inspection Template
5 pages
Random Vibration Fatigue Analysis of Car Roof Luggage Carrier - Gulsevincler 2021
No ratings yet
Random Vibration Fatigue Analysis of Car Roof Luggage Carrier - Gulsevincler 2021
12 pages
Arrays: Shristi Technology Labs
No ratings yet
Arrays: Shristi Technology Labs
9 pages
Hypertension Cheat Sheet
No ratings yet
Hypertension Cheat Sheet
4 pages