0% found this document useful (0 votes)

7 views9 pages

Practise

The document outlines the analysis of a dataset containing student performance data, including demographics and test scores. It provides basic information about the dataset, such as its shape, data types, and summary statistics, and includes visualizations of math score distributions and comparisons based on gender, lunch type, and test preparation courses. The analysis concludes with the calculation of mean scores by gender and the saving of a cleaned dataset to a CSV file.

Uploaded by

Faheem Altaf

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views9 pages

Practise

Uploaded by

Faheem Altaf

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

import numpy as np

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

#load the dataset

df=pd.read_csv("StudentsPerformance.csv")
df.head()

gender race/ethnicity parental level of education lunch \

0 female group B bachelor's degree standard
1 female group C some college standard
2 female group B master's degree standard
3 male group A associate's degree free/reduced
4 male group C some college standard

test preparation course math score reading score writing score

0 none 72 72 74
1 completed 69 90 88
2 none 90 95 93
3 none 47 57 44
4 none 76 78 75

#basic info of the dataset

print(df.shape)

(1000, 8)

print(df.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 8 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 gender 1000 non-null object
1 race/ethnicity 1000 non-null object
2 parental level of education 1000 non-null object
3 lunch 1000 non-null object
4 test preparation course 1000 non-null object
5 math score 1000 non-null int64
6 reading score 1000 non-null int64
7 writing score 1000 non-null int64
dtypes: int64(3), object(5)
memory usage: 62.6+ KB
None

print(df.describe())
math score reading score writing score
count 1000.00000 1000.000000 1000.000000
mean 66.08900 69.169000 68.054000
std 15.16308 14.600192 15.195657
min 0.00000 17.000000 10.000000
25% 57.00000 59.000000 57.750000
50% 66.00000 70.000000 69.000000
75% 77.00000 79.000000 79.000000
max 100.00000 100.000000 100.000000

print (df.isnull().sum())

gender 0
race/ethnicity 0
parental level of education 0
lunch 0
test preparation course 0
math score 0
reading score 0
writing score 0
dtype: int64

df['column_name'].unique()

----------------------------------------------------------------------
-----
NameError Traceback (most recent call
last)
Cell In[1], line 1
----> 1 df['column_name'].unique()

NameError: name 'df' is not defined

for col in df.columns:

print(f"{col}: {df[col].unique()}")

gender: ['female' 'male']

race/ethnicity: ['group B' 'group C' 'group A' 'group D' 'group E']
parental level of education: ["bachelor's degree" 'some college'
"master's degree" "associate's degree"
'high school' 'some high school']
lunch: ['standard' 'free/reduced']
test preparation course: ['none' 'completed']
math score: [ 72 69 90 47 76 71 88 40 64 38 58 65 78 50
18 46 54 66
44 74 73 67 70 62 63 56 97 81 75 57 55 53 59 82 77
33
52 0 79 39 45 60 61 41 49 30 80 42 27 43 68 85 98
87
51 99 84 91 83 89 22 100 96 94 48 35 34 86 92 37 28
24
26 95 36 29 32 93 19 23 8]
reading score: [ 72 90 95 57 78 83 43 64 60 54 52 81 53
75 89 32 42 58
69 73 71 74 70 65 87 56 61 84 55 44 41 85 59 17 39
80
37 63 51 49 26 68 45 47 86 34 79 66 67 91 100 76 77
82
92 93 62 88 50 28 48 46 23 38 94 97 99 31 96 24 29
40]
writing score: [ 74 88 93 44 75 78 92 39 67 50 52 43 73
70 58 86 28 46
61 63 53 80 72 55 65 38 82 79 83 59 57 54 68 66 62
76
48 42 87 49 10 34 71 37 56 41 22 81 45 36 89 47 90
100
64 98 51 40 84 69 33 60 85 91 77 27 94 95 19 35 32
96
97 99 15 30 23]

df[df["math score"]>70]
df[df["gender"]=='female']

gender race/ethnicity parental level of education

lunch \
0 female group B bachelor's degree standard

1 female group C some college standard

2 female group B master's degree standard

5 female group B associate's degree standard

6 female group B some college standard

.. ... ... ... ...

993 female group D bachelor's degree free/reduced

995 female group E master's degree standard

997 female group C high school free/reduced

998 female group D some college standard

999 female group D some college free/reduced

test preparation course math score reading score writing score

0 none 72 72 74

1 completed 69 90 88

2 none 90 95 93

5 none 71 83 78

6 completed 88 95 92

.. ... ... ... ...

993 none 62 72 74

995 completed 88 99 95

997 completed 59 71 65

998 completed 68 78 77

999 none 77 86 86

[518 rows x 8 columns]

df.groupby('gender')['math score'].mean()

gender
female 63.633205
male 68.728216
Name: math score, dtype: float64

sns.histplot(df['math score'])

<Axes: xlabel='math score', ylabel='Count'>

plt.figure(figsize=(8,5))
sns.histplot(df['math score'], kde=True, bins=10)
plt.title("Distribution of Math Scores")
plt.xlabel("Math Score")
plt.ylabel("Frequency")
plt.show()
sns.boxplot(x='gender', y='math score', data=df)
plt.title("Gender vs Math Score")
plt.show()
sns.boxplot(x='lunch', y='math score', data=df)
plt.title("Lunch Type vs Math Score")
plt.show()
sns.boxplot(x='test preparation course', y='math score', data=df)
plt.title("Test Prep vs Math Score")
plt.show()
df.groupby('gender')[['math score', 'reading score', 'writing
score']].mean()

math score reading score writing score

gender
female 63.633205 72.608108 72.467181
male 68.728216 65.473029 63.311203

df.to_csv("cleaned_dataset.csv", index=False)

Focus 4 Test 1 GR A
80% (5)
Focus 4 Test 1 GR A
4 pages
Anchoring Script For Sports Day
No ratings yet
Anchoring Script For Sports Day
17 pages
12th - Mid-Term-IP
No ratings yet
12th - Mid-Term-IP
5 pages
I222153 Lab03
No ratings yet
I222153 Lab03
28 pages
Experiment 1
No ratings yet
Experiment 1
5 pages
Student Animal Research Booklets
100% (1)
Student Animal Research Booklets
45 pages
Lab 03 Numpy - Ipynb - Colab
No ratings yet
Lab 03 Numpy - Ipynb - Colab
15 pages
Student Performance Analysis
No ratings yet
Student Performance Analysis
22 pages
Analyzing Student Performance in Exams Using Python
No ratings yet
Analyzing Student Performance in Exams Using Python
11 pages
Module II - Lecture Notes 1 - Isentropic Flow - Area Variation With Mach Number
No ratings yet
Module II - Lecture Notes 1 - Isentropic Flow - Area Variation With Mach Number
4 pages
Nguyễn Văn Thành Trung-K59BF-ML15 PDF
No ratings yet
Nguyễn Văn Thành Trung-K59BF-ML15 PDF
9 pages
Codealpha Studentseda
No ratings yet
Codealpha Studentseda
2 pages
Cornerstones of Financial Accounting 3rd Canadian Edition Rich Unlocked Test Bank
No ratings yet
Cornerstones of Financial Accounting 3rd Canadian Edition Rich Unlocked Test Bank
311 pages
Value Added Products From PFAD PDF
No ratings yet
Value Added Products From PFAD PDF
60 pages
Jamboree
No ratings yet
Jamboree
17 pages
A09Ass02 - Jupyter Notebook
No ratings yet
A09Ass02 - Jupyter Notebook
11 pages
Exercise 3
No ratings yet
Exercise 3
59 pages
EDA Student
No ratings yet
EDA Student
8 pages
Tomato Processing Guide by Mynampati Sreenivasa Rao
No ratings yet
Tomato Processing Guide by Mynampati Sreenivasa Rao
4 pages
SAT Revyan F
No ratings yet
SAT Revyan F
6 pages
COC III Set Up Computer Server
No ratings yet
COC III Set Up Computer Server
77 pages
BestSub Heat Press Catalog 2024
No ratings yet
BestSub Heat Press Catalog 2024
37 pages
Assignment 2 DSBDA
No ratings yet
Assignment 2 DSBDA
12 pages
Student Performance Analysis
No ratings yet
Student Performance Analysis
16 pages
The Book of The Dun Cow by Walter Wangerin - Teacher Study Guide
No ratings yet
The Book of The Dun Cow by Walter Wangerin - Teacher Study Guide
33 pages
DSBDA Prac2
No ratings yet
DSBDA Prac2
2 pages
Experiment 2
No ratings yet
Experiment 2
5 pages
Prac 1 Feb
No ratings yet
Prac 1 Feb
22 pages
Quiz Coding Question 1
No ratings yet
Quiz Coding Question 1
9 pages
Data Frame Notes3
No ratings yet
Data Frame Notes3
39 pages
Student Dropout
No ratings yet
Student Dropout
38 pages
Student Performance in Exams
No ratings yet
Student Performance in Exams
71 pages
Heimdal The Gjallarhorn The Horn Resounding and Ragnarok by Ormungandr Melchizedek
100% (1)
Heimdal The Gjallarhorn The Horn Resounding and Ragnarok by Ormungandr Melchizedek
4 pages
PMA Experiment 1
No ratings yet
PMA Experiment 1
9 pages
Data Analysis Process
No ratings yet
Data Analysis Process
95 pages
Assignment 2
No ratings yet
Assignment 2
4 pages
Data Preprocessing - Ipynb - Colaboratory
No ratings yet
Data Preprocessing - Ipynb - Colaboratory
7 pages
Jamboree
No ratings yet
Jamboree
10 pages
Samarth Raghav
No ratings yet
Samarth Raghav
15 pages
Xii - CS - WC - MS - Set 2
No ratings yet
Xii - CS - WC - MS - Set 2
5 pages
12 CS MADURAI SAHO SET 1 MS - New
No ratings yet
12 CS MADURAI SAHO SET 1 MS - New
12 pages
Equipment Design: Mechanical Aspects Week 1 Assignment - 1 Solution
No ratings yet
Equipment Design: Mechanical Aspects Week 1 Assignment - 1 Solution
4 pages
12 CS Ak Set-2
No ratings yet
12 CS Ak Set-2
8 pages
Students Performance
No ratings yet
Students Performance
49 pages
BME303 Lab4 NinaSawaf
No ratings yet
BME303 Lab4 NinaSawaf
10 pages
Data Manipulation With Python Pandas 1700003764
No ratings yet
Data Manipulation With Python Pandas 1700003764
10 pages
Practice Assignment 2
No ratings yet
Practice Assignment 2
1 page
CS Xii PB MS - Set1
No ratings yet
CS Xii PB MS - Set1
6 pages
Ai YasmeenAlhajYousef 0197638 Mohammad Almajali 2191370 End
No ratings yet
Ai YasmeenAlhajYousef 0197638 Mohammad Almajali 2191370 End
2 pages
Ip Practical
No ratings yet
Ip Practical
23 pages
Assignment 02
No ratings yet
Assignment 02
4 pages
Practical List 2022-23
100% (1)
Practical List 2022-23
4 pages
Prac File Sol GR XII - 1
No ratings yet
Prac File Sol GR XII - 1
10 pages
Libble Eu
No ratings yet
Libble Eu
55 pages
Class 12 Cs Ms 3rd Preboard
No ratings yet
Class 12 Cs Ms 3rd Preboard
5 pages
Selling Task % Weight of Task in Sales Process % Advertising Contribution To Task Advertising's Contribution To Sales Estimated Estimated Projected
100% (1)
Selling Task % Weight of Task in Sales Process % Advertising Contribution To Task Advertising's Contribution To Sales Estimated Estimated Projected
2 pages
Assignment-Data Preprocessing (All)
No ratings yet
Assignment-Data Preprocessing (All)
1 page
MS-Computer Science-12-Practice Paper - 1
No ratings yet
MS-Computer Science-12-Practice Paper - 1
10 pages
Halal Industry Master Plan (2008 - 2020) : The Evolution of The Halal Industry in Malaysia
No ratings yet
Halal Industry Master Plan (2008 - 2020) : The Evolution of The Halal Industry in Malaysia
2 pages
G 12 Model 2 Cs Ms-Pcbcs
No ratings yet
G 12 Model 2 Cs Ms-Pcbcs
6 pages
IBA Practical Set A 14th Dec
No ratings yet
IBA Practical Set A 14th Dec
3 pages
Python Case Study
No ratings yet
Python Case Study
7 pages
00 - Lesson - Data Science Workflow - Jupyter Notebook
No ratings yet
00 - Lesson - Data Science Workflow - Jupyter Notebook
6 pages
Students Exam Scores Analysis - Ipynb
No ratings yet
Students Exam Scores Analysis - Ipynb
4 pages
CS PQMS
No ratings yet
CS PQMS
9 pages
Student Grade Prediction
No ratings yet
Student Grade Prediction
9 pages
R Record
No ratings yet
R Record
16 pages
DW 14
No ratings yet
DW 14
14 pages
Tutorial 6
No ratings yet
Tutorial 6
13 pages
CS29002 Algorithms Laboratory: Assignment No: 1 Last Date of Submission: 27-July-2016
No ratings yet
CS29002 Algorithms Laboratory: Assignment No: 1 Last Date of Submission: 27-July-2016
2 pages
Asatasdfs
No ratings yet
Asatasdfs
6 pages
(Hooker and Monas, 2008) Shoestring Venture - The Startup Bible
No ratings yet
(Hooker and Monas, 2008) Shoestring Venture - The Startup Bible
532 pages
ST Joseph'S Convent Senior Secondary School: Name:-Shatakshi Gaur Class:-Xii Sec:-A Board Roll No.
No ratings yet
ST Joseph'S Convent Senior Secondary School: Name:-Shatakshi Gaur Class:-Xii Sec:-A Board Roll No.
65 pages
CADVR-1004FD / - 08FD: Honeywell Black
No ratings yet
CADVR-1004FD / - 08FD: Honeywell Black
4 pages
NARAYANI MAHAL Job Fare
No ratings yet
NARAYANI MAHAL Job Fare
2 pages
Lambda Functions & Alternative Methods in Python
No ratings yet
Lambda Functions & Alternative Methods in Python
8 pages
Steel Welded Fabric List Price (SG) - V2.00
No ratings yet
Steel Welded Fabric List Price (SG) - V2.00
2 pages
Introduction To Data Science and Python For Data
No ratings yet
Introduction To Data Science and Python For Data
12 pages
Hull For: Aerodynamic Design HASPA LTA Optimization
No ratings yet
Hull For: Aerodynamic Design HASPA LTA Optimization
5 pages
6089202f4e466 The Amorphous Nature of Agile No One Size Fits All
No ratings yet
6089202f4e466 The Amorphous Nature of Agile No One Size Fits All
42 pages
Colour Dilution Alopecia in Doberman Pinschers With Blue or Fawn Coat Colours - A Study On The Incidence and Histopathology of This Di
No ratings yet
Colour Dilution Alopecia in Doberman Pinschers With Blue or Fawn Coat Colours - A Study On The Incidence and Histopathology of This Di
10 pages
Haldi Ram
No ratings yet
Haldi Ram
9 pages
DiGi KaGB T&C
No ratings yet
DiGi KaGB T&C
5 pages
Latin American Veggie Meal Plan
No ratings yet
Latin American Veggie Meal Plan
2 pages
Anthony 8
No ratings yet
Anthony 8
2 pages
Vipin Kumar Resume
No ratings yet
Vipin Kumar Resume
1 page
First Term TT-2 CL 9,10,11&12
No ratings yet
First Term TT-2 CL 9,10,11&12
1 page

Practise

Uploaded by

Practise

Uploaded by

import numpy as np

#load the dataset

gender race/ethnicity parental level of education lunch \

test preparation course math score reading score writing score

#basic info of the dataset

NameError: name 'df' is not defined

for col in df.columns:

gender: ['female' 'male']

gender race/ethnicity parental level of education

1 female group C some college standard

2 female group B master's degree standard

5 female group B associate's degree standard

6 female group B some college standard

.. ... ... ... ...

993 female group D bachelor's degree free/reduced

995 female group E master's degree standard

997 female group C high school free/reduced

998 female group D some college standard

999 female group D some college free/reduced

.. ... ... ... ...

[518 rows x 8 columns]

<Axes: xlabel='math score', ylabel='Count'>

math score reading score writing score

You might also like