Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
14 views3 pages

Practical 02

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views3 pages

Practical 02

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Dr.

Rafiq Zakaria Campus

P.G. Dept. of Computer Science

M.Sc. III Semester

Data mining and Data warehousing

Practical 02

Aim : To study Exploratory data analysis using python.

Description :

E x p lo ra to ry D a ta A n a ly s is ( E D A ) is a c ru c ial s te p in th e d a ta m in in g p ro c e s s . It h e lp s
in u n d e rs ta n d in g th e u n d er ly in g p a tte r ns in th e d a ta a n d in fo r m s s u b s e q u e n t a n aly s is
o r m o d e lin g . H e re ’ s a p r a c tic a l g u id e o n p e rfo rm in g E D A u s in g P y th o n , u tiliz in g
p o p u la r lib ra rie s s u c h a s P a n d a s , M a tp lo tlib , a n d S e a b o r n

Step-by-Step EDA in Python

1. Import Libraries

S tar t b y im p o r tin g th e n e c e s s a ry lib ra r ie s .

2 . Load the Data

L o a d y o u r d a ta s e t in to a P a n d a s D a taF ra m e .
3 . Understand the Data Structure

U s e b a s ic c o m m a n d s to ex p lo re the d a ta s e t.

4 . Data Cleaning

H a n d le m is s in g va lu e s a n d d u p lic a te s .

# C h e c k fo r m is s in g v alu e s

p r in t( d f.is n ull( ). s um () )

# F ill o r d ro p m is s in g va lu e s

d f .filln a (m e th o d = 'ffill', in p la c e= T ru e ) # E x a m p le : fo rw a rd fill

d f .d ro p _ d u p lic a te s (in p la c e = T r u e)

5 . Univariate Analysis
A n a lyz e in d ivid u a l fe a tu re s .

● Categorical Features :

● Numerical Features

Example Dataset

Y o u c a n u s e an y d a ta s et fo r th is a n a ly s is . P o p u la r c h o ic e s in c lud e th e T ita n ic d a tas e t


o r Iris d a ta s e t, b o th a va ila b le in S e a b o rn o r c a n b e d o w n lo a d e d fro m K a g g le .

Conclusion

E D A is a n ite ra tive p ro c e s s ; yo u m a y n e e d to g o b a c k a n d fo rth to re fin e yo u r


a n a lys is b a s e d o n fin d in g s . T h e in s ig h ts g a in e d d u rin g E D A a re in va lu a b le fo r g u id in g
yo u r d a ta m in in g e ffo rts , m o d e l s e lec tio n , a nd fe a tu re en g in e e rin g .

Output :

E D A is s tu d ied u s in g p y th o n lib ra rie s .

Prepared By : Khan Shagufta (PG Dept Of Comp Sci)

You might also like