Dataframe in Pandas
Q1. Create the following dataframes:
i. CricketPlayers from a list of dictionaries containing names of five cricket
players, number of matches played and Average Score.
ii. Items from a list of dictionaries containing names of five items, cost
price, sales price, discount(if any).
iii. Result from a dictionary of series containing rollnumber of 6 students
and their percentage in last five years
iv. Monuments from a dictionary of series containing names of 10
monuments, their year of built, place and who built them
v. Countries from a list of dictionaries containing names of 10 countries, its
national animal, bird and currency.
Q2. Consider the following dataframe RESULTSHEET:
UT1 Half Yearly UT2 Final
Sharad 57 83 49 89
Mansi 86 67 87 90
Kanika 92 78 45 66
Ramesh 52 84 55 78
Ankita 93 75 87 69
Pranay 98 79 88 96
Here, Names of the students are row labels and term names (UT1, Half
Yearly, UT2 and Final) are the column labels. Answer the following questions
based on the above dataframe:
a. Change the row labels from student name to roll numbers from 1 to 6.
b. Change the column labels to Term1, Term2, Term3, Term4.
c. Add a new column Grade with values ‘A’, ‘A’,’B’,’A’,’C’, ‘B’
d. Add a new row for the student with row label=7 and marks equal to
49, 56, 75,58 and grade=b.
e. Delete the first row
f. Delete the third column
g. Display 2nd row with all columns
h. Display students who have scored more than 50 in Final exam
i. Check students who have grade as A
j. Display marks in Half Yearly and Final of all students
k. Display marks of students from Mansi to Ankita
l. Display marks of Mansi to Ankita in UT1 and UT2
m. Display marks of Kanika and Ankita in Half Yearly and Final
n. Display first 3 records
o. Display last four records
Q3. Write a Python program to create the following dataframe DOCTOR using
the index values as 10,20,30,40, 50, 60, 70.
ID NAME DEPT EXPERIENCE
101 JOHN ENT 12
102 SMITH ORTHOPEDIC 5
103 GEORGE CARDIOLOGY 10
104 LARA SKIN 3
105 K GEORGE MEDICINE 9
106 JOHNSON ORTHOPEDIC 10
107 LUCY ENT 3
Q4. Give commands to perform the following operations on the dataframe
DOCTOR:
a. Write code to display the details of LARA using loc.
b. Write code to display the details of LARA and LUCY using iloc.
c. Write code to display all doctor’s names.
d. Write code to display all doctor’s names along with DEPT
e. Write code to display the first 3 records from the dataframe.
f. Write code to display the last 4 records from the data frame.
g. Write code to display the department and experience of doctors with
names JOHN and SMITH.
h. Write code to display 2nd to 6th record
i. Display all the odd numbered records.
j. Write code to insert a column named “AGE” giving appropriate values
to each doctor.
Q5. Consider the following DataFrame Flight_Fare:
FL_NO AIRLINES FARE
IC701 INDIAN AIRLINES 6500
MU499 SAHARA 9400
AM501 JET AIRWAYS 13400
IC899 INDIAN AIRLINES 8300
IC302 INDIAN AIRLINES 4300
Give the output of the following commands:
a. Flight_Fare [Flight_Fare.index>1]
b. Flight_Fare [( Flight_Fare .FARE>=4000)&( Flight_Fare
.FARE<=9000)]
c. Flight_Fare [( Flight_Fare .FL_NO== "IC701")| ( Flight_Fare
.FL_NO== "AM501")| ( Flight_Fare .FL_NO== " IC302")]
d. Flight_Fare [( Flight_Fare .FARE>=4000)&( Flight_Fare
.FARE<=9000)][[ "FL_NO", "FARE"]]
e. Flight_Fare [2:4]
f. Flight_Fare [:4]
g. Flight_Fare [::3]
h. Flight_Fare [:: -3]
i. Flight_Fare [3:]
j. Flight_Fare.loc[1:4,'FL_NO':'FARE']
k. Flight_Fare.loc[1:4,['FL_NO','FARE']]
l. Flight_Fare.iloc[[0,2,4]]
m. Flight_Fare.iloc[:,1:3]
n. Flight_Fare.iloc[1:2,1:3]
o. Flight_Fare.loc[1:3]
p. Flight_Fare.loc[:,'FL_NO':'FARE']
q. Flight_Fare ["Tax%"] = [10,8,9,5,7]
r. Flight_Fare.loc[5]=[ "MC101", "DECCAN AIRLINES", "3500",”10”]
s. Flight_Fare.loc [:,"Disc%"] = [2,3,2,4,2]
t. Flight_Fare =Flight_Fare.drop("Tax%", axis=1)
u. Flight_Fare =Flight_Fare.drop(4, axis=0)
v. Flight_Fare =Flight_Fare.drop([1,4] , axis=0)
w. Flight_Fare.loc[2]
x. Flight_Fare.loc[:,"FL_NO"]
y. Flight_Fare ["FARE"]>=6000
Solutions
Q1
(i) import pandas as pd
a=[{'name':'virat','matches played':180,'avg score':4500},
{'name':'rohit','matches played':150,'avg score':6000},
{'name':'ms dhoni','matches played':120,'avg score':2800}]
f=pd.DataFrame(a)
print(f)
OUTPUT
name matches played avg score
0 virat 180 4500
1 rohit 150 6000
2 ms dhoni 120 2800
import pandas as pd
m=[{'item':'charger','cost':500,'discount':'5%'},
{'item':'books','cost':750},
{'item':'clock','cost':1200,'discount':'10%'}]
df=pd.DataFrame(m)
print(df)
(ii)
OUTPUT
item cost discount
0 charger 500 5%
1 books 750 NaN
2 clock 1200 10%
(iii) import pandas as pd
result = {'2015':pd.Series(('78%','56%','90%','79%','60%')),
'2016':pd.Series(('64%','85%','72%','56%','48%')),
'2017':pd.Series(('45%','66%','78%','88%','73%')),
'2018':pd.Series(('70%','56%','38%','89%','94%')),
'2019':pd.Series(('66%','78%','58%','90%','83%'))}
rs=pd.DataFrame(result)
rs.index=[1,2,3,4,5]
print(rs)
OUTPUT
2015 2016 2017 2018 2019
1 78% 64% 45% 70% 66%
2 56% 85% 66% 56% 78%
3 90% 72% 78% 38% 58%
4 79% 56% 88% 89% 90%
5 60% 48% 73% 94% 83%
import pandas as pd
mn={'monuments':pd.Series(['qutab minar','humayan tomb','lal qila','taj mahal',
'efiel tower']),
'year':pd.Series([1193,1572,1683,1630,1887]),
'built':pd.Series(['qutab','bega begum','shah jhan','shahjhan','gustave effiel'])}
pm=pd.DataFrame(mn)
print(pm)
(iv)
OUTPUT
monuments year built
0 qutab minar 1193 qutab
1 humayan tomb 1572 bega begum
2 lal qila 1683 shah jhan
3 taj mahal 1630 shahjhan
4 efiel tower 1887 gustave effiel
(v) import pandas as pd
con={'country':pd.Series(['India','Australia', 'China']),
'national animal':pd.Series(['tiger','kangaroo','Chinese dragon']),
'national bird':pd.Series(['peacock','emu','red crowned erane']),
'currency':pd.Series(['ruppee','dollar','kerminbi'])}
pp=pd.DataFrame(con)
print(pp)
OUTPUT
country national animal national bird currency
0 India tiger peacock ruppee
1 Australia kangaroo emu dollar
2 China Chinese dragon red crowned erane kerminbi
Q2.
import pandas as pd
a=[[58,83,49,89],[86,67,87,90],[92,78,45,56],[52,84,55,78],[93,75,87,69],
[98,79,88,96]]
m=pd.DataFrame(a,index=['sharad','mansi','kanika','ramesh','ankita','pranay'],
columns=['ut1','halfyearly','ut2','final'])
print(m)
OUTPUT
ut1 halfyearly ut2 final
sharad 58 83 49 89
mansi 86 67 87 90
kanika 92 78 45 56
ramesh 52 84 55 78
ankita 93 75 87 69
pranay 98 79 88 96
(a) m=m.rename({'sharad':1,'mansi':2,'kanika':3,'ramesh':4,'ankita':5,'pranay':6},
axis="index")
print(m)
OUTPUT
ut1 halfyearly ut2 final
1 58 83 49 89
2 86 67 87 90
3 92 78 45 56
4 52 84 55 78
5 93 75 87 69
6 98 79 88 96
m=m.rename({'ut1':"term1",'halfyearly':"term2",'ut2':"term3",'final':"term4"},
axis="columns")
print(m)
o\p
term1 term2 term3 term4
(b) 1 58 83 49 89
2 86 67 87 90
3 92 78 45 56
4 52 84 55 78
5 93 75 87 69
6 98 79 88 96
(c) m['Grade']=['a','b','b','a','a','b']
print(m)
OUTPUT
term1 term2 term3 term4 Grade
1 58 83 49 89 a
2 86 67 87 90 b
3 92 78 45 56 b
4 52 84 55 78 a
5 93 75 87 69 a
6 98 79 88 96 b
m.loc[7]=[49,56,75,58, 'b']
print(m)
OUTPUT
term1 term2 term3 term4 Grade
1 58 83 49 89 a
(d) 2 86 67 87 90 b
3 92 78 45 56 b
4 52 84 55 78 a
5 93 75 87 69 a
6 98 79 88 96 b
7 49 56 75 58 b
(e) m.drop(1,axis=0)
OUTPUT
term1 term2 term3 term4 Grade
2 86 67 87 90 b
3 92 78 45 56 b
4 52 84 55 78 a
5 93 75 87 69 a
6 98 79 88 96 b
7 49 56 75 58 b
m.drop('term3',axis=1)
OUTPUT
term1 term2 term4 Grade
2 86 67 90 b
(f) 3 92 78 56 b
4 52 84 78 a
5 93 75 69 a
6 98 79 96 b
7 49 56 58 b
(g) m.loc[2]
OUTPUT
term1 92
term2 78
term3 45
term4 56
Grade b
Name: 2, dtype: object
m['term4']>50
OUTPUT
1 True
2 True
3 True
(h)
4 True
5 True
6 True
7 True
Name: term4, dtype: bool
bm.loc[:,'Grade']=='a'
OUTPUT
0 True
1 False
2 False
(i)
3 True
4 True
5 False
6 False
Name: internal assessment, dtype: bool
m.loc[:,["term2","term4"]]
or
m[["term2","term4"]]
OUTPUT
term2 term4
(j) 1 83 89
2 67 90
3 78 56
4 84 78
5 75 69
6 79 96
7 56 58
m.loc['2':'5']
OUTPUT
term1 term2 term3 term4
(k)
2 86 67 87 90
3 92 78 45 56
4 52 84 55 78
5 93 75 87 69
(l) m.loc['2':'5',['term1','term2']]
OUTPUT
term1 term2
2 86 67
3 92 78
4 52 84
5 93 75
m.loc[['3','5'],['term2','term4']]
OUTPUT
(m)
term2 term4
3 78 56
5 75 69
m.head(3)
OUTPUT
(n) term1 term2 term3 term4
1 58 83 49 89
2 86 67 87 90
3 92 78 45 56
m.tail(4)
OUTPUT
term1 term2 term3 term4
(o)
3 92 78 45 56
4 52 84 55 78
5 93 75 87 69
6 98 79 88 96
Q3.
import pandas as pd
a={'ID':[101,102,103,104,105,106,107],
'NAME':['JOHN','SMITH','GEORGE','LARA','K GEORGE','JOHNSON','LUCY'],
'DEPT':['ENT','ORTHOPEDIC','CARDIOLOGY','SKIN','MEDICINE','ORTHOPEDIC','ENT'],
'EXPERIENCE':[12,5,10,3,9,10,3]}
df1=pd.DataFrame(a,index=[10,20,30,40, 50, 60, 70])
print(df1)
OUTPUT
ID NAME DEPT EXPERIENCE
10 101 JOHN ENT 12
20 102 SMITH ORTHOPEDIC 5
30 103 GEORGE CARDIOLOGY 10
40 104 LARA SKIN 3
50 105 K GEORGE MEDICINE 9
60 106 JOHNSON ORTHOPEDIC 10
70 107 LUCY ENT 3
Q4
df1.loc[40]
OUTPUT
ID 104
(a)
NAME LARA
DEPT SKIN
EXPERIENCE 3
Name: 40, dtype: object
(b) df1.iloc[[3,4]]
OUTPUT
ID NAME DEPT EXPERIENCE
40 104 LARA SKIN 3
50 105 K GEORGE MEDICINE 9
df1.NAME
OUTPUT
10 JOHN
20 SMITH
(c) 30 GEORGE
40 LARA
50 K GEORGE
60 JOHNSON
70 LUCY
Name: NAME, dtype: object
df1.loc[:,["NAME","DEPT"]]
OUTPUT
NAME DEPT
10 JOHN ENT
(d) 20 SMITH ORTHOPEDIC
30 GEORGE CARDIOLOGY
40 LARA SKIN
50 K GEORGE MEDICINE
60 JOHNSON ORTHOPEDIC
70 LUCY ENT
(e) df1.head(3)
OUTPUT
ID NAME DEPT EXPERIENCE
10 101 JOHN ENT 12
20 102 SMITH ORTHOPEDIC 5
30 103 GEORGE CARDIOLOGY 10
df1.tail(4)
OUTPUT
ID NAME DEPT EXPERIENCE
(f)
40 104 LARA SKIN 3
50 105 K GEORGE MEDICINE 9
60 106 JOHNSON ORTHOPEDIC 10
70 107 LUCY ENT 3
df1.loc[[10,20],['DEPT','EXPERIENCE']]
OUTPUT
(g)
DEPT EXPERIENCE
10 ENT 12
20 ORTHOPEDIC 5
df1.loc[20:60]
OUTPUT
ID NAME DEPT EXPERIENCE
(h) 20 102 SMITH ORTHOPEDIC 5
30 103 GEORGE CARDIOLOGY 10
40 104 LARA SKIN 3
50 105 K GEORGE MEDICINE 9
60 106 JOHNSON ORTHOPEDIC 10
(i) df1.iloc[0:6:2]
OUTPUT
ID NAME DEPT EXPERIENCE
10 101 JOHN ENT 12
30 103 GEORGE CARDIOLOGY 10
50 105 K GEORGE MEDICINE 9
df1["AGE"]=[50,65,45,38,45,39,52]
OUTPUT
ID NAME DEPT EXPERIENCE AGE
10 101 JOHN ENT 12 50
(j) 20 102 SMITH ORTHOPEDIC 5 65
30 103 GEORGE CARDIOLOGY 10 45
40 104 LARA SKIN 3 38
50 105 K GEORGE MEDICINE 9 45
60 106 JOHNSON ORTHOPEDIC 10 39
70 107 LUCY ENT 3 52
Flight_Fare [Flight_Fare.index>1]
OUTPUT
FL_NO AIRLINES FARE
Q5 (a)
2 AM501 JET AIRWAYS 13400
3 IC899 INDIAN AIRLINES 8300
4 IC302 INDIAN AIRLINES 4300
(b) Flight_Fare [( Flight_Fare .FARE>=4000)&( Flight_Fare .FARE<=9000)]
OUTPUT
FL_NO AIRLINES FARE
0 IC701 INDIAN AIRLINES 6500
3 IC899 INDIAN AIRLINES 8300
4 IC302 INDIAN AIRLINES 4300
Flight_Fare [( Flight_Fare .FL_NO== "IC701")| ( Flight_Fare .FL_NO== "AM501")|
( Flight_Fare .FL_NO== " IC302")]
OUTPUT
(c)
FL_NO AIRLINES FARE
0 IC701 INDIAN AIRLINES 6500
2 AM501 JET AIRWAYS 13400
Flight_Fare [( Flight_Fare .FARE>=4000)&( Flight_Fare .FARE<=9000)][[ "FL_NO",
"FARE"]]
OUTPUT
FL_NO FARE
(d)
0 IC701 6500
3 IC899 8300
4 IC302 4300
Flight_Fare [2:4]
OUTPUT
(e)
FL_NO AIRLINES FARE
2 AM501 JET AIRWAYS 13400
3 IC899 INDIAN AIRLINES 8300
(f) Flight_Fare [:4]
OUTPUT
FL_NO AIRLINES FARE
0 IC701 INDIAN AIRLINES 6500
1 MU499 SAHARA 9400
2 AM501 JET AIRWAYS 13400
3 IC899 INDIAN AIRLINES 8300
Flight_Fare [::3]
OUTPUT
(g)
FL_NO AIRLINES FARE
0 IC701 INDIAN AIRLINES 6500
3 IC899 INDIAN AIRLINES 8300
Flight_Fare [:: -3]
OUTPUT
(h)
FL_NO AIRLINES FARE
4 IC302 INDIAN AIRLINES 4300
1 MU499 SAHARA 9400
Flight_Fare [3:]
OUTPUT
(i)
FL_NO AIRLINES FARE
3 IC899 INDIAN AIRLINES 8300
4 IC302 INDIAN AIRLINES 4300
(j) light_Fare.loc[1:4,'FL_NO':'FARE']
OUTPUT
FL_NO AIRLINES FARE
1 MU499 SAHARA 9400
2 AM501 JET AIRWAYS 13400
3 IC899 INDIAN AIRLINES 8300
4 IC302 INDIAN AIRLINES 4300
Flight_Fare.loc[1:4,['FL_NO','FARE']]
OUTPUT
FL_NO FARE
(k)
1 MU499 9400
2 AM501 13400
3 IC899 8300
4 IC302 4300
Flight_Fare.iloc[[0,2,4]]
OUTPUT
(l) FL_NO AIRLINES FARE
0 IC701 INDIAN AIRLINES 6500
2 AM501 JET AIRWAYS 13400
4 IC302 INDIAN AIRLINES 4300
Flight_Fare.iloc[:,1:3]
OUTPUT
AIRLINES FARE
(m) 0 INDIAN AIRLINES 6500
1 SAHARA 9400
2 JET AIRWAYS 13400
3 INDIAN AIRLINES 8300
4 INDIAN AIRLINES 4300
(n) Flight_Fare.iloc[1:2,1:3]
OUTPUT
AIRLINES FARE
1 SAHARA 9400
Flight_Fare.loc[1:3]
OUTPUT
(o) FL_NO AIRLINES FARE
1 MU499 SAHARA 9400
2 AM501 JET AIRWAYS 13400
3 IC899 INDIAN AIRLINES 8300
Flight_Fare.loc[:,'FL_NO':'FARE']
OUTPUT
FL_NO AIRLINES FARE
(p) 0 IC701 INDIAN AIRLINES 6500
1 MU499 SAHARA 9400
2 AM501 JET AIRWAYS 13400
3 IC899 INDIAN AIRLINES 8300
4 IC302 INDIAN AIRLINES 4300
(q) Flight_Fare ["Tax%"] = [10,8,9,5,7]
OUTPUT
FL_NO AIRLINES FARE Tax%
0 IC701 INDIAN AIRLINES 6500 10
1 MU499 SAHARA 9400 8
2 AM501 JET AIRWAYS 13400 9
3 IC899 INDIAN AIRLINES 8300 5
4 IC302 INDIAN AIRLINES 4300 7
Flight_Fare.loc[5]=[ "MC101", "DECCAN AIRLINES", "3500",”10”]
OUTPUT
FL_NO AIRLINES FARE Tax%
(r) 0 IC701 INDIAN AIRLINES 6500 10
1 MU499 SAHARA 9400 8
2 AM501 JET AIRWAYS 13400 9
3 IC899 INDIAN AIRLINES 8300 5
4 IC302 INDIAN AIRLINES 4300 7
5 MC101 DECCAN AIRLINES 3500 10
Flight_Fare.loc [:,"Disc%"] = [2,3,2,4,2,3]
OUTPUT
FL_NO AIRLINES FARE Tax% Disc%
0 IC701 INDIAN AIRLINES 6500 10 2
(s)
1 MU499 SAHARA 9400 8 3
2 AM501 JET AIRWAYS 13400 9 2
3 IC899 INDIAN AIRLINES 8300 5 4
4 IC302 INDIAN AIRLINES 4300 7 2
5 MC101 DECCAN AIRLINES 3500 10 3
Flight_Fare =Flight_Fare.drop("Tax%", axis=1)
OUTPUT
FL_NO AIRLINES FARE Disc%
0 IC701 INDIAN AIRLINES 6500 2
(t)
1 MU499 SAHARA 9400 3
2 AM501 JET AIRWAYS 13400 2
3 IC899 INDIAN AIRLINES 8300 4
t4 IC302 INDIAN AIRLINES 4300 2
5 MC101 DECCAN AIRLINES 3500 3
Flight_Fare =Flight_Fare.drop(4, axis=0)
OUTPUT
FL_NO AIRLINES FARE Disc%
(u) 0 IC701 INDIAN AIRLINES 6500 2
1 MU499 SAHARA 9400 3
2 AM501 JET AIRWAYS 13400 2
3 IC899 INDIAN AIRLINES 8300 4
5 MC101 DECCAN AIRLINES 3500 3
Flight_Fare =Flight_Fare.drop([1,4] , axis=0)
OUTPUT
(v) FL_NO AIRLINES FARE Disc%
0 IC701 INDIAN AIRLINES 6500 2
2 AM501 JET AIRWAYS 13400 2
3 IC899 INDIAN AIRLINES 8300 4
(w) Flight_Fare.loc[2]
OUTPUT
FL_NO AM501
AIRLINES JET AIRWAYS
FARE 13400
Name: 2, dtype: object
Flight_Fare.loc[:,"FL_NO"]
OUTPUT
0 IC701
1 MU499
(x)
2 AM501
3 IC899
4 IC302
Name: FL_NO, dtype: object
Flight_Fare ["FARE"]>=6000
OUTPUT
0 True
1 True
(y)
2 True
3 True
4 False
Name: FARE, dtype: bool