Data frames
Pandas Data Strucutre
Methods to create data frames
● Using Lists
● Using Series
● Using Dictionary
● Using Numpy arrays
How to create a dataframe ?
Syntax:
pandas.DataFrame ( data, index, columns , dtype, copy )
data : can be a list, Series, dictionary, constants, numpy arrays
index: It is the row label. By default it is 0 to n-1
columns: It is the column label.
dtype : data type for each column. By default None data type is used.
copy : This is used to copy data. By default it is set to false
Creating from Lists
import pandas as pd
l = [ 10, 20 , 30 , 40 ]
df = pd.DataFrame( l ) 0
print (df ) 0 10
1 20
Index automatically generated 2 30
3 40
Creating dataframe from student list
import pandas as pd
data1=[[‘Shreyas’,20],[‘Risha’,18]]
df1=pd.DataFrame(data1,columns
= [‘name’,’age’]) Defining column names
print(df) Index is automatically
generated
df1=pd.DataFrame([['Shreyas',20],['Risha',18]])
Default column
names 0,1
Creating dataframe from a dictionary having lists as values:
dict1={'Students':['Ninu','Minu','Neha'],
'Marks':[23,24,22],
'S ports':['C ricket','K abbadi','T ennis']}
df=pd.DataFrame(dict1)
Created dictionary has index automatically
created and keys as column names.
Specifying new index by giving new
index sequence
Inner dictionaries having non matching inner keys:-
All the inner keys become indexes .
NaN values added for non matching keys
of inner dictionaries
Selecting/accessing data of a dataframe
Format: <dataframe object>[<column name>] Using square
bracket/dot
Or <dataframe object>.<column name>
notation
Creating dataframe by passing a list of dictionaries
Case 1
import pandas as pd
nstudent=[ {'Rinku':23,'Ajay':24,'Pankaj':21},
{'Rinku':20,'Ajay':21,'Pankaj':24},
{'Rinku':24,'Ajay':20,'Pankaj':22} ]
df=pd.DataFrame(nstudent,index=['m1','m2','m3'])
Ajay Pankaj Rinku Columns are formed by the keys
m1 24 21 23
m2 21 24 20
m3 20 22 24
Case 2
Pankaj’s mark is missing
import pandas as pd
nstudent=[{'Rinku':23,'Ajay':24},
{'Rinku':20,'Ajay':21,'Pankaj':24},
{'Rinku':24,'Ajay':20,'Pankaj':22}]
df=pd.DataFrame(nstudent,index=['m1','m2','m3'])
print(df)
NaN automatically added
to missing places
Creating a dataframe from a numpy array
By giving column sequence
, we can specify our own
index names or labels
Creating dictionary
using series:
import pandas as pd
staff=pd.Series([20,36,44])
salaries=pd.Series([16000,246000,563000])
School={'People':staff,'Amount':salaries}
df2=pd.DataFrame(school)
Here dataframe object is created using
multiple series objects
Creating DataFrame from dictionary &Series
import pandas as pd
marks=pd.Series({'Vijay':22,'Mina':23,'Renu':24})
age=pd.Series({'Vijay':17,'Mina':16,'Renu':17})
student_df=pd.DataFrame({'Marks':marks,'Age':age})
print(student_df)
Marks Age
V ij ay 22 17
Mina 23 16
Renu 24 17
Creating DataFrame from dictionary & Series
import pandas as pd
d = { 'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']) ,
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
print (df)
One two Index of series forms the index here also
a 1.0 1
b 2.0 2 Data of series 1 and 2 will be the values of columns
1 & 2 respectively
c 3.0 3
d NaN 4
Key of dict will become the column heading
Dictionary & Series:-
import pandas as pd
nstudent={'name':pd.Series(['Anu','Vinu','Minu']),
'Eng':pd.Series([23,24,25]),
'Maths':pd.Series([22,24,25])}
df=pd.DataFrame(nstudent)
print(df)
Creating a DataFrame from another dataframe
object
You can create a DataFrame object
identical to df2 by passing its name
to DataFrame().[ie df3 is identical to
df2]
Q)What will be the output of the following code?
import pandas as pd
import numpy as np
arr1=np.array([[11,12],[13,14],[15,16]])
df2= pd.D ataF rame(arr1)
print(df2)
Q)Write a program to create a DataFrame to store
weight,age ,names of 3 children.
import pandas as pd
df=pd.DataFrame({'weight':[42,75,66],
'Name':['Arnav','Charles','Guru'],
'Age':[15,12,14]})
print(df)
weight Name Age
0 42 Arnav 15
1 75 C harles 12
2 66 Guru 14
Creating dataframe from 2 D dictionary
import pandas as pd
employees =
inner
{'Sales':{'name':’Rohit','age':24}, dictionary(keys:-
'marketing':{'name':'Neha','age':30} } name and age)
Outer
dictionary
keys Inner dictionary keys as
(sales, index, (columns created
marketing) from keys are placed in
sorted order)
Outer dictionary keys as columns