0% found this document useful (0 votes)

143 views10 pages

Pandas Series & DataFrame Tips

The document provides examples of operations that can be performed on pandas Series objects. Some key operations include: creating and manipulating Series, performing mathematical operations on Series, indexing and selecting Series values, converting between Series and other data types like NumPy arrays and DataFrames, and applying functions to modify Series values.

Uploaded by

Pragati jain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

143 views10 pages

Pandas Series & DataFrame Tips

Uploaded by

Pragati jain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

You are on page 1/ 10

df1= pd.

Series([2,3,4,5,6])
type(df)
df1.tolist()
df2= pd.Series([3,4,5,6,7])
df2+df1
df1-df2
df1*df2
df1/df2
To compare elements:
df1==df2
df1>df2
df1<df2
---------------------------------
to convert numpy array into pandas series:
array= np.array([1,2,3,4,5])
pf= pd.Series(array)

---------------------------------
to change the datatype of a column:
s1= pd.Series([1,2,3,4,'python'])
s2= pd.to_numeric(s1,errors='coerce')

----------------------------------
to change one column of a dataframe into a series:
df1= {"col1":[1,2,3,4,5],"col2":[2,34,56,7],"col3":[23,4,5,6]}
df1= pd.DataFrame(data=df1)
s1= df1.ix[:,0]

----------------------------------
to convert series into an numpy array:
s1= pd.Series([1,2,3,4,'python'])
nd= np.array(s1.values.tolist())

----------------------------------
to convert series of list ton one series:
s1= pd.Series([1,2,3,4],[5,6,7,8],[2,3,4])
s1= s1.apply(pd.Series).stack.reset_index(drop=True)

----------------------------------
to sort the values of a pandas series:
s1= pd.Series([1,2,3,4])
new_s1= pd.Series(s1).sort_values()

----------------------------------
to add elements into an existing pandas series:
s1= pd.Series([1,2,3,4])
new_s1= s1.append(pd.Series([45,'python']).reset_index(drop=True))

----------------------------------
to create a subset of given series based on value and condition:
s1= pd.Series([0,1,2,3,4,5,6,7])
n=5
s_new= s1[s1>n]

----------------------------------
to change the order of index:
s1= pd.Series([0,1,2,3,4,5],index=[A,B,C,D,E,F])
s1= s1.reindex(index=[B,A,D,E,F,C])
----------------------------------
to calculate mean and standard deviation of Series:
s1= pd.Series([2,3,4,5,6])
mean= s1.mean()
st_dev= s1.std()

----------------------------------
to get the items of series which are not in another series
s1= pd.Series([1,2,3,4,5,6])
s2= pd.Series([2,4,6,8,10,12])
print("Element of s1 which are not in s2")
new_s1= s1[~s1.isin(s2)]

----------------------------------
To get the items which are not common in both the series:
s1= pd.Series([1,2,3,4,5,6])
s2=pd.Series([2,4,6,8,10,12])
print('Elements which are not in common')
s11= pd.Series(np.union1d(s1,s2))
s22= pd.Series(np.intersection(s1,s2))
uncommon_elements= s11[~s11.isin[s22]]

----------------------------------
To compute minimum 25th 50th 75th and maximum values of a series:
s1= np.random.RandomState(100)
num_series= pd.Series(s1.normal(10,4,20)) -> mean=10,std_dev=4, total values= 20
result= np.percentile(num_series,q=[0,25,50,75,100])
print(result)

----------------------------------
To get the frequency count of each unique value:
s1= pd.Series([2,3,2,4,5,6,4,2,2,3,4,3,2])
result= s1.value_counts()

----------------------------------
Most frequent occur and replace other elements with 'Other'
s1= pd.Series(np.random.randint(1,5,[15]))
most_frequent= s1[~s1.isin(s1.value_counts().index[:1])]= 'Other'

----------------------------------
Print the position of number from series which are multiple of 5:
s1= pd.Series([1,2,3,5,10,15,30])
index= np.where(s1%5==0)
print(index)

----------------------------------
to extract the item at given position:
s1= pd.Series(list('23456789087633235'))
pos= [0,1,2,3,5,6]

extracted_item= s1.take(pos)

----------------------------------
to get the position of element from a given series to another series:
s1= pd.Series([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
s2= pd.Series([1, 3, 5, 7, 10])
result= [pd.Index(s1).get_loc(i) for i in s2]

----------------------------------
to make first and last letter of every word in series in upper case:
series1= pd.Series(['php',''python'])
new_series= series1.map(lambda x: x[0].upper() + x[1:-1] + x[-1].upper())

----------------------------------
To calculate len of characters in each word:
series1= pd.Series(['php','python'])
new_= series1.map(lambda x: len(x))
print(new_)

----------------------------------
Difference between two consecutive number:
s1= pd.Series([1,2,3,4,5,6,7])
result= s1.diff().tolist()

-----------------------------------
to convert dates into timeseries:
df= pd.Series(['01 Jan 2015', '10-02-2016', '20180307', '2014/05/06', '2016-04-12',
'2019-04-06T11:20'])
new_df= pd.to_datetime(df)

-----------------------------------
To get the date,month,year,week number, day of week from date:
from dateutil.parser import parser

date_sereies= pd.Series(['01 Jan 2015', '10-02-2016', '20180307', '2014/05/06',

'2016-04-12', '2019-04-06T11:20'])
date_series= date_series.map(lambda x: parser(x))
print("Day of month:")
print(date_series.dt.day.tolist())
print("Day of year:")
print(date_series.dt.dayofyear.tolist())
print("Week number:")
print(date_series.dt.isocalendar()) -> to get the week number
print(date_series.dt.weekday.tolist())

------------------------------------
To get the count of words which has number of vowels greater than or equal to 2:
series= pd.Series(['Red', 'Green', 'Orange', 'Pink', 'Yellow', 'White'])
series= series.map(lambda x: sum([Counter(x.lower()).get(i,0) for i in
list(['aeiou'])])>=2)

------------------------------------
to get the euclidian distance:
x= pd.Series([])
y= pd.Series([])
distance= np.linalg.norm(x-y)

------------------------------------
Replace white space with least frequent character in the string
s= 'abc def abcdef icd'
s_series= pd.Series(list(s))
element_counts= s_series.value_counts()
current_freq= element_counts.dropna().index[-1]
result= "".join(s.replace(" ",current_freq))

-------------------------------------
Autocorrelation, also known as serial correlation, is the correlation of a signal
with a delayed copy of itself as a function of delay.
Informally, it is the similarity between observations as a function of the time lag
between them.
num_series = pd.Series(np.arange(15) + np.random.normal(1, 10, 15))
autocorrelations = [num_series.autocorr(i).round(2) for i in range(11)]

-------------------------------------
Create a time series to display Sunday of entire year:
result = pd.Series(pd.date_range('2022-01-01',period=52,freq='W-SUN'))

--------------------------------------
To convert a series into Dataframe and taking index as another column
char_list = list('ABCDEFGHIJKLMNOP')
num_arra = np.arange(8)
num_dict = dict(zip(char_list, num_arra))
num_ser = pd.Series(num_dict)
df= num_ser.to_frame().reset_index()

---------------------------------------
To add two series vertically and horizontally:
Vertically:
series1.append(series2)
series_horizontal= pd.concat([series1,series2],axis=1)

---------------------------------------
To get max and min of Series:
num_series= pd.Series([1,2,3,4,5,5,6,7])
min_num= num_series.idxmin()
max_num= num_series.idxmax()

---------------------------------------
to get basic information about dataframe:
df= pd.DataFrame([])
df.info()

---------------------------------------
To get the first three rows:
df.iloc[:3]

---------------------------------------
To select two specified problem:
df[[name,score]]

---------------------------------------
To select specified rows and column from a dataframe:
df.iloc[[1,2,3,4],[1,3]]

---------------------------------------
To get rows where attempts is greater than two:
df[df['attempts']>2]

---------------------------------------
To get total number of rows and column:
total_rows=df.axes[0]
total_cols= df.axes[1]

----------------------------------------
To get the rows where the data is missing:
df[df[score].isnull()]
----------------------------------------
select the rows where attempts is less than 2 and score greater than 15:
df[(df['attempts']<2) & (df['score']>15)]

----------------------------------------
To select rows where the score is between 15 and 20 inclusive
df[df[score].between(15,20)]

----------------------------------------
To change score in d row to score=15
df.loc[d,'score']=15

----------------------------------------
Sum of examination attempt by students:
df['attempts'].sum()

----------------------------------------
to calculate the mean score for each different student:
df['Score'].mean()

----------------------------------------
To add a new row at k with given values:
df.loc['k']=[1,'Suresh','yes',15.5]

----------------------------------------
To drop the row by it's location:
df.drop('k')

----------------------------------------
Sort the column name by descending order and score by ascending order:
df.sort_values(by=['name','score'],ascending=[False,True])

----------------------------------------
Replace the qualify column yes with True and No with False:
df['qualify'].map({yes: True,No: False})

----------------------------------------
To change the name in names column from James to Suresh
df['names']= df['names'].replace('James','Suresh')

----------------------------------------
To delete the column attempts
df.pop('attempts')

----------------------------------------
To insert a new column in existing dataframe
color=['Blue','Green','White','Red']
df['color']=color

----------------------------------------
To iterate over rows in dataframe:
for index,row in df.iterrows():

----------------------------------------
To get the column names of dataframe
df.columns.values

----------------------------------------
To rename the column:
df= df.rename(columns={col1:column1,col2:column2,col3:column3})

----------------------------------------
To select rows based on some value from columns:
df.loc[df['col']==4]

----------------------------------------
To change the position of the rows:
df= df[['col3','col1','col2']]

----------------------------------------
To add row in existing dataframe:
df2= {'col1':2,'col2':'Suresh','col2':15.5}
df1.append(df2,ignore_index=True)

----------------------------------------
To save the file as CSV with \t as seperator
df1.to_csv('data.csv',sep='\t',index=False)

----------------------------------------
Count of people city wise:
count= df.groupby(['city']).size().reset_index(name='No of people')

----------------------------------------
To delete rows with a given value or condition
df= df[df.cols!=5]

----------------------------------------
To select the rows by integer index:
df1= df1.iloc[[2]]

----------------------------------------
To replace all the NaN value with 0
df= df.fillna(0)

----------------------------------------
To convert index in a column in a data frame
df.reset_index(level=0,in_place=True)

To hide the index column

df.to_string(index=False)

----------------------------------------
Set a particular value using index value

df.set_value(8, 'score', 12.0)

df.set_value(index_row, column_name, value)

----------------------------------------
To count number of null values in one or more column
df.isnull().values.sum()

-----------------------------------------
To drop a list of rows using index from dataframe:
df= df.drop(df.index[[1,2,3,4]])

-----------------------------------------
To drop rows by position (drop first two rows)
df= df.drop([1,2])
-----------------------------------------
To reset index in a dataframe
df= df.reset_index()

-----------------------------------------
to divide the data frame
df= pd.DataFrame([])
part70= df.sample(frac=0.7,random_state=10)
part30= df.drop(part70.index)

------------------------------------------
To concatenate two series into a Dataframe

s1= pd.Series([])
s2= pd.Series

df= pd.concat([s1,s2],axis=1)

-------------------------------------------
convert from string date time to data frame column
s= pd.Series([])
r= pd.to_datetime(s)
df= pd.DataFrame(r)

-------------------------------------------
To get list of specified column

col2= df['col2'].tolist()

-------------------------------------------
Find the row number where the value of the column is maximum

mx_index_col1= df['col1'].argmax()

-------------------------------------------
To check if the column is present in dataframe
if 'col1' is in df.columns:

-------------------------------------------
To get the row value at specified index
df.iloc[3] -> return the values present at index 3

-------------------------------------------
To get the specified data types of the column:
df.dtypes

-------------------------------------------
To add the data into empty data frame:

data= pd.DataFrame({'col1': value1,

'col2': value2,
'col3': value3})

df= pd.DataFrame()
df.append(data)

-------------------------------------------
Sort the data frame by two or more columns
df.sort_values(by=['score','name'],ascending=[False,True])

-------------------------------------------
To convert the data type float to int

df.score.astype(int)

-------------------------------------------
To replace infinity to NaN

df= df.replace([np.inf, -np.inf],np.NaN)

-------------------------------------------
To add new column at specified index

index=0
col1= [1,2,3,4,5]
df.insert(loc=index,column='col1',value=col1)

-------------------------------------------
To convert the list of list into a dataframe

my_list= [['col1','col2'],[1,2],[3,4]]
headers= my_list.pop(0)
df= pd.DataFrame(my_list,columns= headers)

-------------------------------------------
To group the dataframe by column1 and get the column 2 values as list

df=df.groupby('col1')['col2'].apply(list)

--------------------------------------------
get the index of column
df.column.get_loc('col1')

--------------------------------------------
To count the number of columns
len(df.columns)

--------------------------------------------
To select all columns except one column
df= df.loc[:, df.columns!='col3']

--------------------------------------------
To get first n records:
df.head(n)

--------------------------------------------
To get last n records:
df.tail(n)

--------------------------------------------
to get topmost number from column

df.nlargest(3,'col1')

--------------------------------------------
to get rows after removing first n rows
df1= df.iloc[3:]

--------------------------------------------
To get rows after removing last n rows :
df1= df.iloc[:3]

--------------------------------------------
To add prefix and suffix in column name

df.add_prefix("A_")
df.add_suffix("_A")

-------------------------------------------
To select columns by datatype

df= df.select_dtypes(include='object')

-------------------------------------------
To divide values in different subset

df1= df.sample(frac=0.6)
df2= df.drop(df1.index)

-------------------------------------------
To convert continous value column into a categorical column
df['Age_group']= df.cut(df['age'],bins=[0,18,25,35],labels=['kids','adults','old'])

-------------------------------------------
To use local variable in query
maxx= df['col'].max()

df= df.query("col< @maxx")

-------------------------------------------
to get index and distinct value of column

labels, names= pd.factorize(df['name'])

-------------------------------------------
To read data from excel sheet

df= pd.read_clipboard()
-------------------------------------------

df1= pd.DataFrame()
df2= pd.DataFrame()

df1.ne(df2) --> check for inequality -> if unequal return true

-------------------------------------------
To set the index :

df= pd.DataFrame()

df.set_index('col_name')

-------------------------------------------
To remove the index and make the index as default
df1= df.reset_index(inplace=False)

Pandas DataFrame Notes
67% (3)
Pandas DataFrame Notes
13 pages
Python Pandas 1
No ratings yet
Python Pandas 1
86 pages
Pandas 2
No ratings yet
Pandas 2
36 pages
12 IP Notes On Series
No ratings yet
12 IP Notes On Series
5 pages
Homework 12 IP 2025-26 02 Based On Series Summer Vacation
No ratings yet
Homework 12 IP 2025-26 02 Based On Series Summer Vacation
4 pages
Working With Pandas Notes
No ratings yet
Working With Pandas Notes
27 pages
Class 12 Practical File
No ratings yet
Class 12 Practical File
29 pages
Unit 2
No ratings yet
Unit 2
81 pages
Practical File Class - Xii Informatics Practices (New) : 1. How To Create A Series From A List, Numpy Array and Dict?
No ratings yet
Practical File Class - Xii Informatics Practices (New) : 1. How To Create A Series From A List, Numpy Array and Dict?
17 pages
Pandas Series Basics: Data Processing Guide
No ratings yet
Pandas Series Basics: Data Processing Guide
29 pages
Pandas - Series - Short - Notes
100% (1)
Pandas - Series - Short - Notes
7 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
13 pages
XII - LIST OF PRACTICALS - With Answers
No ratings yet
XII - LIST OF PRACTICALS - With Answers
20 pages
Class Notes Class: XII Date: 17-04-2021 Subject: Informatics Practices Topic: Chapter-1
No ratings yet
Class Notes Class: XII Date: 17-04-2021 Subject: Informatics Practices Topic: Chapter-1
5 pages
Pratical 1: Problem Statement: Solution: Source Code
No ratings yet
Pratical 1: Problem Statement: Solution: Source Code
49 pages
Pratical 1: Problem Statement: Solution: Source Code
No ratings yet
Pratical 1: Problem Statement: Solution: Source Code
49 pages
Pratical 1: Problem Statement: Solution: Source Code
No ratings yet
Pratical 1: Problem Statement: Solution: Source Code
49 pages
Pratical 1: Problem Statement: Solution: Source Code
No ratings yet
Pratical 1: Problem Statement: Solution: Source Code
49 pages
? Sample Paper by Aadish
No ratings yet
? Sample Paper by Aadish
7 pages
Ip Project Work 2
No ratings yet
Ip Project Work 2
52 pages
Practical - With Solution - XII - IP
No ratings yet
Practical - With Solution - XII - IP
13 pages
Subject IP
No ratings yet
Subject IP
9 pages
Pandas & Numpy
No ratings yet
Pandas & Numpy
32 pages
Pandas
No ratings yet
Pandas
63 pages
Practical-File-12 IP 24-25
No ratings yet
Practical-File-12 IP 24-25
49 pages
Chapter 2 Q & A
No ratings yet
Chapter 2 Q & A
2 pages
Pandas
No ratings yet
Pandas
49 pages
Introducing Python Pandas
No ratings yet
Introducing Python Pandas
54 pages
LAST MINUTES REVISION Pandas Series
No ratings yet
LAST MINUTES REVISION Pandas Series
6 pages
Igcse Oct/nov 2024 0478
100% (4)
Igcse Oct/nov 2024 0478
16 pages
Pandas Series & DataFrame Guide
No ratings yet
Pandas Series & DataFrame Guide
60 pages
Final Formatted After Iloc Loc
No ratings yet
Final Formatted After Iloc Loc
34 pages
Data Handling and CSV 2024 - 2025
No ratings yet
Data Handling and CSV 2024 - 2025
12 pages
Pandas
No ratings yet
Pandas
57 pages
Grade-XII-IP - Ch-1 - Series Notes
No ratings yet
Grade-XII-IP - Ch-1 - Series Notes
28 pages
Data Handlinng Using Pandas-I
No ratings yet
Data Handlinng Using Pandas-I
46 pages
Pandas Summarized Visually in 8
100% (2)
Pandas Summarized Visually in 8
8 pages
Dataframes UNIT 1 PART 2
No ratings yet
Dataframes UNIT 1 PART 2
33 pages
Pandas Data Structures Guide
No ratings yet
Pandas Data Structures Guide
72 pages
Pandas Output
No ratings yet
Pandas Output
16 pages
Ip Work
No ratings yet
Ip Work
6 pages
PANDAS
No ratings yet
PANDAS
24 pages
Class XII Python Pandas Worksheet
No ratings yet
Class XII Python Pandas Worksheet
3 pages
Pandas Series Exercises & Solutions
No ratings yet
Pandas Series Exercises & Solutions
13 pages
Data Handlinng Using Pandas
No ratings yet
Data Handlinng Using Pandas
46 pages
Python & Pandas for Beginners
No ratings yet
Python & Pandas for Beginners
29 pages
CH 1 Python Pandas-I
No ratings yet
CH 1 Python Pandas-I
13 pages
Pandas Class 12 Ncertttt
No ratings yet
Pandas Class 12 Ncertttt
48 pages
Ip Notes
No ratings yet
Ip Notes
20 pages
12 Pandas
No ratings yet
12 Pandas
9 pages
CH 02 - Data Handling Using Pandas Leip102 EDITED Smaller 01 Codes Only
No ratings yet
CH 02 - Data Handling Using Pandas Leip102 EDITED Smaller 01 Codes Only
15 pages
Ip Study
No ratings yet
Ip Study
18 pages
1 Data Handlinng Using Pandas-I
No ratings yet
1 Data Handlinng Using Pandas-I
46 pages
CS201 PRACTICAL SOLVED MCQs FINAL TERM BY JUNAID
100% (2)
CS201 PRACTICAL SOLVED MCQs FINAL TERM BY JUNAID
16 pages
Practical Xii 11-25
No ratings yet
Practical Xii 11-25
14 pages
Introduction To Pandas & Data Structures
No ratings yet
Introduction To Pandas & Data Structures
11 pages
Exp 25 - 26
No ratings yet
Exp 25 - 26
17 pages
Revision Point - Series
No ratings yet
Revision Point - Series
5 pages
Python Series & DataFrame Tasks
No ratings yet
Python Series & DataFrame Tasks
49 pages
12th Practical
No ratings yet
12th Practical
21 pages
Unix/Linux Piping & vi Editor Guide
No ratings yet
Unix/Linux Piping & vi Editor Guide
5 pages
19EL013 Full Adder Using Data Flow and Gate Level
No ratings yet
19EL013 Full Adder Using Data Flow and Gate Level
9 pages
Dynamic Programming vs. Divide-&-Conquer: Independent
No ratings yet
Dynamic Programming vs. Divide-&-Conquer: Independent
11 pages
Loaders and Linkers: Basic Loader Functions
100% (1)
Loaders and Linkers: Basic Loader Functions
26 pages
Machine Learning For Expert Systems in Data Analysis: Ezekiel T. Ogidan Kamil Dimililer Yoney Kirsal Ever
No ratings yet
Machine Learning For Expert Systems in Data Analysis: Ezekiel T. Ogidan Kamil Dimililer Yoney Kirsal Ever
5 pages
Binary Tree Algorithms
No ratings yet
Binary Tree Algorithms
14 pages
Golang Tips
No ratings yet
Golang Tips
43 pages
Practical 1
No ratings yet
Practical 1
67 pages
Lecture 8 - Naive Bayes
No ratings yet
Lecture 8 - Naive Bayes
27 pages
02introduction To Programming Edip Senyureksyllabus20232024
No ratings yet
02introduction To Programming Edip Senyureksyllabus20232024
3 pages
OS - Module 4 - Notes
No ratings yet
OS - Module 4 - Notes
43 pages
Lesson 3 Control Structures C++ For Students
No ratings yet
Lesson 3 Control Structures C++ For Students
20 pages
String Matching Algorithm
No ratings yet
String Matching Algorithm
18 pages
Practicals
No ratings yet
Practicals
3 pages
Daksh 1.3 Python
No ratings yet
Daksh 1.3 Python
7 pages
Basics of Computer Science
No ratings yet
Basics of Computer Science
2 pages
37-AVL Trees - Terminology and Concepts-05-11-2024
No ratings yet
37-AVL Trees - Terminology and Concepts-05-11-2024
28 pages
Booth's Multiplication
No ratings yet
Booth's Multiplication
22 pages
Lexical Analysis
No ratings yet
Lexical Analysis
41 pages
Algebra Assignment 2
No ratings yet
Algebra Assignment 2
7 pages
Module 1 - Introduction To Artificial Intelligence (AI)
No ratings yet
Module 1 - Introduction To Artificial Intelligence (AI)
27 pages
Cognizant Eligible Students For Technical Assessment On 19th September
No ratings yet
Cognizant Eligible Students For Technical Assessment On 19th September
12 pages
10-SLAM Presentation
No ratings yet
10-SLAM Presentation
62 pages
Group No. Course Code Course Title Unique Code
No ratings yet
Group No. Course Code Course Title Unique Code
86 pages
12th CS Model Mock Practical Exam Question Paper
No ratings yet
12th CS Model Mock Practical Exam Question Paper
9 pages
Indian Institute of Technology Kharagpur: Important Instructions and Guidelines For Students
No ratings yet
Indian Institute of Technology Kharagpur: Important Instructions and Guidelines For Students
8 pages
CSE Lab Experiments Guide
No ratings yet
CSE Lab Experiments Guide
17 pages
Deep Learning
No ratings yet
Deep Learning
22 pages

Pandas Series & DataFrame Tips

Uploaded by

Pandas Series & DataFrame Tips

Uploaded by

df1= pd.

date_sereies= pd.Series(['01 Jan 2015', '10-02-2016', '20180307', '2014/05/06',

To hide the index column

df.set_value(8, 'score', 12.0)

data= pd.DataFrame({'col1': value1,

df= df.replace([np.inf, -np.inf],np.NaN)

df= df.query("col< @maxx")

labels, names= pd.factorize(df['name'])

df1.ne(df2) --> check for inequality -> if unequal return true

You might also like