Pandas DataFrame Notes - 12pages-Pages-4

The document provides various methods for selecting, modifying, and managing rows in a DataFrame using pandas. It covers techniques such as slicing by label/index, appending rows, dropping rows, boolean selection, and sorting. Additionally, it includes traps and considerations for handling row indices and duplicates.

Uploaded by

Sàazón Kasula

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views1 page

Pandas DataFrame Notes - 12pages-Pages-4

Uploaded by

Sàazón Kasula

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1

Select a slice of rows by label/index

Working with rows [inclusive-from : inclusive–to [ : step]]

df = df['a':'c'] # rows 'a' through 'c'
Get the row index and labels Trap: cannot work for integer labelled rows – see
idx = df.index # get row index previous code snippet on integer position slicing.
label = df.index[0] # first row label
label = df.index[-1] # last row label Append a row of column totals to a DataFrame
l = df.index.tolist() # get as a list # Option 1: use dictionary comprehension
a = df.index.values # get as an array sums = {col: df[col].sum() for col in df}
sums_df = DataFrame(sums,index=['Total'])
Change the (row) index df = df.append(sums_df)
df.index = idx # new ad hoc index
df = df.set_index('A') # col A new index # Option 2: All done with pandas
df = df.set_index(['A', 'B']) # MultiIndex df = df.append(DataFrame(df.sum(),
df = df.reset_index() # replace old w new columns=['Total']).T)
# note: old index stored as a col in df
df.index = range(len(df)) # set with list Iterating over DataFrame rows
df = df.reindex(index=range(len(df))) for (index, row) in df.iterrows(): # pass
df = df.set_index(keys=['r1','r2','etc']) Trap: row data type may be coerced.
df.rename(index={'old':'new'}, inplace=True)
Sorting DataFrame rows values
Adding rows df = df.sort(df.columns[0],
df = original_df.append(more_rows_in_df) ascending=False)
Hint: convert row to a DataFrame and then append. df.sort(['col1', 'col2'], inplace=True)
Both DataFrames should have same column labels.
Sort DataFrame by its row index
Dropping rows (by name) df.sort_index(inplace=True) # sort by row
df = df.drop('row_label') df = df.sort_index(ascending=False)
df = df.drop(['row1','row2']) # multi-row
Random selection of rows
Boolean row selection by values in a column import random as r
df = df[df['col2'] >= 0.0] k = 20 # pick a number
df = df[(df['col3']>=1.0) | (df['col1']<0.0)] selection = r.sample(range(len(df)), k)
df = df[df['col'].isin([1,2,5,7,11])] df_sample = df.iloc[selection, :] # get copy
df = df[~df['col'].isin([1,2,5,7,11])] Note: this randomly selected sample is not sorted
df = df[df['col'].str.contains('hello')]
Trap: bitwise "or", "and" “not; (ie. | & ~) co-opted to be Drop duplicates in the row index
Boolean operators on a Series of Boolean df['index'] = df.index # 1 create new col
Trap: need parentheses around comparisons. df = df.drop_duplicates(cols='index',
take_last=True)# 2 use new col
Selecting rows using isin over multiple columns del df['index'] # 3 del the col
# fake up some data df.sort_index(inplace=True)# 4 tidy up
data = {1:[1,2,3], 2:[1,4,9], 3:[1,8,27]}
df = DataFrame(data) Test if two DataFrames have same row index
len(a)==len(b) and all(a.index==b.index)
# multi-column isin
lf = {1:[1, 3], 3:[8, 27]} # look for Get the integer position of a row or col index label
f = df[df[list(lf)].isin(lf).all(axis=1)] i = df.index.get_loc('row_label')
Trap: index.get_loc() returns an integer for a unique
Selecting rows using an index match. If not a unique match, may return a slice/mask.
idx = df[df['col'] >= 2].index
print(df.ix[idx]) Get integer position of rows that meet condition
a = np.where(df['col'] >= 2) #numpy array
Select a slice of rows by integer position
[inclusive-from : exclusive-to [: step]] Test if the row index values are unique/monotonic
start is 0; end is len(df)
if df.index.is_unique: pass # ...
df = df[:] # copy entire DataFrame b = df.index.is_monotonic_increasing
df = df[0:2] # rows 0 and 1 b = df.index.is_monotonic_decreasing
df = df[2:3] # row 2 (the third row)
df = df[-1:] # the last row
Find row index duplicates
df = df[:-1] # all but the last row
if df.index.has_duplicates:
df = df[::2] # every 2nd row (0 2 ..)
print(df.index.duplicated())
Trap: a single integer without a colon is a column label
Note: also similar for column label duplicates.
for integer numbered columns.
Version 30 April 2017 - [Draft – Mark Graph – mark dot the dot graph at gmail dot com – @Mark_Graph on twitter]
4

Industrial Engineering and Management by Pravin Kumar
100% (10)
Industrial Engineering and Management by Pravin Kumar
673 pages
SM-A305F.FN Galaxy A30 PDF
No ratings yet
SM-A305F.FN Galaxy A30 PDF
1 page
Binomial Theorem (Practice Question) PDF
100% (3)
Binomial Theorem (Practice Question) PDF
11 pages
Data Handling for Data Scientists
No ratings yet
Data Handling for Data Scientists
163 pages
Python Cheat Sheet 2.0
100% (1)
Python Cheat Sheet 2.0
10 pages
Pandas DataFrame Cheat Sheet
No ratings yet
Pandas DataFrame Cheat Sheet
4 pages
Pandas Cheat Sheet
100% (1)
Pandas Cheat Sheet
2 pages
Python & Data Science Cheat Sheet
100% (4)
Python & Data Science Cheat Sheet
11 pages
Python Pandas and DataFrame Basics
No ratings yet
Python Pandas and DataFrame Basics
20 pages
Cheat Sheet
No ratings yet
Cheat Sheet
10 pages
Pandas: Import
100% (1)
Pandas: Import
13 pages
12 Pandas
No ratings yet
12 Pandas
9 pages
Unit 2 notes-II
No ratings yet
Unit 2 notes-II
47 pages
Pandas
No ratings yet
Pandas
27 pages
Data Analysis With Python
No ratings yet
Data Analysis With Python
60 pages
Dataframe Ip
No ratings yet
Dataframe Ip
75 pages
Content Pandas Cheat Sheet
No ratings yet
Content Pandas Cheat Sheet
9 pages
Python Pandas-Data Frames
No ratings yet
Python Pandas-Data Frames
41 pages
Pandas Part-2
No ratings yet
Pandas Part-2
9 pages
Java Past Paper
No ratings yet
Java Past Paper
3 pages
05getting Started With Pandas
No ratings yet
05getting Started With Pandas
44 pages
Pandas & PyNumS Essentials
No ratings yet
Pandas & PyNumS Essentials
10 pages
Pandas DataFrame Notes
100% (1)
Pandas DataFrame Notes
10 pages
Operations On Array
No ratings yet
Operations On Array
9 pages
AI & Data Science Lab Record
No ratings yet
AI & Data Science Lab Record
28 pages
Python For Data Science 1662157639
No ratings yet
Python For Data Science 1662157639
6 pages
Pandas Introduction: What Is Python Pandas Used For?
No ratings yet
Pandas Introduction: What Is Python Pandas Used For?
28 pages
Fundamental - Python
No ratings yet
Fundamental - Python
3 pages
Numpy Boolean Indexing: Filter
No ratings yet
Numpy Boolean Indexing: Filter
39 pages
Unit3 - 3) Pandas - Ipynb - Colab
No ratings yet
Unit3 - 3) Pandas - Ipynb - Colab
11 pages
Lab 1 ML Lab
No ratings yet
Lab 1 ML Lab
15 pages
Pandas
No ratings yet
Pandas
44 pages
Unit IV
No ratings yet
Unit IV
49 pages
Python Interviews
No ratings yet
Python Interviews
154 pages
Exp 3
No ratings yet
Exp 3
10 pages
Python & Pandas Cheat Sheet Guide
100% (2)
Python & Pandas Cheat Sheet Guide
5 pages
Cheat Python
No ratings yet
Cheat Python
8 pages
Data Analysis Tools
No ratings yet
Data Analysis Tools
26 pages
DataFrames Continued
No ratings yet
DataFrames Continued
9 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
10 pages
DataFrame Ac Win Final
No ratings yet
DataFrame Ac Win Final
30 pages
Pandas
No ratings yet
Pandas
13 pages
Pandas
No ratings yet
Pandas
1 page
Pandas Data Wrangling Cheat Sheet
100% (2)
Pandas Data Wrangling Cheat Sheet
6 pages
Pandas Merged
No ratings yet
Pandas Merged
2 pages
Ip Study
No ratings yet
Ip Study
18 pages
Day7 PandasCoreFeatures
No ratings yet
Day7 PandasCoreFeatures
4 pages
100 Pandas Puzzles
No ratings yet
100 Pandas Puzzles
20 pages
Python Cheat Sheets
97% (33)
Python Cheat Sheets
11 pages
Pandas Cheat Sheet for Data Manipulation
No ratings yet
Pandas Cheat Sheet for Data Manipulation
1 page
Cheat Sheet
No ratings yet
Cheat Sheet
12 pages
Add and Modifying Rows Renaming
No ratings yet
Add and Modifying Rows Renaming
4 pages
Pandas DataFrame Cheat Sheet
100% (1)
Pandas DataFrame Cheat Sheet
10 pages
Music Notation Shortcuts Guide
No ratings yet
Music Notation Shortcuts Guide
7 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
13 pages
Commands SQL, Python (BASICS)
No ratings yet
Commands SQL, Python (BASICS)
7 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
12 pages
Revision Notes DataFrame XII IP
No ratings yet
Revision Notes DataFrame XII IP
8 pages
Pandas
No ratings yet
Pandas
5 pages
T.ms6586.u705 + 25-DB5414-X2P1 Shg6002c-173e Lc-60ui9362e
100% (1)
T.ms6586.u705 + 25-DB5414-X2P1 Shg6002c-173e Lc-60ui9362e
54 pages
Human Relations in Organizations Applications and Skill Building 10th Edition Lussier Test Bank 1
100% (74)
Human Relations in Organizations Applications and Skill Building 10th Edition Lussier Test Bank 1
26 pages
CORVETTE 14L PV 200813 1510 Locked
No ratings yet
CORVETTE 14L PV 200813 1510 Locked
85 pages
Unit 1 DBMS
No ratings yet
Unit 1 DBMS
107 pages
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
No ratings yet
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
12 pages
PGDCA Project: Time Table System
No ratings yet
PGDCA Project: Time Table System
4 pages
Lesson 3 Transportation Problem
No ratings yet
Lesson 3 Transportation Problem
41 pages
P8 5.5.0-P85.5.4 Patch Compatibility Matrix 6
No ratings yet
P8 5.5.0-P85.5.4 Patch Compatibility Matrix 6
16 pages
Bs en Iso 1307-2008 - Hortum Ölçü Ve Tolerans Standardi
No ratings yet
Bs en Iso 1307-2008 - Hortum Ölçü Ve Tolerans Standardi
12 pages
Manual - Bancada Presys
No ratings yet
Manual - Bancada Presys
39 pages
An Economic Evaluation System For Building Construction Projects in The Conceputal Phase
No ratings yet
An Economic Evaluation System For Building Construction Projects in The Conceputal Phase
6 pages
The Z-Transform and Discrete Functions: Z KT X KT X T X Z X
No ratings yet
The Z-Transform and Discrete Functions: Z KT X KT X T X Z X
5 pages
Object Oriented Programming - ABAP Oops-Abap - 1
No ratings yet
Object Oriented Programming - ABAP Oops-Abap - 1
8 pages
Design and Fabrication of Compact Bicycle Trolley
No ratings yet
Design and Fabrication of Compact Bicycle Trolley
7 pages
TJ Bodies Place Demands Before Extending Term: Kathmandu
No ratings yet
TJ Bodies Place Demands Before Extending Term: Kathmandu
12 pages
Cheat Sheet Template
No ratings yet
Cheat Sheet Template
3 pages
CV Varsha Gupta 2 (1) (1) .7 Years Exp
No ratings yet
CV Varsha Gupta 2 (1) (1) .7 Years Exp
4 pages
Seismic Performance Assessment of A
No ratings yet
Seismic Performance Assessment of A
19 pages
Seismic Behaviors and Resilient Capacity of CFRP-confined Concrete Columns
No ratings yet
Seismic Behaviors and Resilient Capacity of CFRP-confined Concrete Columns
12 pages
Marconite - Earthing Compounds - Granular Marconite Compound Earthing
No ratings yet
Marconite - Earthing Compounds - Granular Marconite Compound Earthing
8 pages
Current Transformer Basics - Understanding Ratio, Polarity, and Class
No ratings yet
Current Transformer Basics - Understanding Ratio, Polarity, and Class
25 pages
1 - Introduction To BI
No ratings yet
1 - Introduction To BI
16 pages
EIM Performance Tuning Guide
No ratings yet
EIM Performance Tuning Guide
3 pages
ATM Banking System (18192203029)
No ratings yet
ATM Banking System (18192203029)
4 pages
Mist Edge
No ratings yet
Mist Edge
2 pages
Perbandingan Biaya Jaringan Dan Kelayakan Teknologi LTE Pada Frekuensi 900 MHZ, 1800 MHZ, 2100 MHZ, Dan 2300 MHZ Untuk Mendukung Rencana Pita Lebar Di Indonesia
No ratings yet
Perbandingan Biaya Jaringan Dan Kelayakan Teknologi LTE Pada Frekuensi 900 MHZ, 1800 MHZ, 2100 MHZ, Dan 2300 MHZ Untuk Mendukung Rencana Pita Lebar Di Indonesia
16 pages
Create All Time Zone Tables in HANA Schema SYSTEM
No ratings yet
Create All Time Zone Tables in HANA Schema SYSTEM
4 pages
How To Play The Back
No ratings yet
How To Play The Back
7 pages
Evolution of The Practice of Software Testing in Java Projects
No ratings yet
Evolution of The Practice of Software Testing in Java Projects
5 pages
PHPIPAM 1.2.1 Multiple Vulnerabilities
No ratings yet
PHPIPAM 1.2.1 Multiple Vulnerabilities
4 pages
Keywords and Identifiers in C
No ratings yet
Keywords and Identifiers in C
3 pages
JioFiber Tariff For Business
No ratings yet
JioFiber Tariff For Business
1 page
How To Crack GATE - IES - BARC - Electronic Devices and Circuits (EDC)
No ratings yet
How To Crack GATE - IES - BARC - Electronic Devices and Circuits (EDC)
4 pages
Target Hardware Debugging Boundary Scan
No ratings yet
Target Hardware Debugging Boundary Scan
13 pages

Pandas DataFrame Notes - 12pages-Pages-4

Uploaded by

Pandas DataFrame Notes - 12pages-Pages-4

Uploaded by

Select a slice of rows by label/index

Working with rows [inclusive-from : inclusive–to [ : step]]

You might also like