0% found this document useful (0 votes)

677 views1 page

Pandas Basics Cheat Sheet Python For Data Science: Retrieving Series/Dataframe Information

This document provides a summary of key Pandas functions for working with DataFrames and Series. It covers reading and writing data to common file types like CSV and Excel. It also discusses selecting and filtering DataFrames, applying functions, descriptive statistics, and alignment of indexes during arithmetic operations. The Pandas library is built on NumPy and provides easy-to-use data structures and analysis tools for Python.

Uploaded by

locuto

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

677 views1 page

Pandas Basics Cheat Sheet Python For Data Science: Retrieving Series/Dataframe Information

Uploaded by

locuto

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1

> I/O > Retrieving Series/DataFrame Information

Python For Data Science Read and Write to CSV Basic Information

Pandas Basics Cheat Sheet >>> pd.read_csv(‘file.csv’, header=None, nrows=5)

>>> df.to_csv('myDataFrame.csv')
>>>
>>>
>>>
df.shape #(rows,columns)

df.index #Describe index

df.columns #Describe DataFrame columns

>>> df.info() #Info on DataFrame

Learn Pandas Basics online at www.DataCamp.com Read and Write to Excel >>> df.count() #Number of non-NA values

>>> pd.read_excel(‘file.xlsx’)

>>> df.to_excel('dir/myDataFrame.xlsx', sheet_name='Sheet1')

Summary
Read multiple sheets from the same file df.sum() #Sum of values

Pandas
>>>
>>> df.cumsum() #Cummulative sum of values

>>> xlsx = pd.ExcelFile(‘file.xls’)

>>> df.min()/df.max() #Minimum/maximum values

>>> df = pd.read_excel(xlsx, 'Sheet1')

>>> df.idxmin()/df.idxmax() #Minimum/Maximum index value

>>> df.describe() #Summary statistics

The Pandas library is built on NumPy and provides easy-to-use data

structures and data analysis tools for the Python programming language. Read and Write to SQL Query or Database Table >>>
>>>
df.mean() #Mean of values

df.median() #Median of values

Use the following import convention: >>> from sqlalchemy import create_engine

>>> engine = create_engine('sqlite:///:memory:')

>>> import pandas as pd >>>

>>>
pd.read_sql("SELECT * FROM my_table;", engine)

pd.read_sql_table('my_table', engine)
> Applying Functions
>>> pd.read_sql_query("SELECT * FROM my_table;", engine)
read_sql() is a convenience wrapper around read_sql_table() and read_sql_query() >>> f = lambda x: x*2

> Pandas Data Structures >>> df.to_sql('myDf', engine) >>> df.apply(f) #Apply function

>>> df.applymap(f) #Apply function element-wise

Series
> Selection Also see NumPy Arrays
> Data Alignment
A one-dimensional labeled array
a 3
capable of holding any data type b -5 Getting Internal Data Alignment
Index
c 7 >>> s['b'] #Get one element

NA values are introduced in the indices that don’t overlap:

d 4 -5

>>> s = pd.Series([3, -5, 7, 4], index=['a', 'b', 'c', 'd']) >>> df[1:] #Get subset of a DataFrame
>>> s3 = pd.Series([7, -2, 3], index=['a', 'c', 'd'])

Country Capital Population

>>> s + s3

1 India New Delhi 1303171035

a 10.0

Dataframe 2 Brazil Brasília 207847528 b NaN

c 5.0

Selecting, Boolean Indexing & Setting

d 7.0
A two-dimensional labeled data structure

with columns of potentially different types

By Position Arithmetic Operations with Fill Methods
Columns Country Capital Population
>>> df.iloc[[0],[0]] #Select single value by row & column

0 Belgium Brussels 11190846 'Belgium'

You can also do the internal data alignment yourself with the help of the fill methods:
Index 1 India New Delhi 1303171035 >>> df.iat([0],[0])
>>> s.add(s3, fill_values=0)

'Belgium' a 10.0

2 Brazil Brasilia 207847528

b -5.0

By Label
>>> data = {'Country': ['Belgium', 'India', 'Brazil'],
c 5.0

'Capital': ['Brussels', 'New Delhi', 'Brasília'],

>>> df.loc[[0], ['Country']] #Select single value by row & column labels
d 7.0

'Population': [11190846, 1303171035, 207847528]}

'Belgium'
>>> s.sub(s3, fill_value=2)

>>> df = pd.DataFrame(data,
>>> df.at([0], ['Country'])
>>> s.div(s3, fill_value=4)

columns=['Country', 'Capital', 'Population']) 'Belgium' >>> s.mul(s3, fill_value=3)

By Label/Position

> Dropping
>>> df.ix[2] #Select single row of subset of rows

Country Brazil

Capital Brasília

Population 207847528

>>> s.drop(['a', 'c']) #Drop values from rows (axis=0)

>>> df.ix[:,'Capital'] #Select a single column of subset of columns

>>> df.drop('Country', axis=1) #Drop values from columns(axis=1) 0 Brussels

1 New Delhi

2 Brasília

>>> df.ix[1,'Capital'] #Select rows and columns

> Asking For Help 'New Delhi'

Boolean Indexing
>>> help(pd.Series.loc) >>> s[~(s > 1)] #Series s where value is not >1

>>> s[(s < -1) | (s > 2)] #s where value is <-1 or >2

>>> df[df['Population']>1200000000] #Use filter to adjust DataFrame

> Sort & Rank Setting

>>> s['a'] = 6 #Set index a of Series s to 6

>>> df.sort_index() #Sort by labels along an axis

Learn Data Skills Online at
>>> df.sort_values(by='Country') #Sort by the values along an axis

>>> df.rank() #Assign ranks to entries

www.DataCamp.com

Python Pandas Tutorial For Beginners
No ratings yet
Python Pandas Tutorial For Beginners
203 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
16 pages
Data Handling Using Pandas-1
No ratings yet
Data Handling Using Pandas-1
60 pages
Subject IP
No ratings yet
Subject IP
9 pages
Pandas Tutorial
No ratings yet
Pandas Tutorial
33 pages
DAP 3 Module
No ratings yet
DAP 3 Module
62 pages
P Unit-4 NP
No ratings yet
P Unit-4 NP
30 pages
Pandas Cheat Sheet........
No ratings yet
Pandas Cheat Sheet........
11 pages
Pandas Notes
No ratings yet
Pandas Notes
20 pages
Unit III - Notes
No ratings yet
Unit III - Notes
12 pages
Pandas Notes
No ratings yet
Pandas Notes
44 pages
Python Unit 3 4
No ratings yet
Python Unit 3 4
92 pages
Lab-3 Pandas Library
No ratings yet
Lab-3 Pandas Library
18 pages
Pandas Python For Data Science
100% (1)
Pandas Python For Data Science
1 page
Seaborn Cheat Sheet Python For Data Science: 3 Plotting With Seaborn 3 Plotting With Seaborn
No ratings yet
Seaborn Cheat Sheet Python For Data Science: 3 Plotting With Seaborn 3 Plotting With Seaborn
1 page
Pandas
No ratings yet
Pandas
26 pages
Unit 2
No ratings yet
Unit 2
81 pages
Cheat Sheet
No ratings yet
Cheat Sheet
10 pages
Pandas Handbook
No ratings yet
Pandas Handbook
33 pages
Pandas Complete Notes
No ratings yet
Pandas Complete Notes
105 pages
Pandas Cheat Sheet for Data Science
No ratings yet
Pandas Cheat Sheet for Data Science
1 page
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
12 pages
Lab-3 Pandas Library
No ratings yet
Lab-3 Pandas Library
14 pages
PandasGUIA PYTHON-04
No ratings yet
PandasGUIA PYTHON-04
1 page
Unit 3
No ratings yet
Unit 3
10 pages
DevOps Session 3 Pandas
No ratings yet
DevOps Session 3 Pandas
33 pages
The Pandas Library
No ratings yet
The Pandas Library
39 pages
Pandas Cheet Sheet
No ratings yet
Pandas Cheet Sheet
1 page
Pandas Cheat Sheet for Data Science
No ratings yet
Pandas Cheat Sheet for Data Science
1 page
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
10 pages
Pandas - Cheat - Sheet (1) - 240511 - 113437
No ratings yet
Pandas - Cheat - Sheet (1) - 240511 - 113437
1 page
Cheat Python
No ratings yet
Cheat Python
8 pages
Python For Data Science 1662157639
No ratings yet
Python For Data Science 1662157639
6 pages
Pandas
No ratings yet
Pandas
4 pages
Pandas DataFrame Cheat Sheet
No ratings yet
Pandas DataFrame Cheat Sheet
4 pages
Unit 4
No ratings yet
Unit 4
36 pages
Pandaspythonfordatascience
No ratings yet
Pandaspythonfordatascience
1 page
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
1 page
Pandas Tutorial
No ratings yet
Pandas Tutorial
7 pages
Introduction C
100% (1)
Introduction C
28 pages
WEBINTEL GUIDED LAB ACTIVITY Introduction To Pandas
No ratings yet
WEBINTEL GUIDED LAB ACTIVITY Introduction To Pandas
1 page
Pandas DataFrame Notes
100% (1)
Pandas DataFrame Notes
10 pages
Introduction To Pandas For Data Analysis
No ratings yet
Introduction To Pandas For Data Analysis
6 pages
Pandas
No ratings yet
Pandas
13 pages
Grade 8 August Holiday Revision Booklet
No ratings yet
Grade 8 August Holiday Revision Booklet
154 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
13 pages
Pandas Data Structures: Sections
No ratings yet
Pandas Data Structures: Sections
13 pages
Pandas
No ratings yet
Pandas
21 pages
Pandas DataFrame Cheat Sheet
100% (1)
Pandas DataFrame Cheat Sheet
10 pages
Pandas
No ratings yet
Pandas
8 pages
Pandas DataFrame Basics Guide
No ratings yet
Pandas DataFrame Basics Guide
9 pages
Cheat Sheet: The Pandas Dataframe Object: Column Index (DF - Columns)
No ratings yet
Cheat Sheet: The Pandas Dataframe Object: Column Index (DF - Columns)
6 pages
Chemistry Recap for Class XII Students
No ratings yet
Chemistry Recap for Class XII Students
1 page
Tropical Design: Climate
No ratings yet
Tropical Design: Climate
60 pages
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
No ratings yet
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
12 pages
MODULE - Range and Kernel
No ratings yet
MODULE - Range and Kernel
23 pages
Python For Data Science: Advanced Indexing Data Wrangling in Pandas Cheat Sheet Combining Data
No ratings yet
Python For Data Science: Advanced Indexing Data Wrangling in Pandas Cheat Sheet Combining Data
1 page
Bokeh Cheat Sheet Python For Data Science: 3 Renderers & Visual Customizations
0% (1)
Bokeh Cheat Sheet Python For Data Science: 3 Renderers & Visual Customizations
1 page
Design of Spur Gear
No ratings yet
Design of Spur Gear
23 pages
Experiment 6 - Sem2
No ratings yet
Experiment 6 - Sem2
4 pages
Scientific Aspects of Juggling by Claude Shannon
No ratings yet
Scientific Aspects of Juggling by Claude Shannon
11 pages
Tyco Drenchers - TFP807 - 07 - 2014
100% (1)
Tyco Drenchers - TFP807 - 07 - 2014
14 pages
Jupyter Notebook Quick Guide
No ratings yet
Jupyter Notebook Quick Guide
1 page
Gas Law Experiment
No ratings yet
Gas Law Experiment
3 pages
Critical Resistance and Critical Speed For DC Shunt Generator For PDF
No ratings yet
Critical Resistance and Critical Speed For DC Shunt Generator For PDF
10 pages
Oracle SQL Tuning: For Day-to-Day Data Warehouse Support
No ratings yet
Oracle SQL Tuning: For Day-to-Day Data Warehouse Support
68 pages
Chapter 3 Methods of Lead Optimization
No ratings yet
Chapter 3 Methods of Lead Optimization
23 pages
Syllabus Apni Kaksha
No ratings yet
Syllabus Apni Kaksha
1 page
CAPA Test 1 2014 Regular
No ratings yet
CAPA Test 1 2014 Regular
3 pages
On The Residual Strength of Rocks and Rockmasses
No ratings yet
On The Residual Strength of Rocks and Rockmasses
13 pages
RT6 Map Update Guide
No ratings yet
RT6 Map Update Guide
1 page
Python Data Importing Guide
No ratings yet
Python Data Importing Guide
1 page
Revenue Grade Metering Standards
No ratings yet
Revenue Grade Metering Standards
2 pages
Namma Kalvi 12th Computer Applications Practical Manual em
No ratings yet
Namma Kalvi 12th Computer Applications Practical Manual em
33 pages
Explicit Solutions For Critical and Normal Depths in Trapezoidal and Parabolic Open Channels
No ratings yet
Explicit Solutions For Critical and Normal Depths in Trapezoidal and Parabolic Open Channels
7 pages
1.what Is Opactch in Oracle?
No ratings yet
1.what Is Opactch in Oracle?
5 pages
A.S Level Biology Edexcel Notes Unit 1 Part 1 Color 2side
No ratings yet
A.S Level Biology Edexcel Notes Unit 1 Part 1 Color 2side
134 pages
200749205339
No ratings yet
200749205339
10 pages
Sodium Coolant Handbook
No ratings yet
Sodium Coolant Handbook
288 pages
Carbon Black Surface Area Analysis
No ratings yet
Carbon Black Surface Area Analysis
39 pages
Ultra Sensitive TSH Test Report
No ratings yet
Ultra Sensitive TSH Test Report
1 page
Project Report (Org) 4
No ratings yet
Project Report (Org) 4
49 pages
Oracle Database 19c Auto-Indexing
No ratings yet
Oracle Database 19c Auto-Indexing
15 pages
Matplotlib Cheat Sheet Python For Data Science: Plotting Cutomize Plot Plotting Routines
No ratings yet
Matplotlib Cheat Sheet Python For Data Science: Plotting Cutomize Plot Plotting Routines
1 page
Seminar On: 3D Printing
No ratings yet
Seminar On: 3D Printing
19 pages
Unit-1 - Introduction To Nodejs
No ratings yet
Unit-1 - Introduction To Nodejs
92 pages
Adaptive Server Enterprise: Performance and Tuning Series: Monitoring Tables
No ratings yet
Adaptive Server Enterprise: Performance and Tuning Series: Monitoring Tables
66 pages
PhysRevB 97 161108
No ratings yet
PhysRevB 97 161108
5 pages
Tree
No ratings yet
Tree
7 pages
Date Palm Pest Management Guide
No ratings yet
Date Palm Pest Management Guide
234 pages
AWR Warehouse: An Introduction
No ratings yet
AWR Warehouse: An Introduction
38 pages

Pandas Basics Cheat Sheet Python For Data Science: Retrieving Series/Dataframe Information

Uploaded by

Pandas Basics Cheat Sheet Python For Data Science: Retrieving Series/Dataframe Information

Uploaded by

> I/O > Retrieving Series/DataFrame Information

Pandas Basics Cheat Sheet >>> pd.read_csv(‘file.csv’, header=None, nrows=5)

df.index #Describe index

df.columns #Describe DataFrame columns

>>> df.info() #Info on DataFrame

>>> df.to_excel('dir/myDataFrame.xlsx', sheet_name='Sheet1')

>>> xlsx = pd.ExcelFile(‘file.xls’)

>>> df.min()/df.max() #Minimum/maximum values

>>> df = pd.read_excel(xlsx, 'Sheet1')

>>> df.describe() #Summary statistics

The Pandas library is built on NumPy and provides easy-to-use data

df.median() #Median of values

>>> engine = create_engine('sqlite:///:memory:')

>>> import pandas as pd >>>

>>> df.applymap(f) #Apply function element-wise

NA values are introduced in the indices that don’t overlap:

Country Capital Population

1 India New Delhi 1303171035

Dataframe 2 Brazil Brasília 207847528 b NaN

Selecting, Boolean Indexing & Setting

with columns of potentially different types

0 Belgium Brussels 11190846 'Belgium'

2 Brazil Brasilia 207847528

'Capital': ['Brussels', 'New Delhi', 'Brasília'],

'Population': [11190846, 1303171035, 207847528]}

columns=['Country', 'Capital', 'Population']) 'Belgium' >>> s.mul(s3, fill_value=3)

>>> s.drop(['a', 'c']) #Drop values from rows (axis=0)

>>> df.drop('Country', axis=1) #Drop values from columns(axis=1) 0 Brussels

>>> df.ix[1,'Capital'] #Select rows and columns

> Asking For Help 'New Delhi'

>>> df[df['Population']>1200000000] #Use filter to adjust DataFrame

> Sort & Rank Setting

>>> s['a'] = 6 #Set index a of Series s to 6

>>> df.sort_index() #Sort by labels along an axis

>>> df.rank() #Assign ranks to entries

You might also like