PANDAS DATA ANALYSIS: VIDEO
INDEX IN PYTHON CAN BE OF ANY TYPE
WE CAN ACCESS VALUES IN A SERIES BY USING THE INDEX AND POSITION
1. Panda Series
1. ACCESSING VALUES IN A SERIES BY USING INDEX”:
ACCESSING VALUES IN A SERIES BY USING PO SITION:
DATAFRAME.IX[] CAN BE USED FOR ACCESSING VALUES USING THE LOCATION AS WELL AS INDEX. IF INDEX AND
POSITION ARE BOTH INTEGERS, THEN THE PRIORITY IS FOR INDEX AS SHOWN ABOVE IN THE LAST EXAMPLE
ARITHMETIC OPERATION S IN SERIES
VALUES ARE ADDED ACCORDING TO THE INDEX/LABEL AND NOT THE POSITION
Series Objects with duplicated labels
CHANGING THE INDEX O F A SERIES
IF WE USE THE REINDEX TO CREATE A NEW SERIES AND THE INDEX VALUES ARE DIFFERENT FROM THE MAIN SERIES, IT
WILL CREATE SERIES WITH NAN AS VALUES AND SPECIFIED INDEX AS THE INDEX(BELOW).
WHILE ADDING TWO SERIES WHERE INDEX DO NOT MATCH(HERE INDEXES ARE OF DIFFERENT TYPES), THE VALUES WILL
BE NANX `
Converting index from one type to another
REVERSING THE SERIES
2. Pandas DataFrame
1. CRE ATING DF FROM A SERIES
CHANGING THE DATAFRAME COLUMN AND INDEX NAMES
ACCESSING SPECIFIC COLUMNS AND ROWS
= Data type= DataFrame
Type= DataFrame
Type= Series
ANOTHER WAY OF ACCESSING THE COLUMN
SLICING THE DF
MODIFYING DATAFRAME
ADDING A NEW COLUMN TO EXISTING DATAFRAM E
ADDING A COLUMN AT A SPECIFIC POSITION:
MODIFYING EXISTING COLUM N V ALUES:
DELETING THE COLUMNS IN DF
Pop method variable can store the series that is deleted from the DF. Del cannot store.
DROP METHOD FOR DELE TING COLUMNS AND ROW S
APPENDING IN PYTHON
- Appends at the bottom of the rows of the other data frame
Appending with different column names for data frames
ADDING A NEW ROW TO THE EXISTING DATAFRAME
CHANGING THE VALUE I N A DATAFRAME
ARITHMETIC OPERATIONS IN DF
This is a series
This is a dataframe and hence
aligns with the rows and columns
HIERARCHIAL INDEXING AND REINDEXING OF DF
X`x`
IMPORTING DATA FROM FILES
HOW TO M AKE A COULMN AS INDEX
SPECIFYING HE ADERS OF FILE
REMOVING HE ADERS FROM A FILE WHILE OP ENING
RE ADING FILES WITH O NLY SPECIFIC COLUMNS CALLED
TIDYING UP DATA
HANDLING MISSING DATA IN PYTHON
DELETING THE COLUMN THAT HAS ALL ROWS “N A”
REPLACING “NAN” VALUES
REPLACING WITH FORWARD FILLING/BACKWARD FILLING
TO REPLACE WITH MEAN OF THE COLUMN
HANDLING DUPLUICATE DATA
REPLACING VALUES IN A DF
LAMBDA FOR MATHEMATI CAL FUNCTIONS
CALCULATING SUM OF C OLUMNS AND ROWS
CONCATENATION FUNCTION
CONCATENATING BY ROW S
CONCATENATING BY COL UMNS
DATAFRAME KEYS
MERGING DATA: INNER MERGE
MERGING WITH ONE COM MON COLUMN
MERGING WITH MULTIPLE COMMON COLUMN
MERGING WITH ON OPERATOR
MERGING OF DATA: OUTER MERGE
RIGHT OUTER JOIN
RIGHT OUTER JOIN
JOINING B ASED IN IND EX NUMBERS
GROUPBY IN PYTHON
GROUPING BY ONE VARI ABLE
GROUPING BY SECOND V ARI ABLE
GROUPBY HIERARCHICAL INDEXING
AGGREGATE FUNCTIONS ON GROUPBY DATA
TRANFORMATION FUNCTI ONS
This is a groupby function and the mean of that group with values as NaN will be NaN only
FILTERING IN PYTHON
REMOVING NULL V ALUES
TIME SERIES
TIMER AS INDEX FOR A SERIES
M= M0NTHS, B= BUSINE SS DAYS, W= WEEKS
LAST BUSINESS DAY OF MONT H
LAST WORKING DAY OF THE MONTH.
PERIOD FUNCTION
HOLIDAY ASSISTANCE
MATPLOTLIB