0% found this document useful (0 votes)

17 views9 pages

Subject IP

The document provides an overview of the Pandas library in Python, detailing its purpose for data analysis and manipulation. It covers key concepts such as Series and DataFrames, their creation, and essential operations including indexing, slicing, and mathematical functions. Additionally, it outlines important questions for Class 12 Informatics Practices focusing on data handling using Pandas.

Uploaded by

krish040goyal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views9 pages

Subject IP

Uploaded by

krish040goyal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 9

Subject IP

NOTES

PANDAS
1. Introduction to Pandas

Pandas is a powerful, open-source Python library used for data analysis and manipulation.

It provides high-performance, easy-to-use data structures and data analysis tools.

The name "Pandas" is derived from "Panel Data System."

To use Pandas in your Python program, you need to import it:

Python

2. Why Pandas?

Data Handling: Efficiently reads and writes data in various formats (CSV, Excel, etc.).

Data Analysis: Performs calculations, statistical analysis, and data aggregation.

Data Manipulation: Allows for easy selection, filtering, sorting, reshaping, and combining of data.

Missing Data Handling: Provides tools to deal with missing values (NaN).

Time-Series Functionality: Offers advanced features for working with time-series data.

3. Pandas Data Structures

Pandas primarily uses two fundamental data structures:

Series:

One-dimensional (1-D) labeled array.

Can hold any data type (homogeneous data).

Data values are mutable (can be changed), but the size is immutable (cannot be changed after creation).

Can be thought of as a column in a spreadsheet or a single list with an index.

Creation: From lists, arrays, dictionaries, or scalar values.

Operations: Indexing, slicing, mathematical operations, statistical functions (e.g., sum())

DataFrame:

Two-dimensional (2-D) labeled data structure with columns of potentially different types (heterogeneous
data).

Similar to a spreadsheet or a SQL table, with rows and columns.

The most commonly used Pandas object for tabular data.

Creation: From dictionaries of Series/lists, lists of dictionaries, NumPy arrays, or CSV/Excel files.

Operations:

Accessing Data: Using column names, row labels (index), loc (label-based), iloc (integer-location based).

Adding/Deleting Columns/Rows: Using assignment or methods like drop().

Data Manipulation: Filtering, sorting, grouping, merging.

Descriptive Statistics: head(), tail(), describe(), info().

4. Key Concepts

Index: Labels used to identify rows in Series and DataFrames.

Column Names: Labels used to identify columns in DataFrames.

Missing Data (NaN): Represents "Not a Number" for missing values. Pandas provides methods to handle these.

Vectorization: Pandas operations are often optimized for performance through vectorized computations.

Creating a Series.

s = pd.Series([10, 20, 30, 40], index=['a', 'b', 'c', 'd'])

✓ We can say that Series is a labeled one-dimensional array

which can hold any type of data.

✓ Data of Series is always mutable, means it can be changed.

✓ But the size of Data of Series is always immutable, means it

cannot be changed.

✓ Series may be considered as a Data Structure with two

arrays out which one array works as Index (Labels) and Creating a series from Scalar value

To create a series from scalar value, an index must be provided. The

scalar value will be repeated as per the length of index

Ser2 = pd.Series([12, 23, 34, 45, 67])

>>> print(Ser2)

0 12

1 23

2 34

3 45
4 67

From a range() object.

ser3 = pd.Series(range(4))

>>> print(ser3)

0 0

1 1

2 2

3 3

Customizing the index.

ser3.index = ['One', 'Two', 'Three', 'Four']

>>> print(ser3)

One 0

Two 1

Three 2

Four 3

You need to import the Pandas module first: import pandas as pd.

From a List/Array: s = pd.Series([10, 20, 30, 40])

From a Dictionary: s = pd.Series({'a': 10, 'b': 20, 'c': 30})

With a Specific Index: s = pd.Series([10, 20, 30], index=['x', 'y', 'z'])

From a Scalar Value: s = pd.Series(5, index=[1, 2, 3])

Examples of creation of series:-

1. Creation of empty series:-

import pandas as pd
S1=pd.Series()
print (S1)

2. Creation of series using list:-

import pandas as pd
S1=pd.Series([23,45,67,99])
print (S1)
output:-
0 23
1 45
2 67
3 99

We can also assign user defined labels to index

import pandas as pd

S1=pd.Series([34,44,23] , index=[“ram” , “sham” , “ria”])

Print (S1)

Output:-

ram 34

sham 44

ria 23

3. Creation of series using Dictionaries:-

import pandas as pd

D={ 2:”abc” , 5:”qwe” , 8:”tyu”}

S2=pd.Series(D)

Print(S2)

Output:-

2 abc

5 qwe

8 tyu

4. creation of series with scalar values:-

Import pandas as pd

S3=pd.Series(5 , index=[‘YELLOW’ , ‘RED’ , ‘GREEN’])

print(S3)

OUTPUT:-

5 YELLOW
5 RED

5 GREEN

Accessing Elements

By Index Label: s['a'] (returns the value associated with label 'a')

By Position: s[0] (returns the element at the first position)

Accessing Data in a Series

Attributes of a Series

s.values: Returns the data as a NumPy array.

s.index: Returns the index labels.

s.dtype: Returns the data type of the Series elements.

s.shape: Returns the number of elements as a tuple (e.g., (5,)).

s.nbytes: Returns the memory occupied by the Series in bytes.

s.empty: Returns True if the Series is empty, False otherwise

Useful Methods

s.head(n): Returns the first n rows (default is 5).

s.tail(n): Returns the last n rows (default is 5).

Mathematical Operations

Series support various mathematical operations, which are often vectorized:

Addition: s1 + s2

Subtraction: s1 - s2

Multiplication: s1 * s2

DataFrames in Python

1. Introduction to DataFrames:

A DataFrame is a two-dimensional, labeled, heterogeneous data structure in Pandas.

It is essentially a tabular data structure with rows and columns, similar to a spreadsheet or a database table.

Each column can hold data of a different data type, but all values within a single column must be of the same
data type.

DataFrames have two indices: a row index (axis 0) and a column index (axis 1). These indices can be numeric,
string, or labels.

2. Characteristics of DataFrames:

Value Mutable: The values within a DataFrame can be changed or updated.

Size Mutable: Rows and columns can be added or deleted from a DataFrame.

Heterogeneous: Different columns can store different data types.

3. Creating DataFrames:
From Dictionary of Series.

Python

import pandas as pd

data = {'Name': pd.Series(['Alice', 'Bob', 'Charlie']),

'Age': pd.Series([25, 30, 22]),

'City': pd.Series(['NY', 'LA', 'Chicago'])}

df = pd.DataFrame(data)

print(df)

From Dictionary of Lists/Arrays.

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'],

'Age': [25, 30, 22],

'City': ['NY', 'LA', 'Chicago']}

df = pd.DataFrame(data)

print(df)

From a List of Dictionaries.

import pandas as pd

data = [{'Name': 'Alice', 'Age': 25},

{'Name': 'Bob', 'Age': 30},

{'Name': 'Charlie', 'Age': 22}]

df = pd.DataFrame(data)

print(df)

4. Accessing Data in DataFrames:

Accessing Columns.

print(df['Name']) # Using column name

print(df.Age) # Using dot notation (if column name is a valid identifier)

Accessing Rows by Label (loc).

print(df.loc[0]) # Accessing row with index label 0

print(df.loc[[0, 2]]) # Accessing multiple rows

Accessing Rows by Position (iloc)

print(df.iloc[0]) # Accessing row at positional index 0

print(df.iloc[[0, 2]]) # Accessing multiple rows by position

Accessing Specific Cells.

print(df.loc[0, 'Name']) # Accessing cell at row label 0, column 'Name'

print(df.iloc[1, 1]) # Accessing cell at row position 1, column position 1

5. Important DataFrame Functions:

head(n): Returns the first n rows of the DataFrame (default n=5).

tail(n): Returns the last n rows of the DataFrame (default n

shape: Returns a tuple representing the dimensions (rows, columns) of the DataFrame.

columns: Returns an Index object containing column labels.

index: Returns an Index object containing row labels.

sort_values(by='column_name', ascending=True/False): Sorts the DataFrame by the values in a specified

column.

sort_index(ascending=True/False): Sorts the DataFrame by its index.

rename(columns={'old_name': 'new_name'}, index={'old_label': 'new_label'}): Renames columns or row labels.

concat([df1, df2]): Concatenates DataFrames along an axis.

merge(df1, df2, on='common_column'): Merges DataFrames based on a common column.

6. Modifying DataFrames:

Adding a new column.

df['Country'] = ['USA', 'USA', 'USA']

Modifying existing values.

df.loc[0, 'Age'] = 26

Adding a new row (using loc or append - though append is deprecated in newer Pandas versions, loc is
preferred):

df.loc[3] = ['David', 28, 'London', 'UK']

Deleting columns.

del df['City']

# or
df.drop('City', axis=1)

Deleting rows.

df.drop(0, axis=0) # Deleting row with index label

Mathematical Functions on DataFrames

DataFrames in Pandas allow for various mathematical operations to be performed on their data, either
element-wise or across entire rows/columns. These operations are fundamental for data analysis and
manipulation.

1. Basic Arithmetic Operations:

These operations are performed element-wise on DataFrames or between a DataFrame and a scalar value.

Addition (+): Adds corresponding elements of two DataFrames or adds a scalar to each element.

Subtraction (-): Subtracts corresponding elements or subtracts a scalar from each element.

Multiplication (*): Multiplies corresponding elements or multiplies each element by a scalar.

Division (/): Divides corresponding elements or divides each element by a scalar.

Floor Division (//): Performs integer division.

Modulo (%): Returns the remainder of the division.

axis=0 (default): Performs operation column-wise (aggregates values within each column).

axis=1: Performs operation row-wise (aggregates values within each row).

Example with axis:

import pandas as pd

df = pd.DataFrame({'A': [10, 20], 'B': [30, 40]})

# Sum of each row

row_sums = df.sum(axis=1)

print("Row Sums:\n", row_sums)

IMPORTANT QUESTIONS

Important questions for Class 12 IP (Informatics Practices) focusing on Data Handling using Pandas typically
cover the following key areas:

1. Pandas Series:

Creation: Creating Series from lists, NumPy arrays, and dictionaries.

Attributes: Understanding index, values, dtype, size, nbytes, itemsize.

Indexing and Slicing: Accessing elements using integer-based indexing (iloc), label-based indexing (loc), and
direct indexing with []. Slicing Series to extract subsets.

Operations: Performing mathematical operations (addition, subtraction, etc.), applying functions, and handling
missing values (NaN).

2. Pandas DataFrame:

Creation:

Creating DataFrames from dictionaries of Series, lists of dictionaries, and external files (CSV).

Attributes:

Understanding index, columns, shape, size, dtypes.

Indexing and Slicing:

Accessing rows and columns using loc, iloc, and direct column selection with []. Slicing DataFrames by rows and
columns.

Operations:

Adding and deleting rows and columns.

Modifying data in specific cells or entire rows/columns.

Performing calculations across rows/columns (e.g., sum()

Note: this sample paper is based on entire syllabus….you have to practice questions only on pandas , data
structures , series , dataframes , mysql

XII IP CH 1 Python Pandas - I Series
No ratings yet
XII IP CH 1 Python Pandas - I Series
45 pages
Data Manipulation With Pandas
No ratings yet
Data Manipulation With Pandas
38 pages
Unit - 1 - Python Pandas
No ratings yet
Unit - 1 - Python Pandas
176 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
12 pages
Data Handing Using Pandas-I
100% (2)
Data Handing Using Pandas-I
46 pages
Pandas Notes
No ratings yet
Pandas Notes
44 pages
Pandas 2
No ratings yet
Pandas 2
36 pages
Data Handlinng Using Pandas-I
No ratings yet
Data Handlinng Using Pandas-I
46 pages
Class 12 Practical File
No ratings yet
Class 12 Practical File
29 pages
Unit 2
No ratings yet
Unit 2
81 pages
Pandas
No ratings yet
Pandas
163 pages
Data Manipulation With Pandas
No ratings yet
Data Manipulation With Pandas
138 pages
Pandas Assignment Version-2
No ratings yet
Pandas Assignment Version-2
9 pages
Grade-XII-IP - Ch-1 - Series Notes
No ratings yet
Grade-XII-IP - Ch-1 - Series Notes
28 pages
RAC MCQs-180-set-01 V2
No ratings yet
RAC MCQs-180-set-01 V2
24 pages
Power System Reactance Diagram Questions PDF
No ratings yet
Power System Reactance Diagram Questions PDF
22 pages
Pandas
No ratings yet
Pandas
57 pages
Heydaraliyevculturalcentre 180131094714 PDF
No ratings yet
Heydaraliyevculturalcentre 180131094714 PDF
23 pages
Pandas
No ratings yet
Pandas
49 pages
Data Handling Using Pandas-1
No ratings yet
Data Handling Using Pandas-1
60 pages
Final Formatted After Iloc Loc
No ratings yet
Final Formatted After Iloc Loc
34 pages
Lab-3 Pandas Library
No ratings yet
Lab-3 Pandas Library
18 pages
Panda
No ratings yet
Panda
46 pages
Pandas
No ratings yet
Pandas
12 pages
1 IP 12 NOTES PythonPandas 2022 PDF
100% (3)
1 IP 12 NOTES PythonPandas 2022 PDF
66 pages
Python Unit 3 4
No ratings yet
Python Unit 3 4
92 pages
Pandas
No ratings yet
Pandas
63 pages
Pandas Series - Notes For PA3
No ratings yet
Pandas Series - Notes For PA3
9 pages
Data Handlinng Using Pandas
No ratings yet
Data Handlinng Using Pandas
46 pages
Chapter 7. MOSFET Single Stage Amplifier - Lecture Notes-2
No ratings yet
Chapter 7. MOSFET Single Stage Amplifier - Lecture Notes-2
103 pages
Unit I: Data Handling Using Pandas and Data Visualization: Marks:25
No ratings yet
Unit I: Data Handling Using Pandas and Data Visualization: Marks:25
135 pages
Pandas Shan Ver2
No ratings yet
Pandas Shan Ver2
25 pages
Pandas Notes
No ratings yet
Pandas Notes
19 pages
Ip Study
No ratings yet
Ip Study
18 pages
Python UnitIV
No ratings yet
Python UnitIV
20 pages
Creating and Using Pandas Series
No ratings yet
Creating and Using Pandas Series
53 pages
05getting Started With Pandas
No ratings yet
05getting Started With Pandas
44 pages
Ip Notes
No ratings yet
Ip Notes
20 pages
Pandas Guide for Data Analysts
No ratings yet
Pandas Guide for Data Analysts
33 pages
Unit 3
No ratings yet
Unit 3
10 pages
14 Pandas
No ratings yet
14 Pandas
25 pages
The Pandas Library
No ratings yet
The Pandas Library
39 pages
1 Data Handlinng Using Pandas-I
No ratings yet
1 Data Handlinng Using Pandas-I
46 pages
Exp 25 - 26
No ratings yet
Exp 25 - 26
17 pages
FDS Module 2 Notes
No ratings yet
FDS Module 2 Notes
24 pages
Pandas Class 12 Ncertttt
No ratings yet
Pandas Class 12 Ncertttt
48 pages
ML Unit-2 Notes
No ratings yet
ML Unit-2 Notes
17 pages
XII - Ip - Panda - I - Part - I - 2023 (1) 1 1
No ratings yet
XII - Ip - Panda - I - Part - I - 2023 (1) 1 1
25 pages
Pandas
No ratings yet
Pandas
29 pages
Unit I: Data Handling Using Pandas and Data Visualization: Marks:30
No ratings yet
Unit I: Data Handling Using Pandas and Data Visualization: Marks:30
75 pages
Analisis Data Dalam Penelitian Tindakan Kelas
No ratings yet
Analisis Data Dalam Penelitian Tindakan Kelas
14 pages
Introduction To Pandas & Data Structures
No ratings yet
Introduction To Pandas & Data Structures
11 pages
Unit 4
No ratings yet
Unit 4
36 pages
ML Lab8
No ratings yet
ML Lab8
28 pages
Introduction To Pandas For Data Analysis
No ratings yet
Introduction To Pandas For Data Analysis
6 pages
Python Pandas New Sylabus
No ratings yet
Python Pandas New Sylabus
53 pages
Data Analytics Pandas
No ratings yet
Data Analytics Pandas
33 pages
P03 Introduction To Pandas Ans
No ratings yet
P03 Introduction To Pandas Ans
45 pages
Python Pandas ch-2
No ratings yet
Python Pandas ch-2
56 pages
Data Handling Using Pandas-I-ORG
No ratings yet
Data Handling Using Pandas-I-ORG
44 pages
CH 02 - Data Handling Using Pandas Leip102 EDITED Smaller 01 Codes Only
No ratings yet
CH 02 - Data Handling Using Pandas Leip102 EDITED Smaller 01 Codes Only
15 pages
3HAC16591 en
No ratings yet
3HAC16591 en
234 pages
800 Hotmail Valid by Megalodon
No ratings yet
800 Hotmail Valid by Megalodon
15 pages
12ip 22 23
No ratings yet
12ip 22 23
188 pages
Python & Pandas for Beginners
No ratings yet
Python & Pandas for Beginners
29 pages
1 Company Presentation 16 9
No ratings yet
1 Company Presentation 16 9
48 pages
Pandas-Creating Series & Dataframes (DR V Gowri, Srmist)
No ratings yet
Pandas-Creating Series & Dataframes (DR V Gowri, Srmist)
47 pages
Module - 2
No ratings yet
Module - 2
130 pages
Dip Computation Methods
No ratings yet
Dip Computation Methods
20 pages
TTL 1 UNIT 1 Intro and Lesson 1 T
No ratings yet
TTL 1 UNIT 1 Intro and Lesson 1 T
32 pages
Intake and Exhaust: Group 15
No ratings yet
Intake and Exhaust: Group 15
20 pages
Data Science Interview Questions (Healthcare)
No ratings yet
Data Science Interview Questions (Healthcare)
19 pages
ABHA M1 API Document V1 R1.bab8b1bd
No ratings yet
ABHA M1 API Document V1 R1.bab8b1bd
33 pages
Chapter Shutdown
No ratings yet
Chapter Shutdown
31 pages
CM Bc9000-Eng-Int-B-Catalogue
No ratings yet
CM Bc9000-Eng-Int-B-Catalogue
20 pages
An Authoritative Study On The
No ratings yet
An Authoritative Study On The
21 pages
Leaflet HEMK 20191010
100% (1)
Leaflet HEMK 20191010
14 pages
Annex B - GK Style Guide For Entries
No ratings yet
Annex B - GK Style Guide For Entries
2 pages
DISD SD380 Wheel Loader Specs PDF
No ratings yet
DISD SD380 Wheel Loader Specs PDF
8 pages
Ambo University Exam System
No ratings yet
Ambo University Exam System
44 pages
Road Restraint Systems Guide
No ratings yet
Road Restraint Systems Guide
82 pages
Building Services for B.Tech Students
No ratings yet
Building Services for B.Tech Students
12 pages
Allied Meditec 1100 October 2023 Ver23-10
No ratings yet
Allied Meditec 1100 October 2023 Ver23-10
2 pages
ThinkPad P Series
No ratings yet
ThinkPad P Series
14 pages
J Jfoodeng 2018 01 016
No ratings yet
J Jfoodeng 2018 01 016
8 pages
Programming Assignment
No ratings yet
Programming Assignment
6 pages
Tutorial Session 10 Autocorrelation Solution
No ratings yet
Tutorial Session 10 Autocorrelation Solution
4 pages
Slide Presentation Colloquim
No ratings yet
Slide Presentation Colloquim
4 pages
Ig 1685196111
No ratings yet
Ig 1685196111
3 pages