0% found this document useful (0 votes)

5 views5 pages

FDS Exp4

Pandas is an open-source Python library essential for data manipulation and analysis, built on NumPy, and provides high-level data structures like Series and DataFrame. It offers features such as efficient data manipulation, handling missing data, and powerful group-by capabilities, making it a fundamental tool for data science. Users can easily install Pandas, import it, and perform various operations including reading data from different formats, filtering, modifying, and exporting data.

Uploaded by

harsh.pandey22

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views5 pages

FDS Exp4

Uploaded by

harsh.pandey22

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

What is Pandas Library?

Pandas is an open-source Python library widely used for data manipulation, analysis, and
preprocessing tasks. It is a fundamental library for data science and analytics and is built on top
of NumPy, providing high-level data structures and methods to work with structured data
efficiently.

Pandas primarily offers two data structures for handling data:-

Series: A one-dimensional labeled array capable of holding any data type.

DataFrame: A two-dimensional labeled data structure, similar to a table in relational databases

or an Excel spreadsheet, consisting of rows and columns.

Features of Pandas:-

• Fast and Efficient Data Manipulation: Pandas provides a variety of functions to

manipulate, clean, and analyze data efficiently.
• Handling Missing Data: Pandas can detect, fill, or remove missing values, making it easier
to preprocess datasets.
• Data Alignment and Merging: It supports database-like operations, such as merging,
joining, and reshaping data.
• Label-Based Slicing and Indexing: Pandas allows access to data using row/column labels
as well as positional indexing.
• Group By Functionality: It provides powerful group-by capabilities, allowing you to split
data, apply functions, and combine results.
• Data Cleaning: Pandas simplifies tasks like renaming columns, handling missing values,
or removing duplicates.
• Support for Time-Series Data: Pandas provides specialized tools for handling time-series
data, including date-based indexing, resampling, and rolling-window calculations.

1.Installing Pandas:-

Before using Pandas, ensure that it’s installed. You can install it using pip if it’s not already
installed

pip install pandas

Requirement already satisfied: pandas in

/usr/local/lib/python3.10/dist-packages (2.1.4)
Requirement already satisfied: numpy<2,>=1.22.4 in
/usr/local/lib/python3.10/dist-packages (from pandas) (1.26.4)
Requirement already satisfied: python-dateutil>=2.8.2 in
/usr/local/lib/python3.10/dist-packages (from pandas) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in
/usr/local/lib/python3.10/dist-packages (from pandas) (2024.2)
Requirement already satisfied: tzdata>=2022.1 in
/usr/local/lib/python3.10/dist-packages (from pandas) (2024.1)
Requirement already satisfied: six>=1.5 in
/usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.8.2-
>pandas) (1.16.0)

2.Importing Pandas:-

To start working with Pandas, you first need to import it into your Python script

import pandas as pd

3.Pandas Data Structures:-

Series: A Series is essentially a one-dimensional array, similar to a column in a spreadsheet or a

list in Python, but with labels (called index).

import pandas as pd

# Creating a Series
data = [1, 3, 5, 7, 9]
series = pd.Series(data)
print(series)

• Indexing: You can access the elements of a Series using its index.
print(series[2]) # Outputs: 5

• Custom Index: You can also define custom labels for the Series index.
series = pd.Series(data, index=['a', 'b', 'c', 'd', 'e'])
print(series['c']) # Outputs: 5

DataFrame: A DataFrame is a two-dimensional data structure, similar to a table in a relational

database or an Excel spreadsheet. It consists of rows and columns.

# Creating a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35],
'City': ['New York', 'Paris', 'London']}
df = pd.DataFrame(data)
print(df)

4.Reading Data into Pandas:- Pandas makes it easy to load data from various file formats, such
as CSV, Excel, and SQL databases.

• Reading CSV Files

# Reading data from a CSV file
df = pd.read_csv('data.csv')
print(df.head()) # Prints the first 5 rows of the DataFrame

• Reading Excel Files

# Reading data from an Excel file
df = pd.read_excel('data.xlsx', sheet_name='Sheet1')

• Reading from SQL Databases

import sqlite3

# Connecting to a SQL database and reading data into a DataFrame

conn = sqlite3.connect('database.db')
df = pd.read_sql_query('SELECT * FROM tablename', conn)

5.Basic Operations with DataFrames:-

• Viewing Data:

.head(): Displays the first few rows of the DataFrame.

.tail(): Displays the last few rows of the DataFrame.

print(df.head()) # View the first 5 rows

print(df.tail()) # View the last 5 rows

• Inspecting Data:

.shape: Returns the dimensions of the DataFrame (rows, columns).

.columns: Returns the column names.

.info(): Provides a concise summary of the DataFrame, including the data types and
non-null counts.

.describe(): Provides descriptive statistics for numeric columns.

print(df.shape) # Get the shape (rows, columns)

print(df.columns) # Get the column names
df.info() # Get information about the DataFrame
print(df.describe()) # Get summary statistics

6.Selecting Data from a DataFrame:-

You can select specific columns or rows using loc and iloc

• Selecting Columns:
# Selecting a single column
age_column = df['Age']

# Selecting multiple columns

subset = df[['Name', 'City']]

• Selecting Rows:
– loc: Select rows and columns by label.
– iloc: Select rows and columns by position (index).
# Selecting rows by index using loc
row = df.loc[1] # Selects the second row by label (index 1)

# Selecting rows by index using iloc

row = df.iloc[1] # Selects the second row by position (index 1)

# Selecting a range of rows

subset = df.iloc[0:2] # Selects the first two rows

7.Filtering and Querying Data:-

You can filter the rows of a DataFrame by applying conditions on the data.

# Filtering rows where Age > 30

filtered_df = df[df['Age'] > 30]

# Filtering rows with multiple conditions

filtered_df = df[(df['Age'] > 25) & (df['City'] == 'New York')]

You can also use the .query() method for filtering:

# Using query method

filtered_df = df.query('Age > 25 & City == "New York"')

8.Modifying Data:-

• Adding New Columns: You can add new columns to the DataFrame by assigning values
to a new column name.
# Adding a new column
df['Salary'] = [50000, 60000, 70000]

• Modifying Existing Columns: You can modify columns by applying operations on them.
# Updating an existing column
df['Age'] = df['Age'] + 1 # Increase each age by 1

9.Handling Missing Data:- Pandas makes it easy to identify and handle missing data (NaN
values).

• Checking for Missing Data:

# Check for missing values in the DataFrame
print(df.isnull())
print(df.isnull().sum()) # Count missing values in each column

• Filling Missing Data: You can fill missing values using the .fillna() method.
# Fill missing values with a default value
df['Salary'].fillna(0, inplace=True)

• Dropping Missing Data: You can drop rows or columns with missing values
using .dropna().
# Drop rows with missing data
df.dropna(inplace=True)

10.Grouping and Aggregating Data:- You can group data based on specific columns and
perform aggregation operations like sum, mean, count, etc.

# Grouping by a column and calculating the mean of another column

grouped_df = df.groupby('City')['Age'].mean()
print(grouped_df)

11.Merging and Joining DataFrames:- Pandas supports merging multiple DataFrames using the
.merge() method (similar to SQL joins).

# Merging two DataFrames

merged_df = pd.merge(df1, df2, on='ID', how='inner') # Inner join on
the 'ID' column

12.Exporting Data:- Pandas allows you to export DataFrames to various file formats.

• Exporting to CSV:
# Save DataFrame to a CSV file
df.to_csv('output.csv', index=False)

• Exporting to Excel:
# Save DataFrame to an Excel file
df.to_excel('output.xlsx', index=False)

Pandas Handbook
No ratings yet
Pandas Handbook
33 pages
Emv Tutorial
0% (1)
Emv Tutorial
4 pages
Pandas Basics
No ratings yet
Pandas Basics
84 pages
Practical Guide To Pandas For Data Science
100% (1)
Practical Guide To Pandas For Data Science
26 pages
Python Pandas Tutorial For Beginners
No ratings yet
Python Pandas Tutorial For Beginners
203 pages
Pandas DataFrame Basics Guide
No ratings yet
Pandas DataFrame Basics Guide
32 pages
FDS Module 2 Notes
No ratings yet
FDS Module 2 Notes
24 pages
ROX User Guide RX1000 PDF
No ratings yet
ROX User Guide RX1000 PDF
341 pages
Unit6 - Working With Data
No ratings yet
Unit6 - Working With Data
29 pages
CHP 8 Pandas
No ratings yet
CHP 8 Pandas
49 pages
Pandas DataFrame Basics Guide
No ratings yet
Pandas DataFrame Basics Guide
9 pages
Embedded Lab Record Edited
No ratings yet
Embedded Lab Record Edited
107 pages
Pandas
No ratings yet
Pandas
25 pages
Pandas
No ratings yet
Pandas
4 pages
Introduction to Pandas Library
No ratings yet
Introduction to Pandas Library
31 pages
d2161r5-ATAATAPI Command Set - 3 PDF
No ratings yet
d2161r5-ATAATAPI Command Set - 3 PDF
577 pages
Google: Don T Be Evil or Don T Be Good
No ratings yet
Google: Don T Be Evil or Don T Be Good
51 pages
Unit 4
No ratings yet
Unit 4
36 pages
Starting Out With Pandas - Ext
No ratings yet
Starting Out With Pandas - Ext
18 pages
CP R80.20 RemoteAccessVPN AdminGuide
No ratings yet
CP R80.20 RemoteAccessVPN AdminGuide
161 pages
ABAP Programming Language Guide
100% (1)
ABAP Programming Language Guide
30 pages
Pandas
No ratings yet
Pandas
12 pages
Excel Add-In User Guide
No ratings yet
Excel Add-In User Guide
7 pages
All Document Reader 1715619870900
No ratings yet
All Document Reader 1715619870900
6 pages
Data Mining and Knowledge Discovery By, Amit Vaghela (020102017)
No ratings yet
Data Mining and Knowledge Discovery By, Amit Vaghela (020102017)
16 pages
Pandas
No ratings yet
Pandas
13 pages
S100+Ethernet User+Manual EN 200615
No ratings yet
S100+Ethernet User+Manual EN 200615
52 pages
Pandas Notes
No ratings yet
Pandas Notes
44 pages
Introduction To Pandas For Data Analysis
No ratings yet
Introduction To Pandas For Data Analysis
6 pages
Mypnotes
No ratings yet
Mypnotes
3 pages
Electronic Business Systems: Introduction To Information Systems
No ratings yet
Electronic Business Systems: Introduction To Information Systems
12 pages
14oct Pandas 2024
No ratings yet
14oct Pandas 2024
13 pages
Cable ID Test Limit Length Headroom Date / Time: Untitled1
No ratings yet
Cable ID Test Limit Length Headroom Date / Time: Untitled1
8 pages
Switch Board Installation Guide Revision 1.2 Playstation Mainboard (Pu-20)
No ratings yet
Switch Board Installation Guide Revision 1.2 Playstation Mainboard (Pu-20)
11 pages
The Pandas Library
No ratings yet
The Pandas Library
39 pages
Pandas - Digitalocean
No ratings yet
Pandas - Digitalocean
15 pages
Python Development Internship Report
No ratings yet
Python Development Internship Report
25 pages
Unit 3 (FODS)
No ratings yet
Unit 3 (FODS)
34 pages
Pandas
No ratings yet
Pandas
25 pages
Baseline Switch 2250-SFP Plus v402!0!0 1 RN
No ratings yet
Baseline Switch 2250-SFP Plus v402!0!0 1 RN
3 pages
Assignment
No ratings yet
Assignment
8 pages
2 Pandas
No ratings yet
2 Pandas
22 pages
Data Analysis With Pandas
No ratings yet
Data Analysis With Pandas
122 pages
Python Pandas
No ratings yet
Python Pandas
21 pages
Webleaflet ENG Amiko Mira WiFi v170719
No ratings yet
Webleaflet ENG Amiko Mira WiFi v170719
2 pages
Flow Chart Vs PseudoCode
No ratings yet
Flow Chart Vs PseudoCode
2 pages
Custom MK-SS808 Image
No ratings yet
Custom MK-SS808 Image
4 pages
Python 3rd Unit Question and Answer
No ratings yet
Python 3rd Unit Question and Answer
25 pages
Lab-3 Pandas Library
No ratings yet
Lab-3 Pandas Library
14 pages
Supermarket Billing System Project
No ratings yet
Supermarket Billing System Project
7 pages
JOINS
No ratings yet
JOINS
10 pages
Syllabus CS
No ratings yet
Syllabus CS
9 pages
Pandas Notes
No ratings yet
Pandas Notes
6 pages
Jurnal 2211600123 Steven Adriandi Vodegel
No ratings yet
Jurnal 2211600123 Steven Adriandi Vodegel
5 pages
DevOps Session 3 Pandas
No ratings yet
DevOps Session 3 Pandas
33 pages
Pandas Library: Data Manipulation & Analysis Guide
No ratings yet
Pandas Library: Data Manipulation & Analysis Guide
9 pages
Profile Summary: Pallavi Kumari Pandey
No ratings yet
Profile Summary: Pallavi Kumari Pandey
2 pages
Usage of NumPy For Numerical Data in Detail
No ratings yet
Usage of NumPy For Numerical Data in Detail
52 pages
Python Basic Codes
No ratings yet
Python Basic Codes
8 pages
Pandas PDF
No ratings yet
Pandas PDF
25 pages
Module 6
No ratings yet
Module 6
48 pages
IP Project I
No ratings yet
IP Project I
51 pages
DAP 3 Module
No ratings yet
DAP 3 Module
62 pages
UNIT II Notes
No ratings yet
UNIT II Notes
23 pages
NumPy and Pandas
No ratings yet
NumPy and Pandas
12 pages
Pandas
No ratings yet
Pandas
13 pages
Unit V Pandas AIML A B Lastupdated 18-06-2024
No ratings yet
Unit V Pandas AIML A B Lastupdated 18-06-2024
33 pages
Week 4.1
No ratings yet
Week 4.1
16 pages
Unit 3
No ratings yet
Unit 3
10 pages
Group 12 PPT Software Programing & Development
No ratings yet
Group 12 PPT Software Programing & Development
24 pages
Data Handling Using Pandas-1
No ratings yet
Data Handling Using Pandas-1
60 pages
Data Handling Module
No ratings yet
Data Handling Module
10 pages
Pandas
No ratings yet
Pandas
50 pages
Pandas
No ratings yet
Pandas
7 pages
Pandas
No ratings yet
Pandas
2 pages
HPE Alletra 9000-A50002571enw
No ratings yet
HPE Alletra 9000-A50002571enw
21 pages
An Empirical Study of DevSecOps Focused On Continuous Security Testing
No ratings yet
An Empirical Study of DevSecOps Focused On Continuous Security Testing
8 pages
Pandas Programs
No ratings yet
Pandas Programs
2 pages
Resume 2025 Final
No ratings yet
Resume 2025 Final
2 pages
Pandas Research
No ratings yet
Pandas Research
14 pages
Unit III - Notes
No ratings yet
Unit III - Notes
12 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
16 pages
Pandas Guide
No ratings yet
Pandas Guide
50 pages
Subject IP
No ratings yet
Subject IP
9 pages
Pandas Notes
No ratings yet
Pandas Notes
20 pages
Data Analytics Preparation & Visualization
No ratings yet
Data Analytics Preparation & Visualization
54 pages
4 Pandas
No ratings yet
4 Pandas
35 pages

FDS Exp4

Uploaded by

FDS Exp4

Uploaded by

What is Pandas Library?

Pandas primarily offers two data structures for handling data:-

Series: A one-dimensional labeled array capable of holding any data type.

DataFrame: A two-dimensional labeled data structure, similar to a table in relational databases

• Fast and Efficient Data Manipulation: Pandas provides a variety of functions to

pip install pandas

Requirement already satisfied: pandas in

3.Pandas Data Structures:-

Series: A Series is essentially a one-dimensional array, similar to a column in a spreadsheet or a

DataFrame: A DataFrame is a two-dimensional data structure, similar to a table in a relational

• Reading CSV Files

• Reading Excel Files

• Reading from SQL Databases

# Connecting to a SQL database and reading data into a DataFrame

5.Basic Operations with DataFrames:-

.head(): Displays the first few rows of the DataFrame.

.tail(): Displays the last few rows of the DataFrame.

print(df.head()) # View the first 5 rows

.shape: Returns the dimensions of the DataFrame (rows, columns).

.columns: Returns the column names.

.describe(): Provides descriptive statistics for numeric columns.

print(df.shape) # Get the shape (rows, columns)

6.Selecting Data from a DataFrame:-

# Selecting multiple columns

# Selecting rows by index using iloc

# Selecting a range of rows

7.Filtering and Querying Data:-

# Filtering rows where Age > 30

# Filtering rows with multiple conditions

You can also use the .query() method for filtering:

# Using query method

• Checking for Missing Data:

# Grouping by a column and calculating the mean of another column

# Merging two DataFrames

You might also like