0% found this document useful (0 votes)

45 views12 pages

NumPy and Pandas

The document provides an overview of NumPy and Pandas, two essential libraries for scientific computing and data manipulation in Python. It details key features, installation instructions, and basic operations for both libraries, including array creation, mathematical functions, data structures, and data cleaning techniques. The document serves as a guide for users to effectively utilize NumPy and Pandas for various data analysis tasks.

Uploaded by

Akshat Joshi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views12 pages

NumPy and Pandas

Uploaded by

Akshat Joshi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 12

NumPy and Pandas

NumPy is a fundamental package for scientific computing with Python. It

provides support for arrays, matrices, and a large collection of
mathematical functions to operate on these data structures efficiently.

Key Features of NumPy

1. N-dimensional Array Object:

○ The core of NumPy is the ndarray, a powerful n-dimensional
array object.
○ Supports various data types and operations.
2. Universal Functions (ufuncs):
○ Functions that operate element-wise on arrays.
○ Includes mathematical, logical, bitwise, and other functions.
3. Broadcasting:
○ Allows arithmetic operations on arrays of different shapes.
○ Simplifies code and improves performance.
4. Linear Algebra:
○ Provides tools for performing linear algebra operations, such as
matrix multiplication, eigenvalues, and singular value
decomposition.
5. Random Number Generation:
○ Generates random numbers for various distributions.
○ Useful for simulations and statistical computations.
6. Integration with Other Libraries:
○ Works seamlessly with other scientific computing libraries like
SciPy, Pandas, and Matplotlib.

Installing NumPy

You can install NumPy using pip:

sh
pip install numpy

Basic Operations with NumPy

1. Creating Arrays
import numpy as np

# Creating a 1D array

array_1d = np.array([1, 2, 3, 4, 5])

print("1D Array:", array_1d)

# Creating a 2D array

array_2d = np.array([[1, 2, 3], [4, 5, 6]])

print("2D Array:\n", array_2d)

# Creating arrays with zeros, ones, and a range of numbers

zeros_array = np.zeros((3, 3))

ones_array = np.ones((2, 2))

range_array = np.arange(10)

print("Zeros Array:\n", zeros_array)

print("Ones Array:\n", ones_array)

print("Range Array:", range_array)

Output:

1D Array: [1 2 3 4 5]

2D Array:

[[1 2 3]

[4 5 6]]

Zeros Array:
[[0. 0. 0.]

[0. 0. 0.]

[0. 0. ]]

Ones Array:

[[1. 1.]

[1. 1.]]

Range Array: [0 1 2 3 4 5 6 7 8 9]

2. Array Operations

# Arithmetic operations

array = np.array([1, 2, 3, 4])

print("Original Array:", array)

# Addition

array_add = array + 10

print("Array + 10:", array_add)

# Multiplication

array_mult = array * 2

print("Array * 2:", array_mult)

# Element-wise operations
array_square = array ** 2

print("Array squared:", array_square)

Output:

Original Array: [1 2 3 4]

Array + 10: [11 12 13 14]

Array * 2: [2 4 6 8]

Array squared: [ 1 4 9 16]

3. Universal Functions (ufuncs)

# Using ufuncs for element-wise operations

array = np.array([1, 2, 3, 4])

# Sine function

array_sin = np.sin(array)

print("Sine of Array:", array_sin)

# Exponential function

array_exp = np.exp(array)

print("Exponential of Array:", array_exp)

# Square root function

array_sqrt = np.sqrt(array)

print("Square Root of Array:", array_sqrt)

Output:

Sine of Array: [ 0.84147098 0.90929743 0.14112001 -0.7568025 ]

Exponential of Array: [ 2.71828183 7.3890561 20.08553692

54.59815003]

Square Root of Array: [1. 1.41421356 1.73205081 2. ]

4. Linear Algebra Operations

# Matrix multiplication

matrix_a = np.array([[1, 2], [3, 4]])

matrix_b = np.array([[5, 6], [7, 8]])

matrix_product = np.dot(matrix_a, matrix_b)

print("Matrix Product:\n", matrix_product)

# Inverse of a matrix

matrix_inv = np.linalg.inv(matrix_a)

print("Inverse of Matrix A:\n", matrix_inv)

# Eigenvalues and eigenvectors

eigenvalues, eigenvectors = np.linalg.eig(matrix_a)

print("Eigenvalues:", eigenvalues)

print("Eigenvectors:\n", eigenvectors)

Output:

Matrix Product:

[[19 22]

[43 50]]

Inverse of Matrix A:

[[-2. 1. ]

[ 1.5 -0.5]]

Eigenvalues: [-0.37228132 5.37228132]

Eigenvectors:

[[-0.82456484 -0.41597356]

[ 0.56576746 -0.90937671]]

5. Random Number Generation

# Generating random numbers

random_array = np.random.rand(5)

print("Random Array:", random_array)

# Generating random integers

random_integers = np.random.randint(1, 10, size=5)

print("Random Integers:", random_integers)

# Generating numbers from a normal distribution

normal_array = np.random.randn(5)

print("Normal Distribution Array:", normal_array)

Output: (Note: Output will vary each time due to random generation)

Random Array: [0.85953447 0.73381974 0.37786374 0.84847527

0.64217697]

Random Integers: [4 1 6 9 7]

Normal Distribution Array: [ 0.35743143 -1.32095611 -0.61792992

0.77700679

Pandas
Pandas is a powerful and widely-used Python library for data manipulation
and analysis. It provides data structures like DataFrames and Series, which
are designed to make data cleaning, manipulation, and analysis fast and
easy. Let's explore some of the core functionalities of Pandas.

Key Features of Pandas

1. Data Structures:
○ Series: One-dimensional labeled array capable of holding any
data type.
○ DataFrame: Two-dimensional labeled data structure with
columns of potentially different types, similar to a table in a
database or an Excel spreadsheet.
2. Data Cleaning and Preparation:
○ Handling missing data, filtering, and cleaning data.
○ Data transformation and normalization.
3. Data Analysis and Exploration:
○ Aggregation, grouping, merging, and joining data.
○ Descriptive statistics and data summarization.
4. Time Series Analysis:
○ Tools for working with time-indexed data, resampling, and time-
based aggregations.
5. Data Input and Output:
○ Reading from and writing to various file formats such as CSV,
Excel, SQL databases, and more.

Installing Pandas

You can install Pandas using pip:

sh
pip install pandas

Basic Operations with Pandas

1. Creating Series and DataFrames

import pandas as pd

# Creating a Series
data = [1, 2, 3, 4, 5]
series = pd.Series(data)
print("Series:\n", series)

# Creating a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Age': [25, 30, 35, 40],
'Country': ['USA', 'UK', 'Canada', 'Australia']}
df = pd.DataFrame(data)
print("\nDataFrame:\n", df)

Output:
Series:
0 1
1 2
2 3
3 4
4 5
dtype: int64

DataFrame:
Name Age Country
0 Alice 25 USA
1 Bob 30 UK
2 Charlie 35 Canada
3 David 40 Australia

2. Reading and Writing Data

# Reading from a CSV file

# Assuming 'data.csv' exists with appropriate data
df = pd.read_csv('data.csv')
print("DataFrame from CSV:\n", df)

# Writing to a CSV file

df.to_csv('output.csv', index=False)

Output:

DataFrame from CSV:

(output will depend on the contents of 'data.csv')

3. Data Selection and Filtering

# Selecting a single column

ages = df['Age']
print("Ages:\n", ages)

# Selecting multiple columns

subset = df[['Name', 'Country']]
print("Subset of DataFrame:\n", subset)

# Filtering rows based on a condition

filtered = df[df['Age'] > 30]
print("Filtered DataFrame:\n", filtered)

Output:

Ages:
0 25
1 30
2 35
3 40
Name: Age, dtype: int64

Subset of DataFrame:
Name Country
0 Alice USA
1 Bob UK
2 Charlie Canada
3 David Australia

Filtered DataFrame:
Name Age Country
2 Charlie 35 Canada
3 David 40 Australia

4. Data Cleaning
# Handling missing values
df = pd.DataFrame({'A': [1, 2, None], 'B': [None, 4, 5]})
print("Original DataFrame:\n", df)

# Filling missing values

df_filled = df.fillna(0)
print("DataFrame with filled values:\n", df_filled)

# Dropping missing values

df_dropped = df.dropna()
print("DataFrame with dropped rows:\n", df_dropped)

Output:

Original DataFrame:
A B
0 1.0 NaN
1 2.0 4.0
2 NaN 5.0

DataFrame with filled values:

A B
0 1.0 0.0
1 2.0 4.0
2 0.0 5.0

DataFrame with dropped rows:

A B
1 2.0 4.0

5. Data Aggregation and Grouping

# Grouping data by a column and calculating aggregate statistics

grouped = df.groupby('Country').agg({'Age': 'mean'})
print("Grouped DataFrame:\n", grouped)

Output:

Grouped DataFrame:
Age
Country
Australia 40.0
Canada 35.0
UK 30.0
USA 25.0

Python Cheat Sheet 2.0
100% (1)
Python Cheat Sheet 2.0
10 pages
DICOM Processing and Segmentation in Python
No ratings yet
DICOM Processing and Segmentation in Python
18 pages
Python Cheat Sheet: Pandas - Numpy - Sklearn Matplotlib - Seaborn BS4 - Selenium - Scrapy
100% (3)
Python Cheat Sheet: Pandas - Numpy - Sklearn Matplotlib - Seaborn BS4 - Selenium - Scrapy
9 pages
Usage of NumPy For Numerical Data in Detail
No ratings yet
Usage of NumPy For Numerical Data in Detail
52 pages
Pandas Research
No ratings yet
Pandas Research
14 pages
Unit 3 (FODS)
No ratings yet
Unit 3 (FODS)
34 pages
NumPy and Pandas Tutorial
No ratings yet
NumPy and Pandas Tutorial
8 pages
FDS Record-1-4
No ratings yet
FDS Record-1-4
18 pages
Pandas & PyNumS Essentials
No ratings yet
Pandas & PyNumS Essentials
10 pages
Dav 2 Unit
No ratings yet
Dav 2 Unit
55 pages
Report
No ratings yet
Report
18 pages
Python Unit IV
No ratings yet
Python Unit IV
12 pages
Data Analysis Tools
No ratings yet
Data Analysis Tools
26 pages
Data Handling Module
No ratings yet
Data Handling Module
10 pages
Attachment 3 Python For Data Analysis Lyst9850
No ratings yet
Attachment 3 Python For Data Analysis Lyst9850
31 pages
Ilovepdf Merged (2) Merged
No ratings yet
Ilovepdf Merged (2) Merged
65 pages
Machine Learning Using Phython
No ratings yet
Machine Learning Using Phython
25 pages
EXP1-siddhant Gupta (23 - SE - 148)
No ratings yet
EXP1-siddhant Gupta (23 - SE - 148)
17 pages
Practicals 1 To 4
No ratings yet
Practicals 1 To 4
15 pages
Int254 Unit 2
No ratings yet
Int254 Unit 2
33 pages
DAP 3 Module
No ratings yet
DAP 3 Module
62 pages
NumPy and Pandas Step
No ratings yet
NumPy and Pandas Step
9 pages
04 Getting Started With Pandas
No ratings yet
04 Getting Started With Pandas
85 pages
DSC Lab Programs
No ratings yet
DSC Lab Programs
24 pages
Cheat Sheet: Python For Data Science
No ratings yet
Cheat Sheet: Python For Data Science
4 pages
Cheat Sheet: Python For Data Science
No ratings yet
Cheat Sheet: Python For Data Science
4 pages
ML Lab File Vijay Kumar
No ratings yet
ML Lab File Vijay Kumar
16 pages
Learning NumPy and Pandas
No ratings yet
Learning NumPy and Pandas
3 pages
Pandas Numpy
No ratings yet
Pandas Numpy
4 pages
Python Libraries
No ratings yet
Python Libraries
6 pages
UNIT-4 Important Q-A
No ratings yet
UNIT-4 Important Q-A
28 pages
NumPy and Pandas Basics Guide
No ratings yet
NumPy and Pandas Basics Guide
8 pages
DV Lab2 Updated
No ratings yet
DV Lab2 Updated
12 pages
FDS Exp4
No ratings yet
FDS Exp4
5 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
36 pages
Numpy & Pandas
No ratings yet
Numpy & Pandas
13 pages
FDS Module 2 Notes
No ratings yet
FDS Module 2 Notes
24 pages
Python Interviews
No ratings yet
Python Interviews
154 pages
Python
No ratings yet
Python
32 pages
FDS Final Manual
No ratings yet
FDS Final Manual
41 pages
Pandas
No ratings yet
Pandas
27 pages
Ai Programs
No ratings yet
Ai Programs
22 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
61 pages
AI & Data Science Lab Record
No ratings yet
AI & Data Science Lab Record
28 pages
Num Py Pandas Interview Qa
No ratings yet
Num Py Pandas Interview Qa
7 pages
RAW Data
No ratings yet
RAW Data
22 pages
Data Preprocessing
No ratings yet
Data Preprocessing
159 pages
NumPy & Pandas
No ratings yet
NumPy & Pandas
27 pages
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
No ratings yet
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
12 pages
Pandas, Numpy, Matplotlib
No ratings yet
Pandas, Numpy, Matplotlib
11 pages
Pandas
No ratings yet
Pandas
5 pages
Cheat Sheet
No ratings yet
Cheat Sheet
12 pages
Python & Pandas Cheat Sheet Guide
100% (2)
Python & Pandas Cheat Sheet Guide
5 pages
Pandas PDF
No ratings yet
Pandas PDF
25 pages
FDS Lab
No ratings yet
FDS Lab
43 pages
Unit 5 PythonPackages (Matplotlib)
No ratings yet
Unit 5 PythonPackages (Matplotlib)
24 pages
Test 1 Datasheet
No ratings yet
Test 1 Datasheet
3 pages
IPT - AI - 30 Days
No ratings yet
IPT - AI - 30 Days
39 pages
1.2 Conditional and Looping
No ratings yet
1.2 Conditional and Looping
4 pages
Django Flask Difference and Comparision
No ratings yet
Django Flask Difference and Comparision
2 pages
Django 1st App
No ratings yet
Django 1st App
11 pages
Unit 6.3
No ratings yet
Unit 6.3
62 pages
FLAT - Ch. 4
No ratings yet
FLAT - Ch. 4
28 pages
Object Oriented Programming
No ratings yet
Object Oriented Programming
8 pages
Academic Calendar 2023-24
No ratings yet
Academic Calendar 2023-24
1 page
FLAT - Ch. 5
No ratings yet
FLAT - Ch. 5
69 pages
OS - Chapter 7
No ratings yet
OS - Chapter 7
35 pages
MAVEN
No ratings yet
MAVEN
30 pages
Unit 2
No ratings yet
Unit 2
13 pages
Assignment 1 (Unit 1)
No ratings yet
Assignment 1 (Unit 1)
4 pages
Unit 2
No ratings yet
Unit 2
101 pages
Solution Unit2 Assignment
No ratings yet
Solution Unit2 Assignment
26 pages
Assignment 2
No ratings yet
Assignment 2
1 page
Digital System Design Basics
No ratings yet
Digital System Design Basics
29 pages
Pin Diagram of 8085
No ratings yet
Pin Diagram of 8085
46 pages
Proposed Syllabus-Computer Organisation and Microprocessor Architecture
No ratings yet
Proposed Syllabus-Computer Organisation and Microprocessor Architecture
2 pages
Unit6 1
No ratings yet
Unit6 1
33 pages
Unit6 5
No ratings yet
Unit6 5
45 pages
Memory Interfacing
No ratings yet
Memory Interfacing
46 pages
DSA - Ch.4
No ratings yet
DSA - Ch.4
25 pages
Looping, Couting, Indexing
No ratings yet
Looping, Couting, Indexing
6 pages
Unit6 3
No ratings yet
Unit6 3
24 pages
Data Science Practical Book - Ipynb
No ratings yet
Data Science Practical Book - Ipynb
21 pages
Three Dimensional Points and Lines in Python
No ratings yet
Three Dimensional Points and Lines in Python
3 pages
Python Programming Overview
No ratings yet
Python Programming Overview
6 pages
Data Science & Visualization Guide
No ratings yet
Data Science & Visualization Guide
6 pages
MCQ 2
No ratings yet
MCQ 2
25 pages
Olympic Data Analysis Guide
No ratings yet
Olympic Data Analysis Guide
23 pages
Code As Policies
No ratings yet
Code As Policies
16 pages
"Cricket Player Statistics Analysis": Visvesvaraya Technological University
No ratings yet
"Cricket Player Statistics Analysis": Visvesvaraya Technological University
17 pages
Automated Review Classification ML
No ratings yet
Automated Review Classification ML
61 pages
Python Basics & Data Analysis Guide
No ratings yet
Python Basics & Data Analysis Guide
5 pages
Internship Report Priyank Vasoya
No ratings yet
Internship Report Priyank Vasoya
80 pages
Py Report
No ratings yet
Py Report
13 pages
Dsbda Unit4
No ratings yet
Dsbda Unit4
110 pages
Python Pandas ch-2
No ratings yet
Python Pandas ch-2
56 pages
ML Lab Manual (Final) Dtu
No ratings yet
ML Lab Manual (Final) Dtu
52 pages
Python Programming PDF
No ratings yet
Python Programming PDF
138 pages
AICTE Activity Points Report
No ratings yet
AICTE Activity Points Report
12 pages
1 Week 6. Pandas and Numpy Cheat Sheet
No ratings yet
1 Week 6. Pandas and Numpy Cheat Sheet
5 pages
Deep Learning A Comprehensive Guide 1st Edition Vasudevan
No ratings yet
Deep Learning A Comprehensive Guide 1st Edition Vasudevan
60 pages
TensorFlow PCA and Triplet Loss Guide
No ratings yet
TensorFlow PCA and Triplet Loss Guide
19 pages
Constitution
No ratings yet
Constitution
3 pages
Analytix Labs Data Science Course
100% (1)
Analytix Labs Data Science Course
18 pages
Day 2 S1
No ratings yet
Day 2 S1
4 pages
MNIST Digit Classification Using NN
No ratings yet
MNIST Digit Classification Using NN
16 pages
AMATH301 Homework3 Writeup Solutions
No ratings yet
AMATH301 Homework3 Writeup Solutions
8 pages
Numpy Python Cheat Sheet
0% (1)
Numpy Python Cheat Sheet
1 page
Everything Data Analytics-A Beginners Guide To Data Literacy Understanding The Processes That Turn Data Into Insights by Elizabeth Clarke
No ratings yet
Everything Data Analytics-A Beginners Guide To Data Literacy Understanding The Processes That Turn Data Into Insights by Elizabeth Clarke
245 pages
Training A Classifier - PyTorch Tutorials 2.3.0+cu121 Documentation
No ratings yet
Training A Classifier - PyTorch Tutorials 2.3.0+cu121 Documentation
8 pages
CKCS 149 Lab 5
No ratings yet
CKCS 149 Lab 5
8 pages

NumPy and Pandas

Uploaded by

NumPy and Pandas

Uploaded by

NumPy and Pandas

NumPy is a fundamental package for scientific computing with Python. It

Key Features of NumPy

1. N-dimensional Array Object:

You can install NumPy using pip:

Basic Operations with NumPy

array_1d = np.array([1, 2, 3, 4, 5])

print("1D Array:", array_1d)

array_2d = np.array([[1, 2, 3], [4, 5, 6]])

print("2D Array:\n", array_2d)

# Creating arrays with zeros, ones, and a range of numbers

zeros_array = np.zeros((3, 3))

ones_array = np.ones((2, 2))

print("Zeros Array:\n", zeros_array)

print("Ones Array:\n", ones_array)

print("Range Array:", range_array)

array = np.array([1, 2, 3, 4])

print("Original Array:", array)

print("Array + 10:", array_add)

print("Array * 2:", array_mult)

print("Array squared:", array_square)

Array + 10: [11 12 13 14]

Array squared: [ 1 4 9 16]

3. Universal Functions (ufuncs)

# Using ufuncs for element-wise operations

array = np.array([1, 2, 3, 4])

print("Sine of Array:", array_sin)

print("Exponential of Array:", array_exp)

# Square root function

print("Square Root of Array:", array_sqrt)

Sine of Array: [ 0.84147098 0.90929743 0.14112001 -0.7568025 ]

Exponential of Array: [ 2.71828183 7.3890561 20.08553692

Square Root of Array: [1. 1.41421356 1.73205081 2. ]

4. Linear Algebra Operations

matrix_a = np.array([[1, 2], [3, 4]])

matrix_b = np.array([[5, 6], [7, 8]])

matrix_product = np.dot(matrix_a, matrix_b)

print("Matrix Product:\n", matrix_product)

print("Inverse of Matrix A:\n", matrix_inv)

# Eigenvalues and eigenvectors

eigenvalues, eigenvectors = np.linalg.eig(matrix_a)

Eigenvalues: [-0.37228132 5.37228132]

5. Random Number Generation

# Generating random numbers

print("Random Array:", random_array)

# Generating random integers

random_integers = np.random.randint(1, 10, size=5)

# Generating numbers from a normal distribution

print("Normal Distribution Array:", normal_array)

Random Array: [0.85953447 0.73381974 0.37786374 0.84847527

Normal Distribution Array: [ 0.35743143 -1.32095611 -0.61792992

Key Features of Pandas

You can install Pandas using pip:

Basic Operations with Pandas

1. Creating Series and DataFrames

2. Reading and Writing Data

# Reading from a CSV file

# Writing to a CSV file

DataFrame from CSV:

3. Data Selection and Filtering

# Selecting a single column

# Selecting multiple columns

# Filtering rows based on a condition

# Filling missing values

# Dropping missing values

DataFrame with filled values:

DataFrame with dropped rows:

5. Data Aggregation and Grouping

# Grouping data by a column and calculating aggregate statistics

You might also like