Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
41 views5 pages

Exp3 2

The document outlines various techniques for handling missing data in Python, particularly using the pandas library. It covers methods such as filling NaN values, forward and backward filling, filling with index values, and interpolation of missing values. Code examples are provided to illustrate each technique in practice.

Uploaded by

damisettilohitha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views5 pages

Exp3 2

The document outlines various techniques for handling missing data in Python, particularly using the pandas library. It covers methods such as filling NaN values, forward and backward filling, filling with index values, and interpolation of missing values. Code examples are provided to illustrate each technique in practice.

Uploaded by

damisettilohitha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

UNIT-3

EXPERIMENTS – 3(B)

2. Apply different Missing Data handling techniques


a) NaN values in mathematical Operations
b) Filling in missing data
c) Forward and Backward filling of missing values
d) Filling with index values
e) Interpolation of missing values

a) NaN values in mathematical Operations

# Import math Library


import math

# Print the value of nan


print (math.nan)

OUTPUT:
nan

b) Filling in missing data


# Importing pandas and numpy
import pandas as pd
import numpy as np
# Sample DataFrame with missing values
data = {'First Score': [100, 90, np.nan, 95],
'Second Score': [30, 45, 56, np.nan],
'Third Score': [np.nan, 40, 80, 98]}
df = pd.DataFrame(data)
# Checking for missing values using isnull()
missing_values = df.isnull()
print(missing_values)

c) Forward and Backward filling of missing values


• Forward Fill: The missing NaN values will be replaced by the last valid
(non-null) value before them.
• Backward Fill: The missing NaN values will be replaced by the next valid
(non-null) value after them.
1. Forward Fill (ffill): NaN values are replaced by the last known value.
2. Backward Fill (bfill): NaN values are replaced by the next known value.

import pandas as pd
# Example DataFrame with missing values (NaN)
data = {
'Date': ['2025-02-20', '2025-02-21', '2025-02-22', '2025-02-23', '2025-02-24'],
'Value': [10, None, None, 20, None]
}
# Create the DataFrame
df = pd.DataFrame(data)
# Show original data
print("Original DataFrame:")
print(df)
# Forward fill missing values (NaN)
df_forward = df.fillna(method='ffill')
# Backward fill missing values (NaN)
df_backward = df.fillna(method='bfill')
# Display the results
print("\nDataFrame after Forward Fill:")
print(df_forward)
print("\nDataFrame after Backward Fill:")
print(df_backward)

OUT PUT:
Original DataFrame:
Date Value
0 2025-02-20 10.0
1 2025-02-21 NaN
2 2025-02-22 NaN
3 2025-02-23 20.0
4 2025-02-24 NaN
DataFrame after Forward Fill:
Date Value
0 2025-02-20 10.0
1 2025-02-21 10.0
2 2025-02-22 10.0
3 2025-02-23 20.0
4 2025-02-24 20.0
DataFrame after Backward Fill:
Date Value
0 2025-02-20 10.0
1 2025-02-21 20.0
2 2025-02-22 20.0
3 2025-02-23 20.0
4 2025-02-24 NaN
D) Filling with index values
# importing pandas as pd
import pandas as pd
# Creating the Index
idx = pd.Index([1, 2, 3, 4, 5, None, 7, 8, 9, None])
# Print the Index
Idx

Use Index.fillna() function to fill all the missing strings in the Index.

# importing pandas as pd
import pandas as pd
# Creating the Index
idx = pd.Index(['Labrador', 'Beagle', None, 'Labrador',
'Lhasa', 'Husky', 'Beagle', None, 'Koala'])
# Print the Index
Idx

D). Interpolation of missing values


Python Pandas interpolate () method is used to fill NaN values in the Data Frame or
Series using various interpolation techniques to fill the missing values rather than
hard-coding the value.
import pandas as pd
import numpy as np
df = pd.DataFrame({
'A': [1, 2, np.nan, 4],
'B': [5, np.nan, np.nan, 8],
'C': [9, 10, 11, 12] })
df.interpolate()
print(df)

You might also like