0% found this document useful (0 votes)

32 views13 pages

Practical Implementation 02

The document provides an overview of machine learning model training using Python. It loads and preprocesses an Algerian forest fire dataset, splits it into training and test sets, and examines correlations between features. Various machine learning algorithms can then be applied to the training data to predict forest fires.

Uploaded by

harshsonaiya09

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views13 pages

Practical Implementation 02

Uploaded by

harshsonaiya09

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Data Science &

AI
Machine Learning

Practical Implementation

Lecture No.- 01 By- Krish Naik Sir

12/4/23, 2:10 PM 3.0-Model Training.ipynb - Colaboratory

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

df=pd.read_csv('Algerian_forest_fires_dataset_UPDATE (8).csv')

df.head()

day month year Temperature RH Ws Rain FFMC DMC DC ISI BUI FWI Classes

0 1 6 2012 29 57 18 0 65.7 3.4 7.6 1.3 3.4 0.5 not fire

1 2 6 2012 29 61 13 1.3 64.4 4.1 7.6 1 3.9 0.4 not fire

2 3 6 2012 26 82 22 13.1 47.1 2.5 7.1 0.3 2.7 0.1 not fire

3 4 6 2012 25 89 13 2.5 28.6 1.3 6.9 0 1.7 0 not fire

4 5 6 2012 27 77 16 0 64.8 3 14.2 1.2 3.9 0.5 not fire

df.columns

Index(['day', 'month', 'year', 'Temperature', ' RH', ' Ws', 'Rain ', 'FFMC',
'DMC', 'DC', 'ISI', 'BUI', 'FWI', 'Classes '],
dtype='object')

##drop month,day and yyear

df.drop(['day','month','year'],axis=1,inplace=True)

df.head()

Temperature RH Ws Rain FFMC DMC DC ISI BUI FWI Classes

0 29 57 18 0 65.7 3.4 7.6 1.3 3.4 0.5 not fire

1 29 61 13 1.3 64.4 4.1 7.6 1 3.9 0.4 not fire

2 26 82 22 13.1 47.1 2.5 7.1 0.3 2.7 0.1 not fire

3 25 89 13 2.5 28.6 1.3 6.9 0 1.7 0 not fire

4 27 77 16 0 64.8 3 14.2 1.2 3.9 0.5 not fire

df.head()

Temperature RH Ws Rain FFMC DMC DC ISI BUI FWI Classes

0 29 57 18 0 65.7 3.4 7.6 1.3 3.4 0.5 not fire

1 29 61 13 1.3 64.4 4.1 7.6 1 3.9 0.4 not fire

2 26 82 22 13.1 47.1 2.5 7.1 0.3 2.7 0.1 not fire

3 25 89 13 2.5 28.6 1.3 6.9 0 1.7 0 not fire

4 27 77 16 0 64.8 3 14.2 1.2 3.9 0.5 not fire

df['Classes '].value_counts()

1 138
0 109
Name: Classes , dtype: int64

df['Classes']=df['Classes ']
df.drop(['Classes '],axis=1,inplace=True)

## Encoding
df['Classes ']=np.where(df['Classes '].str.contains("not fire"),0,1)

df['Classes'].value_counts()

1 138
0 109
Name: Classes, dtype: int64

df.tail()

https://colab.research.google.com/drive/1F2TKQtJV1ATKzgfYLDKIPKZxPOQ_jwjv#scrollTo=Z8Q8HGqYIZdE&printMode=true 1/7
12/4/23, 2:10 PM 3.0-Model Training.ipynb - Colaboratory

Temperature RH Ws Rain FFMC DMC DC ISI BUI FWI Classes

242 30 65 14 0 85.4 16 44.5 4.5 16.9 6.5 1

243 28 87 15 4.4 41.1 6.5 8 0.1 6.2 0 0

244 27 87 29 0.5 45.9 3.5 7.9 0.4 3.4 0.2 0

245 24 54 18 0.1 79.7 4.3 15.2 1.7 5.1 0.7 0

246 24 64 15 0.2 67.3 3.8 16.5 1.2 4.8 0.5 0

df['Classes'].value_counts()

1 137
0 106
Name: Classes, dtype: int64

## Independent And dependent features

X=df.drop('FWI',axis=1) #independent
y=df['FWI'] #dependent

X.head()

Temperature RH Ws Rain FFMC DMC DC ISI BUI Classes

0 29 57 18 0 65.7 3.4 7.6 1.3 3.4 0

1 29 61 13 1.3 64.4 4.1 7.6 1 3.9 0

2 26 82 22 13.1 47.1 2.5 7.1 0.3 2.7 0

3 25 89 13 2.5 28.6 1.3 6.9 0 1.7 0

4 27 77 16 0 64.8 3 14.2 1.2 3.9 0

0 0.5
1 0.4
2 0.1
3 0
4 0.5
...
242 6.5
243 0
244 0.2
245 0.7
246 0.5
Name: FWI, Length: 247, dtype: object

#Train Test Split

from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.25,random_state=42)

X_train.shape,X_test.shape

((185, 10), (62, 10))

X_train.head()

Temperature RH Ws Rain FFMC DMC DC ISI BUI Classes

101 33 73 12 1.8 59.9 2.2 8.9 0.7 2.7 0

197 39 21 17 0.4 93 18.4 41.5 15.5 18.4 1

126 30 73 13 4 55.7 2.7 7.8 0.6 2.9 0

69 35 59 17 0 87.4 14.8 57 6.9 17.9 1

200 35 46 13 0.3 83.9 16.9 54.2 3.5 19 1

X_train[X_train['Temperature']== 'Temperature']

Temperature RH Ws Rain FFMC DMC DC ISI BUI Classes

X_train.drop(index=124,inplace=True)

https://colab.research.google.com/drive/1F2TKQtJV1ATKzgfYLDKIPKZxPOQ_jwjv#scrollTo=Z8Q8HGqYIZdE&printMode=true 2/7
12/4/23, 2:10 PM 3.0-Model Training.ipynb - Colaboratory

X_train.corr()

<ipython-input-60-1d31ae5364df>:1: FutureWarning: The default value of numeric_only in DataFrame.corr is deprecated. In a future ver
X_train.corr()

index Classes
Classes
Show 25 per page

Like what you see? Visit the data table notebook to learn more about interactive tables.

X_train[X_train.Temperature !='Temperature'].corr()

<ipython-input-54-d6f31b53d14c>:1: FutureWarning: The default value of numeric_only in DataFrame.corr is deprecated. In a future ver
X_train[X_train.Temperature != 'Temperature'].corr()
Classes

Classes 1.0

X_train.columns

Index(['Temperature', ' RH', ' Ws', 'Rain ', 'FFMC', 'DMC', 'DC', 'ISI', 'BUI',
'Classes'],
dtype='object')

X_train['Temperature']=X_train['Temperature'].dropna().astype(int)

X_train[' RH']=X_train[' RH'].dropna().astype(int)

X_train[' Ws']=X_train[' Ws'].dropna().astype(int)

X_train['Rain ']=X_train['Rain '].dropna().astype(float)
X_train['FFMC']=X_train['FFMC'].dropna().astype(float)
X_train['DMC']=X_train['DMC'].dropna().astype(float)
#X_train['DC']=X_train['DC'].dropna().astype(float)
X_train['ISI']=X_train['ISI'].dropna().astype(float)
X_train['BUI']=X_train['BUI'].dropna().astype(float)

X_train['DC']=X_train['DC'].dropna().replace('14.6 9','14.69').astype(float)

X_train.corr()

Temperature RH Ws Rain FFMC DMC DC ISI BUI Classes

Temperature 1.000000 -0.689393 -0.321891 -0.359438 0.707745 0.490281 0.376328 0.598660 0.463008 0.515195

RH -0.689393 1.000000 0.166559 0.244101 -0.660022 -0.410668 -0.219077 -0.732962 -0.352303 -0.438307

Ws -0.321891 0.166559 1.000000 0.229595 -0.141418 0.015022 0.081155 0.029341 0.039326 -0.030138

Rain -0.359438 0.244101 0.229595 1.000000 -0.557421 -0.286336 -0.294696 -0.337800 -0.295782 -0.365927

FFMC 0.707745 -0.660022 -0.141418 -0.557421 1.000000 0.614965 0.510088 0.740773 0.597772 0.773751

DMC 0.490281 -0.410668 0.015022 -0.286336 0.614965 1.000000 0.871724 0.676476 0.983552 0.599769

DC 0.376328 -0.219077 0.081155 -0.294696 0.510088 0.871724 1.000000 0.475461 0.943763 0.517169

ISI 0.598660 -0.732962 0.029341 -0.337800 0.740773 0.676476 0.475461 1.000000 0.623201 0.703945

BUI 0.463008 -0.352303 0.039326 -0.295782 0.597772 0.983552 0.943763 0.623201 1.000000 0.591169

Classes 0.515195 -0.438307 -0.030138 -0.365927 0.773751 0.599769 0.517169 0.703945 0.591169 1.000000

keyboard_arrow_down Feature Selection

## Check for multicollinearity
plt.figure(figsize=(12,10))
corr=X_train.corr()
sns.heatmap(corr,annot=True)

https://colab.research.google.com/drive/1F2TKQtJV1ATKzgfYLDKIPKZxPOQ_jwjv#scrollTo=Z8Q8HGqYIZdE&printMode=true 3/7
12/4/23, 2:10 PM 3.0-Model Training.ipynb - Colaboratory

<Axes: >