ex-5-nn-wheat-seed-data
April 16, 2024
[1]: # importing numpy, pandas libraries
import pandas as pd
import numpy as np
# loading wheat seeds data into a dataframe
seeds_data = pd.read_csv('seeds.csv')
# displaying the first 5 rows of wheet seeds data
seeds_data.head()
[1]: Area Perimeter Compactness Kernel.Length Kernel.Width \
0 15.26 14.84 0.8710 5.763 3.312
1 14.88 14.57 0.8811 5.554 3.333
2 14.29 14.09 0.9050 5.291 3.337
3 13.84 13.94 0.8955 5.324 3.379
4 16.14 14.99 0.9034 5.658 3.562
Asymmetry.Coeff Kernel.Groove Type
0 2.221 5.220 1
1 1.018 4.956 1
2 2.699 4.825 1
3 2.259 4.805 1
4 1.355 5.175 1
[2]: # Extracting Independent Variables
X = seeds_data.loc[:, seeds_data.columns != 'Type']
X
[2]: Area Perimeter Compactness Kernel.Length Kernel.Width \
0 15.26 14.84 0.8710 5.763 3.312
1 14.88 14.57 0.8811 5.554 3.333
2 14.29 14.09 0.9050 5.291 3.337
3 13.84 13.94 0.8955 5.324 3.379
4 16.14 14.99 0.9034 5.658 3.562
.. … … … … …
194 12.19 13.20 0.8783 5.137 2.981
1
195 11.23 12.88 0.8511 5.140 2.795
196 13.20 13.66 0.8883 5.236 3.232
197 11.84 13.21 0.8521 5.175 2.836
198 12.30 13.34 0.8684 5.243 2.974
Asymmetry.Coeff Kernel.Groove
0 2.221 5.220
1 1.018 4.956
2 2.699 4.825
3 2.259 4.805
4 1.355 5.175
.. … …
194 3.631 4.870
195 4.325 5.003
196 8.315 5.056
197 3.598 5.044
198 5.637 5.063
[199 rows x 7 columns]
[3]: # Extracting Target Variable
Y = seeds_data.loc[:, seeds_data.columns == 'Type']
Y
[3]: Type
0 1
1 1
2 1
3 1
4 1
.. …
194 3
195 3
196 3
197 3
198 3
[199 rows x 1 columns]
0.1 Split Data for training and testing
[4]: from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X , Y ,
test_size = 0.2,
random_state = 523)
2
0.2 Training the Perceptron Classifier
[5]: # importing Perceptron Class
from sklearn.linear_model import Perceptron
# Creating an insance of Perceptron Class
perc = Perceptron( random_state = 15)
# Training the perceptron classifier
perc.fit(X_train, np.ravel(y_train))
# importing metrics for evaluating perceptron classifier
from sklearn.metrics import accuracy_score
# Using perceptron classifier to make predictions on test data
pred_test = perc.predict(X_test)
# calculating and displaying accuracy score of Perceptron classifier
accuracy = accuracy_score(y_test, pred_test)
print('% of Accuracy using Linear Perceptron: ', accuracy * 100)
% of Accuracy using Linear Perceptron: 67.5
Correlation between two variables can be either a positive correlation, a negative
correlation, or no correlation.
[6]: # Importing plotly.express
import plotly.express as px
# Finding the correlation of Independent variables on Target Variable
corr = seeds_data.corr()
corr = corr.round(2)
corr
[6]: Area Perimeter Compactness Kernel.Length Kernel.Width \
Area 1.00 0.99 0.61 0.95 0.97
Perimeter 0.99 1.00 0.53 0.97 0.95
Compactness 0.61 0.53 1.00 0.37 0.76
Kernel.Length 0.95 0.97 0.37 1.00 0.86
Kernel.Width 0.97 0.95 0.76 0.86 1.00
Asymmetry.Coeff -0.22 -0.21 -0.33 -0.17 -0.25
Kernel.Groove 0.86 0.89 0.23 0.93 0.75
Type -0.34 -0.32 -0.54 -0.25 -0.42
Asymmetry.Coeff Kernel.Groove Type
3
Area -0.22 0.86 -0.34
Perimeter -0.21 0.89 -0.32
Compactness -0.33 0.23 -0.54
Kernel.Length -0.17 0.93 -0.25
Kernel.Width -0.25 0.75 -0.42
Asymmetry.Coeff 1.00 -0.00 0.57
Kernel.Groove -0.00 1.00 0.04
Type 0.57 0.04 1.00
[7]: # displaying confusion matrix as a heatmap
fig = px.imshow(corr ,
width = 700,
height = 700 ,
text_auto = True,
color_continuous_scale = 'tealgrn',
)
fig.show()
It can be observed that the attribute “Kernel.Groove” has very least correlation on
the target variable
[8]: # remove Kernel.Groove attribute from X
X = X.loc[:, X.columns != 'Kernel.Groove']
X
[8]: Area Perimeter Compactness Kernel.Length Kernel.Width \
0 15.26 14.84 0.8710 5.763 3.312
1 14.88 14.57 0.8811 5.554 3.333
2 14.29 14.09 0.9050 5.291 3.337
3 13.84 13.94 0.8955 5.324 3.379
4 16.14 14.99 0.9034 5.658 3.562
.. … … … … …
194 12.19 13.20 0.8783 5.137 2.981
195 11.23 12.88 0.8511 5.140 2.795
196 13.20 13.66 0.8883 5.236 3.232
197 11.84 13.21 0.8521 5.175 2.836
198 12.30 13.34 0.8684 5.243 2.974
Asymmetry.Coeff
0 2.221
1 1.018
2 2.699
3 2.259
4 1.355
.. …
4
194 3.631
195 4.325
196 8.315
197 3.598
198 5.637
[199 rows x 6 columns]
Resplitting Data for training and testing
[9]: X_train, X_test, y_train, y_test = train_test_split(X , Y ,
test_size = 0.2,
random_state = 523)
Retraining the Perceptron Classifier
[10]: # retraining the perceptron classifier
perc.fit(X_train, np.ravel(y_train))
# Using perceptron classifier to make predictions on test data
pred_test = perc.predict(X_test)
# calculating and displaying accuracy score of Perceptron classifier
accuracy = accuracy_score(y_test, pred_test)
print('% of Accuracy using Linear Perceptron: ', accuracy * 100)
% of Accuracy using Linear Perceptron: 75.0
0.2.1 Install scikit-neuralnetwork
[1]: #scikit-neuralnetwork works withscikit-learn 0.18 and above
# installing scikit-neuralnetwork if not already installed
!pip install scikit-neuralnetwork
Processing c:\users\gurram\appdata\local\pip\cache\wheels\7d\42\93\b99bd6392fb56
ec7831a695cb7a23dd9c73382b258614b62ed\scikit_neuralnetwork-0.7-py3-none-any.whl
Processing c:\users\gurram\appdata\local\pip\cache\wheels\a3\72\b6\89bbeb6140ee3
756fa2bdd2fb03003dd60d289851314b35fd7\lasagne-0.1-py3-none-any.whl
Processing c:\users\gurram\appdata\local\pip\cache\wheels\26\68\6f\745330367ce78
22fe0cd863712858151f5723a0a5e322cc144\theano-1.0.5-py3-none-any.whl
Requirement already satisfied: colorama in d:\anaconda\lib\site-packages (from
scikit-neuralnetwork) (0.4.3)
Requirement already satisfied: scikit-learn>=0.17 in
c:\users\gurram\appdata\roaming\python\python37\site-packages (from scikit-
neuralnetwork) (1.0.2)
Requirement already satisfied: numpy in d:\anaconda\lib\site-packages (from
5
Lasagne>=0.1->scikit-neuralnetwork) (1.18.1)
Requirement already satisfied: scipy>=0.14 in
c:\users\gurram\appdata\roaming\python\python37\site-packages (from
Theano>=0.8->scikit-neuralnetwork) (1.7.3)
Requirement already satisfied: six>=1.9.0 in d:\anaconda\lib\site-packages (from
Theano>=0.8->scikit-neuralnetwork) (1.14.0)
Requirement already satisfied: joblib>=0.11 in d:\anaconda\lib\site-packages
(from scikit-learn>=0.17->scikit-neuralnetwork) (0.14.1)
Collecting threadpoolctl>=2.0.0
Downloading threadpoolctl-3.1.0-py3-none-any.whl (14 kB)
Installing collected packages: Lasagne, Theano, scikit-neuralnetwork,
threadpoolctl
Successfully installed Lasagne-0.1 Theano-1.0.5 scikit-neuralnetwork-0.7
threadpoolctl-3.1.0
0.3 Training the Multilayer Perceptron Classifier using Backpropagation algo-
rithm
[12]: # importing required library
import sklearn.neural_network as nn
# Creating an instance of MLPClassifier class
# Taking maximum number of iterations = 1000
# constructing MLP network with 3 hidden layers with
# 100 neurons in hidden layer 1,
# 75 neurons in hidden layer 2,
# 50 neurons in hidden layer 3
mlp = nn.MLPClassifier(random_state = 560,
hidden_layer_sizes = [100, 75, 50],
max_iter = 1000)
[14]: # Training the MLP classifier
mlp.fit(X_train, np.ravel(y_train))
pred_test = mlp.predict(X_test)
mlp_accuracy = accuracy_score(y_test, pred_test)
print('% of Accuracy using MultiLayer Perceptron: ', "{0:0.2f}".
↪format(mlp_accuracy*100))
% of Accuracy using MultiLayer Perceptron: 87.50