Experiment 3: Multi-Layer Perceptron (MLP)
Objectives:
1- Understand the Back-Propagation (BP) Learning algorithm.
2- Implement the MLP with BP using Python.
3- Run real-world applications in Python.
Introduction:
• A multilayer perceptron is a feedforward neural network with one or more
hidden layers.
• Typically, the network consists of an input layer of source neurons, at least
one middle or hidden layer of computational neurons, and an output layer
of computational neurons.
• The input signals are propagated in a forward direction on a layer-by-layer
basis. A multilayer perceptron with two hidden layers is shown in Figure 1.
Figure 8: Multilayer perceptron with two hidden layers
• Typically, a back-propagation network is a multilayer network that has
three or four layers. The layers are fully connected, that is, every neuron in
each layer is connected to every other neuron in the adjacent forward
layer.
• A neuron determines its output in a manner similar to Rosenblatt’s
perceptron. First, it computes the net weighted input as before:
where n is the number of inputs, and θ is the threshold applied to the neuron.
• Next, this input value is passed through the activation function. However,
unlike a percepron, neurons in the back-propagation network use a
sigmoid activation function:
Figure 2: Three-layer back-propagation neural network
• To derive the back-propagation learning law, let us consider the three-
layer network shown in Figure 9. The indices i, j, and k here refer to
neurons in the input, hidden, and output layers, respectively.
• Input signals, x1;x2;…;xn, are propagated through the network from left to
right, and error signals, e1;e2;…; el, from right to left. The symbol wij
denotes the weight for the connection between neuron i in the input layer
and neuron j in the hidden layer, and the symbol wjk the weight between
neuron j in the hidden layer and neuron k in the output layer.
The idea of a Perceptron is analogous to the operating principle of the basic
processing unit of the brain — Neuron. A Neuron is comprised of many input
signals carried by Dendrites, the cell body, and one output signal carried along
Experiments
Experiment 1: Learning XOR gate
import numpy as np
#np.random.seed(0)
def sigmoid (x):
return 1/(1 + np.exp(-x))
def sigmoid_derivative(x):
return x * (1 - x)
#Input datasets
inputs = np.array([[0,0],[0,1],[1,0],[1,1]])
expected_output = np.array([[0],[1],[1],[0]])
epochs = 10000
lr = 0.1
inputLayerNeurons, hiddenLayerNeurons, outputLayerNeurons = 2,2,1
#Random weights and bias initialization
hidden_weights =
np.random.uniform(size=(inputLayerNeurons,hiddenLayerNeurons))
hidden_bias =np.random.uniform(size=(1,hiddenLayerNeurons))
output_weights =
np.random.uniform(size=(hiddenLayerNeurons,outputLayerNeurons))
output_bias = np.random.uniform(size=(1,outputLayerNeurons))
print("Initial hidden weights: ",end='')
print(*hidden_weights)
print("Initial hidden biases: ",end='')
print(*hidden_bias)
print("Initial output weights: ",end='')
print(*output_weights)
print("Initial output biases: ",end='')
print(*output_bias)
#Training algorithm
for _ in range(epochs):
#Forward Propagation
hidden_layer_activation = np.dot(inputs,hidden_weights)
hidden_layer_activation += hidden_bias
hidden_layer_output = sigmoid(hidden_layer_activation)
output_layer_activation =
np.dot(hidden_layer_output,output_weights)
output_layer_activation += output_bias
predicted_output = sigmoid(output_layer_activation)
#Backpropagation
error = expected_output - predicted_output
d_predicted_output = error * sigmoid_derivative(predicted_output)
error_hidden_layer = d_predicted_output.dot(output_weights.T)
d_hidden_layer = error_hidden_layer *
sigmoid_derivative(hidden_layer_output)
#Updating Weights and Biases
output_weights += hidden_layer_output.T.dot(d_predicted_output) *
lr
output_bias += np.sum(d_predicted_output,axis=0,keepdims=True) * lr
hidden_weights += inputs.T.dot(d_hidden_layer) * lr
hidden_bias += np.sum(d_hidden_layer,axis=0,keepdims=True) * lr
print("Final hidden weights: ",end='')
print(*hidden_weights)
print("Final hidden bias: ",end='')
print(*hidden_bias)
print("Final output weights: ",end='')
print(*output_weights)
print("Final output bias: ",end='')
print(*output_bias)
print("\nOutput from neural network after 10,000 epochs: ",end='')
print(*predicted_output)
Run the above program and modify the code such that :
A) The number of neurons in the hidden layer is doubled, discuss the effect for
doubling the neurons.
B) Discuss the effect of changing the learning rate.
Experiments 2
Classifying the IRIS dataset:
import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
# Loading dataset
data = load_iris()
X = data.data
y = data.target
# Split dataset into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=20,
random_state=4)
# Hyperparameters
learning_rate = 0.1
iterations = 5000
N = y_train.size
input_size = 4
hidden_size = 2
output_size = 3
np.random.seed(10)
W1 = np.random.normal(scale=0.5, size=(input_size, hidden_size))
W2 = np.random.normal(scale=0.5, size=(hidden_size, output_size))
# Helper functions
def sigmoid(x):
return 1 / (1 + np.exp(-x))
def mean_squared_error(y_pred, y_true):
# One-hot encode y_true (i.e., convert [0, 1, 2] into [[1, 0, 0],
[0, 1, 0], [0, 0, 1]])
y_true_one_hot = np.eye(output_size)[y_true]
# Reshape y_true_one_hot to match y_pred shape
y_true_reshaped = y_true_one_hot.reshape(y_pred.shape)
# Compute the mean squared error between y_pred and y_true_reshaped
error = ((y_pred - y_true_reshaped)**2).sum() / (2*y_pred.size)
return error
def accuracy(y_pred, y_true):
acc = y_pred.argmax(axis=1) == y_true.argmax(axis=1)
return acc.mean()
results = pd.DataFrame(columns=["mse", "accuracy"])
# Training loop
for itr in range(iterations):
# Feedforward propagation
Z1 = np.dot(X_train, W1)
A1 = sigmoid(Z1)
Z2 = np.dot(A1, W2)
A2 = sigmoid(Z2)
# Calculate error
mse = mean_squared_error(A2, y_train)
acc = accuracy(np.eye(output_size)[y_train], A2)
new_row = pd.DataFrame({"mse": [mse], "accuracy": [acc]})
results = pd.concat([results, new_row], ignore_index=True)
# Backpropagation
E1 = A2 - np.eye(output_size)[y_train]
dW1 = E1 * A2 * (1 - A2)
E2 = np.dot(dW1, W2.T)
dW2 = E2 * A1 * (1 - A1)
# Update weights
W2_update = np.dot(A1.T, dW1) / N
W1_update = np.dot(X_train.T, dW2) / N
W2 = W2 - learning_rate * W2_update
W1 = W1 - learning_rate * W1_update
# Visualizing the results
results.mse.plot(title="Mean Squared Error")
plt.show()
results.accuracy.plot(title="Accuracy")
plt.show()
# Test the model
Z1 = np.dot(X_test, W1)
A1 = sigmoid(Z1)
Z2 = np.dot(A1, W2)
A2 = sigmoid(Z2)
test_acc = accuracy(np.eye(output_size)[y_test], A2)
print("Test accuracy: {}".format(test_acc))
• Run the above program and modify the code to improve the testing
accuracy.
• Add your code to plot the confusion matrix
Experiment 3
In this experiment, you have to do the following :
- Classify the Brest Cancer dataset from UCI.
- Plot the confusion matrix and calculate the below measures.
Measure Formula
Accuracy or correct 𝑇𝑃 + 𝑇𝑁
classification rate 𝑃+𝑁
Error rate or false 𝐹𝑃 + 𝐹𝑁
classification rate 𝑃+𝑁
Sensitivity, true positive 𝑇𝑃
rate, or 𝑟𝑒𝑐𝑎𝑙𝑙 𝑃
Specificity or true 𝑇𝑁
negative rate 𝑁
𝑇𝑃
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 𝑇𝑃 + 𝐹𝑃
𝐹 score: harmonic 2 × 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 × 𝑟𝑒𝑐𝑎𝑙𝑙
mean of precision or 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑟𝑒𝑐𝑎𝑙𝑙
recall