Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
8 views20 pages

Neural Network and Deep Learning - Unit 1

The document provides an overview of neural networks and deep learning, focusing on learning rules such as Hebbian learning and the Perceptron learning rule. It discusses the implementation of an AND gate using Hebbian learning, the importance of activation functions for introducing non-linearity, and the differences between single-layer and multi-layer perceptrons. Additionally, it highlights various activation functions, their impact on model performance, and the architecture of neural networks.

Uploaded by

benittomiraclin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views20 pages

Neural Network and Deep Learning - Unit 1

The document provides an overview of neural networks and deep learning, focusing on learning rules such as Hebbian learning and the Perceptron learning rule. It discusses the implementation of an AND gate using Hebbian learning, the importance of activation functions for introducing non-linearity, and the differences between single-layer and multi-layer perceptrons. Additionally, it highlights various activation functions, their impact on model performance, and the architecture of neural networks.

Uploaded by

benittomiraclin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

NEURAL NETWORK AND DEEP LEARNING​

UNIT 1 - NEURAL NETWORKS

12M:

Learning Rule

●​ A learning rule is a method or algorithm that helps a neural network


improve during training.

Hebbian Learning Rule


●​ Simple neural learning rule that strengthens the connection between
neurons when they are activated together. It follows the principle:
"Neurons that fire together, wire together."
●​ Proposed by Donald O Hebb and one of the earliest learning rules.
●​ Used for pattern classification.
●​ It is a single-layer neural network with one input layer (with n units)
and one output unit.
●​ The rule updates weights between neurons after each training
sample.

Hebbian Learning Rule Algorithm

1.​ Initialize weights and bias: Set all weights and bias to zero.
2.​ For each input vector and target output, repeat steps 3-5.
3.​ Set input activations:
4.​ Set output: y = t.
5.​ Update weights and bias using:

Implementing AND Gate

●​ 4 training samples → 4 iterations.


●​ Activation Function: Bipolar Sigmoid (range: [-1,1])
●​ Formula:

Step 1: Initialize

●​ Weights = [0, 0, 0]ᵀ, Bias = 0.

Step 2: Input Vectors

●​ X1 = [-1, -1, 1]ᵀ


●​ X2 = [-1, 1, 1]ᵀ
●​ X3 = [1, -1, 1]ᵀ
●​ X4 = [1, 1, 1]ᵀ

Step 3: Assign Outputs

●​ Set y = t for each input.


So, the target values (t) become:

Input (X₁, X₂) Target (t)

(-1, -1) -1

(-1, 1) -1

(1, -1) -1

(1, 1) 1

Step 4: Update Weights Using Hebbian Rule


W = W + tX

Iteration 1:

Using X1 = [-1, -1, 1]ᵀ and t = −1,

W = [0,0,0] + (−1) × [−1,−1,1] = [1, 1, -1]ᵀ

Iteration 2:

Using X2 = [-1, 1, 1]ᵀ and t = -1,

W = [1,1,−1] + (−1) × [−1,1,1] = [2, 0, -2]ᵀ

Iteration 3:

Using X3 = [1, -1, 1]ᵀ and t=−1,

W = [2,0,−2] + (−1) × [1,−1,1] = [1, 1, -3]ᵀ


Iteration 4:

Using X4 = [1, 1, 1]ᵀ and t=1,

W = [1,1,−3] + (1) × [1,1,1] = [2, 2, -2]ᵀ

Final Weights:

W = [2, 2, -2]ᵀ

Testing the Network

Each input is represented as an augmented vector (including bias):

The output Y is computed using the dot product:

Y=W⋅X

Where:

Y = (w1 × x1) + (w2 × x2) + (w3 × x3)

1. For (-1, -1):

Y = (2 × −1) + (2 × −1) + (−2 × 1) = - 2 - 2 - 2 = −6

2. For (-1, 1):

Y = (2 × −1) + (2 × 1) + (−2 × 1) = − 2 + 2 − 2= −2
3. For (1, -1):

Y = (2 × 1) + (2 × −1) + (−2 × 1) = 2 − 2 − 2 = −2

4. For (1, 1):

Y = (2 × 1) + (2 × 1) + ( −2 × 1) = 2 + 2 − 2 = 2

The final weight vector obtained through Hebbian learning is:

Y = [-6 , -2 , -2 , 2]

Decision Boundary

Perceptron Learning Rule

●​ Purpose: Used in supervised learning for binary classification tasks


(output: +1 or -1).
●​ Created by: Frank Rosenblatt for a binary classifier.
●​ Components:
1.​ Input Nodes: Accept numerical values for processing.
2.​ Weights & Bias:
■​ Weights determine the strength of connections between
neurons.
■​ Bias acts as an intercept in a linear equation, helping
improve model performance.
3.​ Activation Function: Decides if the neuron will activate or not
(commonly a step function).

Types of Activation Functions:

1.​ Sign Function


2.​ Step Function
3.​ Sigmoid Function (Output between 0 and 1)

Range value of Sign and step functions


How Perceptron Works:

1.​ Step 1: Calculate the weighted sum of the inputs:

2.​ Step 2: Apply the activation function to the weighted sum to get the
output:

The output can be binary or continuous, based on the function used.

Example of Perceptron Learning Rule:

Dataset:

X1 X2 Y

0 0 0

0 1 0

1 0 0

1 1 1
Initial Weights:

●​ w1 = 0.9 , w2 = 0.9
●​ Activation Threshold: 0.5
●​ Learning Rate: 0.5

Step-by-step Calculation:

1.​ First Instance (X1 = 0, X2 = 0):​

○​ Weighted sum: 0 × 0.9 + 0 × 0.9 = 0


○​ Output = 0 (no error, no weight update).
2.​ Second Instance (X1 = 0, X2 = 1):​

○​ Weighted sum: 0 × 0.9 + 1 × 0.9 = 0.9


○​ Output = 1 (but actual output is 0, error = -1).
○​ Update weights:

w1 = 0.9 + 0.5 × (−1) = 0.4

w2 = 0.9 + 0.5 × (−1) = 0.4

3.​ Third Instance (X1 = 1, X2 = 0):​

○​ Weighted sum: 1 × 0.4 + 0 × 0.4 = 0.4


○​ Output = 0 (no error, no weight update).
4.​ Fourth Instance (X1 = 1, X2 = 1):​

○​ Weighted sum: 1 × 0.4 + 1 × 0.4 = 0.8


○​ Output = 1 (correct, no weight update).

Feedforward with Updated Weights:

●​ After updating weights, reapply the process to all instances.


●​ First Instance (X1 = 0, X2 = 0):
○​ Weighted sum: 0 × 0.4 + 0 × 0.4 = 0
○​ Output = 0 (correct, no weight update).
●​ Second Instance (X1 = 0, X2 = 1):
○​ Weighted sum: 0 × 0.4 + 1 × 0.4 = 0.4
○​ Output = 0 (correct, no weight update).
●​ Third & Fourth Instances: Already classified correctly in the previous
round.

This process repeats for all training data until the model consistently classifies
instances correctly.

_____________________________________________________________________________
_

Activation Functions

●​ A mathematical function applied to a neuron's output.


●​ Introduces non-linearity, allowing neural networks to learn complex
patterns.
●​ Without it, networks behave like linear regression models and cannot
handle complex tasks.
●​ Determines neuron activation based on the weighted sum of inputs
and bias.

Need for Non-Linearity in Neural Networks

●​ Neurons rely on weights, biases, and activation functions for


learning.
●​ Backpropagation updates weights and biases based on errors.
●​ Activation functions provide gradients that help in efficient learning.
Why Activation Functions Are Necessary

Without Non-Linearity

●​ Neurons passing weighted sums directly keep the network linear.


●​ Multiple layers still act as a single-layer perceptron, limiting learning
ability.

With Non-Linearity

●​ Non-linear activation functions (e.g., ReLU) help in learning complex


patterns.
●​ Example (ReLU):

○​
○​ Hidden Layer:

○​ Output Layer:

​ ​

Types of Activation Functions

1. Linear Activation Function

●​ y = x, produces a straight-line output.


●​ Limitations: Cannot model complex patterns, so only used in output
layers for regression tasks.
Non-Linear Activation Functions

2. Sigmoid Function

●​ Formula:

●​ Range: 0 to 1 (useful for binary classification).


●​ Issues:
○​ Vanishing Gradient Problem: For large or small values of x,
gradients become very small, slowing learning.

3. Hyperbolic Tangent (Tanh) Function

●​ Formula:
●​ Range: -1 to 1 (zero-centered, better than Sigmoid).
●​ Problem: Still suffers from vanishing gradients.

4. Rectified Linear Unit (ReLU)

●​ Formula:
●​ Range: [0, ∞) (outputs non-negative values).
●​ Advantages:
○​ Simple, fast computation.
○​ Helps avoid the vanishing gradient problem.
●​ Issue: Can cause "dead neurons" (always outputting 0 for negative
values).

5. Leaky ReLU

●​ Modification of ReLU that allows small negative values instead of 0.


●​ Formula:
●​ Fixes the dead neuron problem, but choosing α is crucial.

6. Parametric ReLU (PReLU)

●​ Similar to Leaky ReLU, but learns α during training.


●​ Formula:

●​ Advantage: Optimized performance.


●​ Issue: Increases model complexity, risking overfitting.
7. Softmax Function

●​ Used in multi-class classification to convert outputs into


probabilities.
●​ Formula:

●​ Advantage: Helps handle multiple classes effectively.


●​ Issue: Computationally expensive for large datasets.

Impact of Activation Functions on Model Performance

●​ Training Speed: ReLU is faster; Sigmoid/Tanh can slow learning.


●​ Gradient Flow: ReLU allows deeper layers to learn better, while
Sigmoid/Tanh struggles.
●​ Model Complexity:
○​ Softmax: Best for multi-class problems.
○​ ReLU/Leaky ReLU: Good for hidden layers.

Choosing the Right Activation Function

Function Best For Limitations


Sigmoid Binary classification Vanishing gradient problem

Tanh Hidden layers, Still suffers from vanishing


zero-centered gradients

ReLU Hidden layers, fast training Dead neurons for negative values

Leaky ReLU Fixes dead neurons Choosing α\alpha is tricky

Softmax Multi-class classification Computationally expensive

_____________________________________________________________________________
_

Single Layer Perceptron and MultiLayer Perceptron

Single-Layer Perceptron (SLP)

●​ A basic neural unit that classifies data into two categories (Binary
Classifier).
●​ Works only for linearly separable problems (e.g., AND, OR).
●​ Uses a step function to give binary output (0 or 1).
●​ No hidden layers—just input and output layers.
●​ Uses a simple learning rule (no backpropagation).

Demonstration of Single Layer Perceptron using OR and AND Function

AND Function
Designing of Neuron using AND Function

OR Function
Multilayer Perceptron

●​ MLP is a type of neural network that moves data in the forward


direction only.
●​ It contains input, hidden, and output layers.
●​ All nodes are fully connected, and each node passes values only
forward.
●​ Uses the Backpropagation algorithm to improve accuracy during
training.
Working of MultiLayer Perceptron Neural Network

1.​ Input Layer:


○​ Represents features of the dataset.
○​ Passes vector input values to the hidden layer.
2.​ Hidden Layer:
○​ Each edge has a weight multiplied by the input variable.
○​ The weighted values from all nodes are summed together.
○​ An activation function identifies which nodes should activate.
3.​ Output Layer:
○​ Processes the activated values and generates the final output.
4.​ Error Calculation:
○​ The difference between predicted output and actual output is
calculated.
5.​ Backpropagation:
○​ The network adjusts weights to reduce error and improve
accuracy.

Designing of Non-Linear Problem using Multilayer Perceptron


Difference Between SLP, MLP, and Deep Learning Networks
Aspect Single-Layer Multi-Layer Perceptron Deep Learning
Perceptron (SLP) (MLP) Networks

Architecture One input layer, one Input, hidden, and Many layers (deep
output layer (no output layers architecture)
hidden layers)

Problem Solves only linearly Solves linear & Handles highly


Solvability separable problems non-linear problems complex tasks
(e.g., AND) (e.g., XOR)

Activation Step function (binary Non-linear functions Advanced activations


Function output) (ReLU, Sigmoid, Tanh) (Softmax, Leaky ReLU,
etc.)

Learning Uses Perceptron Uses Backpropagation Uses advanced


Algorithm Learning Rule, no and Gradient Descent optimization
backpropagation techniques (e.g.,
Adam, RMSprop)

Output Binary (0 or 1) Continuous (regression) Highly flexible


or multi-class output (text, images,
(classification) signals, etc.)

Applications Basic binary Complex tasks AI-based


classification (classification, regression, applications
image recognition) (computer vision, NLP,
robotics)

Complexity Simple and limited More complex and Highly complex and
powerful scalable

_____________________________________________________________________________
_

You might also like