Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
6 views19 pages

Neural Networks Lecture Notes

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views19 pages

Neural Networks Lecture Notes

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Neural Networks

Lecture Notes​
Unit 1

Introduction to Perceptron

The Perceptron, introduced by Frank Rosenblatt in 1957, is a foundational algorithm in


artificial neural networks (ANNs) and a simplified model of a biological neuron. It serves as a
building block for more complex neural network architectures.

Key Components and Functionality:

​ Inputs (x):​
The perceptron receives multiple input signals, each representing a feature of the data.
​ Weights (w):​
Each input is associated with a weight, indicating its importance or contribution to the output.
Higher weights signify greater influence.
​ Bias (b):​
A bias term is added to the weighted sum of inputs. It allows the perceptron to adjust its activation
threshold and handle cases where all inputs are zero.
​ Weighted Sum:​
The inputs are multiplied by their respective weights and summed, along with the bias, to
calculate a net input.

Net Input = (x1 * w1) + (x2 * w2) + ... + (xn * wn) + b

Activation Function: The net input is then passed through an activation function, typically a step
function in a classic perceptron. This function determines the perceptron's output, usually a binary
value (e.g., 0 or 1), representing a classification decision.

Output = 1 if Net Input >= Threshold, else 0

Learning Process:

The perceptron learns by adjusting its weights and biases based on the difference between
its predicted output and the desired output (in supervised learning). This adjustment, often
guided by the Perceptron Learning Rule, aims to reduce classification errors over time.
Limitations:

While crucial for understanding ANNs, single-layer perceptrons are limited to classifying
linearly separable data. They cannot solve problems where a straight line (or hyperplane)
cannot separate the different classes in the dataset. This limitation led to the development of
multi-layer perceptrons (MLPs), which can handle non-linear relationships and form the basis
of modern deep learning.
Multilayer Perceptron/ANN

A Multilayer Perceptron (MLP) is a type of Artificial Neural Network (ANN) characterized by


its feedforward structure and the presence of multiple layers of interconnected neurons. It is
a fundamental building block in deep learning and is widely used for various tasks like
classification, regression, and pattern recognition.

Architecture:

​ Input Layer:​
This layer receives the raw input data. Each neuron in the input layer corresponds to a feature in
the input dataset.
​ Hidden Layers:​
MLPs contain one or more hidden layers situated between the input and output layers. These
layers are responsible for transforming and extracting features from the input data through a
series of weighted connections and activation functions. The neurons in a hidden layer do not
directly interact with each other but process information from the preceding layer.
​ Output Layer:​
This layer produces the final output of the network, which can be a classification prediction, a
regression value, or other desired outputs depending on the task.

Functionality:

​ Weighted Connections:​
Each connection between neurons in adjacent layers has an associated weight, representing the
strength or importance of that connection.
​ Activation Functions:​
Neurons in the hidden and output layers typically apply a non-linear activation function to their
weighted sum of inputs. This non-linearity allows MLPs to learn complex, non-linear relationships
within the data, overcoming the limitations of simpler models like single-layer perceptrons.
​ Training:​
MLPs are trained using algorithms like backpropagation, which adjust the weights and biases of
the network iteratively to minimize the difference between the network's predictions and the actual
target values. This process involves calculating the error, propagating it backward through the
network, and updating the parameters to reduce the error in subsequent iterations.
Advantages:

Ability to learn complex, non-linear patterns in data.


Universal approximator: With sufficient hidden neurons, an MLP can approximate any continuous
function.
Versatility in application across various domains.

Limitations:

Can be computationally intensive to train, especially with large datasets and deep architectures.
Prone to overfitting if not properly regularized or if the model complexity is too high relative to the
data.
Requires significant amounts of data for optimal performance and generalization.
Black Box Techniques
In the context of neural networks, black box techniques refer to methods used to analyze or
test a model without needing to understand its internal structure or parameters. These
techniques treat the neural network as a "black box," focusing on its inputs and outputs
rather than its internal workings. This approach is particularly useful when dealing with
complex models where detailed analysis is difficult or impractical.

Focus on Inputs and Outputs:​


Black box techniques analyze a neural network's behavior by observing the relationship between its
inputs and the resulting outputs.

No Internal Knowledge Required:​


They don't require knowledge of the network's architecture, weights, or activation functions.

Examples:

Black Box Testing: This involves testing the network's functionality based on its inputs and outputs,
without examining the internal code or structure.

Surrogate Models: These are simpler, interpretable models trained to mimic the behavior of the black
box neural network, providing insights into its decision-making process.

Adversarial Attacks: These techniques involve crafting specific inputs that cause the network to make
incorrect predictions, revealing vulnerabilities and weaknesses.

Sensitivity Analysis: This involves studying how changes in input values affect the output, helping to
understand which input features are most important.
Why Use Black Box Techniques?

Explainability:​
Black box methods are crucial for understanding the behavior of complex neural networks, especially
when they are used in high-stakes applications.

Debugging:​
They can help identify errors or biases in the network's predictions by analyzing its behavior under
different inputs.

Model Evaluation:​
Black box testing provides a way to evaluate the overall performance and robustness of a neural
network.

Security:​
Adversarial attacks, a type of black box technique, can be used to assess the security and robustness of
neural networks against malicious inputs.
Intuition of neural networks

Neural networks are computational systems inspired by the structure and function of the
human brain, designed to recognize patterns and learn from data. They achieve this by using
interconnected nodes, or neurons, organized in layers. These layers process information
through mathematical operations, adjusting the strength of connections (weights) to
progressively refine their understanding of the data and improve their performance on
specific tasks.
1. Inspired by the Brain:

Neural networks are modeled after the biological structure of the brain, with interconnected neurons that
process and transmit information.

Each "artificial neuron" receives inputs, applies weights, sums them, and then passes the result through
an activation function to produce an output.

2. Layers of Processing:

Information flows through the network in layers: input, hidden (one or more), and output.

The input layer receives raw data, and the hidden layers perform computations, progressively extracting
features and patterns.

The output layer produces the final result or prediction.

3. Learning through Weight Adjustments:

Neural networks learn by adjusting the weights associated with the connections between neurons.

During training, the network compares its output to the desired output and adjusts the weights to
minimize the error.

This iterative process allows the network to "learn" the underlying patterns and relationships in the data.

4. Intuition and Pattern Recognition:

Neural networks excel at pattern recognition and making predictions based on learned patterns.

They can identify complex relationships and make predictions even when the underlying data is noisy or
incomplete.
The network essentially develops an "intuition" for the data through repeated exposure and learning,
allowing it to generalize to new, unseen data.

5. Example:

Imagine teaching a child to recognize a dog. You show them pictures of different dogs, and over time,
they learn to associate the visual characteristics (ears, fur, tail, etc.) with the concept of a dog.

A neural network does something similar, but on a much larger scale with vast amounts of data and
complex mathematical operations.
Perceptron

Perceptrons were developed way back in the 1950s-60s by the scientist Frank Rosenblatt,
inspired by earlier work from Warren McCulloch and Walter Pitts. While today we use other
models of artificial neurons, they follow the general principles set by the perceptron. As you
can see, the network of nodes sends signals in one direction. This is called a feed-forward
network. The figure depicts a neuron connected with n other neurons and thus receives n
inputs (x1, x2, ….. xn). This configuration is called a Perceptron.
Calculation of new weights

In neural networks, updating weights is typically done through a process called


backpropagation, which involves adjusting the weights based on the error between the
predicted output and the actual output. The core idea is to calculate how much each weight
contributed to the error and then change the weight in the opposite direction of that
contribution, effectively reducing the error. This process is repeated iteratively with training
data until the network achieves a desired level of accuracy.
1. Forward Propagation:

Input data is fed into the neural network.

The data is processed through each layer, with each neuron performing a weighted sum of its inputs,
adding a bias, and then applying an activation function.

This process continues until the output layer produces a prediction.

2. Error Calculation:

The predicted output is compared to the actual (target) output.

An error function (e.g., mean squared error) quantifies the difference between the prediction and the
target.

3. Backpropagation:

Calculate error gradients: The error is propagated backward through the network, calculating the
gradient (rate of change) of the error with respect to each weight.

Update weights: Each weight is updated using the following formula:

●​ new_weight = old_weight - learning_rate * gradient


●​ learning_rate is a hyperparameter that controls the step size of the update.
●​ The gradient indicates the direction of steepest increase in the error. By moving in the
opposite direction (subtracting the gradient), we aim to decrease the error.

Update biases: Biases are updated similarly to weights, using their respective gradients.
4. Iteration:

Steps 1-3 are repeated for each data point in the training set.

This process is typically repeated for multiple passes (epochs) through the training data.

The learning rate can be adjusted during training (e.g., decreasing it over time) to fine-tune the learning
process.

Key Concepts:

​ Learning Rate:​
A crucial hyperparameter that determines the magnitude of weight updates during
backpropagation. A small learning rate can lead to slow convergence, while a large learning rate
can cause the training to oscillate and potentially diverge.
​ Activation Functions:​
Non-linear functions applied to the weighted sum of inputs in each neuron. They introduce
non-linearity into the network, enabling it to learn complex patterns. Common examples include
sigmoid, ReLU, and tanh.
​ Error Functions:​
Functions that quantify the difference between the predicted and actual outputs. The choice of
error function depends on the specific task (e.g., regression, classification).
​ Gradient Descent:​
The optimization algorithm used to minimize the error function by iteratively adjusting the weights.
Unit 2

Non-linear boundaries in MLP

The Limitations of Single Layer Perceptron: Linear Decision Boundaries in


Non-Linear Data

The single layer perceptron, a foundational neural network model, has played a significant
role in the development of machine learning. However, it has a critical limitation: it can only
create linear decision boundaries. This constraint restricts its ability to capture or build
non-linear decision boundaries, which are essential for solving complex classification
problems.
In a single layer perceptron, the input data is multiplied by weights and passed through an
activation function. The resulting output is compared to a threshold value to make
predictions. This process creates a decision boundary that separates different classes in the
input space. However, this decision boundary is always linear, meaning it can only classify
data that is linearly separable.

Linearly separable data consists of classes that can be accurately divided by a straight line
or plane. However, real-world data often contains intricate non-linear relationships. Imagine a
scenario where the data points of two classes are intertwined in a complex manner, such as
concentric circles or intertwined spirals. A single layer perceptron would fail to find an
appropriate linear decision boundary to separate such classes accurately.

How Multilayer Perceptron Captures Non-Linearity in Data


To overcome this limitation, the multilayer perceptron (MLP) was introduced. MLP networks
contain multiple layers, including hidden layers, which introduce non-linear transformations
through activation functions. This allows the network to capture complex non-linear
relationships between input features and output classes. By incorporating multiple layers and
non-linear activation functions, MLP networks can build decision boundaries that can flexibly
adapt to non-linear data patterns.
The ability of MLP networks to capture non-linear decision boundaries has greatly expanded
the possibilities of machine learning. It enables us to tackle a wide range of complex tasks,
such as image recognition, natural language processing, and speech recognition. MLP
networks have demonstrated their effectiveness in handling data with intricate non-linear
relationships, offering superior accuracy and predictive power compared to single layer
perceptrons.

Integration function

Integration function typically refers to the process of combining inputs and weights to
produce an output. This is a core part of how neural networks learn and make predictions.
The integration function, often combined with an activation function, determines how the
network processes information.
1. Input and Weights:

➢​ Neural networks consist of interconnected nodes (neurons) organized in layers.


➢​ Each connection between neurons has an associated weight, representing the strength of
that connection.
➢​ When a neuron receives input, it multiplies each input value by its corresponding weight.

2. Summation:

➢​ The weighted inputs are then summed together. This summation is a fundamental part of
the integration process.
➢​ The sum represents the combined influence of all the inputs on that particular neuron.
3. Activation Function:

➢​ The sum is then passed through an activation function.


➢​ The activation function introduces non-linearity into the network, allowing it to learn
complex patterns.
➢​ Common activation functions include ReLU, sigmoid, and tanh, each with its own
characteristics.

4. Output:

➢​ The output of the activation function is the neuron's output, which then becomes an input
for the next layer of neurons (or the final output of the network).

Example:

Let's say a neuron has two inputs, x1 and x2, with corresponding weights w1 and w2. The
integration process would be:

●​ Calculate the weighted inputs: (x1 * w1) and (x2 * w2).


●​ Sum the weighted inputs: (x1 * w1) + (x2 * w2).
●​ Pass the sum through an activation function.
●​ The result is the neuron's output.

Types of Integration in Neural Networks:

While the summation of weighted inputs is the most common form of integration, other
methods exist, such as:

Convolutional Operations:​
Used in convolutional neural networks (CNNs) to extract features from images by sliding a filter across
the input.

Recurrent Connections:​
Used in recurrent neural networks (RNNs) to process sequential data by feeding the output of a layer
back into itself.

You might also like