CH 06 Introduction To Neural Networks
CH 06 Introduction To Neural Networks
Baikuntha Acharya
Senior Lecturer / Deputy Head,
Research Coordinator
Department of Electronics & Computer Engineering,
Sagarmatha Engineering College, Lalitpur
www.linkedin.com/in/baikunth2a
www.baikunthaacharya.com.np
Brain wins: Memory of real-world facts Computer wins: Memory of arbitrary details
2
© Baikuntha Acharya ([email protected])
Introduction to Neural Networks
✓ What is a neuron?
• A neuron is a nerve cell that transmits information through electrical and
chemical signals. It is the basic unit of the nervous system.
• Consists of a cell body, dendrites (receive signals), and an axon (sends signals).
• “. . . (neural nets) refer to machines that have a structure that, at some level,
reflects what is known of the structure of the brain . . .”
• neural networks
✓ Creation:
• 1890: William James - defined a neuronal process of learning
• 1911: Ramon y Cajal (1911) introduced the idea of a neuron (brain cell)
Promising Technology:
• 1943: McCulloch and Pitts - earliest mathematical models
• 1949: Hebbian learning by Donald Hebb - 'Neurons that fire together, wire together'
✓ Disenchantment:
1969: Minsky and Papert - perceptrons have severe limitations (non-linear XOR problem)
✓ Re-emergence:
• 1985: Multi-layer nets that use back-propagation
W1
Neuron
I W2
N
P Σ f(n) Outputs
U
T
S Wn-1 Activation
Wn Function
5
© Baikuntha Acharya ([email protected])
Introduction to Neural Networks (Cont..)
✓ Mathematical Formulation: 𝑌 = 𝑓 𝑊𝑋 + 𝑏
• where 𝑋 is the input vector, 𝑊 is the weight matrix, 𝑏 is the bias vector, and 𝑓 is the activation
function applied element-wise.
6
© Baikuntha Acharya ([email protected])
Introduction to Neural Networks (Cont..)
ANN Matrix Representation
7
© Baikuntha Acharya ([email protected])
Components of ANN
• One or More Hidden Layers: Layers between the input and output layers that
process the data. These layers contain neurons that apply weighted sums and
activation functions to the inputs they receive.
• Output Layer: Produces the final result (e.g., a classification label or a predicted
value). The number of neurons in this layer depends on the number of possible
outputs.
8
© Baikuntha Acharya ([email protected])
Components of ANN (Cont..)
4. Bias: A constant value added to the input of each neuron to allow the
network to better fit the data.
9
© Baikuntha Acharya ([email protected])
Activation Function
✓ Activation functions are mathematical equations applied to a
neuron's output to introduce non-linearity, enabling neural
networks to learn complex patterns and make decisions.
• Introduce Non-Linearity to ML Models:
• Prevents networks from collapsing into simple linear models.
10
© Baikuntha Acharya ([email protected])
Activation Function (Cont..)
Common Activation Functions
11
© Baikuntha Acharya ([email protected])
Working of ANN
How does ANN work?
• Backpropagation: This is the learning phase where the error from the output is
propagated backward through the network. The goal is to minimize the error by
adjusting the weights using optimization algorithms like Gradient Descent.
• The gradient (the rate of change of the loss) is calculated with respect to each weight,
and the weights are updated to minimize the loss.
13
© Baikuntha Acharya ([email protected])
Training of ANN
Concepts:
• Epoch: One complete cycle through the entire training dataset.
• Learning Rate: determines how much the weights are adjusted during each step of training
θ ← θ − η ⋅ ∇θMSE
θ represents the model parameters (weights and biases), η is the learning rate.
• Weight Update: Update parameters using gradient descent (or its variants).
15
© Baikuntha Acharya ([email protected])
ANN Architectures
Feedforward Neural Networks (FNN/MLP)
• Each neuron computes a weighted sum of inputs, adds a bias, and applies a non-linear
activation function.
✓ Key Features:
• Fully connected layers.
Output
✓ Applications: Inputs First Hidden Second Hidden Layer
layer Layer
16
© Baikuntha Acharya ([email protected])
ANN Architectures (Cont..)
Convolutional Neural Networks (CNN)
• Fully connected layers combine the learned features for final prediction.
✓ Key Features:
• Hierarchical feature extraction
from simple edges to complex
shapes.
✓ Applications:
• Image and video recognition,
object detection, and
segmentation.
17
© Baikuntha Acharya ([email protected])
ANN Architectures (Cont..)
Convolutional Neural Networks (CNN)
✓ Example: Handwritten digit prediction
• CNN architecture for digit recognition. The diagram represents a convolutional neural network (CNN)
processing a 28×28 grayscale image of a handwritten digit.
• Conv_1 & Conv_2: Two convolution layers with a 5×5 kernel and valid padding extract features.
• Max-Pooling: Reduces spatial dimensions using 2×2 pooling after each convolution layer.
• Flattening: Converts feature maps into a single vector for fully connected layers.
• Fully Connected Layers: Two dense layers with ReLU activation and dropout for classification.
18
© Baikuntha Acharya ([email protected])
ANN Architectures (Cont..)
Recurrent Neural Networks (RNN)
✓ RNNs are designed for sequential data, using loops to retain information
from previous inputs for context.
• Processes sequences by maintaining a hidden state updated at each time step.
• Variants like LSTM and GRU incorporate mechanisms to handle long-range dependencies.
✓ Key Features:
• Captures temporal information, and has memory of past information through internal loops.
✓ Applications:
• Natural language processing (NLP), speech recognition, and time-series forecasting.
19
© Baikuntha Acharya ([email protected])
ANN Architectures (Cont..)
Autoencoders
✓ Applications:
• Data compression, dimensionality reduction and feature extraction, anomaly detection, and
image denoising.
20
© Baikuntha Acharya ([email protected])
ANN Architectures (Cont..)
Generative Adversarial Networks (GANs)
✓ Applications:
• Image synthesis, style transfer, and data augmentation.
21
© Baikuntha Acharya ([email protected])
ANN Architectures (Cont..)
Auto-Encoder & GAN – Case Study
https://doi.org/10.3390/pr11020330
22
© Baikuntha Acharya ([email protected])
ANN Architectures (Cont..)
Transformers
✓ Key Features:
• Efficient handling of long-range dependencies.
✓ Transformers models:
• BERT, GPT, and T5
✓ Applications:
• NLP, translation, vision, bioinformatics.
23
© Baikuntha Acharya ([email protected])
Basic ANN Implementation – Python Keras
✓ What is Keras?
• Keras is an open-source deep learning library that provides a user-friendly API for
building and training neural networks. It runs on top of TensorFlow and simplifies
the process of developing AI models.
• Key Features:
• Easy-to-use and modular.
24
© Baikuntha Acharya ([email protected])
Basic ANN Implementation – Python Keras
Keras Implementation Guide
25
© Baikuntha Acharya ([email protected])
Basic ANN Implementation – Python Keras
Keras Implementation Guide (Cont..)
26
© Baikuntha Acharya ([email protected])
Basic ANN Implementation – Python Keras
Keras Implementation Guide (Cont..)
27
© Baikuntha Acharya ([email protected])
Basic ANN Implementation – Python Keras
Minimal Code Example
28
© Baikuntha Acharya ([email protected])
Advantages of ANN
29
© Baikuntha Acharya ([email protected])
Applications of ANN in Chemical Engineering
Process Optimization
30
© Baikuntha Acharya ([email protected])
Applications of ANN in Chemical Engineering
Process Optimization
31
© Baikuntha Acharya ([email protected])
Applications of ANN in Chemical Engineering
Process Optimization - Example
• Data Collection: Gather past reactor data, including yield and operating
conditions.
• Implementation: Apply the optimized conditions, monitor yield, and retrain the
ANN for continuous improvement.
32
© Baikuntha Acharya ([email protected])
Applications of ANN in Chemical Engineering
Process Optimization – Case Study
https://doi.org/10.3390/pr11020330
33
© Baikuntha Acharya ([email protected])
Applications of ANN in Chemical Engineering
Predictive Maintenance
2. Time-Series Analysis – Use ANN models ( or variations of ANN - RNNs, LSTMs) to detect
anomalies in sensor readings.
4. Decision Making – Generate alerts for preventive actions before failure occurs.
• Trend: Gradual increase in vibration, temperature, and pressure with a slight drop-
in flow rate suggests equipment wear or imbalance.
✓ A predictive model (e.g., LSTM) trained on the data can forecast that
these trends will continue, signaling the need for maintenance.
35
© Baikuntha Acharya ([email protected])
Applications of ANN in Chemical Engineering
Predictive Maintenance – Case Study
https://doi.org/10.1016/j.eswa.2021.114598 36
© Baikuntha Acharya ([email protected])
Clustering in ML
37
© Baikuntha Acharya ([email protected])
Unsupervised: K-Means Clustering
• Let 𝑋 = {𝑥1, 𝑥2, … , 𝑥𝑛} be the dataset with n data points. The objective of K-Means is to
minimize the following cost function:
𝐾
2
𝐽 = 𝑥 − μ𝑖
𝑖=1 𝑥∈𝐶𝑖
38
© Baikuntha Acharya ([email protected])
Unsupervised: K-Means Clustering (Cont..)
Algorithm Steps
1. Initialization:
• Choose k initial centroids randomly (or via methods like k-means for better initialization).
2. Distance Calculation:
• For each data point, calculate distance between datapoints using Euclidean distance.
𝑛
2
𝑑 𝑥, μ𝑖 = 𝑥𝑗 − μ𝑖𝑗
𝑗=1
3. Cluster Assignment
• Assign each data point to the closest centroid.
4. Update Step:
• Recalculate the centroids as the mean of all points assigned to each cluster.
1
μ𝑖 = 𝑥
𝐶𝑖
𝑥∈𝐶𝑖
5. Convergence Check:
• Repeat the assignment and update steps until centroids do not change significantly or a
maximum number of iterations is reached.
39
© Baikuntha Acharya ([email protected])
Unsupervised: K-Means Clustering (Cont..)
Numerical Example
✓ Use the k-means algorithm and Euclidean distance to cluster the following
8 examples into 3 clusters:
• A1=(2,10), A2=(2,5), A3=(8,4), A4=(5,8), A5=(7,5), A6=(6,4), A7=(1,2), A8=(4,9).
• Suppose that the initial seeds (centers of each cluster) are A1, A4 and A7.
40
© Baikuntha Acharya ([email protected])
Unsupervised: K-Means Clustering (Cont..)
Numerical Example (Cont..)
✓ Epoch 1
41
© Baikuntha Acharya ([email protected])
Unsupervised: K-Means Clustering (Cont..)
Numerical Example (Cont..)
4. Next Epoch - 2:
• Clusters → C1: {A1, A8}, C2: {A3, A4, A5, A6}, C3: {A2, A7}
• Centroids → C1 (3,9.5), C2 (6.5,5.25), C3 (1.5,3.5)
5. Next Epoch - 3:
• Clusters → C1: {A1, A4, A8}, C2: {A3, A5, A6}, C3: {A2, A7}
• Centroids → C1 (3.66,9), C2 (7,4.33), C3 (1.5,3.5)
42
© Baikuntha Acharya ([email protected])
Unsupervised: K-Means Clustering (Cont..)
How to choose the value of K?
• Silhouette Score
• Measures how well-separated the clusters are.
43
© Baikuntha Acharya ([email protected])
K-means Clustering Example
Anomaly Detection - Example
45
© Baikuntha Acharya ([email protected])