UNIT – V
1]. Neural Networks
A Neural Network is a machine learning model inspired by the structure and
function of the human brain. It consists of layers of connected units called
neurons, which process input data and learn patterns through weighted
connections.
Basic Structure
Component Description
Input Layer Takes in raw features
Hidden Layer(s) Learns representations (1 or more layers)
Output Layer Produces prediction (e.g., classification or regression)
Weights & Bias Learnable parameters
Activation Function Introduces non-linearity
Biological vs Artificial Neuron :
Biological Neuron Artificial Neuron (Perceptron)
Dendrites (inputs) Input features x1,x2,...,xn
Soma (cell body) Weighted sum z=∑wi xi + b
Axon (output signal) Output y=ϕ(z) (activation function)
Feedforward Neural Network (FNN)
• Information flows one way: Input → Hidden → Output
• Trained using backpropagation and gradient descent
Mathematics of a Single Neuron
Where:
• Xi : input
• Wi : weight
• b : bias
• ϕ : activation function
Activation Functions
Function Formula Use Case
Sigmoid Binary classification
ReLU Most common in hidden layers
Tanh Range (-1, 1)
Softmax Multiclass classification
Learning Process (Training)
1. Forward Propagation: Compute output from input
2. Loss Function: Measure error (e.g., MSE, cross-entropy)
3. Backpropagation: Compute gradients
4. Update Weights: Using Gradient Descent
Loss Functions
Task Type Loss Function
Regression Mean Squared Error (MSE)
Classification Cross Entropy Loss
Types of Neural Networks :
Type Application
Feedforward NN Basic structure, classification
CNN Image processing, vision
RNN / LSTM Sequence data, NLP
Autoencoders Feature learning, compression
GANs Data generation
Advantages
• Can model complex, non-linear functions
• Highly flexible and scalable
• Performs well on large datasets
Disadvantages
• Requires large amount of data
• Computationally expensive
• Prone to overfitting if not regularized
Real-World Applications
• Image and speech recognition
• Natural language processing (NLP)
• Medical diagnosis
• Self-driving cars
• Fraud detection
2]. Biological Motivation for Neural Networks
Why Biological Motivation?
Artificial Neural Networks (ANNs) were originally inspired by how the human
brain works. The goal was to build machines that learn and make decisions like
biological brains by mimicking neurons and synaptic connections.
Biological Neuron vs Artificial Neuron
Biological Neuron Artificial Neuron (Perceptron)
Dendrites: Receive signals from Inputs: Features like x1,x2,...,xn
other neurons
Cell Body (Soma): Processes signals
Weighted Sum:
Axon: Sends signal to next neuron Output: y=ϕ(z) ,an activation function
Synapse: Strength of connection Weight (w): Determines importance
of input
How the Brain Learns (Biological View)
• Synapses strengthen/weaken with use (Hebbian learning: “cells that fire
together, wire together”)
• Learning happens through adjusting connection strengths
• Neurons fire if input exceeds a threshold
Mapping to Artificial Neural Networks
Biological Concept ANN Equivalent
Neurons Units or nodes in a layer
Synapse strength Weights www
Firing threshold Activation function
Neural activation Output of the neuron
Learning Adjusting weights (via backpropagation)
3]. Limitations of Machine Learning
While Machine Learning (ML) is powerful, it has several limitations that affect
its reliability, scalability, and applicability in real-world scenarios.
1. Requires Large Amounts of Data
• ML models need large, diverse, and high-quality datasets to perform
well.
• With insufficient or biased data, models can underperform or give
incorrect predictions.
2. Data Quality Issues
• ML is highly sensitive to:
o Noisy data
o Missing values
o Imbalanced datasets
• Poor data = poor model, regardless of algorithm complexity.
3. Interpretability / Explainability
• Many ML models (e.g., neural networks, ensemble methods) are black
boxes.
• Hard to explain why a model gave a specific output.
• A challenge in critical applications like healthcare and law.
4. Generalization Problems
• A model trained on a specific dataset may:
o Overfit: Too specific to training data
o Underfit: Too simplistic to capture patterns
• Can struggle to generalize to new/unseen data.
5. High Computational Cost
• Training complex models (e.g., deep learning) requires:
o High processing power
o GPUs/TPUs
o Large memory
• Not feasible on low-resource devices.
6. Dependency on Feature Engineering
• In traditional ML, model performance heavily depends on:
o Selecting the right features
o Manual data preprocessing
• This process is time-consuming and domain-dependent.
7. Vulnerable to Adversarial Attacks
• Small, unnoticeable changes in input can mislead ML models, especially
in image and NLP tasks.
• Security concern in areas like self-driving cars, facial recognition.
8. Ethical and Bias Concerns
• Models can inherit bias from training data (e.g., gender/racial bias).
• Raises ethical concerns and fairness issues in decision-making.
9. Lack of Causal Understanding
• ML learns correlations, not causal relationships.
• Cannot answer “why” something happened — only predicts “what” will
happen.
10. Continuous Maintenance
• ML models may become outdated due to changing data patterns (called
data drift).
• Requires frequent re-training, monitoring, and tuning.
4]. Deep Learning
• Deep Learning is a subset of Machine Learning that uses artificial neural
networks with multiple layers (deep architectures) to model complex
patterns in data.
• Deep Learning automatically learns hierarchical representations from
data — from low-level features to high-level concepts.
Basic Architecture
• Input Layer: Raw data input
• Hidden Layers (multiple): Feature extraction
• Output Layer: Prediction (class, value, etc.)
Each layer is made of neurons (units), with weights and activation functions.
How Deep Learning Works
1. Input: Raw data (e.g., images, text, sound)
2. Forward Pass: Compute weighted sum & activation
3. Loss Calculation: Compare prediction vs. true value
4. Backpropagation: Adjust weights using gradient descent
Popular Deep Learning Architectures :
Model Description Use Case
DNN Basic deep neural network Tabular data, general tasks
CNN Convolutional Neural Network Image recognition, vision
RNN / LSTM Recurrent Neural Network Sequence data, text, time
(memory) series
Autoencoders Learns compressed Anomaly detection,
representation denoising
GANs Generative Adversarial Image generation, deep
Networks fakes
Transformers Attention-based models NLP (BERT, GPT),
translation
Applications of Deep Learning
• Image & speech recognition (e.g., Google Photos, Siri)
• NLP: translation, summarization, chatbots (e.g., ChatGPT)
• Medical image analysis
• Autonomous vehicles
• Recommender systems
Deep Learning vs Machine Learning
Feature Machine Learning Deep Learning
Feature Manual Automatic (learned from data)
engineering
Data requirement Works with small Needs large datasets
data
Performance Good State-of-the-art on complex
tasks
Interpretability Easier Harder to interpret
Advantages
• Best performance on unstructured data (images, text, audio)
• Learns complex patterns automatically
• Highly scalable with GPUs
Limitations
• Needs huge data and computing power
• Often a black-box
• Requires tuning many hyperparameters
5]. Convolutional Neural Network
• It's a type of artificial neural network that is particularly good at analyzing
visual imagery. In simpler terms, it's a deep learning algorithm specifically
designed to process and understand images.
6]. Recurrent Neural Networks (RNN)
• Recurrent Neural Networks (RNNs) are a class of artificial neural networks
where connections between nodes can create cycles, allowing the output
from previous steps to be used as input for the current step. This structure
enables RNNs to maintain a form of memory of previous inputs, making
them particularly effective for sequential data tasks like time series
prediction, speech recognition, and Natural Language Processing (NLP).
Hidden State Update
• ht: The hidden state at the current time step t. It's like the RNN's memory
of the sequence up to this point.
• xt: The current piece of input data (e.g., a word or character).
• ht-1: The hidden state from the previous time step t-1. This is the memory
from before.
• Wh: A weight matrix that helps combine the current input xt and the
previous hidden state ht-1.
• bh: A bias term that adds a bit of adjustment to the calculation.
• activation: An activation function (like tanh or ReLU) that processes the
combined input.
Output Calculation
• yt: The output at the current time step t.
• ht: The hidden state we just calculated.
• Wy: A weight matrix used to transform the hidden state into the final
output.
• by: A bias term added to the output.
Backward pass:
Gradient Descent
n is the learning rate, LOW is the gradient of the loss with respect to the weight
W, and Wnew is the updated weight.
7]. Common Use Cases of RNNs
1. Natural Language Processing (NLP)
Task Description
Text Generation Generate next word or sentence (e.g., story
generation)
Sentiment Analysis Predict if text expresses positive/negative feeling
Language Translate sentence from one language to another
Translation
Speech-to-Text Convert spoken language to written text
RNNs capture context and word order — vital for understanding language.
2. Time Series Forecasting :
• Stock price prediction
• Weather forecasting
• Energy consumption prediction
RNNs learn from past time steps to forecast future values.
3. Music & Audio Modeling :
• Music generation
• Voice recognition
• Speech synthesis
Music and sound are inherently sequential; RNNs model time-dependent
patterns.
4. Video & Frame Prediction :
• Human action recognition in videos
• Next-frame prediction in a video sequence
RNNs process video frames as time steps, learning temporal dependencies.
5. Chatbots & Virtual Assistants :
• Power AI assistants like Siri, Alexa, and Google Assistant
• Understand and respond to user conversations
RNNs help maintain context over a conversation (memory of past inputs).