DeepLearning LabManual24 25
DeepLearning LabManual24 25
Certificate
Seal of
Institution
INSTRUCTION FOR STUDENTS
Students shall read the points given below for understanding the theoretical concepts
and practical applications.
1) Listen carefully to the lecture given by teacher about importance of subject,
curriculum philosophy learning structure, skills to be developed, information
about equipment, instruments, procedure, method of continuous assessment,
tentative plan of work in laboratory and total amount of work to be done in a
semester.
2) Student shall undergo study visit of the laboratory for types of equipment,
instruments, software to be used, before performing experiments.
3) Read the write up of each experiment to be performed, a day in advance.
4) Organize the work in the group and make a record of all observations.
5) Understand the purpose of experiment and its practical implications.
6) Write the answers of the questions allotted by the teacher during practical hours
if possible or afterwards, but immediately.
7) Student should not hesitate to ask any difficulty faced during conduct of
practical/exercise.
8) The student shall study all the questions given in the laboratory manual and
practice to write the answers to these questions.
9) Student shall develop maintenance skills as expected by the industries.
10) Student should develop the habit of pocket discussion/group discussion related
to the experiments/exercises so that exchanges of knowledge/skills could take
place.
11) Student shall attempt to develop related hands-on-skills and gain confidence.
12) Student shall focus on development of skills rather than theoretical or codified
knowledge.
13) Student shall visit the nearby workshops, workstation, industries, laboratories,
technical exhibitions, trade fair etc. even not included in the Lab manual. In
short, students should have exposure to the area of work right in the student hood.
14) Student shall insist for the completion of recommended laboratory work,
industrial visits, answers to the given questions, etc.
15) Student shall develop the habit of evolving more ideas, innovations, skills etc.
those included in the scope of the manual.
16) Student shall refer technical magazines, proceedings of the seminars, refer
websites related to the scope of the subjects and update his knowledge and skills.
17) Student should develop the habit of not to depend totally on teachers but to
develop self-learning techniques.
18) Student should develop the habit to react with the teacher without hesitation with
respect to the academics involved.
19) Student should develop habit to submit the practicals, exercise continuously and
progressively on the scheduled dates and should get the assessment done.
20) Student should be well prepared while submitting the write up of the exercise.
This will develop the continuity of the studies and he/she will not be over loaded
at the end of the term.
GUIDELINES FOR TEACHERS
Teachers shall discuss the following points with students before start of practicals of the
subject.
1) Learning Overview: To develop better understanding of importance of the
subject. To know related skills to be developed such as Intellectual skills and
Motor skills.
2) Learning Structure: In this, topic and sub topics are organized in systematic way
so that ultimate purpose of learning the subject is achieved. This is arranged in
the form of fact, concept, principle, procedure, application and problem.
3) Know your Laboratory Work: To understand the layout of laboratory,
specifications of equipment/Instruments/Materials, procedure, working in
groups, planning time ets. Also to know total amount of work to be done in the
laboratory.
4) Teaching shall ensure that required equipments are in working condition before
start of experiment, also keep operating instruction manual available.
5) Explain prior concepts to the students before starting of each experiment.
6) Involve students activity at the time of conduct of each experiment.
7) While taking reading/observation each student shall be given a chance to perform
or observe the experiment.
8) If the experimental set up has variations in the specifications of the equipment,
the teachers are advised to make the necessary changes, wherever needed.
9) Teacher shall assess the performance of students continuously as per norms
prescribed by university of Mumbai and guidelines provided by IQAC.
10) Teacher should ensure that the respective skills and competencies are developed
in the students after the completion of the practical exercise..
11) Teacher is expected to share the skills and competencies are developed in the
students.
12) Teacher may provide additional knowledge and skills to the students even though
not covered in the manual but are expected from the students by the industries.
13) Teachers shall ensure that industrial visits if recommended in the manual are
covered.
14) Teacher may suggest the students to refer additional related literature of the
Technical papers/Reference books/Seminar proceedings, etc.
15) During assessment teacher is expected to ask questions to the students to tap their
achievements regarding related knowledge and skills so that students can prepare
while submitting record of the practicals. Focus should be given on development
of enlisted skills rather than theoretical /codified knowledge.
16) Teacher should enlist the skills to be developed in the students that are expected
by the industry.
17) Teacher should organize Group discussions /brain storming sessions / Seminars
to facilitate the exchange of knowledge amongst the students.
18) Teacher should ensure that revised assessment norms are followed
simultaneously and progressively.
19) Teacher should give more focus on hands on skills and should actually share the
same.
20) Teacher shall also refer to the circulars related to practicals supervise and
assessment for additional guidelines.
Student’s Progress Assessments
Student Name: Roll No.:
Class/Semester: BE CS/SEM-VIII Academic Year: 2024-2025
Course Name: Deep Learning Course Code: CSDO8011
Assessment Parameters for Practicals/Mini Project/Assignments
Criteria for Grading Total
Exp. Average Covered
No.
Title of Experiment PE KT DR DN PL (out of
(out of 3) COs
(Out of 3) (Out of 3) (Out of 3) (Out of 3) (Out of 3) 15)
To implement McCulloch Pitts model
1
for binary logic function. CO1
To implement the Perceptron
2
algorithm to simulate any logic gate. CO1
To implement Stochastic Gradient
Descent learning algorithms to learn
3
the Supervised-layer feedforward
CO1
neural network parameters.
To implement a backpropagation
4 algorithm to train a DNN with at least CO2
2 hidden layers.
To design the architecture and
5 implement the autoencoder model for CO3
Image Compression
To design the architecture and
6 implement a CNN model for digit CO4
recognition applications.
To design and implement an LSTM
7
For Sentiment Analysis.
CO5
Criteria for Grading – Preparedness and Efforts(PE),Knowledge of tools(KT), Debugging and results(DR),
Documentation(DN), Punctuality & Lab Ethics(PL)
CO6
Criteria for Grading – Knowledge regarding design and analysis of basic electronics circuits(KD), Working in a group
(WG), Presentation skill (PS), Time Management (TM), Lifelong learning (LL), Ethics (ET)
Criteria for Grading Total (out
Covered
Average
COs
Assignments TS OM NT IS of12)
(out of 3)
(Out of 3) (Out of 3) (Out of 3) (Out of 3)
Assignment No. 1 CO6
Assignment No. 2 CO1, CO2, CO3
Assignment No. 3 CO4,CO5, CO6
Average Marks
Criteria for Grading –Timely submission(TS), Originality of the material(OM), Neatness(NT), Innovative solution(IS)
Grades – Meet Expectations(3 Marks), Moderate Expectations (2 Marks), Below Expectations (1 Mark)
Student’s Signature Subject In-charge Head of Department
RECORD OF PROGRESSIVE ASSESSMENTS
Student Name: Roll No.: (BE CS SEM-VIII)
Course Name: Deep Learning Course Code: CSDO8011
Assessment of Experiments (A)
Sr. Teacher's
Page Date of Date of Assessment CO
Name of Experiments Signature
no. No. Performance Submission (out of 15) and Remark Covered
To implement McCulloch Pitts model for binary
1 CO1
logic function.
To implement the Perceptron algorithm to
2 simulate any logic gate.
CO1
To implement Stochastic Gradient Descent
3 learning algorithms to learn the Supervised- CO1
layer feedforward neural network parameters.
To implement a backpropagation algorithm to
4
train a DNN with at least 2 hidden layers. CO2
To design the architecture and implement the
5
autoencoder model for Image Compression CO3
To design the architecture and implement a
6 CNN model for digit recognition applications.
CO4
To design and implement an LSTM For
7 CO5
Sentiment Analysis.
8 Deep Learning Application CO6
Average Marks (Out of 15)
Converted Marks (Out of 10) (A) --
Assessment of Mini Project (B)
Teacher's
Page Date of Date of Assessment CO
Title of Mini Project Signature and
No. Performance Submission (out of 18) Remark Covered
CO6
Converted Marks (Out of 5) (B)
Assessment of Assignments (C)
Teacher's
Sr. Page Date of Date of Assessment Signature and
CO
no. Assignment No. Display Completion (Out of 12) Remark Covered
1 Assignment No.1 CO6
2 Assignment No.2 CO1, CO2,
CO3
3 Assignment No.3 CO4,CO5,
CO6
Average Marks (Out of 12)
Converted Marks (Out of 5) (C)
Assessments of Attendance (D)
ECCF Theory Attendance BEL Practical Attendance AVG. Attendance
Attendance Marks (D)
TH (out of) TH attend. TH % PR (out of) PR Attend. PR % % (TH+PR) (Out of 5)
Create, select, and apply appropriate techniques, resources, and modern engineering and IT
Modern tool
PO-5 usage
tools including prediction and modelling to complex engineering activities with an
understanding of the limitations.
Apply reasoning informed by the contextual knowledge to assess societal, health, safety,
The engineer
PO-6 and society
legal and cultural issues and the consequent responsibilities relevant to the professional
engineering practice.
Environment Understand the impact of the professional engineering solutions in societal and
PO-7 and environmental contexts, and demonstrate the knowledge of, and need for sustainable
sustainability development.
Apply ethical principles and commit to professional ethics and responsibilities and norms of
PO-8 Ethics
the engineering practice.
Individual and Function effectively as an individual, and as a member or leader in diverse teams, and in
PO-9 team work multidisciplinary settings.
Communicate effectively on complex engineering activities with the engineering community
PO-10 Communication and with society at large, such as, being able to comprehend and write effective reports and
design documentation, make effective presentations, and give and receive clear instructions.
Project Demonstrate knowledge and understanding of the engineering and management
PO-11 management principles and apply these to one’s own work, as a member and leader in a team, to manage
and finance projects and in multidisciplinary environments.
Life-long Recognize the need for, and have the preparation and ability to engage in independent
PO-12 learning and life-long learning in the broadest context of technological change.
Program Specific Outcomes (PSOs) defined by the programme. Baseline-Rational Unified Process(RUP)
System Conceptualize the software and/or hardware systems, system components and
PSO-1 Inception and process/procedures through requirement analysis, modelling/design of the system using
Elaboration various architectural/design patterns, standard notations, procedures and algorithms.
System Implement the systems, procedures and processes using the state of the art technologies,
PSO-2 Construction
standards, tolls and programming paradigms.
System Testing Verify and validate the systems, procedures and processes using various testing and
PSO-3 and Deployment
verification techniques and tools.
Manage the quality through various product development strategies under revision, transition
Quality and and operation through maintainability, flexibility, testability, portability, reusability,
PSO-4 Maintenance
interoperability, correctness, reliability, efficiency, integrity and usability to adapt the system
to the changing structure and behaviour of the systems/environments
_
Student’s Signature
DEPARTMENT OF COMPUTER ENGINEERING
Course Objectives and Outcomes
Academic Year: 2024-2025 Class: BE Course Code: CSDO8011
Program: Computer Engineering Div: Course Name: Deep Learning
Department: Computer Engineering Sem.: VIII Faculty: Prof. Neelam Phadnis
Course Objectives:
Sr. No. Statement
To learn the fundamentals of Neural Network.
1
To gain an in-depth understanding of training Deep Neural Networks.
2
To acquire knowledge of advanced concepts of Convolution Neural Networks,
3 Autoencoders and Recurrent Neural Networks.
Students should be familiar with the recent trends in Deep Learning.
4
Course Outcomes:
CO's No. Abbre. Statement
CSDLO6013.1 CO1 Gain basic knowledge of Neural Networks.
Course Prerequisite:
Basic mathematics and Statistical concepts, Linear algebra, Machine Learning
Teaching and Examination Scheme:
Teaching Credits Assigned
Examination Scheme
Scheme (Hrs)
TW/
Theory Pract Tut Theory
Pract
Tut Total Theory
Oral
Internal End Exam TW Oral & Total
Assessment Sem. Duration Pract
4 2 - 4 1 - 5 Test 1 Test 2 Avg. Exam ( in Hrs)
20 20 20 80 03 25 25 --- 150
Term Work (Total 25 Marks) = (Experiments: 15 mark + Assignments: 05 mark + Attendance: 05
marks (TH+PR)).
DEPARTMENT OF COMPUTER ENGINEERING
Academic Year: 2024-2025 Class: BE Course Code: CSDO8011
Program: Computer Engineering Div: - Course Name: Deep Learning
Department: Computer Engineering Sem.: VIII Faculty: Prof. Neelam Phadnis
Conduct
6
Programme Outcome
PO-4 investigations of 2 - 2 2 2
complex problems
Environment and
PO-7 sustainability
- - - - - -
PO-8 Ethics - - - - 1 1
Individual and
PO-9 team work
- - - - 2 2
PO-10 Communication - - - - 1 1
Project
PO-11 management and - - - - - -
finance
System
PSO-2 Construction
- - - - - -
System Testing and
PSO-3 Deployment
- - - - - -
Quality and
PSO-4 Maintenance
- - - - - -
Judge your ability with regard to the following points by putting a (√), on the scale of 1 (lowest) to
5 (highest), based on the knowledge and skills you attained from this course.
Sr. 1 5
Your ability to 2 3 4
No. Lowest Highest
EXPERIMENT 1
AIM
Design Mc-Culloch Pitts model for AND, OR functions.
THEORY
The McCulloch-Pitts neural model, which was the earliest ANN model, has only two
types of inputs — Excitatory and Inhibitory. The excitatory inputs have weights of
positive magnitude and the inhibitory weights have weights of negative magnitude.
The inputs of the McCulloch-Pitts neuron could be either 0 or 1. It has a threshold
function as an activation function. So, the output signal yout is 1 if the input ysum is
greater than or equal to a given threshold value, else 0. The diagrammatic
representation of the model is as follows:
CODE
Exp1 Mc-Culloch and Pitt Model
import numpy as np
np.random.seed(seed=0)
I= np.random.choice([0,1], 3)# generate random vector I, sampling from {0
,1}
W = np.random.choice([-
1,1], 3) # generate random vector W, sampling from (-1,1}
print(f'Input vector: {I}, Weight vector: {W}')
#matrix of inputs
input_table= np.array([
[0,0], # both no
[0,1], # one no, one yes
[1,0], # one yes, one no
[1,1] # bot yes
])
print(f'input table:\n{input_table}')
dot = I@W
print(f'Dot product: {dot}')
def linear_threshold_gate(dot: int, T: float) -> int:
"Returns the binary threshold output"
if dot >= T:
return 1
else:
return 0
T = 1
activation = linear_threshold_gate(dot, T)
print(f'Activation: {activation}')
OUTPUT
CONCLUSION
Hence, we can conclude that we can learn and study about Mc-Culloch Pitts model for
binary logic functions.
EXPERIMENT 2
AIM
To implement Perceptron algorithm to simulate any logic gate.
THEORY
In deep learning, the perceptron algorithm is usually used as the building block of
neural networks. A neural network is composed of multiple layers of interconnected
perceptron’s. The input layer takes in input data, which is then passed through hidden
layers to the output layer. Each layer consists of multiple perceptron’s, and the weights
of the perceptron’s are learned through a process called backpropagation. Perceptron
algorithm is used in deep learning:
1. Input data is fed into the input layer of a neural network.
2. The input data is then passed through hidden layers of perceptron’s. Each
perceptron in the hidden layer takes in input from the previous layer, multiplies
it by its corresponding weight, and applies a non-linear activation function to
the result.
3. The output of the hidden layer is then passed through another layer of
perceptron’s until it reaches the output layer.
4. The output layer produces a prediction based on the input data.
5. The weights of the perceptron’s are learned through a process called
backpropagation. The error between the predicted output and the actual output
is calculated and used to update the weights in the network. This process is
repeated over multiple iterations until the network produces accurate
predictions.
Weight parameter represents the strength of the connection between units. This
is another most important parameter of Perceptron components. Weight is
directly proportional to the strength of the associated input neuron in deciding
the output. Further, Bias can be considered as the line of intercept in a linear
equation.
▪ Activation Function:
These are the final and important components that help to determine whether
the neuron will fire or not. Activation Function can be considered primarily as a
step function.
CODE
Exp2 Perceptron Algorithm
OUTPUT
CONCLUSION
Hence, we can conclude that we can learn and study about Perceptron algorithm to
simulate any logic gate.
EXPERIMENT 3
AIM
To implement Stochastic Gradient Descent learning algorithms to learn the
parameters of the
Supervised-layer feedforward neural network.
THEORY
SGD is a popular optimization algorithm used to train neural networks. The main idea
behind SGD is to iteratively adjust the model parameters in order to minimize a cost
function (also called a loss function) that measures the difference between the
predicted output of the network and the actual output.
In SGD, the cost function is computed for a small random subset of the training data
(known as a mini-batch), and the model parameters are adjusted based on the
gradient of the cost function with respect to the parameters. This process is repeated
multiple times, with different mini-batches of the data being used each time, until the
cost function is minimized.
One advantage of using SGD over other optimization algorithms is that it is
computationally efficient, especially when dealing with large datasets. By computing
the cost function on small subsets of the data, the gradient can be estimated more
quickly and the model can be updated more frequently. There are also several
variants of SGD, including momentum, Adagrad, RMSprop, and Adam, that are
commonly used in deep learning. These variants help to overcome some of the
limitations of standard SGD, such as slow convergence and difficulty in choosing an
appropriate learning rate.
CODE
Exp3 Stochastic Gradient Descent
import sys
sys.argv=['']
del sys
def sigmoid_activation(x):
#compute the derivative of the sigmoid activation value for a given
input
return 1/(1 + np.exp(-x))
def sigmoid_deriv(x):
#compute the derivative of the sigmoid function ASSUMING
# that the input "x" has already been passed through the sigmoid
#activation function
return x*(1-x)
# partition the data into training and testing splits using 50% of
# the data for training and the remaining 50% for testing
(trainX, testX, trainY, testY) = train_test_split(X, y, test_size = 0.5,
random_state = 42)
W+=-args["alpha"]* gradient
#update our loss history by taking the average loss across all
#batches
loss = np.average(epochLoss)
losses.append(loss)
# check to see if an update should be displayed
if epoch == 0 or (epoch + 1) % 5 == 0:
print("[INFO] epoch={}, loss = {:.7f}".format(int(epoch + 1),
loss))
plt.style.use("ggplot")
fig, ax = plt.subplots()
ax.plot(np.arange(0, 100), losses[:100])
ax.set_title("Training Loss")
ax.set_xlabel("Epoch #")
ax.set_ylabel("Loss")
plt.show()
OUTPUT
CONCLUSION
Hence, we can conclude that we can study the concept about Stochastic Gradient
Descent learning algorithms to learn the parameters of the Supervised-layer
feedforward neural network.
EXPERIMENT 4
AIM
To implement a backpropagation algorithm to train a DNN with at least 2 hidden
layers.
THEORY
Backpropagation is a fundamental algorithm used in deep learning to train artificial
neural networks. It is an algorithm for computing the gradient of the loss function
with respect to the weights of the neural network, which is used to update the
weights during the training process.
The backpropagation algorithm works by propagating the error or loss backward
through the neural network. During the forward pass, the input is passed through the
network, and the output is computed. The loss function is then calculated based on
the difference between the predicted output and the actual output. The goal of
backpropagation is to adjust the weights of the network to minimize this loss
function.
During the backward pass, the partial derivative of the loss function with respect to
each weight in the network is computed using the chain rule of calculus. This
derivative is then used to update the weight using an optimization algorithm such as
gradient descent. The process is repeated over multiple iterations until the loss
function is minimized, and the network is trained.
CODE
Exp4 Backpropagation Algorithm in Neural Network
# Package imports
import numpy as np
import matplotlib.pyplot as plt
A1 = cache['A1']
A2 = cache['A2']
Z1 = cache['Z1']
Z2 = cache['Z2']
m = A1.shape[1]
# Compute gradients
dZ2 = A2 - Y
dW2 = 1/m * np.dot(dZ2, A1.T)
db2 = 1/m * np.sum(dZ2, axis=1, keepdims=True)
dZ1 = np.dot(W2.T, dZ2) * (1 - np.power(A1, 2))
dW1 = 1/m * np.dot(dZ1, X.T)
db1 = 1/m * np.sum(dZ1, axis=1, keepdims=True)
# Update parameters
W1 = W1 - learning_rate * dW1
b1 = b1 - learning_rate * db1
W2 = W2 - learning_rate * dW2
b2 = b2 - learning_rate * db2
print_cost = True
for i in range(0, num_iterations):
# Forward propagation. Inputs: "X, parameters". return: "A2, cache".
A2, cache = forward_propagation(X, W1, W2, b1, b2)
# Cost function. Inputs: "A2, Y". Outputs: "cost".
cost = compute_cost(A2, Y)
# Backpropagation. Inputs: "parameters, cache, X, Y". Outputs:
"grads".
grads = back_propagate(W1, b1, W2, b2, cache)
W1, b1, W2, b2 = grads[0], grads[1], grads[2], grads[3]
OUTPUT
CONCLUSION
EXPERIMENT 5
AIM
To design the architecture and implement the autoencoder model for Image Compression.
THEORY
The goal of an autoencoder is to learn an efficient encoding of the input data, which can
then be used for tasks such as dimensionality reduction, denoising, and anomaly detection.
The network is trained by minimizing a reconstruction loss, which measures the difference
between the input data and the reconstructed output from the decoder.
Autoencoders are commonly used in deep learning for a variety of applications such as
image and speech recognition, natural language processing, and recommender systems.
Variations of autoencoders include denoising autoencoders, variational autoencoders, and
convolutional autoencoders, each designed for specific applications.
CODE
Exp5: Autoencoder
import tensorflow.keras.layers
import tensorflow.keras.models
import tensorflow.keras.optimizers
import tensorflow.keras.datasets
import numpy
import matplotlib.pyplot
# Encoder
x = tensorflow.keras.layers.Input(shape=(784), name="encoder_input")
encoder_dense_layer1 = tensorflow.keras.layers.Dense(units=300,
name="encoder_dense_1") (x)
encoder_activ_layer1 =
tensorflow.keras.layers.LeakyReLU(name="encoder_leakyrelu_1")
(encoder_dense_layer1)
encoder_dense_layer2 = tensorflow.keras.layers.Dense(units=2,
name="encoder_dense_2") (encoder_activ_layer1)
encoder_output = tensorflow.keras.layers.LeakyReLU(name="encoder_output")
(encoder_dense_layer2)
encoder= tensorflow.keras.models.Model(x, encoder_output,
name="encoder_model")
encoder.summary()
# Decoder
decoder_input = tensorflow.keras.layers.Input(shape=(2),
name="decoder_input")
decoder_dense_layer1 = tensorflow.keras.layers. Dense (units=300,
name="decoder_dense_1")(decoder_input)
decoder_activ_layer1 = tensorflow.keras.layers. LeakyReLU
(name="decoder_leakyrelu_1")(decoder_dense_layer1)
decoder_dense_layer2 = tensorflow.keras.layers.Dense(units=784,
name="decoder_dense_2") (decoder_activ_layer1)
decoder_output = tensorflow.keras.layers.
LeakyReLU(name="decoder_output") (decoder_dense_layer2)
decoder = tensorflow.keras.models.Model(decoder_input, decoder_output,
name="decoder_model")
decoder.summary()
#Autoencoder
ae_input = tensorflow.keras.layers.Input(shape=(784), name="AE_input")
ae_encoder_output = encoder(ae_input)
ae_decoder_output = decoder(ae_encoder_output)
ae = tensorflow.keras.models.Model(ae_input, ae_decoder_output,
name="AE")
ae.summary()
# RMSE
def rmse(y_true, y_predict):
return
tensorflow.keras.backend.mean(tensorflow.keras.backend.square(y_true-
y_predict))
#AE Compilation
ae.compile(loss="mse",
optimizer=tensorflow.keras.optimizers.Adam(lr=0.0005))
#Training AE
ae.fit(x_train, x_train, epochs=20, batch_size=256, shuffle=True,
validation_data=(x_test, x_test))
encoded_images = encoder.predict(x_train)
decoded_images = decoder.predict(encoded_images)
decoded_images_orig = numpy.reshape(decoded_images,
newshape=(decoded_images.shape[0], 28, 28))
num_images_to_show = 5
for im_ind in range(num_images_to_show):
plot_ind = im_ind*2+1
rand_ind = numpy.random.randint(low=0, high =x_train.shape[0])
matplotlib.pyplot.subplot(num_images_to_show, 2, plot_ind)
matplotlib.pyplot.imshow(x_train_orig[rand_ind,:,:], cmap="gray")
matplotlib.pyplot.subplot(num_images_to_show, 2, plot_ind)
matplotlib.pyplot.imshow(decoded_images_orig[rand_ind,:,:],
cmap="gray")
matplotlib.pyplot.figure()
matplotlib.pyplot.scatter(encoded_images[:,0], encoded_images[:,1], c =
y_train)
matplotlib.pyplot.colorbar()
OUTPUT
CONCLUSION
Hence, we can conclude that we can learn and study about autoencoder model for Image
Compression.
EXPERIMENT 6
AIM
To design the architecture and implement a CNN model for digit recognition applications.
THEORY
CNN stands for Convolutional Neural Network. It is a type of neural network that is
commonly used in computer vision tasks such as image recognition, object detection, and
segmentation.
A CNN consists of multiple layers, including convolutional layers, pooling layers, and fully
connected layers. The convolutional layers apply a series of filters to the input image to
extract features, such as edges and shapes. The pooling layers down sample the output
of the convolutional layers to reduce the spatial dimensions of the feature maps. The fully
connected layers then classify the features into the desired output classes.
CNNs have revolutionized the field of computer vision and have achieved state-of-the-art
performance on various benchmark datasets. They have also been applied to other
domains such as natural language processing and audio processing.
CODE
Exp5: CNN Model For Digit Recognition Application
import tensorflow as tf
(x_train, y_train), (x_test, y_test) =
tf.keras.datasets.mnist.load_data()
x_test/= 255
plt.figure(figsize=(10,10))
plt.subplot(4,4,1)
image_index = 2853
predict = x_test[image_index].reshape(28,28)
pred = model.predict(x_test[image_index].reshape(1, 28, 28, 1))
plt.imshow(x_test[image_index].reshape(28, 28),cmap='Greys')
plt.title("Predicted Label: "+str(pred.argmax()))
plt.subplot(4,4,2)
image_index = 2000
predict = x_test[image_index].reshape(28,28)
pred = model.predict(x_test[image_index].reshape(1, 28, 28, 1))
plt.imshow(x_test[image_index].reshape(28,28),cmap='Greys')
plt.title("Predicted Label: "+str(pred.argmax()))
plt.subplot(4,4,3)
image_index = 1500
predict = x_test[image_index].reshape(28,28)
pred = model.predict(x_test[image_index].reshape(1, 28, 28, 1))
plt.imshow(x_test[image_index].reshape(28, 28),cmap='Greys')
plt.title("Predicted Label: "+str(pred.argmax()))
plt.subplot(4,4,4)
image_index = 1345
predict = x_test[image_index].reshape(28,28)
pred = model.predict(x_test[image_index].reshape(1, 28, 28, 1))
plt.imshow(x_test[image_index].reshape(28, 28),cmap='Greys')
plt.title("Predicted Label: "+str(pred.argmax()))
OUTPUT
CONCLUSION
Hence, we can conclude that we can learn and study a CNN model for digit recognition
applications.
EXPERIMENT 7
AIM
To design and implement a LSTM For Sentiment Analysis.
THEORY
LSTM is a type of recurrent neural network (RNN) architecture in deep learning that is
designed to overcome the vanishing gradient problem in standard RNNs.
LSTMs address the vanishing gradient problem by introducing a memory cell that can
selectively remember or forget information based on the input data. The memory cell is
controlled by gates that regulate the flow of information into and out of the cell. This
allows LSTMs to learn long-term dependencies in sequential data by selectively retaining
or discarding information over time.
Training an LSTM involves defining the architecture of the network, specifying the input
and output layers, and defining the loss function and optimization algorithm. The weights
of the network are then updated through backpropagation during training to minimize the
loss function. Once the network is trained, it can be used to make predictions on new
data by feeding it through the network and extracting the output.
CODE
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
OUTPUT
CONCLUSION
Hence, we can conclude that we can learn and study LSTM For Sentiment Analysis.
BY
Amey Bhavsar (62)
Ankita Upadhyay (73)
Deepanshu Yadav (74)
Guide
Prof.Neelam Phadnis
2
3
Abstract
Detecting sarcasm is a complex problem in natural language processing due to the challenges
associated with understanding the nuances of language and contextual cues. Sarcasm can be
difficult to detect, even for humans, as it often involves a speaker saying the opposite of what they
mean, or using irony or humor to convey a different meaning. It is especially challenging in
written text, as there are no visual or auditory cues to help identify the speaker's intention.
Despite the challenges, there is a growing need for sarcasm detection in various domains,
including social media monitoring, customer service, and sentiment analysis. Identifying sarcasm
can help businesses and organizations understand customer sentiment better, respond more
effectively to feedback, and improve communication with customers. Therefore, developing
accurate and reliable models for sarcasm detection is crucial.
In this project, we focused on developing a deep learning model for detecting sarcasm in news
headlines. We collected a dataset of news headlines from various sources, including satirical news
websites, and used a supervised learning approach to train our model. We chose a neural network-
based approach as it has shown promising results in other natural language processing tasks.
4
List of Figures
Fig. 2: Tokenizer
Fig. 7: Accuracy
5
Introduction
Detecting sarcasm in natural language is a challenging problem in natural language processing
due to its subjective nature and the contextual cues required to identify it accurately. Sarcasm
often involves speakers saying the opposite of what they mean, using irony or humor to convey
a different meaning, or relying on cultural references and shared knowledge. Sarcasm detection
has important applications in various domains, including social media monitoring, customer
service, and sentiment analysis. Identifying sarcasm can help organizations understand
customer sentiment better, respond more effectively to feedback, and improve communication
with customers.
In recent years, deep learning techniques have shown promising results in various natural
language processing tasks, including sarcasm detection. These techniques involve training
neural network models on large datasets to learn patterns and features that enable them to
identify sarcasm accurately. However, sarcasm detection remains a challenging problem, and
there is a need for developing more accurate and reliable models for this task.
In this project, we aimed to develop a deep learning model for detecting sarcasm in news
headlines using a supervised learning approach. We collected a dataset of news headlines from
various sources, including satirical news websites, and trained a neural network to predict the
probability of a sentence being sarcastic or not. Our project involved several steps, including
data preprocessing, feature engineering, and hyperparameter tuning. We evaluated the model's
accuracy and F1-score on a separate test dataset and performed a comprehensive analysis of its
performance.
Objective
1. To develop a deep learning model for detecting sarcasm in news headlines.
2. To collect a dataset of news headlines from various sources to use for training and testing the
model.
3. To preprocess the text data to convert the text into numerical features for the model.
4. To use supervised learning techniques to train the model on the dataset of news headlines.
5. To fine-tune the hyperparameters of the model to optimize its performance.
6. To evaluate the model's accuracy and F1-score on a separate test dataset.
7. To analyze the model's performance using a confusion matrix and examine its most significant
misclassifications.
8. To compare the performance of the developed model with existing sarcasm detection models.
9. To identify the limitations of the developed model and suggest areas for further improvement.
6
10. To provide insights into the sarcasm detection problem in natural language processing and its
practical applications in sentiment analysis, social media monitoring, and customer service.
Literature Review
Understanding users' opinions on various topics or events on social media requires understanding
both literal and figurative meanings. Detecting sarcastic posts on social media has received a lot of
attention recently, especially because sarcastic tweets frequently include positive words that
represent negative or undesirable characteristics. To comprehend the application of various machine
learning algorithms for sarcasm detection in Twitter, the Preferred Reporting Items for Systematic
Reviews and Meta-Analyses (PRISMA) statement was used. The combination of Convolutional
Neural Network (CNN) and SVM was discovered to provide high prediction accuracy. Furthermore,
our findings show that lexical, pragmatic, frequency, and part-of-speech tagging can improve SVM
performance, whereas both lexical and personal features can improve CNN-SVM performance.
III Multimodal Sarcasm Detection: A Deep Learning Approach [3] Santosh Kumar Bharti,
Rajeev Kumar Gupta
In this paper, we proposed a hybrid model for detecting sarcasm in conversational data. To detect
sarcasm, this method combines text and audio features. Three models were developed: the text
model, which only works with textual features, the audio model, which only works with audio
features, and the hybrid model, which works with both text and audio features. The hybrid model's
text and audio features were obtained as latent vectors from the other two models. Based on the
results, it is clear that the hybrid model outperforms the other two models. This is due to text and
audio features compensating for each other's flaws. As a result, these findings support our hypothesis
that combining text and audio increases the likelihood of detecting sarcasm.
7
Implementation
Fig. 1
We use a dataset that contains headline, article_link and is_sarcastic label for various news
headlines scraped from the internet.
2. Tokenization
Fig. 2
To convert words to sequences and then sequences to padded ones, we use tokenizer class to
fit the words on a given size of vocabulary. Padding appends extra characters at the end of
the sentence to make each of them that same length.
3. Model declaration
Fig. 3
Using Sequential class from tf.keras, we define a neural network with Embedding to receive
text sequences as input and end with a single neuron to get probability for sarcasm. The
sigmoid function returns a value between 0 to 1 and binary cross entropy is used since only
two classes are being compared, [Sarcastic, NOT Sarcastic].
4. Fit the model and perform predictions
Fig. 4
8
Fig. 5
Proposed System
The proposed system for detecting sarcasm in news headlines uses a deep learning model based on
a sequential neural network architecture. The model consists of several layers, including an
embedding layer, a global average pooling layer, and two dense layers.
The embedding layer converts the input text data into a numerical representation that the model can
process. The layer uses an embedding matrix with a fixed vocabulary size to map each word in the
input sentence to a corresponding vector representation. In this model, the embedding layer has 16
output dimensions.
The global average pooling layer computes the average of the embeddings for each word in the input
sentence to obtain a single vector representation of the sentence. This layer reduces the
dimensionality of the data and helps prevent overfitting.
The two dense layers are fully connected layers that perform nonlinear transformations on the input
data. The first dense layer has 24 output dimensions and uses the ReLU activation function, while
the second dense layer has a single output neuron with a sigmoid activation function. The sigmoid
function outputs a probability value between 0 and 1, representing the model's confidence that the
input sentence is sarcastic or not.
The model has a total of 160,433 parameters, all of which are trainable. During training, the model
adjusts its parameters to minimize the loss function between the predicted and actual labels for the
training data. Once trained, the model can be used to predict the sarcasm probability for new input
sentences.
Overall, this proposed system provides a simple and efficient approach to sarcasm detection using a
neural network architecture that has shown promising results in previous natural language processing
tasks.
9
Fig. 6
Results
Fig. 7
We collected a dataset of news headlines from various sources, including satirical news websites, and
trained our model on this dataset. Our model achieved an accuracy of 97% on training data and 81%
on validation data demonstrating its effectiveness in identifying sarcasm in news headlines.
Conclusion
In this project, we developed a deep learning model for detecting sarcasm in news headlines using a
supervised learning approach. Our model was based on a sequential neural network architecture and
10
used an embedding layer, global average pooling layer, and two dense layers to make predictions
about the sarcasm probability of the input sentences.
Through our project, we highlighted the importance of developing sophisticated models to solve the
sarcasm detection problem and the potential of deep learning techniques in this domain. Our approach
provides a simple and efficient way of identifying sarcasm in text, which has important applications
in sentiment analysis, social media monitoring, and customer service.
However, we also identified some limitations of our model, including its reliance on textual data and
the potential for bias in the training dataset. We suggest that future research should explore the use of
multimodal data and more diverse datasets to improve the performance and generalization of sarcasm
detection models. Overall, this project provides valuable insights into the sarcasm detection problem
in natural language processing and highlights the potential for deep learning techniques to improve
the accuracy and reliability of sarcasm detection models.
References
[1] A., K. and D., T. (no date) Sarcasm identification and detection in conversion context
using Bert, ACL Anthology. Available at: https://aclanthology.org/2020.figlang-1.10/
(Accessed: April 6, 2023).
[2] SARCASM detection using machine learning algorithms in ... - sage journals (no
date). Available at: https://journals.sagepub.com/doi/full/10.1177/1470785320921779
(Accessed: April 6, 2023).
[3] Bharti, S.K. et al. (2022) Multimodal Sarcasm Detection: A deep learning approach,
Wireless Communications and Mobile Computing. Hindawi. Available at:
https://www.hindawi.com/journals/wcmc/2022/1653696/ (Accessed: April 6, 2023).
[4] https://github.com/ashwaniYDV/sarcasm-detection-tensorflow/blob/main/sarcasm.json
11
12