Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
22 views52 pages

DeepLearning LabManual24 25

The document is a laboratory manual for the Deep Learning course (CSDO8011) for the academic year 2024-2025 at Shree L.R. Tiwari College of Engineering. It outlines the vision and mission of the Computer Engineering Department, details the program educational objectives, and provides guidelines for both students and teachers regarding practical work and assessments. Additionally, it includes assessment criteria for experiments, mini-projects, and assignments, along with a framework for evaluating student performance.

Uploaded by

ux
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views52 pages

DeepLearning LabManual24 25

The document is a laboratory manual for the Deep Learning course (CSDO8011) for the academic year 2024-2025 at Shree L.R. Tiwari College of Engineering. It outlines the vision and mission of the Computer Engineering Department, details the program educational objectives, and provides guidelines for both students and teachers regarding practical work and assessments. Additionally, it includes assessment criteria for experiments, mini-projects, and assignments, along with a framework for evaluating student performance.

Uploaded by

ux
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

DEPARTMENT OF COMPUTER ENGINEERING

A Laboratory Manual for


Deep Learning (CSDO8011)

ACADEMIC YEAR: 2024-2025

Course Name: Deep Learning Course Code: CSDO8011


Name:
Semester: VIII (Eighth) Roll No. :
Div.: _ _ Exam. Seat No. :
Email ID: Mobile No.:
DEPARTMENT OF COMPUTER ENGINEERING
VISION AND MISSION
Institution's
To be a world class institute and a front runner in educational and socioeconomic
Vision development of the nation by providing high quality technical education to students
from all sections of society.
To provide superior learning experiences in a caring and conducive environment so
Mission
as to empower students to be successful in life & contribute positively to society.
We, at SHREE L. R. TIWARI COLLEGE OF ENGINEERING, shall dedicate and
strive hard to continuously achieve academic excellence in the field of Engineering
Quality
and to produce the most competent Engineers through objective & innovative
Policy
teaching methods, consistent updating of facilities, welfare & quality improvement
of the faculty & a system of continual process improvement.

Computer Engineering Department's


To be a department of high repute focused on quality education, training and skill
Vision development in the field of computer engineering to prepare professionals and
entrepreneurs of high caliber with human values to serve our nation and globe.

M1: To provide fertile academic environment for the development of skilled


professionals and empowered with knowledge, skills, values, and confidence to take
the leadership role and to bridge the gap between industry institute and society in the
field of Computer engineering.

M2: To promote caring and interactive teaching practices in a rejoicing learning


Mission ambience with richly supported modern educational tools and techniques.
M3: To enhance and revitalize research culture to provide practical exposure and to
establish synergy between teaching and research and make it an enabler for speedy
progress.
M4: To pursue intensification of soft skills and personality development through
interplay of achievers of all segments of our society.
M5: To provide human values to students by promoting lifelong learning ability.
PEO-1: To prepare the Learner with a sound foundation in the mathematical,
scientific and engineering fundamentals.
PEO-2: To motivate the Learner in the art of self-learning and to use modern tools
Program for solving real life problems.
Educational
Objectives PEO-3: To equip the Learner with broad education necessary to understand the impact
of Computer Science and Engineering in a global and social context.

PEO-4: To encourage, motivate and prepare the Learner‘s for Lifelong-


learning.
PEO-5: To inculcate professional and ethical attitude, good leadership qualities
and commitment to social responsibilities in the Learner‘s thought process.
Student’s Signature
Shree Rahul Education Society’s (Regd.)
SHREE L.R. TIWARI COLLEGE OF ENGINEERING
Kanakia Park, Mira Road(E),Thane-401107,Maharastra.
(Approved by AICTE & DTE, Maharashtra State & Affiliated to University of Mumbai)
ISO 9001:2008 Certified. DTE Code No. 3423
Phone: (O) +91 22-28120144 / 28120145, E-mail: [email protected], Website: www.slrtce.in
DEPARTMENT OF COMPUTER ENGINEERING

Certificate

This is to certify that Mr. /Ms. _

Class Roll No. Exam Seat No. _

of Eighth Semester of Degree in Computer Engineering has completed the

required number of Practicals / Term Work / Sessional in the subject Deep

Learning from the Department of Computer Engineering during the academic

year of 2024 -2025 as prescribed in the curriculum.

Faculty in-Charge Head of the Department Principal


Date:

Seal of
Institution
INSTRUCTION FOR STUDENTS
Students shall read the points given below for understanding the theoretical concepts
and practical applications.
1) Listen carefully to the lecture given by teacher about importance of subject,
curriculum philosophy learning structure, skills to be developed, information
about equipment, instruments, procedure, method of continuous assessment,
tentative plan of work in laboratory and total amount of work to be done in a
semester.
2) Student shall undergo study visit of the laboratory for types of equipment,
instruments, software to be used, before performing experiments.
3) Read the write up of each experiment to be performed, a day in advance.
4) Organize the work in the group and make a record of all observations.
5) Understand the purpose of experiment and its practical implications.
6) Write the answers of the questions allotted by the teacher during practical hours
if possible or afterwards, but immediately.
7) Student should not hesitate to ask any difficulty faced during conduct of
practical/exercise.
8) The student shall study all the questions given in the laboratory manual and
practice to write the answers to these questions.
9) Student shall develop maintenance skills as expected by the industries.
10) Student should develop the habit of pocket discussion/group discussion related
to the experiments/exercises so that exchanges of knowledge/skills could take
place.
11) Student shall attempt to develop related hands-on-skills and gain confidence.
12) Student shall focus on development of skills rather than theoretical or codified
knowledge.
13) Student shall visit the nearby workshops, workstation, industries, laboratories,
technical exhibitions, trade fair etc. even not included in the Lab manual. In
short, students should have exposure to the area of work right in the student hood.
14) Student shall insist for the completion of recommended laboratory work,
industrial visits, answers to the given questions, etc.
15) Student shall develop the habit of evolving more ideas, innovations, skills etc.
those included in the scope of the manual.
16) Student shall refer technical magazines, proceedings of the seminars, refer
websites related to the scope of the subjects and update his knowledge and skills.
17) Student should develop the habit of not to depend totally on teachers but to
develop self-learning techniques.
18) Student should develop the habit to react with the teacher without hesitation with
respect to the academics involved.
19) Student should develop habit to submit the practicals, exercise continuously and
progressively on the scheduled dates and should get the assessment done.
20) Student should be well prepared while submitting the write up of the exercise.
This will develop the continuity of the studies and he/she will not be over loaded
at the end of the term.
GUIDELINES FOR TEACHERS
Teachers shall discuss the following points with students before start of practicals of the
subject.
1) Learning Overview: To develop better understanding of importance of the
subject. To know related skills to be developed such as Intellectual skills and
Motor skills.
2) Learning Structure: In this, topic and sub topics are organized in systematic way
so that ultimate purpose of learning the subject is achieved. This is arranged in
the form of fact, concept, principle, procedure, application and problem.
3) Know your Laboratory Work: To understand the layout of laboratory,
specifications of equipment/Instruments/Materials, procedure, working in
groups, planning time ets. Also to know total amount of work to be done in the
laboratory.
4) Teaching shall ensure that required equipments are in working condition before
start of experiment, also keep operating instruction manual available.
5) Explain prior concepts to the students before starting of each experiment.
6) Involve students activity at the time of conduct of each experiment.
7) While taking reading/observation each student shall be given a chance to perform
or observe the experiment.
8) If the experimental set up has variations in the specifications of the equipment,
the teachers are advised to make the necessary changes, wherever needed.
9) Teacher shall assess the performance of students continuously as per norms
prescribed by university of Mumbai and guidelines provided by IQAC.
10) Teacher should ensure that the respective skills and competencies are developed
in the students after the completion of the practical exercise..
11) Teacher is expected to share the skills and competencies are developed in the
students.
12) Teacher may provide additional knowledge and skills to the students even though
not covered in the manual but are expected from the students by the industries.
13) Teachers shall ensure that industrial visits if recommended in the manual are
covered.
14) Teacher may suggest the students to refer additional related literature of the
Technical papers/Reference books/Seminar proceedings, etc.
15) During assessment teacher is expected to ask questions to the students to tap their
achievements regarding related knowledge and skills so that students can prepare
while submitting record of the practicals. Focus should be given on development
of enlisted skills rather than theoretical /codified knowledge.
16) Teacher should enlist the skills to be developed in the students that are expected
by the industry.
17) Teacher should organize Group discussions /brain storming sessions / Seminars
to facilitate the exchange of knowledge amongst the students.
18) Teacher should ensure that revised assessment norms are followed
simultaneously and progressively.
19) Teacher should give more focus on hands on skills and should actually share the
same.
20) Teacher shall also refer to the circulars related to practicals supervise and
assessment for additional guidelines.
Student’s Progress Assessments
Student Name: Roll No.:
Class/Semester: BE CS/SEM-VIII Academic Year: 2024-2025
Course Name: Deep Learning Course Code: CSDO8011
Assessment Parameters for Practicals/Mini Project/Assignments
Criteria for Grading Total
Exp. Average Covered
No.
Title of Experiment PE KT DR DN PL (out of
(out of 3) COs
(Out of 3) (Out of 3) (Out of 3) (Out of 3) (Out of 3) 15)
To implement McCulloch Pitts model
1
for binary logic function. CO1
To implement the Perceptron
2
algorithm to simulate any logic gate. CO1
To implement Stochastic Gradient
Descent learning algorithms to learn
3
the Supervised-layer feedforward
CO1
neural network parameters.
To implement a backpropagation
4 algorithm to train a DNN with at least CO2
2 hidden layers.
To design the architecture and
5 implement the autoencoder model for CO3
Image Compression
To design the architecture and
6 implement a CNN model for digit CO4
recognition applications.
To design and implement an LSTM
7
For Sentiment Analysis.
CO5

8 Deep Learning Application CO6


Average Marks

Criteria for Grading – Preparedness and Efforts(PE),Knowledge of tools(KT), Debugging and results(DR),
Documentation(DN), Punctuality & Lab Ethics(PL)

Criteria for Grading Total


Average Covered
Mini Project KD WG PS TM LL ET (Out (out of 3) Cos
(Out of 3) (Out of 3) (Out of 3) (Out of 3) (Out of 3) (Out of 3) of 18)

CO6

Criteria for Grading – Knowledge regarding design and analysis of basic electronics circuits(KD), Working in a group
(WG), Presentation skill (PS), Time Management (TM), Lifelong learning (LL), Ethics (ET)
Criteria for Grading Total (out
Covered

Average
COs

Assignments TS OM NT IS of12)
(out of 3)
(Out of 3) (Out of 3) (Out of 3) (Out of 3)
Assignment No. 1 CO6
Assignment No. 2 CO1, CO2, CO3
Assignment No. 3 CO4,CO5, CO6
Average Marks

Criteria for Grading –Timely submission(TS), Originality of the material(OM), Neatness(NT), Innovative solution(IS)
Grades – Meet Expectations(3 Marks), Moderate Expectations (2 Marks), Below Expectations (1 Mark)
Student’s Signature Subject In-charge Head of Department
RECORD OF PROGRESSIVE ASSESSMENTS
Student Name: Roll No.: (BE CS SEM-VIII)
Course Name: Deep Learning Course Code: CSDO8011
Assessment of Experiments (A)
Sr. Teacher's
Page Date of Date of Assessment CO
Name of Experiments Signature
no. No. Performance Submission (out of 15) and Remark Covered
To implement McCulloch Pitts model for binary
1 CO1
logic function.
To implement the Perceptron algorithm to
2 simulate any logic gate.
CO1
To implement Stochastic Gradient Descent
3 learning algorithms to learn the Supervised- CO1
layer feedforward neural network parameters.
To implement a backpropagation algorithm to
4
train a DNN with at least 2 hidden layers. CO2
To design the architecture and implement the
5
autoencoder model for Image Compression CO3
To design the architecture and implement a
6 CNN model for digit recognition applications.
CO4
To design and implement an LSTM For
7 CO5
Sentiment Analysis.
8 Deep Learning Application CO6
Average Marks (Out of 15)
Converted Marks (Out of 10) (A) --
Assessment of Mini Project (B)
Teacher's
Page Date of Date of Assessment CO
Title of Mini Project Signature and
No. Performance Submission (out of 18) Remark Covered

CO6
Converted Marks (Out of 5) (B)
Assessment of Assignments (C)
Teacher's
Sr. Page Date of Date of Assessment Signature and
CO
no. Assignment No. Display Completion (Out of 12) Remark Covered
1 Assignment No.1 CO6
2 Assignment No.2 CO1, CO2,
CO3
3 Assignment No.3 CO4,CO5,
CO6
Average Marks (Out of 12)
Converted Marks (Out of 5) (C)
Assessments of Attendance (D)
ECCF Theory Attendance BEL Practical Attendance AVG. Attendance
Attendance Marks (D)
TH (out of) TH attend. TH % PR (out of) PR Attend. PR % % (TH+PR) (Out of 5)

Total Term Work Marks: A+C+D = (Out of 25)

Student Signature Subject In-charge Head of the Department


DEPARTMENT OF COMPUTER ENGINEERING
Programme Outcome (POs & PSOs)
Programme Outcomes are the skills and knowledge which the students have at the time of graduation. This will
indicate what student can do from subject-wise knowledge acquired during the programme.
Graduate
PO Description of the Programme outcome as defined by the NBA
Attributes
Engineering Apply the knowledge of mathematics, science, engineering fundamentals, and an engineering
PO-1 knowledge specialization to the solution of complex engineering problems.
Identify, formulate, review research literature, and analyse complex engineering
Problem
PO-2 problems reaching substantiated conclusions using first principles of mathematics, natural
analysis
sciences, and engineering sciences.
Design/ Design solutions for complex engineering problems and design system components or
PO-3 development of processes that meet the specified needs with appropriate consideration for the public
solutions health and safety, and the cultural, societal, and environmental considerations.
Conduct Use research-based knowledge and research methods including design of experiments,
PO-4 investigations of analysis and interpretation of data, and synthesis of the information to provide valid
complex problems conclusions.

Create, select, and apply appropriate techniques, resources, and modern engineering and IT
Modern tool
PO-5 usage
tools including prediction and modelling to complex engineering activities with an
understanding of the limitations.
Apply reasoning informed by the contextual knowledge to assess societal, health, safety,
The engineer
PO-6 and society
legal and cultural issues and the consequent responsibilities relevant to the professional
engineering practice.
Environment Understand the impact of the professional engineering solutions in societal and
PO-7 and environmental contexts, and demonstrate the knowledge of, and need for sustainable
sustainability development.
Apply ethical principles and commit to professional ethics and responsibilities and norms of
PO-8 Ethics
the engineering practice.
Individual and Function effectively as an individual, and as a member or leader in diverse teams, and in
PO-9 team work multidisciplinary settings.
Communicate effectively on complex engineering activities with the engineering community
PO-10 Communication and with society at large, such as, being able to comprehend and write effective reports and
design documentation, make effective presentations, and give and receive clear instructions.
Project Demonstrate knowledge and understanding of the engineering and management
PO-11 management principles and apply these to one’s own work, as a member and leader in a team, to manage
and finance projects and in multidisciplinary environments.
Life-long Recognize the need for, and have the preparation and ability to engage in independent
PO-12 learning and life-long learning in the broadest context of technological change.
Program Specific Outcomes (PSOs) defined by the programme. Baseline-Rational Unified Process(RUP)
System Conceptualize the software and/or hardware systems, system components and
PSO-1 Inception and process/procedures through requirement analysis, modelling/design of the system using
Elaboration various architectural/design patterns, standard notations, procedures and algorithms.
System Implement the systems, procedures and processes using the state of the art technologies,
PSO-2 Construction
standards, tolls and programming paradigms.
System Testing Verify and validate the systems, procedures and processes using various testing and
PSO-3 and Deployment
verification techniques and tools.
Manage the quality through various product development strategies under revision, transition
Quality and and operation through maintainability, flexibility, testability, portability, reusability,
PSO-4 Maintenance
interoperability, correctness, reliability, efficiency, integrity and usability to adapt the system
to the changing structure and behaviour of the systems/environments

_
Student’s Signature
DEPARTMENT OF COMPUTER ENGINEERING
Course Objectives and Outcomes
Academic Year: 2024-2025 Class: BE Course Code: CSDO8011
Program: Computer Engineering Div: Course Name: Deep Learning
Department: Computer Engineering Sem.: VIII Faculty: Prof. Neelam Phadnis

Course Objectives:
Sr. No. Statement
To learn the fundamentals of Neural Network.
1
To gain an in-depth understanding of training Deep Neural Networks.
2
To acquire knowledge of advanced concepts of Convolution Neural Networks,
3 Autoencoders and Recurrent Neural Networks.
Students should be familiar with the recent trends in Deep Learning.
4

Course Outcomes:
CO's No. Abbre. Statement
CSDLO6013.1 CO1 Gain basic knowledge of Neural Networks.

CSDLO6013.2 CO2 Acquire in depth understanding of training Deep Neural Networks.

CSDLO6013.3 CO3 To gain an in-depth understanding of Autoencoders.

To acquire knowledge of advanced concepts of Convolution Neural


CSDLO6013.4 CO4 Networks.

To be familiar with concepts of Recurrent Neural Networks.


CSDLO6013.5 CO5

Gain familiarity with recent trends and applications of Deep Learning.


CSDLO6013.6 CO6

Course Prerequisite:
Basic mathematics and Statistical concepts, Linear algebra, Machine Learning
Teaching and Examination Scheme:
Teaching Credits Assigned
Examination Scheme
Scheme (Hrs)
TW/
Theory Pract Tut Theory
Pract
Tut Total Theory
Oral
Internal End Exam TW Oral & Total
Assessment Sem. Duration Pract
4 2 - 4 1 - 5 Test 1 Test 2 Avg. Exam ( in Hrs)
20 20 20 80 03 25 25 --- 150
Term Work (Total 25 Marks) = (Experiments: 15 mark + Assignments: 05 mark + Attendance: 05
marks (TH+PR)).
DEPARTMENT OF COMPUTER ENGINEERING
Academic Year: 2024-2025 Class: BE Course Code: CSDO8011
Program: Computer Engineering Div: - Course Name: Deep Learning
Department: Computer Engineering Sem.: VIII Faculty: Prof. Neelam Phadnis

Course Articulation Matrix


Course Outcome
Graduate Weighte
Outcomes POs CSDLO60 CSDLO60 CSDLO60 CSDLO60 CSDLO6 CSDLO6 Total d Avg.
Attributes 
13.1 13.2 13.3 13.4 013.5 013.6
Engineering 5
PO-1 knowledge
1 - 2 2 1.67

PO-2 Problem analysis 2 1 2 2 7 1.75


Design/developmen 7
PO-3 t of solutions
2 1 2 2 1.75

Conduct
6
Programme Outcome

PO-4 investigations of 2 - 2 2 2
complex problems

PO-5 Modern tool usage 3 - 3 3 9 3


The engineer and
PO-6 society
- - - - - -

Environment and
PO-7 sustainability
- - - - - -

PO-8 Ethics - - - - 1 1
Individual and
PO-9 team work
- - - - 2 2

PO-10 Communication - - - - 1 1
Project
PO-11 management and - - - - - -
finance

PO-12 Life-long learning - - - - - -


System Inception
Program Specific

PSO-1 and Elaboration 2 1 2 2 7 1.75


Outcomes

System
PSO-2 Construction
- - - - - -
System Testing and
PSO-3 Deployment
- - - - - -

Quality and
PSO-4 Maintenance
- - - - - -

Signature of the Student


DEPARTMENT OF COMPUTER ENGINEERING

Course Exit Form


Student Name: Roll No.:
Class/Semester: _ BE CS/SEM-VIII Academic Year: 2024-2025
Course Name: Deep Learning Course Code: CSDO8011

Judge your ability with regard to the following points by putting a (√), on the scale of 1 (lowest) to
5 (highest), based on the knowledge and skills you attained from this course.

Sr. 1 5
Your ability to 2 3 4
No. Lowest Highest

1 CSDLO6013.1: Gain basic knowledge of


Neural Networks.

CSDLO6013.2: Acquire in depth understanding of


2 training Deep Neural Networks.

CSDLO6013.3: To gain an in-depth


3 understanding of Autoencoders.

4 CSDLO6013.4: To acquire knowledge of advanced


concepts of Convolution Neural Networks.

5 CSDLO6013.5: To be familiar with concepts of


Recurrent Neural Networks.

6 CSDLO6013.6: Gain familiarity with recent


trends and applications of Deep Learning.

Student’s Signature Date


Deep Learning Experiment 1 Page No. 1-1

EXPERIMENT 1

AIM
Design Mc-Culloch Pitts model for AND, OR functions.

THEORY

The McCulloch-Pitts neural model, which was the earliest ANN model, has only two
types of inputs — Excitatory and Inhibitory. The excitatory inputs have weights of
positive magnitude and the inhibitory weights have weights of negative magnitude.
The inputs of the McCulloch-Pitts neuron could be either 0 or 1. It has a threshold
function as an activation function. So, the output signal yout is 1 if the input ysum is
greater than or equal to a given threshold value, else 0. The diagrammatic
representation of the model is as follows:

Figure 1: Mc-Culloch Pitts model

CODE
Exp1 Mc-Culloch and Pitt Model

import numpy as np
np.random.seed(seed=0)
I= np.random.choice([0,1], 3)# generate random vector I, sampling from {0
,1}
W = np.random.choice([-
1,1], 3) # generate random vector W, sampling from (-1,1}
print(f'Input vector: {I}, Weight vector: {W}')
#matrix of inputs
input_table= np.array([
[0,0], # both no
[0,1], # one no, one yes
[1,0], # one yes, one no
[1,1] # bot yes
])
print(f'input table:\n{input_table}')

Name – Ankita Upadhyay Roll No. - 73 BE - CMPN


Deep Learning Experiment 1 Page No. 1-2

dot = I@W
print(f'Dot product: {dot}')
def linear_threshold_gate(dot: int, T: float) -> int:
"Returns the binary threshold output"
if dot >= T:
return 1
else:
return 0
T = 1
activation = linear_threshold_gate(dot, T)
print(f'Activation: {activation}')

OUTPUT

CONCLUSION

Hence, we can conclude that we can learn and study about Mc-Culloch Pitts model for
binary logic functions.

Name – Ankita Upadhyay Roll No. - 73 BE - CMPN


Deep Learning Experiment 2 Page No. 2-1

EXPERIMENT 2

AIM
To implement Perceptron algorithm to simulate any logic gate.

THEORY

In deep learning, the perceptron algorithm is usually used as the building block of
neural networks. A neural network is composed of multiple layers of interconnected
perceptron’s. The input layer takes in input data, which is then passed through hidden
layers to the output layer. Each layer consists of multiple perceptron’s, and the weights
of the perceptron’s are learned through a process called backpropagation. Perceptron
algorithm is used in deep learning:
1. Input data is fed into the input layer of a neural network.
2. The input data is then passed through hidden layers of perceptron’s. Each
perceptron in the hidden layer takes in input from the previous layer, multiplies
it by its corresponding weight, and applies a non-linear activation function to
the result.
3. The output of the hidden layer is then passed through another layer of
perceptron’s until it reaches the output layer.
4. The output layer produces a prediction based on the input data.
5. The weights of the perceptron’s are learned through a process called
backpropagation. The error between the predicted output and the actual output
is calculated and used to update the weights in the network. This process is
repeated over multiple iterations until the network produces accurate
predictions.

Figure 1: Perceptron Model

Basic Components of Perceptron:


▪ Input Nodes or Input Layer:
This is the primary component of Perceptron which accepts the initial data into
the system for further processing. Each input node contains a real numerical
value.

▪ Wight and Bias:

Name – Ankita Upadhyay Roll No. - 73 BE - CMPN


Deep Learning Experiment 2 Page No. 2-2

Weight parameter represents the strength of the connection between units. This
is another most important parameter of Perceptron components. Weight is
directly proportional to the strength of the associated input neuron in deciding
the output. Further, Bias can be considered as the line of intercept in a linear
equation.

▪ Activation Function:
These are the final and important components that help to determine whether
the neuron will fire or not. Activation Function can be considered primarily as a
step function.

▪ Types of Activation functions:


1. Sign function
2. Step function, and
3. Sigmoid function

CODE
Exp2 Perceptron Algorithm

#importing Python library


import numpy as np
# define unit step function
def unitStep(v):
if v>=0:
return 1
else:
return 0
#design Perceptron Model
def perceptronModel(x, w, b):
v = np.dot(w, x) + b
y = unitStep(v)
return y
# AND Logic Function
#w1 = 1, w2 = 1, b = -1.5
def AND_logicFunction(x):
w = np.array([1,1])
b = -1.5
return perceptronModel(x, w, b)
# testing the perceptron model
test1 = np.array([0, 1])
test2 = np.array([1, 1])
test3 = np.array([0, 0])
test4 = np.array([1, 0])

print("AND({}, {}) = {}".format(0,1,AND_logicFunction(test1)))


print("AND({}, {}) = {}".format(1,1,AND_logicFunction(test2)))
print("AND({}, {}) = {}".format(0,0,AND_logicFunction(test3)))

Name – Ankita Upadhyay Roll No. - 73 BE - CMPN


Deep Learning Experiment 2 Page No. 2-3

print("AND({}, {}) = {}".format(1,0,AND_logicFunction(test4)))

OUTPUT

CONCLUSION

Hence, we can conclude that we can learn and study about Perceptron algorithm to
simulate any logic gate.

Name – Ankita Upadhyay Roll No. - 73 BE - CMPN


Deep Learning Experiment 3 Page No. 3-1

EXPERIMENT 3

AIM
To implement Stochastic Gradient Descent learning algorithms to learn the
parameters of the
Supervised-layer feedforward neural network.

THEORY
SGD is a popular optimization algorithm used to train neural networks. The main idea
behind SGD is to iteratively adjust the model parameters in order to minimize a cost
function (also called a loss function) that measures the difference between the
predicted output of the network and the actual output.
In SGD, the cost function is computed for a small random subset of the training data
(known as a mini-batch), and the model parameters are adjusted based on the
gradient of the cost function with respect to the parameters. This process is repeated
multiple times, with different mini-batches of the data being used each time, until the
cost function is minimized.
One advantage of using SGD over other optimization algorithms is that it is
computationally efficient, especially when dealing with large datasets. By computing
the cost function on small subsets of the data, the gradient can be estimated more
quickly and the model can be updated more frequently. There are also several
variants of SGD, including momentum, Adagrad, RMSprop, and Adam, that are
commonly used in deep learning. These variants help to overcome some of the
limitations of standard SGD, such as slow convergence and difficulty in choosing an
appropriate learning rate.

Name – Ankita Upadhyay Roll No. - 73 BE - CMPN


Deep Learning Experiment 3 Page No. 3-2

Figure 1: Stochastic Gradient Descent

CODE
Exp3 Stochastic Gradient Descent

#import the necessary packages


from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from sklearn.datasets import make_blobs
import matplotlib.pyplot as plt
import numpy as np
import argparse

import sys
sys.argv=['']
del sys

def sigmoid_activation(x):
#compute the derivative of the sigmoid activation value for a given
input
return 1/(1 + np.exp(-x))

def sigmoid_deriv(x):
#compute the derivative of the sigmoid function ASSUMING
# that the input "x" has already been passed through the sigmoid
#activation function
return x*(1-x)

def predict(X, W):


# take the dot product between our features and weight matrix
preds = sigmoid_activation(X.dot(W))
#apply a step function to threshold the outputs to binary # class
labels
preds[preds <= 0.5] = 0
preds[preds>0] = 1
# return the predictions
return preds

Name – Ankita Upadhyay Roll No. - 73 BE - CMPN


Deep Learning Experiment 3 Page No. 3-3

def next_batch(X, y, batchSize):


# loop over our dataset "X" in mini-batches, yielding a tuple of
# the current batched data and labels
for i in np.arange(0, X.shape[0], batchSize):
yield (X[i:i + batchSize], y[i:i + batchSize])

# construct the argument parser and parse the arguments


ap = argparse.ArgumentParser()
ap.add_argument("-e", "--epochs", type=float, default=100, help="# of
epochs")
ap.add_argument("-a", "--alpha", type=float, default=0.01, help="learning
rate")
ap.add_argument("-b", "--batch-size", type=int, default=32, help="size of
SGD mini-batches")
args = vars(ap.parse_args())

# generate a 2-class classification problem with 1,000 data points,


# where each data point is a 2D feature vector
X, y = make_blobs(n_samples=1000, n_features=2, centers=2,
cluster_std=1.5, random_state=1)
y = y.reshape((y.shape[0], 1))

# insert a column of 1's as the last entry in the feature matrix


# this little trick allows us to treat the bias as a trainable parameter
within the weight matrix
X = np.c_[X, np.ones((X.shape[0]))]

# partition the data into training and testing splits using 50% of
# the data for training and the remaining 50% for testing
(trainX, testX, trainY, testY) = train_test_split(X, y, test_size = 0.5,
random_state = 42)

#initialize our weight matrix and list of losses


print("[INFO] training...")
W = np.random.randn(X.shape[1], 1)
losses = []

#loop over the desired number of epochs


for epoch in np.arange(0, args['epochs']):
#initialize the total loss for the epoch
epochLoss = []

#loop over our data in batches


for (batchX, batchY) in next_batch(trainX, trainY,
args["batch_size"]):
#take the dot product between our current batch of features
#and the weight matrix, then pass this value through our
#activation function
preds = sigmoid_activation(batchX.dot(W))

Name – Ankita Upadhyay Roll No. - 73 BE - CMPN


Deep Learning Experiment 3 Page No. 3-4

#now that we have our predictions, we need to determine the


#"error", which is the difference between our predictions
#and the true values
error = preds - batchY
epochLoss.append(np.sum(error ** 2))

#the gradient descent update is the dot product between our


#(1) current batch and (2) the error of the sigmoid
#derivative of our predictions
d = error * sigmoid_deriv(preds)
gradient = batchX.T.dot(d)

# in the update stage, all we need to do is "nudge" the weight


matrix
#weight matrix in the negative direction of the gradient
# (hence the term "gradient descent" by taking a small step
# towards a set of "more optimal" parameters

W+=-args["alpha"]* gradient
#update our loss history by taking the average loss across all
#batches

loss = np.average(epochLoss)
losses.append(loss)
# check to see if an update should be displayed
if epoch == 0 or (epoch + 1) % 5 == 0:
print("[INFO] epoch={}, loss = {:.7f}".format(int(epoch + 1),
loss))

# evaluate our model


print("[INFO] evaluating...")
preds = predict(testX, W)
print(classification_report(testY, preds))

# plot the (testing) classification data


plt.style.use("ggplot")
fig, ax = plt.subplots()
ax.set_title("Data")
ax.scatter(testX[:, 0], testX[:, 1], marker="o", c=testY[:, 0], s=30)

plt.style.use("ggplot")
fig, ax = plt.subplots()
ax.plot(np.arange(0, 100), losses[:100])
ax.set_title("Training Loss")
ax.set_xlabel("Epoch #")
ax.set_ylabel("Loss")
plt.show()

Name – Ankita Upadhyay Roll No. - 73 BE - CMPN


Deep Learning Experiment 3 Page No. 3-5

OUTPUT

Name – Ankita Upadhyay Roll No. - 73 BE - CMPN


Deep Learning Experiment 3 Page No. 3-6

CONCLUSION

Hence, we can conclude that we can study the concept about Stochastic Gradient
Descent learning algorithms to learn the parameters of the Supervised-layer
feedforward neural network.

Name – Ankita Upadhyay Roll No. - 73 BE - CMPN


Deep Learning Experiment 4 Page No. 4-1

EXPERIMENT 4

AIM
To implement a backpropagation algorithm to train a DNN with at least 2 hidden
layers.

THEORY
Backpropagation is a fundamental algorithm used in deep learning to train artificial
neural networks. It is an algorithm for computing the gradient of the loss function
with respect to the weights of the neural network, which is used to update the
weights during the training process.
The backpropagation algorithm works by propagating the error or loss backward
through the neural network. During the forward pass, the input is passed through the
network, and the output is computed. The loss function is then calculated based on
the difference between the predicted output and the actual output. The goal of
backpropagation is to adjust the weights of the network to minimize this loss
function.
During the backward pass, the partial derivative of the loss function with respect to
each weight in the network is computed using the chain rule of calculus. This
derivative is then used to update the weight using an optimization algorithm such as
gradient descent. The process is repeated over multiple iterations until the loss
function is minimized, and the network is trained.

Figure 1: Backpropagation Algorithm in Neural Network

Name – Ankita Upadhyay Roll No. - 73 BE - CMPN


Deep Learning Experiment 4 Page No. 4-2

CODE
Exp4 Backpropagation Algorithm in Neural Network

# Package imports
import numpy as np
import matplotlib.pyplot as plt

# Import the planar_utils.py


from planar_utils import plot_decision_boundary, sigmoid,
load_planar_dataset, load_extra_datasets

# Loading the Sample data


X, Y = load_planar_dataset()

# Visualize the data:


plt.scatter(X[0, :], X[1, :], c=Y, s=40, cmap=plt.cm.Spectral)

# X -> input dataset of shape (input size, number of examples)


# Y -> labels of shape (output size, number of examples)
W1 = np.random.randn(4, X.shape[0]) * 0.01
b1 = np.zeros(shape=(4, 1))
W2 = np.random.randn(Y.shape[0], 4) * 0.01
b2 = np.zeros(shape=(Y.shape[0], 1))

def forward_propagation(X, W1, W2, b1, b2):


Z1 = np.dot(W1, X) + b1
A1 = np.tanh(Z1)
Z2 = np.dot(W2, A1) + b2
A2 = sigmoid(Z2)
# Here the cache is the data of previous iteration
# This will be used for backpropagation
cache = {"Z1": Z1,
"A1": A1,
"Z2": Z2,
"A2": A2}
return A2, cache

# Here Y is actual output


def compute_cost(A2, Y):
m = Y.shape[1] # Number of examples

# Implementing the cost function


logprobs = np.multiply(np.log(A2), Y) + np.multiply((1 - Y), np.log(1
- A2))
cost = -np.sum(logprobs) / m

# Squeezing to avoid unnecessary dimensions


cost = np.squeeze(cost)
return cost

def back_propagate(W1, b1, W2, b2, cache):


# Retrieve cached values

Name – Ankita Upadhyay Roll No. - 73 BE - CMPN


Deep Learning Experiment 4 Page No. 4-3

A1 = cache['A1']
A2 = cache['A2']
Z1 = cache['Z1']
Z2 = cache['Z2']
m = A1.shape[1]

# Compute gradients
dZ2 = A2 - Y
dW2 = 1/m * np.dot(dZ2, A1.T)
db2 = 1/m * np.sum(dZ2, axis=1, keepdims=True)
dZ1 = np.dot(W2.T, dZ2) * (1 - np.power(A1, 2))
dW1 = 1/m * np.dot(dZ1, X.T)
db1 = 1/m * np.sum(dZ1, axis=1, keepdims=True)

# Update parameters
W1 = W1 - learning_rate * dW1
b1 = b1 - learning_rate * db1
W2 = W2 - learning_rate * dW2
b2 = b2 - learning_rate * db2

# Store gradients in a dictionary


grads = {'dW1': dW1, 'db1': db1, 'dW2': dW2, 'db2': db2}

return W1, b1, W2, b2, grads

# Please note that the weights and bias are global


# Here num_iterations is epochs
num_iterations = 10000
learning_rate = 1.2

print_cost = True
for i in range(0, num_iterations):
# Forward propagation. Inputs: "X, parameters". return: "A2, cache".
A2, cache = forward_propagation(X, W1, W2, b1, b2)
# Cost function. Inputs: "A2, Y". Outputs: "cost".
cost = compute_cost(A2, Y)
# Backpropagation. Inputs: "parameters, cache, X, Y". Outputs:
"grads".
grads = back_propagate(W1, b1, W2, b2, cache)
W1, b1, W2, b2 = grads[0], grads[1], grads[2], grads[3]

# Print the cost every 1000 iterations


if print_cost and i % 1000 == 0:
print("Cost after iteration %i: %f" % (i, cost))

Name – Ankita Upadhyay Roll No. - 73 BE - CMPN


Deep Learning Experiment 4 Page No. 4-4

OUTPUT

CONCLUSION

Hence, we can conclude that we can study the concept of a backpropagation


algorithm to train a DNN with at least 2 hidden layers.

Name – Ankita Upadhyay Roll No. - 73 BE - CMPN


Deep Learning Experiment 5 Page No. 5-1

EXPERIMENT 5

AIM
To design the architecture and implement the autoencoder model for Image Compression.

THEORY

An autoencoder is a type of neural network that is trained to learn a compressed


representation of input data without any supervised labels, where the compressed
representation is called the "latent space" or "encoding". The autoencoder architecture
consists of an encoder network that maps the input data into the latent space and a
decoder network that reconstructs the original input data from the encoding.

The goal of an autoencoder is to learn an efficient encoding of the input data, which can
then be used for tasks such as dimensionality reduction, denoising, and anomaly detection.
The network is trained by minimizing a reconstruction loss, which measures the difference
between the input data and the reconstructed output from the decoder.

Autoencoders are commonly used in deep learning for a variety of applications such as
image and speech recognition, natural language processing, and recommender systems.
Variations of autoencoders include denoising autoencoders, variational autoencoders, and
convolutional autoencoders, each designed for specific applications.

Figure 1: Schematic Structure Of an Autoencoder

Name – Ankita Upadhyay Roll No. - 73 BE - CMPN


Deep Learning Experiment 5 Page No. 5-2

CODE
Exp5: Autoencoder

import tensorflow.keras.layers
import tensorflow.keras.models
import tensorflow.keras.optimizers
import tensorflow.keras.datasets
import numpy
import matplotlib.pyplot

# Encoder
x = tensorflow.keras.layers.Input(shape=(784), name="encoder_input")
encoder_dense_layer1 = tensorflow.keras.layers.Dense(units=300,
name="encoder_dense_1") (x)
encoder_activ_layer1 =
tensorflow.keras.layers.LeakyReLU(name="encoder_leakyrelu_1")
(encoder_dense_layer1)
encoder_dense_layer2 = tensorflow.keras.layers.Dense(units=2,
name="encoder_dense_2") (encoder_activ_layer1)
encoder_output = tensorflow.keras.layers.LeakyReLU(name="encoder_output")
(encoder_dense_layer2)
encoder= tensorflow.keras.models.Model(x, encoder_output,
name="encoder_model")
encoder.summary()

# Decoder
decoder_input = tensorflow.keras.layers.Input(shape=(2),
name="decoder_input")
decoder_dense_layer1 = tensorflow.keras.layers. Dense (units=300,
name="decoder_dense_1")(decoder_input)
decoder_activ_layer1 = tensorflow.keras.layers. LeakyReLU
(name="decoder_leakyrelu_1")(decoder_dense_layer1)

decoder_dense_layer2 = tensorflow.keras.layers.Dense(units=784,
name="decoder_dense_2") (decoder_activ_layer1)
decoder_output = tensorflow.keras.layers.
LeakyReLU(name="decoder_output") (decoder_dense_layer2)
decoder = tensorflow.keras.models.Model(decoder_input, decoder_output,
name="decoder_model")
decoder.summary()

#Autoencoder
ae_input = tensorflow.keras.layers.Input(shape=(784), name="AE_input")
ae_encoder_output = encoder(ae_input)
ae_decoder_output = decoder(ae_encoder_output)
ae = tensorflow.keras.models.Model(ae_input, ae_decoder_output,
name="AE")
ae.summary()

# RMSE
def rmse(y_true, y_predict):

Name – Ankita Upadhyay Roll No. - 73 BE - CMPN


Deep Learning Experiment 5 Page No. 5-3

return
tensorflow.keras.backend.mean(tensorflow.keras.backend.square(y_true-
y_predict))

#AE Compilation
ae.compile(loss="mse",
optimizer=tensorflow.keras.optimizers.Adam(lr=0.0005))

#Preparing MNIST Dataset


(x_train_orig, y_train), (x_test_orig, y_test) =
tensorflow.keras.datasets.mnist.load_data()
x_train_orig = x_train_orig.astype("float32") / 255.0
x_test_orig = x_test_orig.astype("float32")/255.0
x_train = numpy.reshape(x_train_orig, newshape=(x_train_orig.shape[0],
numpy.prod(x_train_orig.shape[1:])))
x_test = numpy.reshape(x_test_orig, newshape=(x_test_orig.shape[0],
numpy.prod(x_test_orig.shape[1:])))

#Training AE
ae.fit(x_train, x_train, epochs=20, batch_size=256, shuffle=True,
validation_data=(x_test, x_test))
encoded_images = encoder.predict(x_train)
decoded_images = decoder.predict(encoded_images)
decoded_images_orig = numpy.reshape(decoded_images,
newshape=(decoded_images.shape[0], 28, 28))
num_images_to_show = 5
for im_ind in range(num_images_to_show):
plot_ind = im_ind*2+1
rand_ind = numpy.random.randint(low=0, high =x_train.shape[0])
matplotlib.pyplot.subplot(num_images_to_show, 2, plot_ind)
matplotlib.pyplot.imshow(x_train_orig[rand_ind,:,:], cmap="gray")
matplotlib.pyplot.subplot(num_images_to_show, 2, plot_ind)
matplotlib.pyplot.imshow(decoded_images_orig[rand_ind,:,:],
cmap="gray")
matplotlib.pyplot.figure()
matplotlib.pyplot.scatter(encoded_images[:,0], encoded_images[:,1], c =
y_train)
matplotlib.pyplot.colorbar()

Name – Ankita Upadhyay Roll No. - 73 BE - CMPN


Deep Learning Experiment 5 Page No. 5-4

OUTPUT

Name – Ankita Upadhyay Roll No. - 73 BE - CMPN


Deep Learning Experiment 5 Page No. 5-5

CONCLUSION

Hence, we can conclude that we can learn and study about autoencoder model for Image
Compression.

Name – Ankita Upadhyay Roll No. - 73 BE - CMPN


Deep Learning Experiment 6 Page No. 6-1

EXPERIMENT 6

AIM
To design the architecture and implement a CNN model for digit recognition applications.

THEORY

CNN stands for Convolutional Neural Network. It is a type of neural network that is
commonly used in computer vision tasks such as image recognition, object detection, and
segmentation.
A CNN consists of multiple layers, including convolutional layers, pooling layers, and fully
connected layers. The convolutional layers apply a series of filters to the input image to
extract features, such as edges and shapes. The pooling layers down sample the output
of the convolutional layers to reduce the spatial dimensions of the feature maps. The fully
connected layers then classify the features into the desired output classes.
CNNs have revolutionized the field of computer vision and have achieved state-of-the-art
performance on various benchmark datasets. They have also been applied to other
domains such as natural language processing and audio processing.

Figure 1: CNN Model For Digit Recognition

CODE
Exp5: CNN Model For Digit Recognition Application

import tensorflow as tf
(x_train, y_train), (x_test, y_test) =
tf.keras.datasets.mnist.load_data()

x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)


x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)
input_shape=(28,28, 1)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train/= 255

Name – Ankita Upadhyay Roll No. - 73 BE - CMPN


Deep Learning Experiment 6 Page No. 6-2

x_test/= 255

plt.figure(figsize=(10,10))
plt.subplot(4,4,1)
image_index = 2853
predict = x_test[image_index].reshape(28,28)
pred = model.predict(x_test[image_index].reshape(1, 28, 28, 1))
plt.imshow(x_test[image_index].reshape(28, 28),cmap='Greys')
plt.title("Predicted Label: "+str(pred.argmax()))

plt.subplot(4,4,2)
image_index = 2000
predict = x_test[image_index].reshape(28,28)
pred = model.predict(x_test[image_index].reshape(1, 28, 28, 1))
plt.imshow(x_test[image_index].reshape(28,28),cmap='Greys')
plt.title("Predicted Label: "+str(pred.argmax()))

plt.subplot(4,4,3)
image_index = 1500
predict = x_test[image_index].reshape(28,28)
pred = model.predict(x_test[image_index].reshape(1, 28, 28, 1))
plt.imshow(x_test[image_index].reshape(28, 28),cmap='Greys')
plt.title("Predicted Label: "+str(pred.argmax()))

plt.subplot(4,4,4)
image_index = 1345
predict = x_test[image_index].reshape(28,28)
pred = model.predict(x_test[image_index].reshape(1, 28, 28, 1))
plt.imshow(x_test[image_index].reshape(28, 28),cmap='Greys')
plt.title("Predicted Label: "+str(pred.argmax()))

OUTPUT

Name – Ankita Upadhyay Roll No. - 73 BE - CMPN


Deep Learning Experiment 6 Page No. 6-3

CONCLUSION

Hence, we can conclude that we can learn and study a CNN model for digit recognition
applications.

Name – Ankita Upadhyay Roll No. - 73 BE - CMPN


Deep Learning Experiment 7 Page No. 7-1

EXPERIMENT 7

AIM
To design and implement a LSTM For Sentiment Analysis.

THEORY

LSTM is a type of recurrent neural network (RNN) architecture in deep learning that is
designed to overcome the vanishing gradient problem in standard RNNs.

LSTMs address the vanishing gradient problem by introducing a memory cell that can
selectively remember or forget information based on the input data. The memory cell is
controlled by gates that regulate the flow of information into and out of the cell. This
allows LSTMs to learn long-term dependencies in sequential data by selectively retaining
or discarding information over time.

Training an LSTM involves defining the architecture of the network, specifying the input
and output layers, and defining the loss function and optimization algorithm. The weights
of the network are then updated through backpropagation during training to minimize the
loss function. Once the network is trained, it can be used to make predictions on new
data by feeding it through the network and extracting the output.

Figure 1: LSTM: Long Short Term Memory

CODE

Exp7: LSTM For Sentiment Analysis

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.model_selection import train_test_split

Name – Ankita Upadhyay Roll No. - 73 BE - CMPN


Deep Learning Experiment 7 Page No. 7-2

from sklearn.preprocessing import LabelEncoder


from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Embedding, LSTM,
SpatialDropout1D
from tensorflow.keras.utils import to_categorical

df = pd.read_csv('imdb_master.csv', encoding='ISO-8859-1', index_col=0,


error_bad_lines=False)
df = df[df['label'] != 'unsup']

# Preprocess the data


le = LabelEncoder()
y = le.fit_transform(df['label'])
X = df['review'].values # Split the dataset into training and testing
sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)

# Tokenize the data


tokenizer = Tokenizer(num_words=5000)
tokenizer.fit_on_texts(X_train)
X_train = tokenizer.texts_to_sequences(X_train)
X_test = tokenizer.texts_to_sequences(X_test)

# Pad the sequences


max_length = 500
X_train = pad_sequences(X_train, maxlen=max_length)
X_test = pad_sequences(X_test, maxlen=max_length)

# Build the LSTM model


embedding_vector_length = 32
model = Sequential()
model.add(Embedding(5000, embedding_vector_length,
input_length=max_length))
model.add(SpatialDropout1D(0.4))
model.add(LSTM(100, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(2, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam',
metrics=['accuracy'])

# Train the model


epochs = 5
batch_size = 64
history = model.fit(X_train, to_categorical(y_train), epochs=epochs,
batch_size=batch_size, validation_split=0.1)

# Evaluate the model


loss, accuracy = model.evaluate(X_test, to_categorical(y_test))
print('Accuracy:', accuracy)

Name – Ankita Upadhyay Roll No. - 73 BE - CMPN


Deep Learning Experiment 7 Page No. 7-3

import matplotlib.pyplot as plt

# Plot training and validation accuracy over time


plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend(['Training', 'Validation'], loc='upper left')

OUTPUT

CONCLUSION

Hence, we can conclude that we can learn and study LSTM For Sentiment Analysis.

Name – Ankita Upadhyay Roll No. - 73 BE - CMPN


DEEP LEARNING MINI PROJECT REPORT

Web application for detection of Sarcasm in sentences from web scrapped


news headlines

Submitted in partial fulfillment of the requirements


of the B.E Final Year Semester VIII in the degree of
Bachelor of Computer Engineering

BY
Amey Bhavsar (62)
Ankita Upadhyay (73)
Deepanshu Yadav (74)

Guide
Prof.Neelam Phadnis

DEPARTMENT OF COMPUTER ENGINEERING


Accredited by NBA for 3 years w.e.f. 1st July 2022

SHREE L. R. TIWARI COLLEGE OF ENGINEERING


THANE -401 105, MAHARASHTRA.
University of Mumbai
(AY 2022-23)
Table of
Contents

Table of Contents ......................................................................................... 2


Abstract ........................................................................................................ 3
Introduction .................................................................................................. 5
Literature Review ......................................................................................... 6
Implementation............................................................................................. 7
Proposed System .......................................................................................... 8
Conclusion.................................................................................................... 9
References ................................................................................................. 10

2
3
Abstract

Detecting sarcasm is a complex problem in natural language processing due to the challenges
associated with understanding the nuances of language and contextual cues. Sarcasm can be
difficult to detect, even for humans, as it often involves a speaker saying the opposite of what they
mean, or using irony or humor to convey a different meaning. It is especially challenging in
written text, as there are no visual or auditory cues to help identify the speaker's intention.

Despite the challenges, there is a growing need for sarcasm detection in various domains,
including social media monitoring, customer service, and sentiment analysis. Identifying sarcasm
can help businesses and organizations understand customer sentiment better, respond more
effectively to feedback, and improve communication with customers. Therefore, developing
accurate and reliable models for sarcasm detection is crucial.

In this project, we focused on developing a deep learning model for detecting sarcasm in news
headlines. We collected a dataset of news headlines from various sources, including satirical news
websites, and used a supervised learning approach to train our model. We chose a neural network-
based approach as it has shown promising results in other natural language processing tasks.

4
List of Figures

Fig. 1: Download dataset

Fig. 2: Tokenizer

Fig. 3: Define model

Fig. 4: Fit model

Fig. 5: Predict output on a sentence

Fig. 6: Model summary

Fig. 7: Accuracy

5
Introduction
Detecting sarcasm in natural language is a challenging problem in natural language processing
due to its subjective nature and the contextual cues required to identify it accurately. Sarcasm
often involves speakers saying the opposite of what they mean, using irony or humor to convey
a different meaning, or relying on cultural references and shared knowledge. Sarcasm detection
has important applications in various domains, including social media monitoring, customer
service, and sentiment analysis. Identifying sarcasm can help organizations understand
customer sentiment better, respond more effectively to feedback, and improve communication
with customers.

In recent years, deep learning techniques have shown promising results in various natural
language processing tasks, including sarcasm detection. These techniques involve training
neural network models on large datasets to learn patterns and features that enable them to
identify sarcasm accurately. However, sarcasm detection remains a challenging problem, and
there is a need for developing more accurate and reliable models for this task.

In this project, we aimed to develop a deep learning model for detecting sarcasm in news
headlines using a supervised learning approach. We collected a dataset of news headlines from
various sources, including satirical news websites, and trained a neural network to predict the
probability of a sentence being sarcastic or not. Our project involved several steps, including
data preprocessing, feature engineering, and hyperparameter tuning. We evaluated the model's
accuracy and F1-score on a separate test dataset and performed a comprehensive analysis of its
performance.

Objective
1. To develop a deep learning model for detecting sarcasm in news headlines.
2. To collect a dataset of news headlines from various sources to use for training and testing the
model.
3. To preprocess the text data to convert the text into numerical features for the model.
4. To use supervised learning techniques to train the model on the dataset of news headlines.
5. To fine-tune the hyperparameters of the model to optimize its performance.
6. To evaluate the model's accuracy and F1-score on a separate test dataset.
7. To analyze the model's performance using a confusion matrix and examine its most significant
misclassifications.
8. To compare the performance of the developed model with existing sarcasm detection models.
9. To identify the limitations of the developed model and suggest areas for further improvement.

6
10. To provide insights into the sarcasm detection problem in natural language processing and its
practical applications in sentiment analysis, social media monitoring, and customer service.

Literature Review

I Sarcasm Identification and Detection in ConversionContext using Kalaivani A,


BERT [1] Thenmozhi D
Sarcasm analysis in user conversion text is the automated detection of any irony, insult, hurtful,
painful, caustic, humour, or vulgarity that degrades a person. It is useful for sentimental analysis
and cyberbullying. With the rapid growth of social media, sarcasm analysis can assist in avoiding
insults, hurts, and humour from affecting someone. We present traditional Machine Learning
approaches, Deep Learning approaches (RNN-LSTM), and BERT (Bidirectional Encoder
Representations from Transformers) for detecting sarcasm in this paper. We used the approaches to
build the model, identify and categorise how much conversion context or response is required for
sarcasm detection, and tested it on two social media forums, Twitter conversation dataset and Reddit
conversion dataset.
II Sarcasm detection using machine learning algorithms in Twitter:A Samer Muthana Sarsam
systematic review [2]

Understanding users' opinions on various topics or events on social media requires understanding
both literal and figurative meanings. Detecting sarcastic posts on social media has received a lot of
attention recently, especially because sarcastic tweets frequently include positive words that
represent negative or undesirable characteristics. To comprehend the application of various machine
learning algorithms for sarcasm detection in Twitter, the Preferred Reporting Items for Systematic
Reviews and Meta-Analyses (PRISMA) statement was used. The combination of Convolutional
Neural Network (CNN) and SVM was discovered to provide high prediction accuracy. Furthermore,
our findings show that lexical, pragmatic, frequency, and part-of-speech tagging can improve SVM
performance, whereas both lexical and personal features can improve CNN-SVM performance.
III Multimodal Sarcasm Detection: A Deep Learning Approach [3] Santosh Kumar Bharti,
Rajeev Kumar Gupta
In this paper, we proposed a hybrid model for detecting sarcasm in conversational data. To detect
sarcasm, this method combines text and audio features. Three models were developed: the text
model, which only works with textual features, the audio model, which only works with audio
features, and the hybrid model, which works with both text and audio features. The hybrid model's
text and audio features were obtained as latent vectors from the other two models. Based on the
results, it is clear that the hybrid model outperforms the other two models. This is due to text and
audio features compensating for each other's flaws. As a result, these findings support our hypothesis
that combining text and audio increases the likelihood of detecting sarcasm.

7
Implementation

1. Web-scraped data set of news headlines [4]

Fig. 1
We use a dataset that contains headline, article_link and is_sarcastic label for various news
headlines scraped from the internet.
2. Tokenization

Fig. 2
To convert words to sequences and then sequences to padded ones, we use tokenizer class to
fit the words on a given size of vocabulary. Padding appends extra characters at the end of
the sentence to make each of them that same length.
3. Model declaration

Fig. 3
Using Sequential class from tf.keras, we define a neural network with Embedding to receive
text sequences as input and end with a single neuron to get probability for sarcasm. The
sigmoid function returns a value between 0 to 1 and binary cross entropy is used since only
two classes are being compared, [Sarcastic, NOT Sarcastic].
4. Fit the model and perform predictions

Fig. 4

8
Fig. 5
Proposed System

The proposed system for detecting sarcasm in news headlines uses a deep learning model based on
a sequential neural network architecture. The model consists of several layers, including an
embedding layer, a global average pooling layer, and two dense layers.

The embedding layer converts the input text data into a numerical representation that the model can
process. The layer uses an embedding matrix with a fixed vocabulary size to map each word in the
input sentence to a corresponding vector representation. In this model, the embedding layer has 16
output dimensions.

The global average pooling layer computes the average of the embeddings for each word in the input
sentence to obtain a single vector representation of the sentence. This layer reduces the
dimensionality of the data and helps prevent overfitting.

The two dense layers are fully connected layers that perform nonlinear transformations on the input
data. The first dense layer has 24 output dimensions and uses the ReLU activation function, while
the second dense layer has a single output neuron with a sigmoid activation function. The sigmoid
function outputs a probability value between 0 and 1, representing the model's confidence that the
input sentence is sarcastic or not.

The model has a total of 160,433 parameters, all of which are trainable. During training, the model
adjusts its parameters to minimize the loss function between the predicted and actual labels for the
training data. Once trained, the model can be used to predict the sarcasm probability for new input
sentences.

Overall, this proposed system provides a simple and efficient approach to sarcasm detection using a
neural network architecture that has shown promising results in previous natural language processing
tasks.

9
Fig. 6

Results

Fig. 7
We collected a dataset of news headlines from various sources, including satirical news websites, and
trained our model on this dataset. Our model achieved an accuracy of 97% on training data and 81%
on validation data demonstrating its effectiveness in identifying sarcasm in news headlines.
Conclusion
In this project, we developed a deep learning model for detecting sarcasm in news headlines using a
supervised learning approach. Our model was based on a sequential neural network architecture and

10
used an embedding layer, global average pooling layer, and two dense layers to make predictions
about the sarcasm probability of the input sentences.

Through our project, we highlighted the importance of developing sophisticated models to solve the
sarcasm detection problem and the potential of deep learning techniques in this domain. Our approach
provides a simple and efficient way of identifying sarcasm in text, which has important applications
in sentiment analysis, social media monitoring, and customer service.

However, we also identified some limitations of our model, including its reliance on textual data and
the potential for bias in the training dataset. We suggest that future research should explore the use of
multimodal data and more diverse datasets to improve the performance and generalization of sarcasm
detection models. Overall, this project provides valuable insights into the sarcasm detection problem
in natural language processing and highlights the potential for deep learning techniques to improve
the accuracy and reliability of sarcasm detection models.
References

[1] A., K. and D., T. (no date) Sarcasm identification and detection in conversion context
using Bert, ACL Anthology. Available at: https://aclanthology.org/2020.figlang-1.10/
(Accessed: April 6, 2023).
[2] SARCASM detection using machine learning algorithms in ... - sage journals (no
date). Available at: https://journals.sagepub.com/doi/full/10.1177/1470785320921779
(Accessed: April 6, 2023).

[3] Bharti, S.K. et al. (2022) Multimodal Sarcasm Detection: A deep learning approach,
Wireless Communications and Mobile Computing. Hindawi. Available at:
https://www.hindawi.com/journals/wcmc/2022/1653696/ (Accessed: April 6, 2023).
[4] https://github.com/ashwaniYDV/sarcasm-detection-tensorflow/blob/main/sarcasm.json

11
12

You might also like