Speech Emotion Detection System using Python
1) Background/ Problem Statement
While having a face-to-face conversation with another person, it is
often possible to gauge their emotions through cues such as their
expressions, body language etc. However, while having a telephonic
conversation, it becomes very difficult to get a sense of the emotional
state of an individual.
Following the Covid 19 Pandemic, a lot of us were confined to our
homes. In such trying times, it is essential to keep tabs on the mental
health and well-being of your family, friends, co-workers, etc. This
Speech Emotion Detection project has been designed to help with
detecting the emotions of a person based on their voice.
Users can either upload a pre-recorded audio file or record a new file
and analyze it. This project has been developed using the Django
Framework with Python as the programming language.
2) Working of the Project
Considering the anomalies in the existing system computerization of
the whole activity is being suggested after the initial analysis. The web
application is developed using Django Framework with Python as a
programming language. The proposed system is accessed by one entity
named, the user.
Users need to log in with their valid login credentials first. After
successfully logging in, the user can access all the modules and
perform/manage each task accurately. The user can perform tasks such
as uploading an audio file, recording an Audio, and analyzing the
emotions in the audio.
The front-end involves Html, CSS, and JavaScript and the back-end
involves Python. The framework used is Django and the database is
MySQL.
3) Advantages
It is easy to maintain.
It is user-friendly.
The system can recognize human emotions from speech,
audio or recordings.
It can help to resolve many problems by understanding
emotions.
4) System Description
The system comprises 1 major module with their sub-modules as
follows:
USER:
Sign Up: The user will need to register their accounts will their
basic details.
Sign In: They can log in to the system using
Home
o Upload an Audio File: The user will send an audio file in mp3
format and will display all the emotions detected.
o Record an Audio: The user will have to speak for 1 min and
will display all the emotions detected.
Logout: The user can log out from the system, once they have
completed their task.
5) Project Life Cycle
The waterfall model is a classical model used in the system
development life cycle to create a system with a linear and sequential
approach. It is termed a waterfall because the model develops
systematically from one phase to another in a downward fashion. The
waterfall approach does not define the process to go back to the
previous phase to handle changes in requirements. The waterfall
approach is the earliest approach that was used for software
development.
6) System Requirements
I. Hardware Requirement
i. Laptop or PC
Windows 7 or higher
I3 processor system or higher
4 GB RAM or higher
100 GB ROM or higher
II. Software Requirement
ii. Laptop or PC
Python
Sublime Text Editor
XAMP Server
7) Limitation/Disadvantages
- The user will need to upload the audio file correctly,
otherwise, it will fail to detect emotions.
- Also, while recording they compulsorily need to speak for 1
minute.
8) Application – This application recognizes and detects emotions
from audio or recordings.
9) Reference
- https://ieeexplore.ieee.org/abstract/document/8805181
- http://ieeexplore.ieee.org/abstract/document/7002390/
- http://ieeexplore.ieee.org/document/9383000
- https://ieeexplore.ieee.org/document/4555476