Efficient Speech Emotion Recognition: Presented By: Samir Kumar Majhi

The document presents a comprehensive overview of Speech Emotion Recognition (SER), detailing its importance in understanding human emotions through voice analysis and its applications in various fields. It reviews existing literature on SER methods, highlights a proposed model that combines multiple audio features for improved accuracy, and discusses the performance of this model compared to traditional approaches. The proposed model demonstrates a significant accuracy of 99.31% by integrating MFCC, RMS, and ZCR features with a CNN architecture.

Uploaded by

samirk.majhi369

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views12 pages

Efficient Speech Emotion Recognition: Presented By: Samir Kumar Majhi

Uploaded by

samirk.majhi369

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 12

Efficient Speech Emotion Recognition

Presented By: Samir kumar Majhi

Presentation
outlines

1. Introduction
2. Literature review
3. Proposed Model
4. Dataset Used
5. Result
6. Comparison Analysis
Introduction
Speech Emotion Recognition (SER) is about understanding how a person feels
by listening to their voice. It looks at things like how high or low the voice is,
how loud it is, and how the person talks. SER can tell if someone is happy, sad,
angry, or scared. It is used in places like customer service, healthcare, and
talking with computers to make communication better. Thanks to new
technology like machine learning, SER is getting better at helping machines
understand human emotions.
Literature review
S.N. AUTHOR TITLE AND JOURNAL PROPOSED METHOD MERITS LIMITATION
PUBLISHER
1. Ala Saleh Alluhaidan, Speech Applied A Convolutional Neural Achieves high Limited to audio-
Oumaima Saidani el Emotion Sciences Network (CNN) using hybrid accuracy (97%, based datasets and
at Recognition features (MFCC and time- 93%, 92% on needs further
through Hybrid domain features) to improve Emo-DB, SAVEE, exploration for
Features and emotion detection accuracy and RAVDESS multimodal or real-
Convolutional datasets world complex
Neural respectively), emotional detection
Network outperforms scenarios
traditional
methods
2. Samuel Kakuba, Deep IEEE CoSTGA model using multi- Achieves 75.50% Limited to speech-
Alwin Poulose, Dong Learning- Access level fusion of spatial, weighted accuracy based emotion
Seog Han et al Based Speech temporal, and semantic and 75.82% recognition and may
Emotion features unweighted need further
Recognition accuracy, showing exploration with
Using Multi- improved additional modalities
Level Fusion of robustness for real-world
Concurrent applications
Features
S.N. AUTHOR TITLE AND JOURNAL PROPOSED METHOD MERITS LIMITATION
PUBLISHER
3. SAMUEL KAKUBA, Deep IEEE Proposed includes utilizing An advantage of Disadvantages of
ALWIN POULOSE AND Learning- ACCESS (CNNs) and deep learning using deep learning using complex
DONG SEOG HAN 4 , Based Speech model called CoSTGA focuses on models, particularly architectures and
(Senior Member, IEEE) Emotion learning features from both those trained using multi-level fusion
Recognition audio and text of speech at the multi-level fusion, is techniques for speech
Using Multi- same time their emotion recognition
Level Fusion comprehensive include large data
of Concurrent understanding, real- requirements and a
Features world applicability, potential lack of
enhanced accuracy, interpretability due to
and ability to the complexity of the
effectively ignore models.
background noise in
speech emotion
recognition tasks.
4. Cheng Lu , Wenming Speech IEEE introduces an Attentive Time- The proposed While achieving high
Zheng , Senior Emotion TRANSACTI Frequency Neural Network for method achieves accuracy, the
Member, IEEE, Hailun Recognition ONS ON Speech Emotion Recognition, state-of-the-art computational
Lian, Yuan Zong , via an COMPUTATI which effectively captures performance in complexity of the
Member, IEEE, Attentive ONAL
time-frequency patterns in speech emotion proposed model
Chuangao Tang , Time– SOCIAL
Sunan Li, and Yan Frequency SYSTEMS,
speech signals to identify recognition by might be higher
Zhao Neural VOL. 10, emotional states. leveraging compared to simpler
Network NO. 6, attention models, limiting its
DECEMBER mechanisms to real-time application
2023 focus on relevant in resource-
time-frequency constrained
features. environments.
Dataset used
:- RAVDESS
The Ryerson Audio-Visual Database of Emotional Speech
and Song (RAVDESS) contains 1,440 speech samples from
24 actors expressing 8 emotions (neutral, calm, happy, sad,
angry, fearful, disgust, and surprised). It is widely used for
emotion recognition research due to its high-quality,
labeled emotional expressions, supporting the training of
machine learning models like DNNs and CNNs.
OUR FINAL Proposed model
Steps involved

1.Audio Input – Raw audio is given to the system.

2.MFCC Extraction – Gets important voice features.
3.RMS Extraction – Measures loudness of the sound.
4.ZCR Extraction – Counts how often the sound signal changes direction.
5.Feature Fusion – Combines MFCC, RMS, and ZCR features.
6.CNN Block – Finds useful patterns in the combined features.
7.Flattening – Turns the patterns into a single line of numbers.
8.Dense Layers – Learns from the patterns to understand emotions.
9.Dropout – Helps prevent the model from memorizing too much.
10.Softmax Output – Picks the emotion with the highest chance.
RESULTS AND DISCUSSION

•Model performance metrics: Accuracy (Acc),

Precision, Recall, F1-score.

•Comparison with other models:

• MFCC only: 85.30% accuracy.

• RMS + ZCR: 88.40% accuracy.
• MFCC + RMS + ZCR (Fusion Model): 99.31%
accuracy.
Comparison
• .
analysis
Existing
. Model (MFCC + VGGish):
•Uses MFCC with a pre-trained model called VGGish.
•VGGish is good for general sound tasks, not emotions.
•Accuracy is lower because it's not made for emotion detection.

Proposed Model (MFCC + RMS + ZCR + CNN):

•Combines MFCC, RMS, and ZCR features.
•Uses CNN to learn better patterns from the data.
•Gets very high accuracy (99%) for emotion recognition.
•Works better because it uses more types of information and is made for emotions.

Si Trc2 U3 Unit Tests
No ratings yet
Si Trc2 U3 Unit Tests
5 pages
RPH 1
No ratings yet
RPH 1
5 pages
W3
No ratings yet
W3
5 pages
Full Text
No ratings yet
Full Text
121 pages
Advance Organizers PowerPoint
No ratings yet
Advance Organizers PowerPoint
17 pages
Syllabus Fundamental Mathematics
No ratings yet
Syllabus Fundamental Mathematics
4 pages
Og Training
No ratings yet
Og Training
3 pages
Develop Human Excellence AND Professional Excellence: Presented by
No ratings yet
Develop Human Excellence AND Professional Excellence: Presented by
25 pages
Lesson Plan English Year 5
No ratings yet
Lesson Plan English Year 5
13 pages
Feedback Mechanism Instrument
No ratings yet
Feedback Mechanism Instrument
2 pages
Self-Learning Modules - EEnglish-7-Q4-M2
No ratings yet
Self-Learning Modules - EEnglish-7-Q4-M2
10 pages
LESSON PLAN IN TLE Beauty Care
100% (3)
LESSON PLAN IN TLE Beauty Care
3 pages
Oral-Comm Q2W2
No ratings yet
Oral-Comm Q2W2
6 pages
C Areer Objective:: Sumera Khanum
No ratings yet
C Areer Objective:: Sumera Khanum
2 pages
English Greetings for Kids
No ratings yet
English Greetings for Kids
8 pages
Acr 3RD Portfolio Day
No ratings yet
Acr 3RD Portfolio Day
2 pages
Wuolah-Free-Examen Vocabulary 1 APTIS
No ratings yet
Wuolah-Free-Examen Vocabulary 1 APTIS
4 pages
Sensors: Speech Emotion Recognition With Heterogeneous Feature Unification of Deep Neural Network
No ratings yet
Sensors: Speech Emotion Recognition With Heterogeneous Feature Unification of Deep Neural Network
15 pages
Speech Based Emotion Recognition
No ratings yet
Speech Based Emotion Recognition
26 pages
Speech Emotion Recognition Using MFCC and SVM
100% (1)
Speech Emotion Recognition Using MFCC and SVM
21 pages
Speech-Emotion-Recognition Using SVM, Decision Tree and LDA Report
No ratings yet
Speech-Emotion-Recognition Using SVM, Decision Tree and LDA Report
7 pages
Questionnaire Effects of Study Habits On Grade 12 Abm Students' Academic Performance in Consolacion National High School Senior High School Day Class
No ratings yet
Questionnaire Effects of Study Habits On Grade 12 Abm Students' Academic Performance in Consolacion National High School Senior High School Day Class
6 pages
Speech Emotion Recognition With Deep Learning
No ratings yet
Speech Emotion Recognition With Deep Learning
5 pages
Speech Emotion Recognition Model
No ratings yet
Speech Emotion Recognition Model
19 pages
Tempo and Certifcattes and Reccomendation PDF
50% (4)
Tempo and Certifcattes and Reccomendation PDF
3 pages
Assignment 04 Portfolio 28october2022
No ratings yet
Assignment 04 Portfolio 28october2022
4 pages
Core French 8 Outline 2015-2016 Final
No ratings yet
Core French 8 Outline 2015-2016 Final
2 pages
Chethana H N REPORT
No ratings yet
Chethana H N REPORT
12 pages
Speech Emotion Recognition: Submitted by Manoj Rajput 2019PEC5303
No ratings yet
Speech Emotion Recognition: Submitted by Manoj Rajput 2019PEC5303
11 pages
Marco Baltazar - Reference Letter From Bus 100 Supervisor
No ratings yet
Marco Baltazar - Reference Letter From Bus 100 Supervisor
1 page
Fundations Preparation and Implementation 2022
No ratings yet
Fundations Preparation and Implementation 2022
8 pages
Group 110 Arun Kumar Review 2 Report
No ratings yet
Group 110 Arun Kumar Review 2 Report
14 pages
Jazz It Up - Grammar Games & Activities
No ratings yet
Jazz It Up - Grammar Games & Activities
10 pages
Sat - 82.Pdf - Election Prediction With Automated Speech Emotion Recognition
No ratings yet
Sat - 82.Pdf - Election Prediction With Automated Speech Emotion Recognition
11 pages
IJRPR4210
No ratings yet
IJRPR4210
12 pages
Sensors 23 06212 v2
No ratings yet
Sensors 23 06212 v2
20 pages
SPRINGERIJST
No ratings yet
SPRINGERIJST
11 pages
Human Emotion Detection With Speech Recognition Using Mel-Frequency Cepstral Coefficient and CNN - New
No ratings yet
Human Emotion Detection With Speech Recognition Using Mel-Frequency Cepstral Coefficient and CNN - New
2 pages
Real-Time Speech Emotion Recognition Using Deep Le
No ratings yet
Real-Time Speech Emotion Recognition Using Deep Le
40 pages
Cross-Accent Emotion Recognition
No ratings yet
Cross-Accent Emotion Recognition
19 pages
The Styles Visual, Auditory, Kinesthetic and Competences in The Classroom
No ratings yet
The Styles Visual, Auditory, Kinesthetic and Competences in The Classroom
5 pages
VikasSwaroop Resume AI
No ratings yet
VikasSwaroop Resume AI
1 page
MS Thesis Final
No ratings yet
MS Thesis Final
47 pages
Recognition of Emotions in Speech Using Deep CNN A
No ratings yet
Recognition of Emotions in Speech Using Deep CNN A
18 pages
DL Emotion MFCC
No ratings yet
DL Emotion MFCC
6 pages
Reality
No ratings yet
Reality
11 pages
Speech Emotion Recognition Guide
No ratings yet
Speech Emotion Recognition Guide
14 pages
EMOTIONDETECTION (1) Mini Project
No ratings yet
EMOTIONDETECTION (1) Mini Project
5 pages
Speech Emotion Analysis System
No ratings yet
Speech Emotion Analysis System
10 pages
Multimodal Speech Emotion Recognition and Ambiguity Resolution
No ratings yet
Multimodal Speech Emotion Recognition and Ambiguity Resolution
9 pages
SER (Research Paper)
No ratings yet
SER (Research Paper)
5 pages
Deep Learning for Emotion Detection
No ratings yet
Deep Learning for Emotion Detection
5 pages
Serdl 2
No ratings yet
Serdl 2
10 pages
Lesson Plan. Animals. The Ghost of The Mountains
No ratings yet
Lesson Plan. Animals. The Ghost of The Mountains
6 pages
Economic and Cultural Growth
No ratings yet
Economic and Cultural Growth
3 pages
Speaker Emotion Recognition: Leveraging Self-Supervised Models For Feature Extraction Using Wav2Vec2 and Hubert
No ratings yet
Speaker Emotion Recognition: Leveraging Self-Supervised Models For Feature Extraction Using Wav2Vec2 and Hubert
9 pages
Speech Emotion Recognition Using Deep Learning
No ratings yet
Speech Emotion Recognition Using Deep Learning
6 pages
Mayur Kumar PDF
No ratings yet
Mayur Kumar PDF
1 page
Electronics 12 00839 v2
No ratings yet
Electronics 12 00839 v2
17 pages
Multimodal Emotion Recognition
No ratings yet
Multimodal Emotion Recognition
5 pages
CNN Based Approach For Speech Emotion Recognition Using MFCC Croma and STFT Hand-Crafted Features
No ratings yet
CNN Based Approach For Speech Emotion Recognition Using MFCC Croma and STFT Hand-Crafted Features
5 pages
Machine Learning and Deep Learning Techniques For Emotion Recognition From Human Speech Using Acoustic Analysis
No ratings yet
Machine Learning and Deep Learning Techniques For Emotion Recognition From Human Speech Using Acoustic Analysis
10 pages
Set Conference Draft Paper - 223585
No ratings yet
Set Conference Draft Paper - 223585
6 pages
Review 3 PPT Final1)
No ratings yet
Review 3 PPT Final1)
51 pages
Organized
No ratings yet
Organized
58 pages
FP-05 4
No ratings yet
FP-05 4
6 pages
JETIR2106163
No ratings yet
JETIR2106163
5 pages
Exploring The Effectiveness of Advanced Machine Learning Models in Speech Emotion Recognition
No ratings yet
Exploring The Effectiveness of Advanced Machine Learning Models in Speech Emotion Recognition
6 pages
Voice Emotion Recognition
No ratings yet
Voice Emotion Recognition
11 pages
1 s2.0 S0003682X23002906 Main
No ratings yet
1 s2.0 S0003682X23002906 Main
11 pages
9 - Yogendra
No ratings yet
9 - Yogendra
5 pages
Deep Learning Structure For Emotion Prediction Using MFCC From Native Languages
No ratings yet
Deep Learning Structure For Emotion Prediction Using MFCC From Native Languages
13 pages
Speech Emotion Journal Phase 2-3
No ratings yet
Speech Emotion Journal Phase 2-3
6 pages
XEmoAccent Embracing Diversity in Cross-Accent Emotion Recognition Using Deep Learning
No ratings yet
XEmoAccent Embracing Diversity in Cross-Accent Emotion Recognition Using Deep Learning
18 pages
Deep Learning Report 1 3
No ratings yet
Deep Learning Report 1 3
3 pages
Journey of The SPED Learners in The English Mainstream Classroom: A Multiple-Case Study
No ratings yet
Journey of The SPED Learners in The English Mainstream Classroom: A Multiple-Case Study
41 pages
Speech Emotion Recognition Using Machine Learning
No ratings yet
Speech Emotion Recognition Using Machine Learning
8 pages
Literature Study 2025
No ratings yet
Literature Study 2025
10 pages
Attention-Based Multi-Level Feature Fusion For Multilingual Speech Emotion Recognition
No ratings yet
Attention-Based Multi-Level Feature Fusion For Multilingual Speech Emotion Recognition
6 pages
Research Paper Seminar
No ratings yet
Research Paper Seminar
17 pages
GROUP7 Researchpaper
No ratings yet
GROUP7 Researchpaper
9 pages
1st Review
No ratings yet
1st Review
19 pages
Paper5 Implementation
No ratings yet
Paper5 Implementation
7 pages
DL Research Paper PDF
No ratings yet
DL Research Paper PDF
15 pages
Sentispeak Tone Mood Detector
No ratings yet
Sentispeak Tone Mood Detector
16 pages
Literature Review (2) Smaple
No ratings yet
Literature Review (2) Smaple
9 pages
Emotion Recognition Using Speech Processing
No ratings yet
Emotion Recognition Using Speech Processing
5 pages
Speech
No ratings yet
Speech
6 pages

Efficient Speech Emotion Recognition: Presented By: Samir Kumar Majhi

Uploaded by

Efficient Speech Emotion Recognition: Presented By: Samir Kumar Majhi

Uploaded by

Efficient Speech Emotion Recognition

Presented By: Samir kumar Majhi

1.Audio Input – Raw audio is given to the system.

•Model performance metrics: Accuracy (Acc),

•Comparison with other models:

• MFCC only: 85.30% accuracy.

Proposed Model (MFCC + RMS + ZCR + CNN):

You might also like