Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
7 views19 pages

SLTPPT Final

The project focuses on developing a real-time Sign Language Translation (SLT) system using a CNN-LSTM hybrid architecture to enhance communication between sign language users and non-signers. It aims to accurately recognize dynamic hand gestures and produce grammatically correct translations, addressing existing limitations in real-time gesture recognition. The system shows promising results with a 76.42% accuracy rate and has potential applications in various fields, including healthcare and education.

Uploaded by

saanvi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as KEY, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views19 pages

SLTPPT Final

The project focuses on developing a real-time Sign Language Translation (SLT) system using a CNN-LSTM hybrid architecture to enhance communication between sign language users and non-signers. It aims to accurately recognize dynamic hand gestures and produce grammatically correct translations, addressing existing limitations in real-time gesture recognition. The system shows promising results with a 76.42% accuracy rate and has potential applications in various fields, including healthcare and education.

Uploaded by

saanvi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as KEY, PDF, TXT or read online on Scribd
You are on page 1/ 19

GLOBAL ACADEMY OF

TECHNOLOGY
RAJARAJESHAWARI NAGAR,BENGALURU-
560098
Department of Electronics and
Communication Engineering
Major Project Presentation
(21ECP83)
On
“Real-Time Sign Language Translation using CNN-LSTM
Hybrid Architecture for Spatial-Temporal Gesture Synthesis
and Grammar Refinement”
Under The Guidance o
PRESENTED BY: Mrs.Shubha G.N
C Saanvi 1GA21EC028
Chidaksh Babu 1GA21EC032 Assistant Professor
Varsha P V 1GA21EC165 Dept. of ECE, GAT
CONTENTS
1. INTRODUCTION
2. LITERATURE SURVEY
3. PROBLEM STATEMENT
4. OBJECTIVES
5. IMPLEMENTATION
6. RESULTS AND DISCUSSION
7. ADVANTAGES/DISADVANTAGES
8. APPLICATIONS
9. CONCLUSION
10. FUTURE SCOPE
11. REFERENCES
INTRODUCTION
Sign Language (SL) serves as a vital communication method for individuals who are
hearing or speech impaired. However, a major barrier exists when communicating
with non-signers. To bridge this gap, automated Sign Language Translation (SLT)
systems have gained research attention in recent years.
These systems aim to translate hand gestures into text, enabling effective
interaction between signers and non-signers. Deep learning models, especially
CNNs and LSTMs, have shown great potential in recognizing both spatial and
temporal aspects of hand movements.
This project adopts a real-time SLT approach using computer vision and deep
learning, focusing on accurate gesture recognition and grammatically correct output
to enhance inclusivity and communication.
LITERATURE
SURVEY
PROBLEM STATEMENT
There is a major communication gap between sign language users
and non-signers. Existing solutions struggle with real-time gesture
recognition, lack accuracy in dynamic hand movements, and don’t
offer grammar correction. This project aims to build a real-time sign
language translator using deep learning (CNN-LSTM), MediaPipe for
hand tracking, and grammar correction to bridge this gap and
improve accessibility.
OBJECTIVES
To design a real-time sign language translation system using deep
learning.
To accurately detect and classify dynamic hand gestures using
MediaPipe and LSTM.
To convert recognized gestures into grammatically correct sentences
using NLP tools.
To enhance communication accessibility for the hearing and speech-
impaired community.
IMPLEMENTATION
Video Frame Capture:
Real-time video is captured using a webcam with OpenCV. Frames are
continuously extracted from the video stream to monitor hand movements as
they happen.
Hand Landmark Detection:
MediaPipe Holistic is used to detect 21 key 3D landmarks on the hand in each
frame. These points represent the shape and orientation of the hand, forming
the core input for gesture recognition.
Feature Preprocessing:
The extracted landmarks are normalized to reduce variation due to hand size or
camera angle. Input sequences are padded to a uniform length, and
augmentation techniques like flipping and rotation are used to improve model
robustness.
Model Training (LSTM):
A deep learning model based on stacked LSTM layers is trained on labeled
gesture sequences. The model captures temporal patterns in the hand
movements and learns to associate them with specific gestures. It is optimized
using categorical cross-entropy loss and the Adam optimizer.
Gesture Classification:
In real-time, the preprocessed input is passed through the trained LSTM model. It
classifies the gesture by predicting the most likely class based on the sequence
of hand movements.
Grammar Correction:
The output from gesture classification is refined using languagetoolpython. This
module corrects grammar, ensuring that the translated sentence is clear, correct,
and easy to understand.
Real-Time Display:
The final, grammatically correct output is displayed on a user-friendly interface,
allowing seamless and accessible communication between sign language users
and non-signers.
RESULTS AND DISCUSSION
Achieved 76.42% accuracy in real-time gesture recognition.
Precision, Recall, F1-Score: All at 0.76, showing balanced
performance.
Outperforms traditional methods by capturing temporal
patterns using LSTM.
Works smoothly on both CPU and GPU.
Handles single-hand gestures with good reliability.
ADVANTAGES/DISADVANTAGES
Advantages Disadvantages
Real-time sign language Accuracy depends on
translation lighting and camera quality
High accuracy with spatial- Requires large, diverse
temporal modeling (LSTM) dataset for better
generalization
Grammar correction for
meaningful sentences Not yet integrated with
speech output
Compatible with both CPU
and GPU setups May lag on low-resource
devices during training
User-friendly interface for
easy interaction
APPLICATIONS
Communication for the Hearing and Speech Impaired: The system
translates sign language into text or speech, enabling real-time
communication for those with hearing or speech impairments. This helps
bridge the gap between them and the general public.
Integration with Smart Devices: Sign language recognition can be
integrated into smart devices like smartphones or wearables. It allows
users to control these devices through hand gestures.
Educational Tools: The system can be used in apps to teach and help
practice sign language. It provides real-time feedback, making learning
more interactive and accessible.
Healthcare Applications: In medical environments, this system helps
healthcare providers communicate with deaf or hard-of-hearing patients. It
ensures accurate communication for proper care and treatment.
Real-Time Translation for Accessibility: Sign language recognition can be
used at public events to translate sign language into text or speech. This
makes events accessible to people who are deaf or hard of hearing.
CONCLUSION
Sign language recognition systems using customised
Convolutional Neural Networks (CNNs) have immense potential
to enhance communication and accessibility. These systems
can bridge gaps for the hearing and speech impaired, integrate
with smart technologies, and improve educational tools. They
also hold promise in healthcare settings and public events,
ensuring inclusivity and real-time interaction for those who rely
on sign language. As technology advances, such systems will
continue to make significant contributions to society,
empowering individuals and fostering better communication
across various domains.
FUTURE SCOPE
Support for Multiple Sign Languages: Expanding the system to
recognize various sign languages globally, making it more inclusive.
Integration with Wearable Devices and AR: Utilizing smart wearables
or augmented reality for more intuitive sign language
communication.
Sign Language to Speech Translation: Converting sign language into
speech in real-time, allowing seamless communication with non-
signers.
Real-Time Multilingual Translation: Enabling real-time translation of
sign language into multiple spoken languages for broader
accessibility.
Enhanced Accuracy and Speed: Improving recognition accuracy and
processing speed with advanced machine learning, making systems
more reliable and efficient.
REFERENCES
[1] D. Guo, W. Zhou, H. Li, and M. Wang, ”Hierarchical LSTM for sign language translation,” Proceedings of
the AAAI Conference on Artificial Intelligence, vol. 32, no. 1, pp. 1–8, Apr. 2018.
[2] Q. Xiao, X. Chang, X. Zhang, and X. Liu, ”Multi-information spa- tial–temporal LSTM fusion continuous
sign language neural machine translation,” IEEE Access, vol. 8, pp. 216718–216728, 2020, doi:
10.1109/ACCESS.2020.3039539.
[3] D. A. Kumar, A. S. C. S. Sastry, P. V. V. Kishore, E. K. Kumar, and M. T. K. Kumar, ”S3DRGF: Spatial 3-
D relational geometric features for 3-D sign language representation and recognition,” IEEE Signal
Processing Letters, vol. 26, no. 1, pp. 169–173, Jan. 2019, doi: 10.1109/LSP.2018.2883864.
[4] M. Sultana, J. Thomas, S. Thomas, M. SA, and S. L. S, ”Design and development of teaching and
learning tool using sign language translator to enhance the learning skills for students with hearing and
verbal impairment,” in Proc. 2nd Int. Conf. Emerging Trends Inf. Technol. Eng. (ICETITE), Vellore, India,
2024, pp. 1–5, doi: 10.1109/ic- ETITE58242.2024.10493342.
[5] M. Ahmed, M. Idrees, Z. ul Abideen, R. Mumtaz, and S. Khalique, ”Deaf talk using 3D animated sign
language: A sign language interpreter using Microsoft’s Kinect v2,” in Proc. 2016 SAI Computing Conf. (SAI),
London, UK, 2016, pp. 330–335, doi: 10.1109/SAI.2016.7556002.
[6] O. Koller, N. C. Camgoz, H. Ney, and R. Bowden, ”Weakly su- pervised learning with multi-
stream CNN-LSTM-HMMs to discover sequential parallelism in sign language videos,” IEEE
Trans. Pattern Anal. Mach. Intell., vol. 42, no. 9, pp. 2306–2320, Sept. 2020, doi:
10.1109/TPAMI.2019.2911077.
[7] Y. Liao, P. Xiong, W. Min, W. Min, and J. Lu, ”Dynamic sign language recognition based on
video sequence with BLSTM-3D residual net- works,” IEEE Access, vol. 7, pp. 38044–38054,
2019, doi: 10.1109/AC- CESS.2019.2904749.
[8] I. Papastratis, K. Dimitropoulos, D. Konstantinidis, and P. Daras, ”Con- tinuous sign
language recognition through cross-modal alignment of video and text embeddings in a joint-
latent space,” IEEE Access, vol
[9] Z. Liu et al., ”Improving end-to-end sign language translation with adaptive video
representation enhanced transformer,” IEEE Transactions on Circuits and Systems for Video
Technology, vol. 34, no. 9, pp. 8327– 8342, Sept. 2024, doi: 10.1109/TCSVT.2024.3376404.
[10] Z. Huang, W. Xue, Y. Zhou et al., ”Dual-stage temporal percep- tion network for continuous
sign language recognition,” The Vi- sual Computer, vol. 41, pp. 1971–1986, 2025. [Online].
Available: https://doi.org/10.1007/s00371-024-03516-x
THANK YOU

You might also like