Thanks to visit codestin.com
Credit goes to github.com

Skip to content

EduSign AI converts speech and text into expressive sign language in real time using AI and 3D avatars built for classrooms, workplaces, and online meetings.

Notifications You must be signed in to change notification settings

Cbreve/edusign-ai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

25 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🧠 EduSign AI

Bridging the Communication Gap with Artificial Intelligence


🌍 Overview

EduSign AI is a bi-directional AI communication platform designed to connect deaf and hearing individuals in real time. It translates speech and text into sign language and also interprets sign language gestures into speech and text, powered by deep learning, computer vision, and 3D avatar technology.

Whether in classrooms, workplaces, or online meetings, EduSign AI ensures that everyone β€” regardless of hearing ability β€” can understand, participate, and communicate effortlessly.


πŸ’‘ Problem

Deaf and hard-of-hearing individuals often face barriers in education, meetings, and daily life. Existing captioning tools only cover text, missing the full expression of tone, emotion, and gesture. There is a global need for an accessible solution that enables two-way communication between signers and speakers in any environment.


πŸš€ Solution

EduSign AI bridges this gap through an intelligent system that:

  • Recognizes speech and text, converting them into expressive sign language via a 3D avatar.
  • Understands sign gestures from users and converts them into spoken or written language.
  • Displays real-time transcripts alongside the avatar to ensure clarity and learning support.

It empowers full participation in classrooms, online meetings (Zoom, Google Meet, Teams), and social environments β€” creating an inclusive communication experience for all.


πŸ” How It Works

1. Speech β†’ Sign β†’ Transcript

  • Captures live speech using Whisper AI or Google STT.
  • Converts the text into sign language animations via a 3D avatar.
  • Simultaneously displays transcripts as captions for learning support.

2. Sign β†’ Speech β†’ Transcript

  • Captures sign language gestures through MediaPipe or OpenPose.
  • Interprets gestures into text using a fine-tuned WLASL-based model.
  • Converts text to speech and shows a transcript confirmation.

3. Multi-Modal Accessibility

  • Users can toggle transcripts, adjust signing speed, or replay translations.
  • Supports multiple spoken and sign languages for global inclusivity.

✨ Key Features

  • πŸŽ™οΈ Speech-to-Sign Translation – Converts voice or text to sign language via an expressive avatar.
  • βœ‹ Sign-to-Speech Recognition – Converts gestures to spoken or written words using AI vision.
  • 🧍 3D Avatar Interpreter – Natural, emotion-aware signing with lifelike gestures.
  • πŸ’¬ Real-Time Transcripts – Displayed alongside avatar for clarity and accessibility.
  • 🌐 Multi-Platform Support – Integrates into Zoom, Google Meet, Teams, and web apps.
  • πŸ“Ά Offline Mode – Works in low-connectivity areas using local AI models.
  • 🧩 Multi-Language + Multi-Sign Support – GSL, ASL, BSL, and more.
  • πŸ”Š Emotion-Aware Communication – Preserves tone and intent across translations.
  • 🧠 Adaptive Learning – Improves translation accuracy with community feedback.
  • ☁️ Supabase Integration – For cloud sync, user preferences, and dataset management.

πŸ› οΈ Tech Stack

Frontend: React.js, Next.js, Tailwind CSS
Backend: FastAPI, Supabase
AI Components: TensorFlow / PyTorch, MediaPipe, Whisper, gTTS
3D Rendering: Three.js or Unity WebGL for avatar gestures
Integrations: Zoom SDK, Google Meet API, Microsoft Teams SDK
Deployment: Docker + Vercel


πŸš€ Quick Start

Prerequisites

  • Python 3.8+
  • Node.js 18+
  • PyTorch 2.0+
  • MediaPipe 0.10+

Backend Setup

cd backend
pip install -r ../requirements.txt
uvicorn app.main:app --reload

Frontend Setup

npm install
npm run dev

πŸ€– Model Training

GSL Sign Recognition Model

Our model is fine-tuned on Ghanaian Sign Language (GSL) with 98.18% validation accuracy and 98.54% overall test accuracy across 1,485 sign classes using the FullI3D architecture.

Training Pipeline:

# Preprocess landmarks (one-time)
python scripts/preprocess_landmarks.py

# Fine-tune with FullI3D architecture (recommended)
python scripts/train_edusign_gsl.py \
    --architecture i3d \
    --pretrained-model backend/app/models/pretrained_wlasl.pth \
    --epochs 50 \
    --batch-size 8 \
    --base-channels 64 \
    --fine-tune-lr 0.0001 \
    --augment --oversample \
    --loss focal --class-weights \
    --scheduler cosine

# Alternative: SimpleI3D (faster training)
python scripts/train_edusign_gsl.py \
    --architecture simple \
    --pretrained-model backend/app/models/pretrained_wlasl.pth \
    --epochs 50 \
    --batch-size 16 \
    --fine-tune-lr 0.0001

# Test model
python scripts/inference_edusign_gsl.py \
    --model backend/app/models/edusign_gsl_finetuned.pth \
    --input <frame.jpg>

# Detailed evaluation
python scripts/evaluate_model_detailed.py \
    --model backend/app/models/edusign_gsl_finetuned.pth \
    --output-dir evaluation_results

Model Performance

FullI3D Architecture (Current Best):

  • Validation Accuracy: 98.18% (best), 97.27% (final)
  • Test Accuracy: 98.54%
  • Top-5 Accuracy: 100.00%
  • Classes: 1,485 GSL signs
  • Architecture: FullI3D (3D Convolutions)
  • Base Model: WLASL pretrained on 2,000 ASL signs
  • Model Size: 46MB
  • Training Accuracy: 99.54%

Previous SimpleI3D (Baseline):

  • Validation Accuracy: 95.45%
  • Architecture: Simplified I3D (LSTM-based)
  • Improvement: +2.73% with FullI3D

πŸ“Š Dataset

Our GSL dataset includes:

  • 1,525 dictionary entries with sign meanings
  • 8,980 validated frames extracted from YouTube videos
  • MediaPipe landmarks pre-extracted and cached for fast training

Dataset Scripts

# Extract dictionary from PDF
python scripts/extract_gsl_dictionary.py

# Download YouTube videos
python scripts/download_youtube_sign_videos.py <video_url>

# Extract and validate frames
python scripts/extract_frames_from_videos.py
python scripts/validate_video_frames.py

πŸ“‘ API Endpoints

The FastAPI backend provides:

  • Sign Recognition: Recognizes GSL signs from images/videos
  • Sign-to-Speech: Converts recognized signs to speech/text
  • Speech-to-Text: Converts speech to text
  • Text-to-Sign: Converts text to sign language representation
  • Health Checks: System status monitoring

See backend/app/routes/ for endpoint definitions.


🎯 Impact

EduSign AI promotes digital inclusion by ensuring equal participation for deaf and hearing individuals. It aligns with UN SDG 4 (Quality Education) and SDG 10 (Reduced Inequalities), making education, meetings, and communication accessible to everyone, everywhere.


🧩 Future Vision

EduSign AI aims to become a universal AI interpreter that supports:

  • Augmented and Virtual Reality classrooms.
  • Emotion-based sign interpretation.
  • A global open-source sign language dataset (EduSign Corpus).
  • Plug-ins for web, mobile, and smart devices.

πŸ‘₯ Team

Cbreve β€” Innovators passionate about AI, accessibility, and social impact.


πŸ“œ License

Open for academic and non-commercial use. For research or partnership inquiries, please contact the EduSign AI team.


πŸ“ Note on Large Files

Large raw assets (PDFs, videos, images, models) are intentionally not tracked in Git (see .gitignore). Share via external storage or Git LFS if needed.


πŸ“š Project Structure

EduSign-AI/
β”œβ”€β”€ backend/                    # FastAPI backend (AI & API)
β”‚   β”œβ”€β”€ app/
β”‚   β”‚   β”œβ”€β”€ models/            # Trained ML models
β”‚   β”‚   β”œβ”€β”€ services/          # Business logic (avatar, sign recognition)
β”‚   β”‚   └── routes/            # API endpoints
β”‚   └── data/                   # Datasets and processed data
β”œβ”€β”€ frontend/                   # Next.js frontend
β”‚   β”œβ”€β”€ public/
β”‚   β”‚   └── avatar_models/      # 3D avatar files (Ready Player Me)
β”‚   β”‚       └── ready-player-me/
β”‚   β”‚           β”œβ”€β”€ models/     # .glb/.gltf files
β”‚   β”‚           β”œβ”€β”€ textures/   # Texture maps
β”‚   β”‚           └── animations/ # Sign language animations
β”‚   └── src/                    # React components and pages
β”œβ”€β”€ integrations/               # SDKs for Zoom, Google Meet, Teams
β”œβ”€β”€ scripts/                    # Data processing and training scripts
└── docs/                       # Project documentation

About

EduSign AI converts speech and text into expressive sign language in real time using AI and 3D avatars built for classrooms, workplaces, and online meetings.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages