Thanks to visit codestin.com
Credit goes to Github.com

Skip to content

Platform that uses artificial intelligence to generate video subtitles. The application offers transcription, translation, and subtitle customization, and can be run locally with or without a GPU using Docker.

License

Notifications You must be signed in to change notification settings

Paulogb98/Leg2Sub

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Leg2Sub

GitHub stars GitHub issues GitHub license Python 3.10+ Streamlit Docker

AboutFeaturesRequirementsInstallationUsageTroubleshootingContributingLicense


About

Leg2Sub breaks down language barriers in education. With only 1% of Brazil's population fluent in English while quality content remains concentrated in that language, Leg2Sub provides free, powerful subtitle generation, translation, and customization.

Built with Streamlit and powered by Whisper/WhisperX, the platform transcribes, translates, and synchronizes subtitles automatically. Deploy on CPU for accessibility or GPU for high-performance processing.


Features

  • 🎯 Video Subtitling - Automatic transcription with embedded subtitles (soft & hard)
  • 🌐 Translation - Multi-language support via Google Translate
  • 📝 Transcription - Whisper and WhisperX engines
  • 🎨 Subtitle Customization - Color adjustment and formatting
  • 🐳 Multi-Platform - Unified Docker with CPU and GPU variants
  • ⚙️ Advanced Config - Fine-tune transcription, translation, and video parameters
  • 🚀 Production Ready - Error handling and batch processing support

Requirements

Core

  • Python: 3.10 - 3.12 (strongly recommended)
  • Docker & Docker Compose: Any recent version
  • WSL 2 (Windows only): For Docker support
  • GPU (Optional): NVIDIA with CUDA 11.0+ support

System Resources

Resource Minimum Recommended
Memory 2 GB 8 GB
Disk 15 GB 30 GB
CPU 2 cores 4+ cores

Installation

Option 1: Docker (Recommended - 2 minutes)

CPU Version:

git clone https://github.com/Paulogb98/Leg2Sub.git
cd Leg2Sub

docker compose up leg2sub_cpu -d

GPU Version (NVIDIA):

# Prerequisites: Install NVIDIA Container Toolkit
# https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html

docker compose up leg2sub_gpu -d

Monitor Startup:

docker compose logs -f leg2sub_cpu
# or for GPU:
docker compose logs -f leg2sub_gpu

Access Application: Open browser to http://localhost:8501

Option 2: Local Python

git clone https://github.com/Paulogb98/Leg2Sub.git
cd Leg2Sub

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

pip install -r requirements.txt

streamlit run app/streamlit.py

Setup Guide (Windows + WSL 2 + Docker)

Step 1: Enable WSL 2

Open PowerShell as Administrator:

wsl --install -d Ubuntu

Step 2: Install Docker in WSL

Open Ubuntu terminal:

curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh

sudo usermod -aG docker $USER
newgrp docker

Step 3: Install NVIDIA Container Toolkit (GPU Only)

# Follow official guide:
# https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html

# Verify installation:
dpkg -l | grep nvidia-container-toolkit

# Test GPU support:
docker run --rm --gpus all nvidia/cuda:11.0-runtime nvidia-smi

Step 4: Clone and Run

cd /mnt/c/Users/<username>/Downloads
git clone https://github.com/Paulogb98/Leg2Sub.git
cd Leg2Sub

# CPU version
docker compose up leg2sub_cpu -d

# GPU version
docker compose up leg2sub_gpu -d

Usage

Main Features

Feature Purpose Input Output
Subtitle Video Transcribe and embed subtitles MP4, MOV, AVI MP4 + SRT
Translate Subtitles Translate subtitle files SRT, VTT SRT (translated)
Transcribe Video Extract text from audio MP4, MOV, AVI Text (displayed)
Color Subtitles Customize subtitle colors SRT ASS (colored)

Quick Start

  1. Access: http://localhost:8501
  2. Select Feature: Choose from navigation menu
  3. Upload File: Select media or subtitle file
  4. Configure (Optional): Adjust parameters in Advanced Settings
  5. Process: Click button and wait for completion
  6. Download: Results saved to temp_output/ folder

Advanced Parameters

Transcription:

  • Engine: Whisper, WhisperX
  • Model: tiny, small, medium, large, large-v3-turbo
  • Device: auto, cpu, cuda
  • Batch Size: 1-32 (default: 12)
  • Compute Type: float32, float16, int8

Video Encoding:

  • Video Codec: h264, hevc, mpeg4
  • Audio Codec: aac, libopus, libmp3lame
  • Hardware APIs: nvenc, vaapi, amf, qsv (auto-detected)

Translation:

  • Languages: Portuguese, English, Spanish, French, German, Chinese, Japanese, Korean, and 20+ more

Troubleshooting

Container won't start

Check logs:

docker compose logs leg2sub_cpu
# or
docker compose logs leg2sub_gpu

Verify Docker is running:

docker ps

GPU not detected

Verify NVIDIA Container Toolkit:

docker run --rm --gpus all nvidia/cuda:11.0-runtime nvidia-smi

Expected output: GPU information should display

Out of memory error

  • Reduce batch size in Advanced Settings (try 4-8)
  • Process smaller files
  • Close other applications
  • Increase available disk space for temp files

Streamlit connection refused

Port 8501 already in use:

# Kill existing process or change port in docker-compose.yml
# Change: "8501:8501" to "8502:8501"

WSL 2 slow performance

Enable WSL 2 optimizations:

# In PowerShell as Admin:
wsl --set-default-version 2
wsl -l -v  # Verify WSL 2 is default

File permissions in WSL

# Fix permission issues:
sudo chown -R $USER:$USER /root_dir
chmod -R 755 /root_dir

Project Structure

Leg2Sub/
├── app/
│   └── template/
│       ├── homepage.py              # Main landing page
│       ├── sub_video_page.py        # Video subtitling interface
│       ├── translate_srt_page.py    # Subtitle translation interface
│       ├── transcribe_video_page.py # Video transcription interface
│       ├── sub_color_page.py        # Subtitle coloring interface
│       ├── static/
│       │   └── style.css            # Custom styling
│       └── streamlit.py             # App entry point
├── src/
│   ├── main_color_srt.py            # Color conversion logic
│   ├── main_subtitle.py             # Video processing pipeline
│   ├── main_transcriber.py          # Transcription entry point
│   └── main_translate_srt.py        # Translation logic
├── utils/
│   ├── ffmpeg_utils.py              # FFmpeg operations
│   ├── file_utils.py                # File handling
│   ├── subtitle_utils.py            # Subtitle processing
│   ├── translate_utils.py           # Translation utilities
│   ├── utils_func.py                # Helper functions
│   ├── whisper_utils.py             # Whisper integration
│   ├── whisperx_utils.py            # WhisperX integration
│   └── whisperx_transcription_utils.py # WhisperX helpers
├── assets/
│   ├── logo/                        # Logo files (SVG & PNG)
│   ├── content/                     # UI card images
│   └── demo/                        # Demo GIF
├── Dockerfile                       # Unified build with targets
├── docker-compose.yml               # Multi-service orchestration
├── requirements.txt                 # Python dependencies
└── README.md                        # This file

Docker Architecture

Dockerfile Targets

The Dockerfile uses multi-stage builds for efficient CPU and GPU deployments:

Base Stage (base):

  • Python 3.10-buster
  • Common dependencies (FFmpeg, tk, curl, etc.)
  • Pip packages from requirements.txt
  • Streamlit configuration

CPU Stage (cpu):

  • Inherits from base
  • Optimized for multi-core processing
  • Minimal overhead

GPU Stage (gpu):

  • NVIDIA CUDA 12.0.1 base
  • cuDNN 8 support
  • Python 3.10 with GPU optimization
  • All common dependencies

Docker Compose

Both services available simultaneously:

leg2sub_cpu:
  build:
    target: cpu      # Builds base → cpu
  ports:
    - "8501:8501"
  # Standard resource limits

leg2sub_gpu:
  build:
    target: gpu      # Builds NVIDIA base → gpu
  runtime: nvidia    # GPU support
  devices:           # GPU allocation
    - driver: nvidia

Performance

Typical Processing Times

Task CPU (4-core) GPU (RTX 3060)
Transcribe (30min video) 30-45 min 5-8 min
Translate (500 subtitles) 2-3 min 1-2 min
Color (SRT → ASS) <1 sec <1 sec
Embed Subtitles 2-5 min 1-2 min

Memory Usage

  • CPU Mode: 2-4 GB system RAM
  • GPU Mode: 2-4 GB system + 4-6 GB VRAM

Configuration Files

requirements.txt

Core dependencies:

  • streamlit==1.42.2 - Web interface
  • whisper - Speech recognition
  • whisperx - Enhanced transcription
  • deep_translator - Translation
  • ffmpeg_progress_yield - FFmpeg monitoring
  • pysrt - Subtitle handling

docker-compose.yml Highlights

# Memory limits (adjust as needed)
deploy.resources.limits.memory: 16G
deploy.resources.reservations.memory: 4G

# GPU device allocation
devices:
  - driver: nvidia
    device_ids: ['0']  # Change to use different GPU
    capabilities: [gpu, compute, utility]

# Restart policy
restart: unless-stopped  # Auto-restart on failure

Contributing

Contributions are welcome!

  1. Fork repository
  2. Create feature branch (git checkout -b feature/YourFeature)
  3. Commit changes (git commit -m 'feat: add YourFeature')
  4. Push to branch (git push origin feature/YourFeature)
  5. Open Pull Request

Areas for Contribution

  • ✅ New language support
  • ✅ UI/UX improvements
  • ✅ Performance optimization
  • ✅ Additional subtitle formats
  • ✅ Documentation and examples
  • ✅ Bug fixes and testing

License

This project is licensed under the GNU GPLv3 License - see LICENSE for details.

Acknowledgments

  • OpenAI Whisper - Speech recognition
  • WhisperX - Enhanced transcription and alignment
  • LeGen - Subtitle processing reference
  • Python community - All open-source libraries

Contact & Support

📧 Email: [email protected]

🔗 LinkedIn: https://www.linkedin.com/in/paulo-goiss/

💬 GitHub Issues: Open an issue


Built with ❤️ using Python, Streamlit & FFmpeg

🔗 Repository📝 Issues📦 Releases👤 LinkedIn

Leg2Sub v1.0 | ✅ Production Ready | 🌐 Free & Open Source

About

Platform that uses artificial intelligence to generate video subtitles. The application offers transcription, translation, and subtitle customization, and can be run locally with or without a GPU using Docker.

Topics

Resources

License

Stars

Watchers

Forks