Video Translation: English to German

A solution for translating videos from English to German with voice cloning and lip-sync capabilities using LatentSync.

Overview

This pipeline takes an English video with subtitles and produces a German-translated version that:

Preserves the original speaker's voice characteristics
Maintains synchronization between audio and video
Applies lip-sync to match the new German audio

Features

SRT Parsing: Handles standard SRT subtitle files with precise timing
Text Translation: Uses Google Translate via deep-translator for English to German translation
Voice Cloning: Employs Coqui XTTS v2 for multilingual voice cloning
Timing Preservation: Maintains subtitle timing and synchronization
Lip-Sync: Uses LatentSync for realistic lip movement synchronization

Installation

1. Clone the Repository

git clone https://github.com/aman-17/video-translation.git
cd video-translation

2. Installing dependencies

# Ubuntu/Debian
sudo apt-get update
sudo apt-get install ffmpeg

# macOS
brew install ffmpeg

# creates environment and downloads the dependencies
source setup_env.sh

pip install git+https://github.com/coqui-ai/TTS.git@dbf1a08a0d4e47fdad6172e433eeb34bc6b13b4e

You might encounter a NumPy-related issue while installing TTS. Please ignore it as all the required sub-packages will be installed except NumPy.

Docker Installation (Optional)

# Build the image
docker-compose build

# Run with your video
docker-compose run --rm video-translation \
  python3 main.py \
    --video /app/inputs/input_video.mp4 \
    --transcript /app/inputs/input_subtitles.srt \
    --output /app/outputs/translated_video.mp4

3. Download the models

huggingface-cli download ByteDance/LatentSync-1.6 whisper/tiny.pt --local-dir latentsync/checkpoints
huggingface-cli download ByteDance/LatentSync-1.6 latentsync_unet.pt --local-dir latentsync/checkpoints

Usage

Basic Usage

python main.py \
  --video path/to/input_video.mp4 \
  --transcript path/to/subtitles.srt \
  --output outputs/translated_video.mp4

Advanced Options

python main.py \
  --video input.mp4 \
  --transcript input.srt \
  --output outputs/translated.mp4 \
  --inference-steps 30 \
  --guidance-scale 2.5 \
  --seed 42 \
  --keep-temp

Example

python main.py \
  --video ./Tanzania-2.mp4 \
  --transcript ./Tanzania-caption.srt \
  --output outputs/translated_video.mp4

Command-Line Arguments

--video: Input video file path (required)
--transcript: Input SRT subtitle file path (required)
--output: Output video file path (required)
--inference-steps: Number of diffusion steps for LatentSync (default: 25, higher = better quality but slower)
--guidance-scale: Guidance scale for LatentSync (default: 2.0, controls adherence to audio)
--seed: Random seed for reproducibility (default: 1247)
--keep-temp: Keep temporary audio segment files for debugging

Expected Output Files

After successful execution, you'll find in outputs directory:

outputs/translated_video.mp4 - The final translated video with lip-sync
outputs/translated_video_audio.wav - The generated German audio track
outputs/translated_video_translated.srt - Translated German subtitles

Limitations

Voice cloning is not 100% accurate. I did not find any open-source voice cloning models that are great for this task.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
latentsync		latentsync
outputs		outputs
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
config.py		config.py
docker-compose.yml		docker-compose.yml
exceptions.py		exceptions.py
main.py		main.py
requirements.txt		requirements.txt
setup_env.sh		setup_env.sh
subtitles.py		subtitles.py
translation.py		translation.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Video Translation: English to German

Overview

Features

Installation

1. Clone the Repository

2. Installing dependencies

Docker Installation (Optional)

3. Download the models

Usage

Basic Usage

Advanced Options

Example

Command-Line Arguments

Expected Output Files

Limitations

About

Uh oh!

Releases

Packages

Languages

License

aman-17/video-translation

Folders and files

Latest commit

History

Repository files navigation

Video Translation: English to German

Overview

Features

Installation

1. Clone the Repository

2. Installing dependencies

Docker Installation (Optional)

3. Download the models

Usage

Basic Usage

Advanced Options

Example

Command-Line Arguments

Expected Output Files

Limitations

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages