LeGen v0.19.2 — Local, fast AI‑powered subtitle studio

Atualização do LeGen (v0.19.8)

Correção de bugs de resume
Adicionada possibilidade de seleção de versão no notebook Colab (compatibilidade limitada)

Atualização do LeGen (v0.19.7)

Correção de bug para o caso de a input ser um arquivo individual
Correção de bug no matplotlib do Colab e feita a supressão de alguns warnings
Melhoria nos parametros de tradução e prompt do gemini, trazendo menor quantidade de erros (que forçavam nova tentativa) e economia de tokens
Agora a ferramenta de tradução translate_utils.py ou legen-translate para linha de comando com pacote pipy podem também ser usadas como uma ferramenta separada para realizar somente a tradução de arquivos srt, funcionando de forma desaoplada do restante do legen
Colab agora usa UV, acelerando a preparação do ambiente e unificando a manutenção dos pacotes
Criado novo script do Colab, Dessa forma ficaram dois scripts:
- legen.ipynb: script padrão com a ultima versão publicada do pacote no pipy. LINK (https://colab.research.google.com/github/matheusbach/legen/blob/main/legen.ipynb)
- legen-beta.ipynb: script contendo o código atual da branch master no github. LINK (https://colab.research.google.com/github/matheusbach/legen/blob/main/legen-beta.ipynb)

Doações
O projeto não recebeu doações recentemente e na verdade o valor arrecadado em toda a história do projeto é baixo. O projeto legen não foi criado e nem continua em desenvolvimento com o objetivo de me enriquecer, porém, existem custos atrelados ao desenvolvimento, como horas de trabalho, hardware/energia, assinatura de serviços atrelados e consumo de APIs. Peço apoio para que eu não precise reduzir drasticamente a frequencia com que trabalho nesse projeto.

Pix: https://livepix.gg/legendonate
Monero: 86HjTCsiaELEoNhH96rTf3ezGMXgKmHjqFrNmca2tesCESdCTZvRvQ9QWQXPGDtmaZhKz4ryHCdZXFzdbmtGahVa5VMLJnx

Atualização do LeGen (v0.19.6)

O LeGen agora conta com suporte Docker oficial, tem melhor detecção automática de dispositivos de aceleração, VAD aprimorado com Silero e nova recomendação de instalação via uv para facilitar ainda mais o uso.

Principais mudanças

Docker oficial: Container pronto para uso com todas as dependências, incluindo CUDA 12.4 para aceleração GPU. Build a partir do source com docker build -t legen . e rode com docker run -v /caminho/videos:/input legen -i /input/video.mp4.
Detecção automática de dispositivos: O LeGen agora detecta de forma mais aprimorada e seleciona automaticamente o melhor dispositivo disponível (CUDA → ROCm → CPU), informando qual está sendo usado e eliminando configurações manuais. O LeGen vai avisar quando faltar VRAM para o modelo e compute type selecionado. Você ainda pode forçar rodar com GPU mesmo que o legen indique não ser possível passando o parametro --transcription_device cuda
Silero VAD integrado: Nova opção de Voice Activity Detection mais rápida que o VAD padrão. Use --vad [silero | pyannote] para ativar. Padrâo: Silero.
Instalação via uv (recomendado): Nova forma simplificada de instalar e atualizar - uv tool install legen mantém tudo isolado e te permite rodar o legen sem ter trabalho adicionais de instalação. Atualização também simplificada rodando o mesmo comando. Veja o guia de instalação do uv.
Melhorias de estabilidade: Correções em edge cases do processamento de áudio, melhor tratamento de erros e logs mais informativos.

LeGen v0.19.2 — Local, fast AI‑powered subtitle studio

LeGen is a fast, local-first subtitle studio powered by Whisper/WhisperX. It transcribes audio, translates to your target language, exports .srt/.txt, muxes soft-subs into MP4, and can burn hard-subs — all on your machine. It also integrates with yt-dlp to fetch videos/playlists and optionally embed all remote subtitles before processing.

Highlights

Whisper/WhisperX transcription with GPU auto-detection (cuda > mps > cpu) and manual override.
Accurate alignment and batch processing for speed (configurable batch size on WhisperX).
Translation via Google or Gemini (supply multiple Gemini API keys for quota rotation).
Export to SRT and TXT; embed soft-subs into MP4; optional hard-sub burn-in output.
End-to-end URL workflow using yt-dlp (optional download and embed of all remote subtitles).
Flexible FFmpeg codecs for video/audio with hardware acceleration (NVENC, VAAPI, QSV, AMF).
Robust CLI with sensible defaults, overwrite control, and optional copying of non-video files.
Works locally, in Docker, or on Google Colab.

Installation

From PyPI

pip install legen

The legen console script is added to PATH and mirrors the CLI options below.

From Source

git clone https://github.com/matheusbach/legen.git
cd legen
pip3 install -r requirements.txt --upgrade

Requirements:

Python 3.9–3.12
FFmpeg installed on the system
Optional: yt-dlp CLI available in PATH (installed via requirements)
Optional: CUDA-capable GPU + matching PyTorch for acceleration

Quick Start

Local file:

python3 legen.py -i /path/to/video.mp4

Translate to Portuguese, soft-subs only:

python3 legen.py -i /path/to/video.mp4 --translate pt --disable_hardsubs

Force input language, pick WhisperX large model, and use GPU with NVENC output:

python3 legen.py -i /path/to/video.mp4 --input_lang en -ts:e whisperx -ts:m large-v3 -ts:d cuda -c:v h264_nvenc

URL pipeline with remote subtitles embedded and translation to English:

python3 legen.py -i "https://www.youtube.com/watch?v=XXXX" --download_remote_subs --translate en

Use Gemini translator with API key:

python3 legen.py -i /path/to/video.mp4 --translate_engine gemini --gemini_api_key YOUR_KEY --translate fr

Google Colab

Use Google’s compute with the ready-made notebook:

https://colab.research.google.com/github/matheusbach/legen/blob/main/legen.ipynb

CLI Overview (selection)

Input: -i, --input_path (file/folder/URL)
Transcription: --transcription_engine [whisper|whisperx], --transcription_model [tiny..large-v3(-turbo)], --transcription_device [auto|cpu|cuda], --transcription_compute_type [auto|int8..float32], --transcription_batch N
Translation: --translate <lang>, --translate_engine [google|gemini], --gemini_api_key KEY[,KEY2...]
Language: --input_lang <code|auto>
Outputs: --output_softsubs PATH, --output_hardsubs PATH, --subtitle_formats srt,txt, --disable_srt, --disable_softsubs, --disable_hardsubs, --overwrite, --copy_files
Codecs: --codec_video, --codec_audio
URLs: --download_remote_subs, --output_downloads PATH
Pre-step: --norm (normalize folder times + run vidqa)

Full list:

legen --help

GPU Acceleration

Auto-selects best backend; override with --transcription_device.
Docker image installs CUDA-enabled PyTorch by default; expose GPUs with --gpus all.
Build CPU-only image with --build-arg PYTORCH_INSTALL_CUDA=false.

Dependencies

Core Python packages include: whisper/whisperx (fork), torch, yt-dlp, ffmpeg_progress_yield, pysrt, tqdm, deep_translator, gemini-srt-translator, vidqa. FFmpeg must be installed on the system.

Known Limitations

Whisper model size vs. speed/VRAM trade-offs; large models require more resources.
Batch size too high may cause OOM on long media; lower --transcription_batch if unstable.
GPU support depends on matching PyTorch/CUDA drivers; install the right wheel for your GPU if needed.
When using URLs, remote subtitles are embedded only if --download_remote_subs is provided.

Breaking Changes

Some default dirs has changed. Temporary compatibility alowed for existings
Some parameters changed its names. Read the docs and the --help output

Acknowledgements

Powered by:

OpenAI Whisper and WhisperX (alignment)
PyTorch
FFmpeg
yt-dlp
deep_translator
Gemini

Releases: matheusbach/legen

v0.19.8

Uh oh!

v0.19.7

Uh oh!

v0.19.6

Uh oh!

v0.19.3

Uh oh!

v0.19.2

LeGen v0.19.2 — Local, fast AI‑powered subtitle studio

Highlights

Installation

From PyPI

From Source

Quick Start

Google Colab

CLI Overview (selection)

GPU Acceleration

Dependencies

Known Limitations

Breaking Changes

Acknowledgements

Uh oh!

v0.16

Uh oh!