Whishper Reloaded is an open-source, 100% local audio transcription and subtitling suite with a full-featured web UI. It is based on Pluja's Whishper and further improved.
- Load whishper backend, frontend and whishper-api as separate containers (DockerHub frontend, DockerHub backend). Useful if your powerful transcriptive is not always on and you need the service at all times!
- Search for specific transcriptions via the search bar!
- Better page loading experience using pagination, solving blocking UI threads making browsers lag.
- Feedback banners on connections to available services, having a more fail-prone approach to translations and new transcriptions services.
- Rename your transcriptions after a successful transcription.
- Upload your whole transcription through JSON file for external pre-processing before real-time editing! Download the current transcription in JSON format, edit it externally, and reupload.
- Real-time editing improvements:
- Audio-only mode for the Whishper editor
- Use shortcut to control the playback [F7, F8, F9]
- Navigate through segments using TAB on your keyboard
- Go to current segment / Navigate to segment number
The following is an example of a docker-compose.yml that puts backend, mongoDB and frontend on the same server, while whishper-api is hosted on a different machine:
services:
whishper-mongo:
image: mongo:latest
container_name: whishper-mongo
env_file:
- .backend.env
restart: unless-stopped
volumes:
- ./db_data/db:/data/db
- ./db_data/logs/:/var/log/mongodb/
environment:
MONGO_INITDB_ROOT_USERNAME: ${DB_USER:-whishper}
MONGO_INITDB_ROOT_PASSWORD: ${DB_PASS:-whishper}
expose:
- 27017
ports:
- 27017:27017
command: mongod --logpath var/log/mongodb/mongod.log
whishper-frontend:
image: thespartan94/whishper-frontend:latest
container_name: whishper-frontend
ports:
- "3000:3000"
expose:
- 3000
env_file:
- .frontend.env
volumes:
- ./whishper-frontend/logs:/var/log/whishper
whishper-backend:
image: thespartan94/whishper-backend:latest
container_name: whishper-backend
ports:
- 8080:8080
env_file:
- .backend.env
volumes:
- ./uploads:/uploads
- ./whishper-backend/logs:/var/log/whishper
The docker-compose.yml references two different env files, one for backend and another for frontend:
.backend.env:
UPLOAD_DIR=/uploads
ASR_ENDPOINT=<external-ip-address:8000> # assuming default port
DB_USER=whishper # if default
DB_PASS=whishper # if default
DB_ENDPOINT=whishper-mongo:27017
TRANSLATION_ENDPOINT=<external-ip-address:5000> # assuming default port
.frontend.env:
PUBLIC_API_HOST=<external-public-api-host>
PUBLIC_TRANSLATION_API_HOST=<external-public-translation-api-host>
PUBLIC_INTERNAL_API_HOST=http://whishper-backend:8080
PUBLIC_WHISHPER_PROFILE=gpu # or cpu
- 🗣️ Transcribe any media to text: audio, video, etc.
- Transcribe from URLs (any source supported by yt-dlp).
- Upload a file to transcribe.
- 📥 Download transcriptions in many formats: TXT, JSON, VTT, SRT or copy the raw text to your clipboard.
- 🌐 Translate your transcriptions to any language supported by Libretranslate.
- ✍️ Powerful subtitle editor so you don't need to leave the UI!
- Transcription highlighting based on media position.
- CPS (Characters per second) warnings.
- Segment splitting.
- Segment insertion.
- Subtitle language selection.
- 🏠 100% Local: transcription, translation and subtitle edition happen 100% on your machine (can even work offline!).
- 🚀 Fast: uses FasterWhisper as the Whisper backend: get much faster transcription times on CPU!
- 👍 Quick and easy setup: use the quick start script, or run through a few steps!
- 🔥 GPU support: use your NVIDIA GPU to get even faster transcription times!
- 🐎 CPU support: no GPU? No problem! Whishper can run on CPU too.
Whishper is a collection of pieces that work together. The three main pieces are:
- Transcription-API: This is the API that enables running Faster-Whisper. You can find it in the
transcription-apifolder. - Whishper-Backend: This is the backend that coordinates frontend calls, database, and tasks. You can find it in
backendfolder. - Whishper-Frontend: This is the frontend (web UI) of the application. You can find it in
frontendfolder. - Translation (3rd party): This is the libretranslate container that is used for translating subtitles.
- MongoDB (3rd party): This is the database that stores all the information about your transcriptions.
- Nginx (3rd party): This is the proxy that allows running everything from a single domain.
Contributions are welcome! Feel free to open a PR with your changes, or take a look at the issues to see if there is something you can help with.
Check out the development documentation here.
These screenshots are available on the official website, click any of the following links to see:
- Faster Whisper
- LibreTranslate
- This project is a fork of [Whishper]. Support also the original idea of the creator: Whishper