Hacked together openai-whisper and deepl translation to stream and live transcribe-translate languages during meetings to any supported whisper/deepl languages.
The inspiration came from my struggle to understand some colleagues with a thick german/swiss german accent during meetings. This tool will hopefully help me follow along!
- Create Android APK package for smartphone
- Troubleshoot PyInstaller for MacOS dist (App is not performing as expected)
- Add App buttons for transcription source language and translation target language
- The App uses
openai-whisperanddeeplAPI to live transcribe and translate speech. - Currently, the
whispermodel is hosted on a GPU server. - Speech is recorded through the microphone and sent as WAV through WebSocket to
/transcriptionAPI endpoint. - Transcribed text is translated with
deeplAPI to english. - Translated text is streamed back to App.
- Clone the repository:
git clone [email protected]:K-Schubert/TransApp.git
cd TransApp- Set env variables:
nano .env
DEEPL_API_KEY=<DEEPL_API_KEY>
TRANSCRIPTION_ENDPOINT=<TRANSCRIPTION_ENDPOINT_URL>- Create venv:
# NOTE: Requires python < python3.13
python3.11 -m venv venv_transapp
source venv_transapp/bin/activate- Install requirements:
pip install -r requirements.txt- Run app and server with CLI:
# Run server
python3 whisper_server_stream.py# Run app
python3 translate_app_stream.py- OR build executable for MacOS:
# Build app
pyinstaller TransApp.spec
# Run app from dist folder