Simple desktop application for speech transcription with global hotkey control. Record, transcribe, and paste - all without switching applications.
- Start Recording - Press the global hotkey (default: Ctrl+Shift+R) from any application
- Stop Recording - Press the same hotkey again when you're done speaking
- Paste Text - The transcription is automatically copied to your clipboard, just paste it wherever you need
That's it! No need to switch applications during your workflow.
- ποΈ Record audio directly from your microphone
- π Support for 100+ languages with automatic language detection
- π Custom vocabulary support to improve transcription accuracy
- π§ System instructions for controlling transcription behavior
- π Copy transcription to clipboard
- π Real-time recording status and timer
Open Super Whisper supports the following AI transcription models:
- Whisper-1 - OpenAI's original open-source Whisper model
- GPT-4o Transcribe - High-performance transcription model offering superior accuracy
- GPT-4o Mini Transcribe - Lightweight and fast transcription model with a good balance of speed and accuracy
You can download the latest executable file (.exe) for Windows from our GitHub Releases page.
- OpenAI API key
- Windows or macOS operating system
UV is a fast and efficient Python package installer and environment manager. It's faster than traditional pip and venv, and provides better dependency resolution.
- Check if UV is installed:
uv --version- If not installed, you can install it with:
# Windows (PowerShell)
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
# macOS/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh-
Clone or download this repository
-
Set up the project using UV's sync command, which will create a virtual environment and install all dependencies:
uv sync- Activate the virtual environment:
# Windows (PowerShell)
.\.venv\Scripts\activate.ps1
# macOS/Linux
source .venv/bin/activateNote: If you get a "execution of scripts is disabled on this system" error when using
activate.ps1in PowerShell, try one of these solutions:
- Use Command Prompt (cmd.exe) and run
.venv\Scripts\activate.batinstead- Run the following command in PowerShell to change the execution policy for the current session only:
Then runSet-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope Process.\.venv\Scripts\activate.ps1- Run PowerShell as Administrator and change the execution policy for your user account (do this only if you understand the security implications):
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
- Run the application:
python main.pyTo create a standalone executable, you can use PyInstaller:
# Windows (PowerShell)
python -m PyInstaller --onefile --windowed --icon assets/icon.ico --name "OpenSuperWhisper" --add-data "assets;assets" main.py
# For macOS
python -m PyInstaller --onefile --windowed --icon assets/icon.icns --name "OpenSuperWhisper" --add-data "assets:assets" main.py
# For Linux
python -m PyInstaller --onefile --windowed --icon assets/linux_pngs/icon_256.png --name "OpenSuperWhisper" --add-data "assets:assets" main.pyThe Windows command does the following:
--onefile: Creates a single executable file--windowed: Prevents a console window from appearing--icon assets/icon.ico: Sets the application icon--name "OpenSuperWhisper": Specifies the output filename--add-data "assets;assets": Includes the entire assets directory in the executable
Once the build is complete, you'll find OpenSuperWhisper.exe in the dist folder on Windows, OpenSuperWhisper.app in the dist folder on macOS, or OpenSuperWhisper in the dist folder on Linux.
- On first launch, you'll be prompted to enter your OpenAI API key
- If you don't have an API key, you can get one from OpenAI's website
- Your API key will be saved for future use
- To change it later, click "API Key Settings" in the toolbar
- Click the "Start Recording" button to begin recording from your microphone
- Click "Stop Recording" when you're done
- The application will automatically transcribe your recording
- You can also use the global hotkey (default: Ctrl+Shift+R) to start/stop recording even when the application is in the background
- The default hotkey is set to "Ctrl+Shift+R"
- Pressing this hotkey will start/stop recording even when the application is in the background
- To change the hotkey, click "Hotkey Settings" in the toolbar
- The application stays resident in your system tray (Windows) or menu bar (macOS)
- Closing the window will keep the application running in the background
- Click the system tray/menu bar icon to toggle the application's visibility
- Right-click the system tray icon (Windows) or click the menu bar icon (macOS) to access a context menu with options to:
- Show the application
- Start/stop recording
- Completely exit the application
- Select a language from the dropdown menu before recording or importing audio
- Choose "Auto-detect" to let Whisper identify the language automatically
- Select the Whisper model to use from the dropdown menu
- Different models offer different balances of accuracy and processing speed
- Your selected model will be remembered for future sessions
- Click "Custom Vocabulary" in the toolbar
- Add specific terms, names, or phrases that might appear in your audio
- These terms will help improve transcription accuracy
- Click "System Instructions" in the toolbar
- Add specific instructions to control transcription behavior, such as:
- "Ignore filler words like um, uh, er"
- "Add proper punctuation"
- "Format text into paragraphs"
- These instructions help refine transcription results without manual editing
- View the transcription in the main text area
- Edit the text if needed (the text area is editable)
- Use the toolbar buttons to:
- Copy the transcription to clipboard
- "Auto Copy" option: Toggle automatic copying of transcription to clipboard when completed
The application supports the following command line arguments:
python main.py -m
# or
python main.py --minimizedUsing the -m or --minimized option will start the application minimized to the system tray only, without showing the window.
This project is licensed under the MIT License - see the LICENSE file for details.
- This application uses OpenAI's Whisper API for speech recognition
- Built with PyQt6 for the user interface
- Inspired by the Super Whisper desktop application