Voice Typer

A locally run Speech-To-Text voice typer enabled with hotkeys and floating interface for effortless voice typing, powered by the publicly available OpenAI Whisper model.

📸 Screenshot

❓ Why?

I created this because I couldn't find a free solution that uses a locally run Whisper model (or any speech-to-text model) with a hotkey activation for easy interaction. Even if a solution exists, this was a cool project to work on. A hotkey, like holding Alt+X to record and release to transcribe, is an efficient way to control a speech-to-text model! It can type on any input, so as long as your cursor can access it, you can use VoiceTyper on it. With all the typing we do for LLMs, sometimes we want to explain things in detail, but typing it all out can feel long and tedious. Using voice is faster and more interactive. Now, the next few projects will be built even faster. Let's gooo!

P.S. I've been using it ALOT, maybe too much, hopefully I still remember how to type with a keyboard.

✨ Features

Floating, draggable, minimal mic/close app interface
Speech-to-text transcription using OpenAI's Whisper model
Automatic output of transcribed text to the active window
Global hotkey hold (Alt+X) to start/stop recording
Press the "X" button on the floating window to quit

💻 Requirements

Microphone
Windows 10 or later
At least 4GB of VRAM (for GPU acceleration)
Note: Hardware will determine the transcription performance.

📥 Installation for Users

I've only tested it on Windows 11. For others, you're free to try if it works.
Since the exe file is > 2GB, github's release limit, it's split into 2.
(1) VoiceTyper.7z.001, and (2) VoiceTyper.7z.002

Go to Releases Page & Download all files.
Install 7-Zip from www.7-zip.org if you haven't already.
Right-click on VoiceTyper.7z.001 > Select 7-Zip > Extract Here.
Run the VoiceTyper.exe file
Wait startup & model download, there'll be notifications
1. Startup (~5-30s)
2. Model Download (For first time users) (~1m-5m)
3. Waiting times will vary between hardwares & internet connections
In any text field, hold Alt+X to start/stop recording
Press the "X" button on the floating window to quit

👨‍💻 For Developers

🛠️ Setup

Clone this repository
Create a virtual environment: python -m venv env
Activate the virtual environment:
- Windows: .\env\Scripts\activate
- Unix/MacOS: source env/bin/activate
Install dependencies: pip install -r requirements.txt

🚀 Running the Application

Use .\run.sh to run the application in development mode

📦 Building the Executable

Ensure you have PyInstaller installed: pip install pyinstaller
Run the build script: .\build.sh
The executable will be created in the dist folder

📁 Project Structure

main.py: Entry point of the application
ui.py: User interface implementation
recorder.py: Audio recording functionality
transcriber.py: Speech-to-text transcription using Whisper
typer.py: Handles text output to the active window
utils.py: Utility functions

⚠️ Issues

The hot keys (ALT+X), which is the ideal way to use it works. However, clicking on the record button removes focus from the input the cursor was at. This causes the program to enter characters into the unknown. For now, it's just used as a recording indicator.

👨‍💻 Author

Made with ❤️ by @faqihxdev

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.github		.github
assets		assets
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
NOTES.txt		NOTES.txt
README.md		README.md
build.sh		build.sh
main.py		main.py
pyproject.toml		pyproject.toml
recorder.py		recorder.py
run.sh		run.sh
transcriber.py		transcriber.py
typer.py		typer.py
ui.py		ui.py
utils.py		utils.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Voice Typer

📸 Screenshot

❓ Why?

✨ Features

💻 Requirements

📥 Installation for Users

👨‍💻 For Developers

🛠️ Setup

🚀 Running the Application

📦 Building the Executable

📁 Project Structure

⚠️ Issues

👨‍💻 Author

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

faqihxdev/voice-typer

Folders and files

Latest commit

History

Repository files navigation

Voice Typer

📸 Screenshot

❓ Why?

✨ Features

💻 Requirements

📥 Installation for Users

👨‍💻 For Developers

🛠️ Setup

🚀 Running the Application

📦 Building the Executable

📁 Project Structure

⚠️ Issues

👨‍💻 Author

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages