A macOS menu-bar voice input app. Hold Fn to record, release to inject the transcribed text into whatever field has focus — Slack, browser, IDE, anything.
- 🎙 Streaming transcription via Apple's Speech Recognition framework
- 🌐 Nine languages out of the box (Simplified/Traditional Chinese, English, Japanese, Korean, Spanish, French, German, Russian)
- 🤖 Optional LLM refinement (any OpenAI-compatible API) to fix homophone and mixed-language recognition errors
- 🎨 Frameless capsule HUD with live waveform driven by mic RMS
- 🪶 Menu-bar only — no Dock icon, no window clutter
- 🧠 CJK input-method aware paste (won't be eaten by Pinyin/Kana IMEs)
Requires macOS 14+. Open Terminal (⌘+Space → "Terminal") and paste:
curl -fsSL https://raw.githubusercontent.com/nickleefly/voice-input/main/install.sh | bashThen:
- When System Settings opens, toggle VoiceInput ON under Privacy & Security → Accessibility.
- Quit VoiceInput from the menu bar (icon → Quit).
- Relaunch from
/Applications/VoiceInput.app. - Hold Fn to record, release to inject.
The installer downloads the latest release, strips macOS quarantine, re-signs the app locally so Accessibility permissions stick across launches, and clears any stale TCC grants left by earlier installs.
| Action | How |
|---|---|
| Record | Hold Fn |
| Stop & inject | Release Fn |
| Change language | Menu bar icon → Language |
| Toggle LLM refinement | Menu bar icon → LLM Refinement |
| Configure LLM API | Menu bar icon → LLM Refinement → Settings… |
| Quit | Menu bar icon → Quit |
If you enable LLM refinement, VoiceInput sends the raw transcript to
an OpenAI-compatible endpoint to correct obvious recognition errors
(e.g. 配森 → Python, 杰森 → JSON). The system prompt is
conservative — it won't rewrite or polish content that already looks
correct.
Configure in the Settings window:
- API Base URL — e.g.
https://api.openai.com/v1, or any compatible gateway (DeepSeek, Together, local Ollama, etc.) - API Key
- Model — e.g.
gpt-4o-mini
After release, the HUD briefly shows "Refining…" before pasting the refined text.
Pressing Fn does nothing.
Open ~/.voiceinput-debug.log:
ERROR: CGEvent.tapCreate returned nil→ Accessibility permission isn't actually applied to the running binary. Re-run the installer — it re-signs locally and resets stale TCC entries, which is the fix in ~99% of cases.SUCCESS: Event tap createdbut noFn DOWNlines → the key you're pressing isn't being delivered as Fn. On newer Macs, check System Settings → Keyboard → Press 🌐 key to is set to Do Nothing (or Change Input Source). Karabiner-Elements and similar remappers will also intercept Fn before this app sees it.
It worked, then suddenly stopped after a macOS update. macOS sometimes invalidates ad-hoc-signed app entries on update. Re-run the installer.
No microphone input / Speech Recognition fails. Check System Settings → Privacy & Security → Microphone and Speech Recognition — both must be ON for VoiceInput.
Paste doesn't appear in Chinese/Japanese/Korean apps. VoiceInput temporarily switches to ABC keyboard before pasting, then restores your IME. If you've remapped Cmd+V or the IME doesn't expose ABC as a fallback, paste may fail — let me know which app.
Requires Swift 5.9+ and macOS 14+.
make build # Build .app bundle
make run # Build and launch
make install # Install to /Applications
make clean # Clean build artifactsAfter install, grant Accessibility permission in System Settings → Privacy & Security → Accessibility.
| File | Role |
|---|---|
FnKeyMonitor.swift |
Global CGEvent tap; detects Fn via both flagsChanged (maskSecondaryFn) and keyDown/Up (keyCode 63), with debounced release |
SpeechRecognitionManager.swift |
Streaming Apple Speech Recognition + audio RMS metering for the waveform |
FloatingWindowController.swift |
Frameless capsule HUD (NSPanel + NSVisualEffectView .hudWindow) |
WaveformView.swift |
5-bar live waveform driven by mic RMS, with attack/release envelope and jitter |
TextInjector.swift |
Clipboard + simulated ⌘V; auto-switches CJK IMEs to ABC before pasting |
LLMRefiner.swift |
OpenAI-compatible refinement client with conservative correction prompt |
StatusBarController.swift |
Menu bar UI, language selection, LLM toggle, glue |
SettingsWindowController.swift |
API Base URL / Key / Model configuration |
App runs in LSUIElement mode (menu-bar only, no Dock icon).
This app was bootstrapped with a single Claude Code prompt — see
PROMPT.md for the full specification.
MIT