STT_Demos – Local Speech Recognition Showcase

Jenkins Robotics

STT_Demos – Local Speech Recognition Showcase

Project Information

Project Status : INACTIVE
Code Status : STABLE
Development Status : CONCLUDED

General Information

This project is a collection of local/offline STT (Speech-to-Text) demos used to benchmark and explore different open-source speech recognition engines. Designed for robotics and voice interface applications, each demo includes either a real-time or batch processing interface for fast testing and integration.

Key Outcome: The primary conclusion of this exploration is that pywhispercpp is the recommended engine for future integration due to its superior performance and native macOS support.

Goals include:

Evaluate transcription speed and accuracy
Compare real-time vs batch models
Support macOS (Apple Silicon) with MPS where applicable
Build a foundation for full-duplex speech interaction
Integrate with TTS_Demos in future agents
Add transcript benchmarking + WER tools
Measure latency, duplication, and streaming fidelity

Support

Like our work? Consider supporting Jenkins Robotics!

Subscribe ➔ https://www.youtube.com/@Jenkins_Robotics

Patreon ➔ https://www.patreon.com/JenkinsRobotics

Venmo ➔ https://venmo.com/u/JenkinsRobotics

STT Engines Included

Engine	Interface	Offline?	Notes
Vosk	Real-time	✅ Yes	Fast, lightweight, low-memory CPU STT
FasterWhisper	Real-time	✅ Yes	CTranslate2-backed Whisper. High accuracy, CPU-only on Mac
Whisper.cpp	CLI + GUI	✅ Yes	Metal/ANE-accelerated C++ engine for macOS
pywhispercpp	Python API	✅ Yes	Metal-accelerated Python bindings for Whisper.cpp
Whisper MLX	File	✅ Yes	GPU-accelerated MLX backend for macOS
RealTimeSTT	Real-time	✅ Yes	Lightweight real-time demo
SpeechRecSTT	Real-time	✅ Yes	Uses Python’s SpeechRecognition/pocketsphinx

Installation Instructions

Clone this repo and install dependencies for each STT demo as needed. For macOS (Apple Silicon recommended):

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

To run a demo:

python whisper_stt.py         # Whisper offline
python vosk_stt.py            # Vosk-based
python real_time_stt.py       # Stream + print live transcript
python speechrec_stt.py       # SpeechRecognition (pocketsphinx)

To run Whisper.cpp CLI-based GUI:

python whisper_gui_app.py     # Runs rolling 10s inference using whisper.cpp

CLI + Real-Time App Summaries

whisper_gui_app.py
Uses Whisper.cpp via CLI, transcribes 10s rolling mic buffers. Shows final, clean transcript and saves to .txt.
whisper_stt.py
Runs FasterWhisper (CTranslate2) on CPU. GUI with volume meter and chunked partial/final transcript view.
vosk_stt.py
Lightweight Kaldi-based transcription. Fast and accurate. CPU only.
pywhispercpp_demo.py
GPU-accelerated via Metal. Uses pywhispercpp binding and simple file-based API.
mlx_whisper_stt.py
Apple MLX version of Whisper. Fast file-based inference with whisper-medium model.
real_time_stt.py
Basic microphone streaming demo. Updates in real time.
speechrec_stt.py
Fully offline. Uses pocketsphinx via SpeechRecognition for basic commands.

Next Steps

This project is no longer in active development. The explored engines and findings—particularly the selection of pywhispercpp—will inform future agents and integrations.

Links

SUPPORT US ►

Subscribe ➔ https://www.youtube.com/@Jenkins_Robotics
Patreon ➔ https://www.patreon.com/JenkinsRobotics
Venmo ➔ https://venmo.com/u/JenkinsRobotics

FOLLOW US ►

Discord ➔ https://discord.gg/sAnE5pRVyT
Patreon ➔ https://www.patreon.com/JenkinsRobotics
Twitter ➔ https://twitter.com/jenkinsrobotics
Instagram ➔ https://www.instagram.com/jenkinsrobotics/
Facebook ➔ https://www.facebook.com/jenkinsrobotics/
GitHub ➔ https://jenkinsrobotics.github.io

Licenses and Credits

All third-party models and libraries retain their original licenses. This repo is intended for R&D, robotics, and AI voice assistant prototyping.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
FastWhisperSTT_Demo		FastWhisperSTT_Demo
GH PAGES		GH PAGES
Obsidian Engineering Notes		Obsidian Engineering Notes
PyWisperCPP_Demo		PyWisperCPP_Demo
VoskSTT_Demo		VoskSTT_Demo
WhisperCPP_Demo		WhisperCPP_Demo
WhisperMLX_Demo		WhisperMLX_Demo
lightning whisper mlx demo		lightning whisper mlx demo
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Jenkins Robotics

STT_Demos – Local Speech Recognition Showcase

Project Information

General Information

Support

Table of Contents

STT Engines Included

Installation Instructions

CLI + Real-Time App Summaries

Next Steps

Links

Licenses and Credits

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Jenkins Robotics

STT_Demos – Local Speech Recognition Showcase

Project Information

General Information

Support

Table of Contents

STT Engines Included

Installation Instructions

CLI + Real-Time App Summaries

Next Steps

Links

Licenses and Credits

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages