Velvet

Local speech-to-text that runs entirely on your machine. No API keys, no cloud, no subscriptions. Click the orb, speak, get text.

What It Does

Click the orb to start recording
Speak — the orb reacts to your voice in real-time (WebGL shader + FFT audio analysis)
Click again to stop — transcription happens locally via faster-whisper
Text auto-copies to your clipboard

The window is transparent and always-on-top — it floats over your desktop like a glass HUD.

Stack

Layer	Tech
Window	Electron (frameless, transparent, always-on-top)
UI	HTML/CSS glass panels + Three.js WebGL orb with custom GLSL shaders
Audio	Web Audio API (FFT visualization) + Python sounddevice (capture)
Transcription	faster-whisper medium model (~1.5 GB)
Backend	Python Flask server on localhost
GPU	CUDA float16 (auto-falls back to CPU int8)

Setup

Prerequisites: Python 3.10+, Node.js 18+, a microphone

# Clone
git clone https://github.com/desertcache/velvet.git
cd velvet

# Python dependencies
pip install -r requirements.txt

# Electron
npm install

# Launch
start.bat          # Windows
npx electron .     # Any platform

Electron automatically starts the Python backend — no need to run it separately.

First run: The whisper model (~1.5 GB) downloads automatically. This takes a few minutes depending on your connection. The status will show "Loading model..." until it's ready.

CUDA (optional)

CUDA is used automatically if available — transcription is significantly faster on GPU. If you don't have a CUDA GPU, it falls back to CPU (int8) with no extra configuration.

For CUDA support, you need the NVIDIA cuDNN and cuBLAS libraries installed. See the faster-whisper docs for details.

Platform notes

Windows: Transparent window with acrylic blur works out of the box
macOS/Linux: Electron transparency works but the acrylic blur effect may vary by compositor

How It Works

The UI has three layers:

Glass panels — CSS backdrop-filter: blur(8px) with near-zero opacity backgrounds over a transparent Electron window
Soul Orb — A Three.js sphere with custom vertex/fragment shaders. Simplex noise deforms the mesh, fresnel shading creates edge glow, and voice amplitude drives real-time shape morphing
Waveform visualizer — Canvas-drawn frequency bars from Web Audio API FFT data, masked with a gradient fade

The orb has four visual states (IDLE / LISTENING / PROCESSING / SPEAKING), each defining colors, noise parameters, scale, and shape. All transitions are lerped frame-by-frame for smooth organic motion.

Files

velvet/
├── main.js           # Electron main process (window config + spawns Flask)
├── preload.js        # IPC bridge (minimize/close)
├── index.html        # UI — glass panels, controls, waveform canvas
├── orb.js            # Three.js SoulOrb — WebGL shaders, state machine, audio reactivity
├── server.py         # Flask backend — mic capture, whisper transcription
├── start.bat         # Windows launcher
├── requirements.txt  # Python deps
└── package.json      # Electron dep

API

The Flask backend runs on localhost:5111:

Endpoint	Method	Description
`/status`	GET	Model load state + device (CUDA/CPU)
`/record`	POST	Start mic capture
`/stop`	POST	Stop capture
`/transcribe`	POST	Run whisper on captured audio, return text

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Velvet

What It Does

Stack

Setup

CUDA (optional)

Platform notes

How It Works

Files

API

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
screenshots		screenshots
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
index.html		index.html
main.js		main.js
orb.js		orb.js
package-lock.json		package-lock.json
package.json		package.json
preload.js		preload.js
requirements.txt		requirements.txt
server.py		server.py
start.bat		start.bat

Folders and files

Latest commit

History

Repository files navigation

Velvet

What It Does

Stack

Setup

CUDA (optional)

Platform notes

How It Works

Files

API

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages