ollama-voice

Plug whisper audio transcription to a local ollama server and ouput tts audio responses

This is just a simple combination of three tools in offline mode:

Speech recognition: whisper running local models in offline mode
Large Language Mode: ollama running local models in offline mode
Offline Text To Speech: pyttsx3

Prerequisites

whisper dependencies are setup to run on GPU so Install Cuda before running pip install.

Running

Install ollama and ensure server is started locally first (in WSL under windows) (e.g. curl https://ollama.ai/install.sh | sh)

Download a whisper model and place it in the whisper subfolder (e.g. https://openaipublic.azureedge.net/main/whisper/models/e5b1a55b89c1367dacf97e3e19bfd829a01529dbfdeefa8caeb59b3f1b81dadb/large-v3.pt)

Configure assistant.yaml settings. (It is setup to work in french with ollama mistral model by default...)

Run assistant.py

Leave space key pressed to talk, the AI will interpret the query when you release the key.

Todo

Rearrange code base
Multi threading to overlap tts and speed recognition (ollama is already running remotely in parallel)

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
assistant.png		assistant.png
assistant.py		assistant.py
assistant.yaml		assistant.yaml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ollama-voice

Prerequisites

Running

Todo

About

Uh oh!

Releases

Packages

Languages

License

Wolwer1nE/ollama-voice

Folders and files

Latest commit

History

Repository files navigation

ollama-voice

Prerequisites

Running

Todo

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages