Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Wolwer1nE/ollama-voice

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ollama-voice

Plug whisper audio transcription to a local ollama server and ouput tts audio responses

This is just a simple combination of three tools in offline mode:

  • Speech recognition: whisper running local models in offline mode
  • Large Language Mode: ollama running local models in offline mode
  • Offline Text To Speech: pyttsx3

Prerequisites

whisper dependencies are setup to run on GPU so Install Cuda before running pip install.

Running

Install ollama and ensure server is started locally first (in WSL under windows) (e.g. curl https://ollama.ai/install.sh | sh)

Download a whisper model and place it in the whisper subfolder (e.g. https://openaipublic.azureedge.net/main/whisper/models/e5b1a55b89c1367dacf97e3e19bfd829a01529dbfdeefa8caeb59b3f1b81dadb/large-v3.pt)

Configure assistant.yaml settings. (It is setup to work in french with ollama mistral model by default...)

Run assistant.py

Leave space key pressed to talk, the AI will interpret the query when you release the key.

Todo

  • Rearrange code base
  • Multi threading to overlap tts and speed recognition (ollama is already running remotely in parallel)

About

plug whisper audio transcription to a local ollama server and ouput tts audio responses

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%