go-whisper

Speech-to-Text in golang. This is an early development version.

cmd contains an OpenAI-API compatible server
pkg contains the whisper service and client
sys contains the whisper bindings to the whisper.cpp library
third_party is a submodule for the whisper.cpp source

Running

(Note: Docker images are not created yet - this is some forward planning!)

There are docker images for arm64 and amd64 (Intel). The arm64 image is built for Jetson GPU support specifically, but it will also run on Raspberry Pi's.

In order to utilize a NVIDIA GPU, you'll need to install the NVIDIA Container Toolkit first.

A docker volume should be created called "whisper" can be used for storing the Whisper language models. You can see which models are available to download locally here. The following command will run the server on port 8080:

docker run \
  --name whisper-server --rm \
  --runtime nvidia --gpus all \ # When using a NVIDIA GPU
  -v whisper:/models -p 8080:8080 -e WHISPER_DATA=/models \
  ghcr.io/mutablelogic/go-whisper:latest

If you include a -debug flag at the end, you'll get more verbose output. The API is then available at http://localhost:8080/v1 and it generally conforms to the OpenAI API spec.

Sample Usage

In order to download a model, you can use the following command (for example):

curl -X POST -H "Content-Type: application/json" -d '{"Path" : "ggml-tiny.en-q8_0.bin" }' localhost:8080/v1/models

To list the models available, you can use the following command:

curl -X GET localhost:8080/v1/models

To delete a model, you can use the following command:

curl -X DELETE localhost:8080/v1/models/ggml-tiny.en-q8_0

To transcribe a media file into it's original language, you can use the following command:

curl -F "model=ggml-tiny.en-q8_0" -F "file=@samples/jfk.wav" localhost:8080/v1/audio/transcriptions

To translate a media file into a different language, you can use the following command:

curl -F "model=ggml-tiny.en-q8_0" -F "file=@samples/de-podcast.wav" -F "language=en" localhost:8080/v1/audio/transcriptions

There's more information on the API here.

Building

If you are building a docker image, you just need Docker installed:

DOCKER_REGISTRY=docker.io/user make docker - builds a docker container with the server binary, tagged to a specific registry

If you want to build the server yourself for your specific combination of hardware, you can use the Makefile in the root directory and have the following dependencies met:

Go 1.22
C++ compiler
FFmpeg 6.1 libraries (see here for more information)
For CUDA, you'll need the CUDA toolkit including the nvcc compiler

The following Makefile targets can be used:

make server - creates the server binary, and places it in the build directory. Should link to Metal on macOS
GGML_CUDA=1 make server - creates the server binary linked to CUDA, and places it in the build directory. Should work for amd64 and arm64 (Jetson) platforms

See all the other targets in the Makefile for more information.

Status

Still in development. See this issue for remaining tasks to be completed.

Contributing & Distribution

This module is currently in development and subject to change.

Please do file feature requests and bugs here. The license is Apache 2 so feel free to redistribute. Redistributions in either source code or binary form must reproduce the copyright notice, and please link back to this repository for more information:

go-whisper
https://github.com/mutablelogic/go-whisper/
Copyright (c) 2023-2024 David Thorpe, All rights reserved.

whisper.cpp
https://github.com/ggerganov/whisper.cpp
Copyright (c) 2023-2024 The ggml authors

This software links to static libraries of whisper.cpp licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 124 Commits
.github/workflows		.github/workflows
cmd		cmd
doc		doc
etc		etc
pkg/whisper		pkg/whisper
samples		samples
sys		sys
third_party		third_party
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

go-whisper

Running

Sample Usage

Building

Status

Contributing & Distribution

About

Uh oh!

Releases

Packages

Languages

License

RamiAwar/go-whisper

Folders and files

Latest commit

History

Repository files navigation

go-whisper

Running

Sample Usage

Building

Status

Contributing & Distribution

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages