Run OpenAI Whisper as a Replicate Cog on Fly.io!
This app exposes the Whisper model via a simple HTTP server, thanks to Replicate Cog. Cog is an open-source tool that lets you package machine learning models in a standard, production-ready container. When you're up and running, you can trascribe audio using the /predictions endpoint.
Create a deploy the app in one single command:
fly launch --from https://github.com/fly-apps/cog-whisper --no-public-ipsAssign a Flycast IP to the app:
fly ips allocate-v6 --privateThat's it! You can now access the app at http://<APP_NAME>.flycast/predictions
Important
By default, the app runs on Fly GPUs — Nvidia L40s to be exact. This can be customized in the fly.toml vm settings. It will run on a standard Fly Machine — but performance will be reduced.
curl -X PUT \
-H "Content-Type: application/json" \
-d '{
"input": {
"audio": "https://fly.storage.tigris.dev/cogs/bun_on_fly.mp3"
}
}' \
http://cog-whisper.flycast/predictions/test | jq
-
Clone the
cog-whisperrepository from GitHub:git clone [email protected]:fly-apps/cog-whisper.git -
Navigate into the cloned directory:
cd cog-whisper -
Run locally. First, run
get_weights.shfrom the project root to download pre-trained weights, then build a container and run predictions:./scripts/get_weights.sh:cog predict -i audio="<path/to/your/audio/file>" -
Build the Docker image using
cog:cog build -t whisper
Create an issue or ask a question here: https://community.fly.io/