OctoAI (formerly OctoML) was a Seattle-based AI inference platform serving open-source foundation models (Llama 2/3, Mixtral, SDXL, Stable Diffusion, Whisper) behind OpenAI-compatible REST APIs, plus OctoStack for private VPC and on-premises deployment.
Status: defunct. NVIDIA acquired OctoAI in September 2024 (reported ~$165M) and OctoAI terminated all hosted services, accounts, and SDK access on 31 October 2024. The octo.ai domain 301-redirects to nvidia.com. The team and technology were absorbed into NVIDIA's internal AI inference stack and are not separately purchasable.
This repository is an api.json/apis.yml historical record of the former OctoAI developer surface (text-gen, image-gen, asset library, compute service, OctoStack) and the artifacts that remain in the octoml GitHub org.