Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Latest commit

 

History

History
67 lines (51 loc) · 2.86 KB

File metadata and controls

67 lines (51 loc) · 2.86 KB
title What is DeepInfra
description AI inference cloud — OpenAI-compatible API, 100s of open-source models, private GPU deployments, and GPU rental.
icon bolt

DeepInfra is an AI inference cloud that makes it simple to run the latest machine learning models at scale — LLMs, vision, embeddings, image generation, video generation, speech, and more.

What you can do

OpenAI-compatible API for 100+ LLMs. Swap your base URL, keep your code. Multimodal models for visual understanding and document text extraction. State-of-the-art embedding and reranker models for search and RAG. FLUX, Stable Diffusion, text-to-video, and more. Speech recognition (Whisper) and text-to-speech models. Run your own fine-tuned LLM on A100 / H100 / H200 / B200 / B300 with autoscaling.

Why DeepInfra

Drop-in OpenAI replacement. Point your existing OpenAI SDK to https://api.deepinfra.com/v1/openai and your code works without changes. No migration required.

Best price for open-source models. DeepInfra consistently offers the lowest prices for open-source model inference. You only pay per token — no idle GPU time, no minimums, no seat fees. DeepInfra is also the provider with the most models on OpenRouter.

Always-fresh model catalog. DeepInfra is typically among the first providers to deploy a newly released model.

Private deployments for compliance and customization. Need to run your own fine-tuned weights, or require data isolation? Deploy a dedicated instance on A100/H100/H200/B200/B300 with autoscaling and a private endpoint — competitive GPU pricing, deployable in just a few clicks.

GPU Clusters for training and full control. Rent a B200 or B300 cluster with SSH access and run whatever you want.

Get started in 60 seconds

Make your first API call — no installation required.

Quick example

from openai import OpenAI

client = OpenAI(
    api_key="$DEEPINFRA_TOKEN",
    base_url="https://api.deepinfra.com/v1/openai",
)

response = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-V3",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

Get your API key from the Dashboard.