api_llm

Direct HTTP API bindings for major LLM providers with enterprise reliability features.

🎯 Architecture: Stateless HTTP Clients

All API crates are designed as stateless HTTP clients with zero persistence requirements. They provide:

Direct HTTP calls to respective LLM provider APIs
In-memory operation state only (resets on restart)
No external storage dependencies (databases, files, caches)
No configuration persistence beyond environment variables

This ensures lightweight, containerized deployments and eliminates operational complexity.

🏛️ Governing Principle: "Thin Client, Rich API"

Expose all server-side functionality transparently while maintaining zero client-side intelligence or automatic behaviors.

Key principles:

API Transparency: One-to-one mapping with provider APIs without hidden behaviors
Zero Client Intelligence: No automatic decision-making or magic thresholds
Explicit Control: Developer decides when, how, and why operations occur
Information vs Action: Clear separation between data retrieval and state changes

Scope

In Scope

Text generation (single and multi-turn conversations)
Streaming responses (SSE and WebSocket where applicable)
Function/tool calling with full schema support
Vision and multimodal inputs
Audio processing (speech-to-text, text-to-speech)
Embedding generation
Model listing and information
Token counting
Batch operations
Enterprise reliability (retry, circuit breaker, rate limiting, failover, health checks)
Synchronous API wrappers

Out of Scope

High-level abstractions or unified interfaces (see llm_contract)
Provider switching or fallback logic
Business logic or application features
Persistent state management

API Crates

Crate	Provider	Tests	Default Model
api_gemini	Google Gemini	485	gemini-2.5-flash
api_openai	OpenAI	643	gpt-5.1-chat-latest
api_claude	Anthropic Claude	435	claude-sonnet-4-5-20250929
api_ollama	Ollama (Local)	378	llama3.2
api_huggingface	HuggingFace	534	meta-llama/Llama-3.2-3B-Instruct
api_xai	xAI Grok	127	grok-2-1212

Quick Start

use api_openai::{ OpenAIClient, ChatRequest, ChatMessage, MessageRole };

#[tokio::main]
async fn main() -> Result< (), Box< dyn std::error::Error > >
{
  let client = OpenAIClient::new_from_env()?;

  let request = ChatRequest::new( "gpt-4" )
    .with_message( ChatMessage::new( MessageRole::User, "Hello!" ) );

  let response = client.chat( &request ).await?;
  println!( "{}", response.choices[ 0 ].message.content );
  Ok( () )
}

Features

Core Capabilities:

Text generation with configurable parameters
Real-time streaming responses
Multi-turn conversation handling
Function calling with JSON schema validation
Vision support (image inputs)
Audio processing (where supported)
Embedding generation
Token counting (where supported)

Enterprise Reliability:

Retry logic with exponential backoff
Circuit breaker for fault tolerance
Rate limiting with token bucket algorithm
Multi-endpoint failover
Health checks with endpoint monitoring
Request caching with TTL

API Patterns:

Async API (tokio-based)
Sync API (blocking wrappers)
Builder patterns for configuration
Feature flags for zero-overhead customization

Secret Management

API keys via environment variables or workspace secrets:

# Environment (CI/CD)
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GEMINI_API_KEY="AIza..."
export HUGGINGFACE_API_KEY="hf_..."
export XAI_API_KEY="xai-..."

# Workspace secrets (local development)
source secret/-secrets.sh

Testing

# Check all crates compile
cargo check --workspace

# Run all tests (requires API keys)
cargo test --workspace

# Run tests for specific crate
cargo test -p api_openai

# Full validation
w3 .test level::3

# Build documentation
cargo doc --workspace --open

Testing Policy: All tests use real API integration. No mocking allowed.

Documentation

API Feature Matrix - Complete feature comparison
api_gemini - Google Gemini API client
api_openai - OpenAI API client
api_claude - Anthropic Claude API client
api_ollama - Ollama local API client
api_huggingface - HuggingFace Inference API client
api_xai - xAI Grok API client

Dependencies

All crates share common dependencies managed at workspace level:

reqwest: HTTP client with async support
tokio: Async runtime
serde/serde_json: Serialization
error_tools: Unified error handling
workspace_tools: Secret management

Contributing

Follow established patterns in existing code
Use 2-space indentation consistently
Add tests for new functionality
Update documentation for public APIs
Ensure zero clippy warnings: cargo clippy -- -D warnings
Follow zero-tolerance mock policy (real API integration only)

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.cargo		.cargo
.config		.config
api		api
secret		secret
.gitignore		.gitignore
Cargo.toml		Cargo.toml
license		license
readme.md		readme.md
spec.md		spec.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

api_llm

🎯 Architecture: Stateless HTTP Clients

🏛️ Governing Principle: "Thin Client, Rich API"

Scope

In Scope

Out of Scope

API Crates

Quick Start

Features

Secret Management

Testing

Documentation

Dependencies

Contributing

License

About

Uh oh!

Releases

Packages

Languages

License

iron-cage/api_llm

Folders and files

Latest commit

History

Repository files navigation

api_llm

🎯 Architecture: Stateless HTTP Clients

🏛️ Governing Principle: "Thin Client, Rich API"

Scope

In Scope

Out of Scope

API Crates

Quick Start

Features

Secret Management

Testing

Documentation

Dependencies

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages