feat: backend/sglang #69

thushan · 2025-09-26T08:09:36Z

This PR implements the SGLang backend.

Summary by CodeRabbit

New Features
- Added native SGLang backend support: OpenAI‑compatible endpoints for models, routing and provider recognition across the app; SGLang now listed in supported backends and response headers.
- Exposes SGLang health, metrics and version endpoints with Prometheus metrics and model-specific resource/timeout hints.
Documentation
- Comprehensive SGLang integration guide and API reference, navigation and README updates with usage examples (chat, generation, batch, vision, speculative decoding) and monitoring.

coderabbitai · 2025-09-26T08:09:42Z

Walkthrough

Adds SGLang as a first-class backend/provider: new profile (sglang.yaml), parser, converter, constants, routes, and documentation. Updates navigation, API reference, README, and profiles index. Tests added/updated for factory and new converter. vLLM profile gains a home URL. Supported backends list expanded to include sglang.

Changes

Cohort / File(s)	Summary
Profiles: SGLang add, vLLM tweak `config/profiles/sglang.yaml`, `config/profiles/vllm.yaml`, `config/profiles/README.md`	New SGLang profile with endpoints, model/resource/metrics configuration; vLLM adds `home_url`; profiles README updated to include `sglang` and adjust OpenAI description.
Docs: Index, Overview, API, Integration, Nav `docs/content/index.md`, `docs/content/integrations/overview.md`, `docs/content/api-reference/overview.md`, `docs/content/integrations/backend/sglang.md`, `docs/mkdocs.yml`, `docs/content/api-reference/sglang.md`, `docs/content/configuration/reference.md`, `docs/content/development/contributing.md`	Adds SGLang documentation pages and nav entries; inserts SGLang into API reference and site content; small anchor and contributing additions.
Converter: SGLang support `internal/adapter/converter/sglang_converter.go`, `internal/adapter/converter/factory.go`, `internal/adapter/converter/factory_test.go`	Implements `SGLangConverter`, registers it in factory, and updates factory test to expect `sglang` format.
Parser & Profile types: SGLang `internal/adapter/registry/profile/sglang.go`, `internal/adapter/registry/profile/sglang_parser.go`, `internal/adapter/registry/profile/parsers.go`	Adds SGLang OpenAI-compatible model types and a parser; registers parser case for `sglang`.
Routing & Provider wiring `internal/app/handlers/server_routes.go`, `internal/app/handlers/handler_common.go`	Registers SGLang provider routes, model listing handler and marks provider supported.
Core constants & versioning `internal/core/constants/providers.go`, `internal/core/domain/profile.go`, `internal/version/version.go`	Adds `ProviderTypeSGLang`/display constant, `ProfileSGLang` and includes `sglang` in supported backends.
Tests: Converter `internal/adapter/converter/sglang_converter_test.go`	New comprehensive tests covering conversion, metadata handling, owner/alias logic, filtering and performance.
Top-level README `readme.md`	Reworks provider list into a table and adds SGLang badge/entry.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant C as Client
  participant GW as Olla API Gateway
  participant RH as Routes/Handlers
  participant CV as Model Converter
  participant SG as SGLang Server

  Note over C,GW: List models via OpenAI-compatible endpoint
  C->>GW: GET /v1/models (X-Olla-Backend-Type: sglang)
  GW->>RH: genericProviderModelsHandler(sglang, openai)
  RH->>SG: GET /v1/models (SGLang)
  SG-->>RH: SGLang model list (JSON)
  RH->>CV: Convert SGLang->OpenAI model list
  CV-->>RH: OpenAI-compatible models
  RH-->>C: 200 OK, models (X-Olla-Backend-Type: sglang)

  rect rgba(230,245,255,0.6)
    Note right of CV: New/changed: SGLang converter — owner/parent/vision/radix/spec-decoding fields
  end

sequenceDiagram
  autonumber
  participant D as Discovery Job
  participant SG as SGLang Server
  participant PR as SGLang Parser
  participant REG as Model Registry

  Note over D,REG: Periodic backend discovery
  D->>SG: GET /v1/models
  SG-->>D: SGLangResponse
  D->>PR: Parse SGLangResponse
  PR-->>D: []*ModelInfo (type=sglang, details)
  D->>REG: Upsert models
  REG-->>D: Ack

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

feat: backend/vllm #44 — Adds a new backend/provider integration and similar parser/registration changes; parallels SGLang integration points.
tweaks & bugfixes #35 — Changes converter model-ID/alias handling; relates to SGLang converter ID/alias fallback logic.
refactor: Proxy Configurations #59 — Modifies converters and BaseConverter utilities that the new SGLang converter relies on.

Suggested labels

enhancement, llm-backend

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title “feat: backend/sglang” clearly indicates that this pull request introduces a new feature related to the SGLang backend, which aligns directly with the extensive code and documentation changes for SGLang integration described in the PR.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch integration/sglang

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

docs/mkdocs.yml (1)

153-178: MkDocs nav still misses the new SGLang docs.

mkdocs --strict is failing because the navigation entries point to integrations/backend/sglang.md and api-reference/sglang.md, but those files aren’t present in the repo. Please add the missing documents (or fix the paths) so the nav links resolve.

Based on the pipeline logs

🧹 Nitpick comments (6)

config/profiles/README.md (1)

22-23: Minor wording nit.

If you update openai.yaml’s blurb again later, consider mirroring the other descriptions (“OpenAI-compatible API (generic profile)”) to keep the list uniform.

internal/adapter/converter/factory_test.go (1)

55-68: Tighten the assertions for SGLang support

Since the factory now emits a sixth converter, let’s assert on it directly to avoid silent regressions. Please extend the “GetConverter returns correct converters” table (and optionally the unsupported-format error assertions) to cover "sglang" explicitly. That gives us deterministic coverage that this new backend stays wired in.
internal/adapter/converter/sglang_converter.go (4)
52-69: Make "created" deterministic and source from model when possible.

Using time.Now() at conversion time makes outputs non-deterministic and can drift from the model’s actual creation/ingest time. Prefer LastSeen or a metadata-provided timestamp, falling back to now as a last resort.

Apply this diff:
-	now := time.Now().Unix()
+	created := model.LastSeen.Unix()
+	if created == 0 {
+		if v := c.extractMetadataInt64(model.Metadata, "created"); v > 0 {
+			created = v
+		} else {
+			created = time.Now().Unix()
+		}
+	}
@@
 	sglangModel := &profile.SGLangModel{
 		ID:      modelID,
 		Object:  "model",
-		Created: now,
+		Created: created,
 		OwnedBy: c.determineOwner(modelID),
 		Root:    modelID, // SGLang typically sets root to the model ID
 	}
71-74: Fallback max context length from metadata if domain field is empty.

Some sources may only provide max_model_len in metadata; add a fallback.
 	// Set max context length if available
 	if model.MaxContextLength != nil && *model.MaxContextLength > 0 {
 		sglangModel.MaxModelLen = model.MaxContextLength
 	}
+	if sglangModel.MaxModelLen == nil {
+		if max := c.extractMetadataInt64(model.Metadata, "max_model_len"); max > 0 {
+			sglangModel.MaxModelLen = &max
+		}
+	}
106-114: Support common "parent" key as well as "parent_model".

Improves compatibility with varied metadata sources.
 func (c *SGLangConverter) getParentModel(model *domain.UnifiedModel) string {
 	// Check metadata first
 	if parentModel := c.ExtractMetadataString(model.Metadata, "parent_model"); parentModel != "" {
 		return parentModel
 	}
+	if parentModel := c.ExtractMetadataString(model.Metadata, "parent"); parentModel != "" {
+		return parentModel
+	}
 
 	return ""
 }
140-152: Harden metadata int parsing (strings).

Metadata often carries numbers as strings; parse them too.
 func (c *SGLangConverter) extractMetadataInt64(metadata map[string]interface{}, key string) int64 {
 	if val, ok := metadata[key].(int64); ok {
 		return val
 	}
 	if val, ok := metadata[key].(int); ok {
 		return int64(val)
 	}
 	if val, ok := metadata[key].(float64); ok {
 		return int64(val)
 	}
+	if val, ok := metadata[key].(string); ok {
+		if i, err := strconv.ParseInt(val, 10, 64); err == nil {
+			return i
+		}
+		if f, err := strconv.ParseFloat(val, 64); err == nil {
+			return int64(f)
+		}
+	}
 	return 0
 }
Add the import:
 import (
+	"strconv"
 	"strings"
 	"time"

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d2bc4af and 7aeb09f.

📒 Files selected for processing (21)

config/profiles/README.md (1 hunks)
config/profiles/sglang.yaml (1 hunks)
config/profiles/vllm.yaml (1 hunks)
docs/content/api-reference/overview.md (2 hunks)
docs/content/index.md (3 hunks)
docs/content/integrations/backend/sglang.md (1 hunks)
docs/content/integrations/overview.md (1 hunks)
docs/mkdocs.yml (2 hunks)
internal/adapter/converter/factory.go (1 hunks)
internal/adapter/converter/factory_test.go (2 hunks)
internal/adapter/converter/sglang_converter.go (1 hunks)
internal/adapter/converter/sglang_converter_test.go (1 hunks)
internal/adapter/registry/profile/parsers.go (1 hunks)
internal/adapter/registry/profile/sglang.go (1 hunks)
internal/adapter/registry/profile/sglang_parser.go (1 hunks)
internal/app/handlers/handler_common.go (1 hunks)
internal/app/handlers/server_routes.go (1 hunks)
internal/core/constants/providers.go (1 hunks)
internal/core/domain/profile.go (1 hunks)
internal/version/version.go (1 hunks)
readme.md (2 hunks)

🧰 Additional context used

📓 Path-based instructions (3)

config/profiles/{ollama,lmstudio,litellm,openai,vllm}.yaml

📄 CodeRabbit inference engine (CLAUDE.md)

Provider-specific profiles must reside under config/profiles/ with the specified filenames

Files:

config/profiles/vllm.yaml

internal/app/handlers/*.go

📄 CodeRabbit inference engine (CLAUDE.md)

Set response headers on proxy responses: X-Olla-Endpoint, X-Olla-Model, X-Olla-Backend-Type, X-Olla-Request-ID, X-Olla-Response-Time

Files:

internal/app/handlers/server_routes.go
internal/app/handlers/handler_common.go

{internal,pkg}/**/*_test.go

📄 CodeRabbit inference engine (CLAUDE.md)

Include Go benchmarks (Benchmark* functions) for critical paths, proxy engine comparisons, pooling efficiency, and circuit breaker behaviour

Files:

internal/adapter/converter/factory_test.go
internal/adapter/converter/sglang_converter_test.go

🧠 Learnings (3)

📚 Learning: 2025-09-23T08:30:20.348Z

Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-23T08:30:20.348Z
Learning: Applies to internal/app/handlers/*.go : Set response headers on proxy responses: `X-Olla-Endpoint`, `X-Olla-Model`, `X-Olla-Backend-Type`, `X-Olla-Request-ID`, `X-Olla-Response-Time`

Applied to files:

docs/content/api-reference/overview.md
docs/content/index.md

📚 Learning: 2025-09-23T08:30:20.348Z

Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-23T08:30:20.348Z
Learning: Applies to internal/adapter/proxy/**/service.go : Proxy services must include response headers: `X-Olla-Endpoint`, `X-Olla-Model`, `X-Olla-Backend-Type`, `X-Olla-Request-ID`, `X-Olla-Response-Time`

Applied to files:

docs/content/api-reference/overview.md
docs/content/index.md

📚 Learning: 2025-09-23T08:30:20.348Z

Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-23T08:30:20.348Z
Learning: Applies to config/profiles/{ollama,lmstudio,litellm,openai,vllm}.yaml : Provider-specific profiles must reside under `config/profiles/` with the specified filenames

Applied to files:

config/profiles/README.md

🧬 Code graph analysis (7)

internal/app/handlers/server_routes.go (1)

internal/core/constants/providers.go (2)

ProviderTypeSGLang (8-8)

ProviderTypeOpenAI (6-6)

internal/adapter/registry/profile/sglang_parser.go (2)

internal/core/domain/model.go (2)

ModelInfo (26-33)

ModelDetails (11-24)

internal/adapter/registry/profile/sglang.go (1)

SGLangResponse (4-7)

internal/adapter/converter/sglang_converter_test.go (6)

internal/adapter/converter/sglang_converter.go (2)

NewSGLangConverter (23-27)

SGLangConverter (18-20)

internal/core/constants/providers.go (1)

ProviderTypeSGLang (8-8)

internal/core/domain/unified_model.go (3)

UnifiedModel (15-31)

AliasEntry (9-12)

SourceEndpoint (34-44)

internal/core/ports/model_converter.go (1)

ModelFilters (18-23)

internal/adapter/registry/profile/sglang.go (1)

SGLangResponse (4-7)

internal/adapter/converter/base_converter.go (2)

BaseConverter (8-10)

NewBaseConverter (13-17)

internal/app/handlers/handler_common.go (1)

internal/core/constants/providers.go (1)

ProviderTypeSGLang (8-8)

internal/adapter/converter/factory.go (1)

internal/adapter/converter/sglang_converter.go (1)

NewSGLangConverter (23-27)

internal/adapter/converter/sglang_converter.go (5)

internal/adapter/registry/profile/sglang.go (2)

SGLangResponse (4-7)

SGLangModel (10-26)

internal/adapter/converter/base_converter.go (2)

BaseConverter (8-10)

NewBaseConverter (13-17)

internal/core/ports/model_converter.go (2)

ModelResponseConverter (10-15)

ModelFilters (18-23)

internal/core/constants/providers.go (1)

ProviderTypeSGLang (8-8)

internal/core/domain/unified_model.go (1)

UnifiedModel (15-31)

internal/adapter/registry/profile/parsers.go (1)

internal/core/constants/providers.go (1)

ProviderTypeSGLang (8-8)

🪛 GitHub Actions: Deploy Documentation

docs/content/api-reference/overview.md