-
-
Notifications
You must be signed in to change notification settings - Fork 11
feat: API Redesign #33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…es invalid) ah, normalise lmstudio, lm-studio, lm_studio in configuration add tests for model_builder update tests for model builder fix invalid routes for ollama and broken for vllm adds a test to check the unfied models too
adds two new scripts to check formats and compatibility (Claude created)
consolidates naming for load balancers
update scripts fix tests support lm_studio variations in API (not finalised yet). update tests update more tests
WalkthroughThis update introduces a provider-specific routing system, enabling explicit namespacing of proxy endpoints by backend provider type (e.g., Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Olla
participant ProfileFactory
participant BackendProvider
Client->>Olla: HTTP request to /olla/{provider}/...
Olla->>ProfileFactory: Resolve provider profile by prefix/alias
ProfileFactory-->>Olla: Profile config (with routing prefixes)
Olla->>BackendProvider: Proxy or model discovery request (provider-specific)
BackendProvider-->>Olla: Response (models, inference, etc.)
Olla-->>Client: Response (format depends on endpoint/provider)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~40–60 minutes Possibly related PRs
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. 📜 Recent review detailsConfiguration used: .coderabbit.yaml 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
♻️ Duplicate comments (2)
default.yaml (1)
34-35: Same key rename as above – keep parsers in syncDuplicated concern: the default config now uses
"least-connections". Make sure both JSON/YAML unmarshal logic and any CLI validation recognise the updated spelling, or startup will fail in deployments that copy the default.internal/adapter/proxy/proxy_olla.go (1)
518-522: Mirror the conditional logging here as wellSame recommendation as for the Sherpa proxy to suppress empty
"model"entries in production logs (see diff in previous comment).
🧹 Nitpick comments (25)
internal/core/constants/context.go (1)
4-8: Prefer typed context keys to plain stringsUsing raw strings as
context.Contextkeys risks clashes across packages. A lightweight, unexported type avoids this:-package constants +package constants + +// ctxKey is an unexported type used to prevent context-key collisions. +type ctxKey string @@ - OriginalPathKey = "original_path" // original path before any modifications, useful for logging/debugging + OriginalPathKey ctxKey = "original_path" // original path before any modifications, useful for logging/debuggingMigrating now keeps the public surface stable before more call-sites appear.
test/scripts/logic/requirements.txt (1)
1-1: Pin the dependency version for repeatable test runsUnpinned dependencies can break the test harness when
requestsreleases a new major version.-requests +requests>=2.32,<3Locking to a tested range makes CI failures easier to diagnose.
internal/adapter/proxy/proxy_sherpa.go (1)
236-239: New structured-log field is valuable – avoid zero-value noiseIncluding
"model"in the dispatch log improves traceability. Consider guarding the field whenstats.Model == "":- rlog.Info("Request dispatching to endpoint", "endpoint", endpoint.Name, "target", stats.TargetUrl, "model", stats.Model) + if stats.Model != "" { + rlog.Info("Request dispatching to endpoint", "endpoint", endpoint.Name, "target", stats.TargetUrl, "model", stats.Model) + } else { + rlog.Info("Request dispatching to endpoint", "endpoint", endpoint.Name, "target", stats.TargetUrl) + }This keeps logs tidy when model info is unavailable.
internal/core/constants/endpoint.go (1)
5-6: Well-defined constants for standardised routing.The new constants provide centralised configuration for proxy path prefixes, supporting the provider-specific routing architecture. The naming is clear and follows Go conventions.
Note on struct alignment: Whilst these are constants rather than struct fields, ensure any related structs in this package follow proper field alignment for optimal memory layout as per the coding guidelines.
internal/core/domain/profile_config.go (1)
34-43: Routing configuration properly added for provider-specific routing.The addition of the
Routingfield withPrefixesslice correctly enables the new provider-specific routing architecture. The YAML tags are properly configured for external configuration loading.However, consider optimising struct field alignment for better memory layout:
type ProfileConfig struct { Name string `yaml:"name"` Version string `yaml:"version"` DisplayName string `yaml:"display_name"` Description string `yaml:"description"` Detection struct { Headers []string `yaml:"headers"` UserAgentPatterns []string `yaml:"user_agent_patterns"` ResponsePatterns []string `yaml:"response_patterns"` PathIndicators []string `yaml:"path_indicators"` DefaultPorts []int `yaml:"default_ports"` } `yaml:"detection"` Request struct { ModelFieldPaths []string `yaml:"model_field_paths"` ResponseFormat string `yaml:"response_format"` ParsingRules struct { ChatCompletionsPath string `yaml:"chat_completions_path"` CompletionsPath string `yaml:"completions_path"` GeneratePath string `yaml:"generate_path"` ModelFieldName string `yaml:"model_field_name"` SupportsStreaming bool `yaml:"supports_streaming"` } `yaml:"parsing_rules"` } `yaml:"request"` + // Group smaller fields together for better memory alignment + PathIndices struct { + Health int `yaml:"health"` + Models int `yaml:"models"` + Completions int `yaml:"completions"` + ChatCompletions int `yaml:"chat_completions"` + Embeddings int `yaml:"embeddings"` + } `yaml:"path_indices"` + + Characteristics struct { + Timeout time.Duration `yaml:"timeout"` + MaxConcurrentRequests int `yaml:"max_concurrent_requests"` + DefaultPriority int `yaml:"default_priority"` + StreamingSupport bool `yaml:"streaming_support"` + } `yaml:"characteristics"` Models struct { CapabilityPatterns map[string][]string `yaml:"capability_patterns"` NameFormat string `yaml:"name_format"` ContextPatterns []ContextPattern `yaml:"context_patterns"` } `yaml:"models"` Routing struct { Prefixes []string `yaml:"prefixes"` } `yaml:"routing"` API struct { ModelDiscoveryPath string `yaml:"model_discovery_path"` HealthCheckPath string `yaml:"health_check_path"` Paths []string `yaml:"paths"` OpenAICompatible bool `yaml:"openai_compatible"` } `yaml:"api"` Resources struct { Quantization struct { Multipliers map[string]float64 `yaml:"multipliers"` } `yaml:"quantization"` ModelSizes []ModelSizePattern `yaml:"model_sizes"` ConcurrencyLimits []ConcurrencyLimitPattern `yaml:"concurrency_limits"` Defaults ResourceRequirements `yaml:"defaults"` TimeoutScaling TimeoutScaling `yaml:"timeout_scaling"` } `yaml:"resources"` - PathIndices struct { - Health int `yaml:"health"` - Models int `yaml:"models"` - Completions int `yaml:"completions"` - ChatCompletions int `yaml:"chat_completions"` - Embeddings int `yaml:"embeddings"` - } `yaml:"path_indices"` - - Characteristics struct { - Timeout time.Duration `yaml:"timeout"` - MaxConcurrentRequests int `yaml:"max_concurrent_requests"` - DefaultPriority int `yaml:"default_priority"` - StreamingSupport bool `yaml:"streaming_support"` - } `yaml:"characteristics"` }internal/app/handlers/handler_provider_lmstudio.go (1)
56-57: Consider removing or expanding this comment.The comment about LM Studio's focus on local model serving seems incomplete and doesn't add significant value to the code. Consider either expanding it with more context or removing it.
-// lm studio focuses on local model serving without centralised management. -// this simplifies deployment but limits remote administration capabilitiesinternal/app/handlers/handler_provider_generic.go (1)
44-46: Consider using a more specific struct type.While the anonymous struct works, consider defining a named struct type for better code maintainability and reusability across handlers.
type ModelRequest struct { Name string `json:"name"` } var req ModelRequesttest/integration/profile_routing_test.go (1)
11-54: Consider enhancing to test full request flow through the proxyWhilst this test validates the profile factory's provider validation logic, per the retrieved learnings, integration tests should test the full request flow through the proxy. This appears to be more of a unit test for the
ProfileFactory.ValidateProfileTypemethod rather than an end-to-end integration test.Consider adding tests that:
- Send actual HTTP requests to provider-specific endpoints
- Validate the complete routing chain from request to backend
- Test the interaction between profile loading, route registration, and request handling
readme.md (1)
366-375: Address minor grammar issuesThe static analysis tool identified missing determiners in the load balancer section.
-### 📊 Least Connections (`least-connections`) - **Recommended** -Routes to the endpoint with least active requests. Ideal for: +### 📊 Least Connections (`least-connections`) - **Recommended** +Routes to the endpoint with the least active requests. Ideal for:docs/adding-providers.md (1)
112-112: Consider reducing exclamation marks for professional toneThe static analysis tool flagged excessive exclamation marks. Consider a more measured tone for technical documentation.
-3. Run Olla - done! +3. Run Olla - done!Or alternatively:
-3. Run Olla - done! +3. Run Olla - that's it!internal/app/handlers/handler_provider_models_test.go (1)
64-96: Consider replacing hardcoded sleep with deterministic synchronisationThe 200ms sleep on line 85 could make tests flaky and slower than necessary. Consider using a more deterministic approach to wait for async unification.
- // Wait for async unification to complete - time.Sleep(200 * time.Millisecond) + // Wait for async unification to complete with timeout + ctx, cancel := context.WithTimeout(ctx, 5*time.Second) + defer cancel() + for { + select { + case <-ctx.Done(): + require.Fail(t, "unification did not complete within timeout") + default: + // Check if unification is complete by verifying model count + models, err := unifiedRegistry.GetAllUnifiedModels(ctx) + if err == nil && len(models) >= 4 { // Expected total models + goto unified + } + time.Sleep(10 * time.Millisecond) + } + } + unified:docs/api/provider-routing.md (2)
163-163: Minor grammar improvementConsider adding the missing determiner for better readability:
-Within each provider type, the configured load balancing strategy (round-robin, least connections, priority) is applied. This means: +Within each provider type, the configured load balancing strategy (round-robin, the least connections, priority) is applied. This means:
24-27: Add language specifications to fenced code blocksThe fenced code blocks should specify a language for proper syntax highlighting and consistency:
-``` +```http GET /olla/ollama/api/tags # Ollama native format GET /olla/ollama/v1/models # OpenAI-compatible formatApply similar changes to the other endpoint listings: - Lines 69-72: Add `http` language - Lines 127-129: Add `http` language - Lines 138-140: Add `http` language - Lines 148-150: Add `http` language - Lines 155-157: Add `http` language Also applies to: 69-72, 127-129, 138-140, 148-150, 155-157 </blockquote></details> <details> <summary>test/scripts/logic/test-provider-models.sh (1)</summary><blockquote> `89-89`: **Address shellcheck warnings for better script reliability** Several shellcheck warnings should be addressed: ```diff - local response_body=$(echo "$response" | sed '$d') + local response_body + response_body=$(echo "$response" | sed '$d')- if [ $(echo "$response_body" | wc -l) -gt 10 ]; then + if [ "$(echo "$response_body" | wc -l)" -gt 10 ]; then- local response_body=$(echo "$response" | sed '$d') + local response_body + response_body=$(echo "$response" | sed '$d')These changes prevent masking return values and avoid word splitting issues.
Also applies to: 139-139, 191-191
test/scripts/logic/test-provider-routing.sh (1)
12-12: Remove unused colour variableThe
BLUEvariable is defined but never used in the script.-BLUE='\033[0;34m'test/scripts/logic/README.md (1)
125-125: Minor style and formatting improvementsA few minor improvements for consistency and readability:
-- **Flexible Testing** - Test specific providers or all providers +- **Flexible Testing** - Test-specific providers or all providers-- Endpoint usage statistics with success/failure breakdown +- Endpoint usage statistics with success and failure breakdown-``` +```text Available endpoints:Also applies to: 168-168, 172-172
internal/adapter/registry/profile/factory.go (1)
92-107: Consider atomic update pattern for prefix lookup rebuild.While the current implementation works, consider building the new prefix lookup in a temporary variable first, then atomically replacing it. This ensures consistency if
buildPrefixLookupencounters any issues.func (f *Factory) ReloadProfiles() error { f.mu.Lock() defer f.mu.Unlock() if err := f.loader.LoadProfiles(); err != nil { return err } - // Invalidate and rebuild the prefix cache - f.prefixLookup = make(map[string]string) - f.buildPrefixLookup() + // Build new prefix lookup atomically + newPrefixLookup := make(map[string]string) + f.buildPrefixLookupInto(newPrefixLookup) + f.prefixLookup = newPrefixLookup return nil }You'd need to refactor
buildPrefixLookupto accept the map as a parameter:func (f *Factory) buildPrefixLookupInto(lookup map[string]string) { profiles := f.loader.GetAllProfiles() for profileName, profile := range profiles { config := profile.GetConfig() if config == nil { continue } // Each prefix in the YAML becomes a valid route for _, prefix := range config.Routing.Prefixes { lookup[prefix] = profileName } // Profile names are implicit prefixes for convenience lookup[profileName] = profileName } } func (f *Factory) buildPrefixLookup() { f.buildPrefixLookupInto(f.prefixLookup) }internal/app/handlers/handler_provider_ollama.go (2)
8-28: Consider more descriptive error messages.While the implementation is correct, the error responses could be more informative for debugging.
func (a *Application) ollamaModelsHandler(w http.ResponseWriter, r *http.Request) { ctx := r.Context() models, err := a.getProviderModels(ctx, "ollama") if err != nil { - http.Error(w, err.Error(), http.StatusInternalServerError) + http.Error(w, fmt.Sprintf("Failed to fetch Ollama models: %v", err), http.StatusInternalServerError) return } response, err := a.convertModelsToProviderFormat(models, "ollama") if err != nil { - http.Error(w, err.Error(), http.StatusInternalServerError) + http.Error(w, fmt.Sprintf("Failed to convert models to Ollama format: %v", err), http.StatusInternalServerError) return } w.Header().Set(ContentTypeHeader, ContentTypeJSON) w.WriteHeader(http.StatusOK) json.NewEncoder(w).Encode(response) }
52-73: Consider consistent error message improvements.This handler follows the same pattern as
ollamaModelsHandler. Consider applying similar error message improvements here for consistency.test/scripts/logic/test-model-routing-provider.py (1)
1-14: Remove unused import and fix loop variable.The static analysis correctly identifies an unused import and loop variable that should be addressed.
import sys -import json import time import argparse import requests -from typing import Dict, List, Tuple +from typing import List, Tuple from collections import defaultdictAlso update line 150:
-for i, model in enumerate(self.provider_models[provider][:5]): +for model in self.provider_models[provider][:5]:internal/app/handlers/handler_provider_common.go (1)
132-178: Comprehensive model filtering with provider compatibility checks.The implementation correctly checks both source endpoints and aliases for provider compatibility. Consider performance optimisation for large model sets.
For better performance with large model sets, consider pre-computing a provider compatibility map:
// Pre-compute which endpoints are compatible with the provider compatibleEndpoints := make(map[string]bool) for _, ep := range endpoints { normalisedType := NormaliseProviderType(ep.Type) if providerProfile.IsCompatibleWith(normalisedType) { compatibleEndpoints[ep.URLString] = true } } // Then use simple map lookups in the model loop for _, source := range model.SourceEndpoints { if compatibleEndpoints[source.EndpointURL] { hasProvider = true break } }internal/app/handlers/server_routes.go (4)
12-19: Align struct fields for better memory layout.According to the coding guidelines for
**/*.gofiles, struct fields should be aligned for better memory layout. The current field ordering instaticRouteis not optimal.type staticRoute struct { - path string - handler http.HandlerFunc - description string - method string - isProxy bool + handler http.HandlerFunc // 8 bytes + path string // 16 bytes + description string // 16 bytes + method string // 16 bytes + isProxy bool // 1 byte }
63-64: Extract hardcoded string to improve maintainability.The "openai-compatible" string is hardcoded here and referenced again at line 72. Consider extracting it to a constant for better maintainability.
Add a constant in the
constantspackage:// In internal/core/constants/providers.go ProfileNameOpenAICompatible = "openai-compatible"Then update the code:
- profiles = append(profiles, "openai-compatible") + profiles = append(profiles, constants.ProfileNameOpenAICompatible)And at line 72:
- if profileName != "openai-compatible" { + if profileName != constants.ProfileNameOpenAICompatible {
92-175: Consider refactoring to reduce complexity.This function is quite long (83 lines) with multiple nested conditionals. Consider extracting the OpenAI compatibility registration (lines 133-171) into a separate method for better readability and testability.
Extract the OpenAI compatibility section:
func (a *Application) registerOpenAICompatibilityRoutes(basePath, prefix, profileName string) { openAIPath := basePath + "v1/models" switch profileName { case constants.ProviderTypeOllama: // ... existing logic case constants.ProviderTypeLMStudio: // ... existing logic case constants.ProviderTypeOpenAICompat: // ... existing logic default: // ... existing logic } }Then call it from the main function:
// OpenAI compatibility enables cross-provider client support if config.API.OpenAICompatible { - openAIPath := basePath + "v1/models" - // ... all the switch logic ... + a.registerOpenAICompatibilityRoutes(basePath, prefix, profileName) }
177-221: Static provider configuration looks comprehensive.The static provider definitions appropriately mirror the YAML configurations for test isolation. The comment clearly explains why this duplication exists. Note that any changes to the YAML profiles will need to be manually synchronised here.
Would you like me to generate a test that validates the static providers match their corresponding YAML configurations to catch drift?
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (52)
config/config.yaml(1 hunks)config/profiles/lmstudio.yaml(1 hunks)config/profiles/ollama.yaml(1 hunks)config/profiles/openai.yaml(1 hunks)config/profiles/vllm.yaml(1 hunks)default.yaml(1 hunks)docs/adding-providers.md(1 hunks)docs/api/README.md(1 hunks)docs/api/provider-routing.md(1 hunks)docs/user-guide.md(4 hunks)internal/adapter/proxy/proxy_olla.go(1 hunks)internal/adapter/proxy/proxy_sherpa.go(1 hunks)internal/adapter/registry/profile/factory.go(6 hunks)internal/adapter/registry/profile/factory_test.go(1 hunks)internal/adapter/registry/profile/loader.go(3 hunks)internal/adapter/unifier/default_unifier.go(2 hunks)internal/adapter/unifier/model_builder.go(2 hunks)internal/adapter/unifier/model_builder_test.go(1 hunks)internal/app/handlers/application.go(2 hunks)internal/app/handlers/handler_common.go(1 hunks)internal/app/handlers/handler_common_test.go(1 hunks)internal/app/handlers/handler_provider_common.go(1 hunks)internal/app/handlers/handler_provider_compatibility_test.go(1 hunks)internal/app/handlers/handler_provider_generic.go(1 hunks)internal/app/handlers/handler_provider_lmstudio.go(1 hunks)internal/app/handlers/handler_provider_models_test.go(1 hunks)internal/app/handlers/handler_provider_ollama.go(1 hunks)internal/app/handlers/handler_provider_openai.go(1 hunks)internal/app/handlers/handler_provider_test.go(1 hunks)internal/app/handlers/handler_proxy.go(2 hunks)internal/app/handlers/handler_unified_models_test.go(1 hunks)internal/app/handlers/server.go(0 hunks)internal/app/handlers/server_routes.go(1 hunks)internal/core/constants/context.go(1 hunks)internal/core/constants/endpoint.go(1 hunks)internal/core/constants/providers.go(1 hunks)internal/core/domain/profile_config.go(1 hunks)internal/core/ports/proxy.go(1 hunks)internal/util/request.go(1 hunks)readme.md(3 hunks)test/integration/profile_routing_test.go(1 hunks)test/scripts/load/test-load-chaos.sh(1 hunks)test/scripts/load/test-load-limits.sh(2 hunks)test/scripts/logic/.gitignore(1 hunks)test/scripts/logic/README.md(2 hunks)test/scripts/logic/requirements.txt(1 hunks)test/scripts/logic/test-model-routing-provider.py(1 hunks)test/scripts/logic/test-model-routing.sh(3 hunks)test/scripts/logic/test-provider-models.sh(1 hunks)test/scripts/logic/test-provider-routing.sh(1 hunks)test/scripts/security/test-request-rate-limits.sh(1 hunks)test/scripts/security/test-request-size-limits.sh(1 hunks)
💤 Files with no reviewable changes (1)
- internal/app/handlers/server.go
🧰 Additional context used
📓 Path-based instructions (5)
**/*.go
📄 CodeRabbit Inference Engine (CLAUDE.md)
Align struct fields for better memory layout
Files:
internal/app/handlers/handler_proxy.gointernal/app/handlers/handler_unified_models_test.gointernal/core/ports/proxy.gointernal/core/constants/context.gointernal/adapter/proxy/proxy_sherpa.gointernal/adapter/proxy/proxy_olla.gointernal/app/handlers/application.gointernal/adapter/unifier/default_unifier.gointernal/core/constants/endpoint.gointernal/core/domain/profile_config.gointernal/app/handlers/handler_provider_compatibility_test.gointernal/adapter/unifier/model_builder.gointernal/adapter/registry/profile/factory_test.gointernal/util/request.gointernal/core/constants/providers.gointernal/adapter/unifier/model_builder_test.gointernal/app/handlers/handler_provider_openai.gointernal/app/handlers/handler_provider_lmstudio.gointernal/app/handlers/handler_common_test.gotest/integration/profile_routing_test.gointernal/app/handlers/handler_provider_test.gointernal/app/handlers/handler_common.gointernal/adapter/registry/profile/loader.gointernal/app/handlers/handler_provider_generic.gointernal/app/handlers/handler_provider_ollama.gointernal/app/handlers/handler_provider_models_test.gointernal/adapter/registry/profile/factory.gointernal/app/handlers/handler_provider_common.gointernal/app/handlers/server_routes.go
internal/**/*.go
📄 CodeRabbit Inference Engine (CLAUDE.md)
All internal packages must be placed under
/internal/and are not importable by external projects
Files:
internal/app/handlers/handler_proxy.gointernal/app/handlers/handler_unified_models_test.gointernal/core/ports/proxy.gointernal/core/constants/context.gointernal/adapter/proxy/proxy_sherpa.gointernal/adapter/proxy/proxy_olla.gointernal/app/handlers/application.gointernal/adapter/unifier/default_unifier.gointernal/core/constants/endpoint.gointernal/core/domain/profile_config.gointernal/app/handlers/handler_provider_compatibility_test.gointernal/adapter/unifier/model_builder.gointernal/adapter/registry/profile/factory_test.gointernal/util/request.gointernal/core/constants/providers.gointernal/adapter/unifier/model_builder_test.gointernal/app/handlers/handler_provider_openai.gointernal/app/handlers/handler_provider_lmstudio.gointernal/app/handlers/handler_common_test.gointernal/app/handlers/handler_provider_test.gointernal/app/handlers/handler_common.gointernal/adapter/registry/profile/loader.gointernal/app/handlers/handler_provider_generic.gointernal/app/handlers/handler_provider_ollama.gointernal/app/handlers/handler_provider_models_test.gointernal/adapter/registry/profile/factory.gointernal/app/handlers/handler_provider_common.gointernal/app/handlers/server_routes.go
**/*_test.go
📄 CodeRabbit Inference Engine (CLAUDE.md)
**/*_test.go: Unit tests should test individual components in isolation
Integration tests should test full request flow through the proxy
Benchmark tests should measure performance of critical paths, proxy engine comparisons, connection pooling efficiency, and circuit breaker behavior
Files:
internal/app/handlers/handler_unified_models_test.gointernal/app/handlers/handler_provider_compatibility_test.gointernal/adapter/registry/profile/factory_test.gointernal/adapter/unifier/model_builder_test.gointernal/app/handlers/handler_common_test.gotest/integration/profile_routing_test.gointernal/app/handlers/handler_provider_test.gointernal/app/handlers/handler_provider_models_test.go
internal/adapter/proxy/**/*.go
📄 CodeRabbit Inference Engine (CLAUDE.md)
internal/adapter/proxy/**/*.go: Olla engine should use per-endpoint connection pooling, circuit breakers, and object pooling
Sherpa engine should be a simple, maintainable proxy for moderate traffic
Olla engine should use 64KB default buffer size, while Sherpa uses 8KB
Olla engine should use object pooling to reduce GC pressure and larger buffers for streaming
Olla engine should prevent cascade failures using circuit breakers
Files:
internal/adapter/proxy/proxy_sherpa.gointernal/adapter/proxy/proxy_olla.go
test/scripts/security/**
📄 CodeRabbit Inference Engine (CLAUDE.md)
Security tests should validate rate limiting and size restrictions (see
/test/scripts/security/)
Files:
test/scripts/security/test-request-rate-limits.shtest/scripts/security/test-request-size-limits.sh
🧠 Learnings (29)
📓 Common learnings
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to config.yaml : Proxy engine and load balancer strategy are configured in the `proxy` section of `config.yaml`
internal/app/handlers/handler_proxy.go (4)
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/*_test.go : Shared proxy tests should ensure compatibility between both proxy engines
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/**/*.go : Sherpa engine should be a simple, maintainable proxy for moderate traffic
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to **/*_test.go : Integration tests should test full request flow through the proxy
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/stats/**/*.go : Automatic cleanup of stale endpoint data in statistics collection
internal/app/handlers/handler_unified_models_test.go (1)
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to **/*_test.go : Unit tests should test individual components in isolation
default.yaml (3)
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to config.yaml : Proxy engine and load balancer strategy are configured in the proxy section of config.yaml
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to config.yaml : Endpoint definitions and priorities are configured in the discovery section of config.yaml
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Use the priority balancer as the recommended load balancing strategy
config/profiles/openai.yaml (2)
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to config.yaml : Primary configuration is in config.yaml
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to config.yaml : Endpoint definitions and priorities are configured in the discovery section of config.yaml
config/config.yaml (4)
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to config.yaml : Proxy engine and load balancer strategy are configured in the proxy section of config.yaml
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to config.yaml : Primary configuration is in config.yaml
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to config.yaml : Endpoint definitions and priorities are configured in the discovery section of config.yaml
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Use the priority balancer as the recommended load balancing strategy
internal/core/constants/context.go (4)
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to **/*_test.go : Integration tests should test full request flow through the proxy
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/*_test.go : Shared proxy tests should ensure compatibility between both proxy engines
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to **/*_test.go : Benchmark tests should measure performance of critical paths, proxy engine comparisons, connection pooling efficiency, and circuit breaker behavior
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/**/*.go : Sherpa engine should be a simple, maintainable proxy for moderate traffic
internal/adapter/proxy/proxy_sherpa.go (5)
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/**/*.go : Sherpa engine should be a simple, maintainable proxy for moderate traffic
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/**/*.go : Olla engine should use 64KB default buffer size, while Sherpa uses 8KB
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/stats/**/*.go : Automatic cleanup of stale endpoint data in statistics collection
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/*_test.go : Shared proxy tests should ensure compatibility between both proxy engines
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to **/*_test.go : Integration tests should test full request flow through the proxy
internal/adapter/proxy/proxy_olla.go (8)
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/**/*.go : Olla engine should use per-endpoint connection pooling, circuit breakers, and object pooling
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/**/*.go : Sherpa engine should be a simple, maintainable proxy for moderate traffic
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/**/*.go : Olla engine should use object pooling to reduce GC pressure and larger buffers for streaming
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/stats/**/*.go : Automatic cleanup of stale endpoint data in statistics collection
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/*_test.go : Shared proxy tests should ensure compatibility between both proxy engines
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/**/*.go : Olla engine should use 64KB default buffer size, while Sherpa uses 8KB
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to **/*_test.go : Integration tests should test full request flow through the proxy
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/**/*.go : Olla engine should prevent cascade failures using circuit breakers
internal/app/handlers/application.go (1)
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/**/*.go : Sherpa engine should be a simple, maintainable proxy for moderate traffic
test/scripts/load/test-load-chaos.sh (2)
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to **/*_test.go : Integration tests should test full request flow through the proxy
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/*_test.go : Shared proxy tests should ensure compatibility between both proxy engines
internal/core/constants/endpoint.go (2)
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/**/*.go : Olla engine should use per-endpoint connection pooling, circuit breakers, and object pooling
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/health/**/*.go : Use circuit breaker pattern for failing endpoints in health checking
internal/app/handlers/handler_provider_compatibility_test.go (4)
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/*_test.go : Shared proxy tests should ensure compatibility between both proxy engines
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to **/*_test.go : Unit tests should test individual components in isolation
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to **/*_test.go : Integration tests should test full request flow through the proxy
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to **/*_test.go : Benchmark tests should measure performance of critical paths, proxy engine comparisons, connection pooling efficiency, and circuit breaker behavior
internal/adapter/registry/profile/factory_test.go (4)
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/*_test.go : Shared proxy tests should ensure compatibility between both proxy engines
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to **/*_test.go : Unit tests should test individual components in isolation
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to **/*_test.go : Integration tests should test full request flow through the proxy
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to **/*_test.go : Benchmark tests should measure performance of critical paths, proxy engine comparisons, connection pooling efficiency, and circuit breaker behavior
test/scripts/logic/README.md (1)
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to test/scripts/security/** : Security tests should validate rate limiting and size restrictions (see /test/scripts/security/)
test/scripts/security/test-request-rate-limits.sh (3)
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to test/scripts/security/** : Security tests should validate rate limiting and size restrictions (see /test/scripts/security/)
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to **/*_test.go : Integration tests should test full request flow through the proxy
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/*_test.go : Shared proxy tests should ensure compatibility between both proxy engines
internal/adapter/unifier/model_builder_test.go (2)
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to **/*_test.go : Unit tests should test individual components in isolation
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/*_test.go : Shared proxy tests should ensure compatibility between both proxy engines
test/scripts/security/test-request-size-limits.sh (2)
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to test/scripts/security/** : Security tests should validate rate limiting and size restrictions (see /test/scripts/security/)
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to **/*_test.go : Integration tests should test full request flow through the proxy
internal/app/handlers/handler_common_test.go (4)
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to **/*_test.go : Unit tests should test individual components in isolation
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/*_test.go : Shared proxy tests should ensure compatibility between both proxy engines
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to **/*_test.go : Integration tests should test full request flow through the proxy
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to **/*_test.go : Benchmark tests should measure performance of critical paths, proxy engine comparisons, connection pooling efficiency, and circuit breaker behavior
test/scripts/load/test-load-limits.sh (1)
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to config.yaml : Proxy engine and load balancer strategy are configured in the proxy section of config.yaml
docs/api/provider-routing.md (2)
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/**/*.go : Olla engine should use per-endpoint connection pooling, circuit breakers, and object pooling
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to config.yaml : Proxy engine and load balancer strategy are configured in the proxy section of config.yaml
test/integration/profile_routing_test.go (5)
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to **/*_test.go : Integration tests should test full request flow through the proxy
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to **/*_test.go : Unit tests should test individual components in isolation
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/*_test.go : Shared proxy tests should ensure compatibility between both proxy engines
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to **/*_test.go : Benchmark tests should measure performance of critical paths, proxy engine comparisons, connection pooling efficiency, and circuit breaker behavior
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/**/*.go : Olla engine should use per-endpoint connection pooling, circuit breakers, and object pooling
internal/app/handlers/handler_provider_test.go (5)
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to **/*_test.go : Integration tests should test full request flow through the proxy
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to **/*_test.go : Unit tests should test individual components in isolation
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to **/*_test.go : Benchmark tests should measure performance of critical paths, proxy engine comparisons, connection pooling efficiency, and circuit breaker behavior
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/*_test.go : Shared proxy tests should ensure compatibility between both proxy engines
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/**/*.go : Olla engine should use per-endpoint connection pooling, circuit breakers, and object pooling
internal/adapter/registry/profile/loader.go (2)
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/**/*.go : Olla engine should use per-endpoint connection pooling, circuit breakers, and object pooling
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/**/*.go : Olla engine should use object pooling to reduce GC pressure and larger buffers for streaming
internal/app/handlers/handler_provider_ollama.go (1)
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/**/*.go : Olla engine should use per-endpoint connection pooling, circuit breakers, and object pooling
readme.md (6)
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to config.yaml : Proxy engine and load balancer strategy are configured in the proxy section of config.yaml
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/**/*.go : Olla engine should use 64KB default buffer size, while Sherpa uses 8KB
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/**/*.go : Sherpa engine should be a simple, maintainable proxy for moderate traffic
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Use the priority balancer as the recommended load balancing strategy
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/**/*.go : Olla engine should use per-endpoint connection pooling, circuit breakers, and object pooling
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/**/*.go : Olla engine should use object pooling to reduce GC pressure and larger buffers for streaming
internal/app/handlers/handler_provider_models_test.go (4)
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to **/*_test.go : Unit tests should test individual components in isolation
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to **/*_test.go : Integration tests should test full request flow through the proxy
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/*_test.go : Shared proxy tests should ensure compatibility between both proxy engines
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to **/*_test.go : Benchmark tests should measure performance of critical paths, proxy engine comparisons, connection pooling efficiency, and circuit breaker behavior
internal/app/handlers/handler_provider_common.go (2)
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/**/*.go : Sherpa engine should be a simple, maintainable proxy for moderate traffic
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/**/*.go : Olla engine should use per-endpoint connection pooling, circuit breakers, and object pooling
internal/app/handlers/server_routes.go (1)
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-26T12:37:55.605Z
Learning: Applies to internal/adapter/proxy/**/*.go : Sherpa engine should be a simple, maintainable proxy for moderate traffic
🧬 Code Graph Analysis (15)
internal/app/handlers/handler_proxy.go (1)
internal/app/handlers/handler_common.go (1)
NormaliseProviderType(14-25)
internal/core/ports/proxy.go (1)
internal/adapter/unifier/default_unifier.go (1)
Model(12-21)
internal/adapter/proxy/proxy_sherpa.go (1)
internal/adapter/unifier/default_unifier.go (1)
Model(12-21)
internal/app/handlers/application.go (1)
internal/adapter/registry/profile/factory.go (1)
ProfileFactory(11-17)
internal/app/handlers/handler_provider_compatibility_test.go (2)
internal/app/handlers/application.go (1)
Application(61-77)internal/app/handlers/handler_common.go (1)
NormaliseProviderType(14-25)
internal/adapter/unifier/model_builder_test.go (1)
internal/adapter/unifier/model_builder.go (1)
ModelExtractor(144-144)
internal/app/handlers/handler_provider_openai.go (2)
internal/app/handlers/application.go (1)
Application(61-77)internal/app/handlers/server.go (2)
ContentTypeHeader(14-14)ContentTypeJSON(12-12)
internal/app/handlers/handler_provider_lmstudio.go (2)
internal/app/handlers/application.go (1)
Application(61-77)internal/app/handlers/server.go (2)
ContentTypeHeader(14-14)ContentTypeJSON(12-12)
test/integration/profile_routing_test.go (1)
internal/adapter/registry/profile/factory.go (1)
NewFactoryWithDefaults(45-47)
internal/app/handlers/handler_provider_test.go (1)
internal/core/constants/endpoint.go (1)
DefaultOllaProxyPathPrefix(5-5)
internal/app/handlers/handler_common.go (4)
internal/core/constants/providers.go (6)
ProviderPrefixLMStudio1(17-17)ProviderPrefixLMStudio2(18-18)ProviderTypeLMStudio(5-5)ProviderTypeOllama(4-4)ProviderTypeOpenAI(6-6)ProviderTypeVLLM(8-8)internal/core/constants/endpoint.go (2)
DefaultOllaProxyPathPrefix(5-5)DefaultPathPrefix(6-6)internal/app/handlers/application.go (1)
Application(61-77)internal/core/constants/context.go (1)
OriginalPathKey(7-7)
internal/app/handlers/handler_provider_ollama.go (2)
internal/app/handlers/application.go (1)
Application(61-77)internal/app/handlers/server.go (2)
ContentTypeHeader(14-14)ContentTypeJSON(12-12)
test/scripts/logic/test-provider-routing.sh (2)
test/scripts/logic/test-model-routing.sh (3)
banner(65-70)show_summary(388-451)main(455-492)test/scripts/logic/test-provider-models.sh (4)
banner(32-52)test_endpoint(67-168)show_summary(289-310)main(313-343)
internal/app/handlers/handler_provider_common.go (10)
internal/app/handlers/application.go (1)
Application(61-77)internal/core/domain/routing.go (2)
RequestProfile(20-34)NewRequestProfile(36-42)internal/app/handlers/handler_common.go (1)
NormaliseProviderType(14-25)internal/core/constants/providers.go (3)
ProviderTypeOpenAI(6-6)ProviderTypeOpenAICompat(7-7)ProviderTypeVLLM(8-8)internal/core/domain/profile.go (1)
ProfileOpenAICompatible(6-6)internal/core/constants/context.go (1)
ProxyPathPrefix(4-4)internal/core/domain/endpoint.go (1)
Endpoint(21-40)internal/core/domain/unified_model.go (1)
UnifiedModel(15-31)internal/adapter/registry/unified_memory_registry.go (1)
UnifiedMemoryModelRegistry(18-25)internal/core/ports/model_converter.go (1)
ModelFilters(18-23)
internal/app/handlers/server_routes.go (4)
internal/app/handlers/application.go (1)
Application(61-77)internal/core/constants/endpoint.go (3)
DefaultHealthCheckEndpoint(4-4)DefaultOllaProxyPathPrefix(5-5)DefaultPathPrefix(6-6)internal/core/domain/profile_config.go (1)
ProfileConfig(8-77)internal/core/constants/providers.go (10)
ProviderTypeOllama(4-4)ProviderTypeLMStudio(5-5)ProviderTypeOpenAICompat(7-7)ProviderTypeOpenAI(6-6)ProviderPrefixLMStudio2(18-18)ProviderPrefixLMStudio1(17-17)ProviderPrefixLMStudio3(19-19)ProviderTypeVLLM(8-8)ProviderDisplayLMStudio(12-12)ProviderDisplayOllama(11-11)
🪛 LanguageTool
test/scripts/logic/README.md
[uncategorized] ~125-~125: When ‘Test-specific’ is used as a modifier, it is usually spelled with a hyphen.
Context: ...coded results
- Flexible Testing - Test specific providers or all providers
- **Configu...
(SPECIFIC_HYPHEN)
[style] ~168-~168: ‘with success’ might be wordy. Consider a shorter alternative.
Context: ...onse times
- Endpoint usage statistics with success/failure breakdown
- Overall success ra...
(EN_WORDINESS_PREMIUM_WITH_SUCCESS)
docs/api/README.md
[grammar] ~87-~87: A verb may be missing.
Context: ... - 404 - Not found (model or endpoint not available)
429- Too many requests...
(NN_NOT_JJ)
docs/api/provider-routing.md
[grammar] ~163-~163: A determiner may be missing.
Context: ...d load balancing strategy (round-robin, least connections, priority) is applied. This...
(THE_SUPERLATIVE)
docs/adding-providers.md
[uncategorized] ~59-~59: The preposition ‘to’ seems more likely in this position.
Context: ... will automatically:
- Register routes for
/olla/vllm/ - Create model discovery...
(AI_HYDRA_LEO_REPLACE_FOR_TO)
[uncategorized] ~60-~60: The preposition ‘to’ seems more likely in this position.
Context: ...m/`
- Create model discovery endpoints at
/olla/vllm/v1/models - Set up the p...
(AI_HYDRA_LEO_REPLACE_AT_TO)
[style] ~112-~112: Using many exclamation marks might seem excessive (in this case: 3 exclamation marks for a text that’s 2019 characters long)
Context: ... priority: 90
3. Run Olla - done!
## Benefits
- **No Code Changes**...
(EN_EXCESSIVE_EXCLAMATION)
</details>
<details>
<summary>readme.md</summary>
[grammar] ~366-~366: A determiner may be missing.
Context: ...ight strategy for your setup:
### 📊 Least Connections (`least-connections`) - **R...
(THE_SUPERLATIVE)
---
[grammar] ~367-~367: A determiner may be missing.
Context: ...ommended**
Routes to the endpoint with least active requests. Ideal for:
- **Mixed ...
(THE_SUPERLATIVE)
</details>
</details>
<details>
<summary>🪛 markdownlint-cli2 (0.17.2)</summary>
<details>
<summary>test/scripts/logic/README.md</summary>
172-172: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
</details>
<details>
<summary>docs/api/provider-routing.md</summary>
24-24: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
---
69-69: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
---
127-127: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
---
138-138: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
---
148-148: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
---
155-155: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
</details>
</details>
<details>
<summary>🪛 YAMLlint (1.37.1)</summary>
<details>
<summary>config/profiles/vllm.yaml</summary>
[error] 1-1: wrong new line character: expected \n
(new-lines)
---
[error] 63-63: no new line character at the end of file
(new-line-at-end-of-file)
</details>
</details>
<details>
<summary>🪛 Shellcheck (0.10.0)</summary>
<details>
<summary>test/scripts/logic/test-provider-models.sh</summary>
[warning] 89-89: Declare and assign separately to avoid masking return values.
(SC2155)
---
[warning] 103-103: Declare and assign separately to avoid masking return values.
(SC2155)
---
[warning] 113-113: Declare and assign separately to avoid masking return values.
(SC2155)
---
[warning] 123-123: Declare and assign separately to avoid masking return values.
(SC2155)
---
[warning] 139-139: Quote this to prevent word splitting.
(SC2046)
---
[warning] 191-191: Declare and assign separately to avoid masking return values.
(SC2155)
---
[warning] 204-204: Declare and assign separately to avoid masking return values.
(SC2155)
---
[warning] 211-211: Declare and assign separately to avoid masking return values.
(SC2155)
---
[warning] 218-218: Declare and assign separately to avoid masking return values.
(SC2155)
---
[warning] 225-225: Declare and assign separately to avoid masking return values.
(SC2155)
</details>
<details>
<summary>test/scripts/logic/test-provider-routing.sh</summary>
[warning] 12-12: BLUE appears unused. Verify use (or export if used externally).
(SC2034)
---
[warning] 58-58: Declare and assign separately to avoid masking return values.
(SC2155)
---
[warning] 188-188: Declare and assign separately to avoid masking return values.
(SC2155)
---
[warning] 189-189: Declare and assign separately to avoid masking return values.
(SC2155)
---
[warning] 193-193: Declare and assign separately to avoid masking return values.
(SC2155)
---
[warning] 194-194: Declare and assign separately to avoid masking return values.
(SC2155)
---
[warning] 198-198: Declare and assign separately to avoid masking return values.
(SC2155)
---
[warning] 199-199: Declare and assign separately to avoid masking return values.
(SC2155)
</details>
</details>
<details>
<summary>🪛 Ruff (0.12.2)</summary>
<details>
<summary>test/scripts/logic/test-model-routing-provider.py</summary>
8-8: `json` imported but unused
Remove unused import: `json`
(F401)
---
12-12: `typing.Dict` imported but unused
Remove unused import: `typing.Dict`
(F401)
---
150-150: Loop control variable `i` not used within loop body
Rename unused `i` to `_i`
(B007)
</details>
</details>
</details>
<details>
<summary>🔇 Additional comments (85)</summary><blockquote>
<details>
<summary>test/scripts/logic/.gitignore (1)</summary>
`1-2`: **Good inclusion of local artefact ignores**
Ignoring the virtual-env and temp directories will keep the repo clean and CI runs deterministic.
</details>
<details>
<summary>config/profiles/ollama.yaml (1)</summary>
`7-10`: **Routing prefix entry looks correct**
The `routing.prefixes` section aligns with the new provider-specific router and uses a lower-case slug, matching the normalisation logic in the profile factory.
</details>
<details>
<summary>internal/core/ports/proxy.go (1)</summary>
`46-53`: **Addition of `Model` field looks good – verify all producers/consumers set it**
The extra `Model string` field is placed sensibly after `RequestID`, keeping the two most-frequent string values adjacent and avoiding any regressive padding issues.
Just ensure every path that instantiates `RequestStats` (handlers, tests, etc.) populates the new field – otherwise log output will show an empty `model` attribute, reducing the value of the change.
</details>
<details>
<summary>config/config.yaml (1)</summary>
`35-35`: **Config key rename requires code-side support**
`load_balancer` value has moved from `"least_conn"` → `"least-connections"`. Please confirm the configuration loader accepts the new hyphenated string; otherwise the proxy will silently fall back to its default strategy.
</details>
<details>
<summary>internal/adapter/registry/profile/loader.go (3)</summary>
`116-116`: **LGTM! Clean routing prefix assignment for Ollama profile.**
The explicit assignment of routing prefixes aligns perfectly with the new provider-specific routing architecture. This enables requests to be routed to `/olla/ollama/*` endpoints.
---
`214-214`: **Excellent support for multiple alias variations.**
The LM Studio profile correctly supports multiple routing prefix variations (`lmstudio`, `lm-studio`, `lm_studio`), which provides flexibility for users and maintains compatibility with different naming conventions.
---
`264-264`: **Appropriate dual prefix support for OpenAI compatibility.**
The OpenAI-compatible profile includes both `openai` and `openai-compatible` prefixes, which correctly reflects the dual nature of this provider type.
</details>
<details>
<summary>config/profiles/lmstudio.yaml (1)</summary>
`7-12`: **Well-structured routing configuration with comprehensive prefix coverage.**
The routing section properly defines all LM Studio naming variations, ensuring users can access the provider through multiple intuitive URL patterns (`/olla/lmstudio/*`, `/olla/lm-studio/*`, `/olla/lm_studio/*`).
</details>
<details>
<summary>config/profiles/openai.yaml (1)</summary>
`7-11`: **Clear and logical routing prefix configuration.**
The dual prefix approach (`openai` and `openai-compatible`) provides both concise and descriptive routing options, enhancing user experience whilst maintaining clarity about the provider's compatibility layer.
</details>
<details>
<summary>internal/adapter/unifier/default_unifier.go (2)</summary>
`148-148`: **Enhanced platform detection with endpoint type context.**
The addition of `endpoint.Type` as a third parameter to `DetectPlatform` provides better context for platform detection logic, improving accuracy in provider identification.
---
`221-221`: **Consistent parameter addition for platform detection.**
The method call correctly includes the endpoint type parameter, maintaining consistency with the enhanced `DetectPlatform` signature throughout the codebase.
</details>
<details>
<summary>internal/app/handlers/handler_proxy.go (2)</summary>
`96-96`: **Approve model tracking enhancement in request stats.**
This change properly sets the `Model` field in request stats when a profile contains a model name, which will improve observability and logging throughout the proxy request lifecycle. The field assignment is appropriately guarded by the existing null checks.
---
`210-212`: **Provider type normalisation ensures consistent compatibility checks.**
The addition of provider type normalisation before profile compatibility checking is a solid improvement. This handles provider name variations (e.g., "lmstudio" → "lm-studio") and ensures consistent matching with profile configurations. The implementation correctly uses the `NormaliseProviderType` function from `handler_common.go` as shown in the relevant code snippets.
</details>
<details>
<summary>internal/app/handlers/application.go (2)</summary>
`73-73`: **Proper integration of profile factory into application struct.**
The addition of the `profileFactory` field correctly follows the existing struct field alignment and uses the appropriate interface type as defined in the relevant code snippets.
---
`141-141`: **Profile factory properly initialised in constructor.**
The profile factory is correctly assigned during application initialisation, maintaining consistency with the existing dependency injection pattern used throughout the constructor.
</details>
<details>
<summary>test/scripts/security/test-request-rate-limits.sh (1)</summary>
`17-18`: **Dynamic provider routing properly implemented for security testing.**
The addition of the `PROVIDER` environment variable with sensible default and the dynamic `PROXY_ENDPOINT` correctly adapts the security test to the new provider-specific routing architecture. This maintains backward compatibility whilst enabling testing across different provider endpoints (ollama, lmstudio, openai, vllm).
</details>
<details>
<summary>test/scripts/security/test-request-size-limits.sh (1)</summary>
`20-21`: **Consistent provider routing implementation in size limit testing.**
The implementation correctly follows the same pattern as the rate limit test script, using the `PROVIDER` environment variable with appropriate default and updating the endpoint to match the new provider-specific routing scheme. This ensures consistent testing across all security validation scripts.
</details>
<details>
<summary>test/scripts/load/test-load-chaos.sh (1)</summary>
`50-52`: **LGTM! Provider-specific routing implementation looks good.**
The introduction of the `PROVIDER` environment variable and dynamic proxy path construction aligns perfectly with the PR's provider-specific routing architecture. This change enables testing different backend providers whilst maintaining backward compatibility.
</details>
<details>
<summary>internal/adapter/unifier/model_builder.go (2)</summary>
`219-220`: **Enhanced platform detection with explicit endpoint type support.**
The method signature change adds valuable functionality by allowing explicit platform specification via `endpointType` parameter, which supports the provider-specific routing enhancements in this PR.
---
`234-241`: **Well-implemented endpoint type normalisation.**
The normalisation logic properly handles common provider naming variations (e.g., "lm-studio" → "lmstudio") by removing hyphens and underscores whilst converting to lowercase. This ensures consistent platform identification across different naming conventions.
</details>
<details>
<summary>internal/util/request.go (2)</summary>
`67-72`: **Good refactoring to improve code modularity.**
The delegation to the new `StripPrefix` helper function maintains the original behaviour whilst improving code organisation and reusability.
---
`74-83`: **Well-implemented prefix stripping logic.**
The extracted helper function correctly handles edge cases:
- Checks if path starts with the prefix
- Ensures the resulting path starts with "/" if necessary
- Returns the original path if no prefix match
This modular approach supports the provider-specific routing changes throughout the PR.
</details>
<details>
<summary>test/scripts/logic/test-model-routing.sh (2)</summary>
`33-35`: **Provider-specific routing correctly implemented.**
The introduction of the `PROVIDER` variable with a sensible default and dynamic proxy endpoint construction aligns perfectly with the PR's routing architecture changes.
---
`77-98`: **Excellent documentation updates.**
The usage instructions are comprehensive and clearly document:
- The new PROVIDER variable and its possible values
- Practical examples for different providers
- Clear explanations of the functionality
This will help users understand and test the new provider-specific routing features.
</details>
<details>
<summary>test/scripts/load/test-load-limits.sh (2)</summary>
`36-37`: **Consistent provider-specific routing implementation.**
The `PROVIDER` variable introduction and dynamic proxy endpoint construction matches the pattern established in other test scripts and supports the PR's provider-specific routing architecture.
---
`88-90`: **Clear documentation for new functionality.**
The environment variables section properly documents the new `PROVIDER` variable with its default value and possible options, helping users understand how to test different providers.
</details>
<details>
<summary>internal/adapter/registry/profile/factory_test.go (1)</summary>
`77-117`: **Comprehensive test coverage for routing prefix validation.**
The test thoroughly covers the new routing prefix functionality with good test case organisation. The table-driven approach tests direct profile names, routing prefixes (including LM Studio variations), the auto profile, and edge cases like unknown providers.
</details>
<details>
<summary>internal/core/constants/providers.go (1)</summary>
`1-21`: **Well-organised provider constants with clear naming conventions.**
The constants are properly grouped and follow Go naming conventions. The multiple LM Studio prefix variations support the alias resolution system effectively.
</details>
<details>
<summary>internal/app/handlers/handler_provider_compatibility_test.go (1)</summary>
`14-121`: **Comprehensive compatibility test coverage.**
The test cases thoroughly cover the provider compatibility matrix, including edge cases for unknown providers and various provider/endpoint type combinations.
</details>
<details>
<summary>internal/adapter/unifier/model_builder_test.go (1)</summary>
`7-90`: **Thorough test coverage for platform detection logic.**
The test comprehensively covers platform detection scenarios including metadata hints, version keys, endpoint type normalization, and fallback behaviour. The table-driven approach ensures good test organisation and coverage.
</details>
<details>
<summary>internal/app/handlers/handler_provider_openai.go (2)</summary>
`8-30`: **Clean implementation following established handler patterns.**
The handler properly implements the OpenAI-compatible model listing endpoint with appropriate error handling, content type headers, and response encoding. The documentation clearly explains the purpose for local inference servers.
---
`32-34`: **Good explanatory comment about OpenAI API standardisation.**
This comment provides valuable context about why OpenAI API compatibility is important in the local inference ecosystem.
</details>
<details>
<summary>docs/user-guide.md (4)</summary>
`157-167`: **LGTM! Provider-specific routing examples are clear and consistent.**
The updated Python examples correctly demonstrate the new provider-specific base URLs, providing clear examples for both Ollama and LM Studio backends.
---
`189-199`: **LGTM! JavaScript examples align with the new routing architecture.**
The JavaScript examples properly demonstrate the provider-specific routing for both Ollama and OpenAI-compatible backends.
---
`214-235`: **LGTM! cURL examples demonstrate provider-specific endpoints effectively.**
The cURL examples showcase the new provider-specific routing for both Ollama and LM Studio, with appropriate model filtering examples.
---
`240-253`: **LGTM! Ollama Native API section correctly reflects the new routing.**
The section rename from "Ollama Compatibility" to "Ollama Native API" is appropriate and the examples correctly use the provider-specific Ollama endpoints.
</details>
<details>
<summary>internal/app/handlers/handler_provider_lmstudio.go (2)</summary>
`8-29`: **LGTM! Clean and consistent handler implementation.**
The `lmstudioOpenAIModelsHandler` follows good patterns with proper error handling, content type setting, and JSON encoding. The use of shared helper methods promotes code reuse.
---
`31-54`: **LGTM! Enhanced models handler is well-implemented.**
The `lmstudioEnhancedModelsHandler` correctly uses the "lmstudio" format converter for enhanced metadata and follows the same consistent error handling pattern.
</details>
<details>
<summary>config/profiles/vllm.yaml (2)</summary>
`8-11`: **LGTM! Routing configuration is correctly defined.**
The routing configuration with the "vllm" prefix aligns well with the provider-specific routing architecture.
---
`13-22`: **LGTM! API compatibility settings are comprehensive.**
The API configuration correctly specifies OpenAI compatibility and includes all necessary endpoints for vLLM functionality.
</details>
<details>
<summary>internal/app/handlers/handler_provider_generic.go (2)</summary>
`8-34`: **LGTM! Generic models handler is well-implemented.**
The `genericProviderModelsHandler` follows good patterns with proper provider normalisation, error handling, and consistent response formatting. The use of a closure to return an `http.HandlerFunc` is appropriate for this use case.
---
`36-71`: **LGTM! Model show handler has proper validation and error handling.**
The `genericModelShowHandler` correctly validates the HTTP method, parses JSON input safely, validates required fields, and provides appropriate error responses with correct status codes.
</details>
<details>
<summary>docs/api/README.md (3)</summary>
`1-96`: **LGTM! Comprehensive and well-structured API documentation.**
The API reference provides excellent coverage of all endpoints, features, and functionality. The organisation is logical and the content accurately reflects the new provider-specific routing architecture.
---
`11-28`: **LGTM! Provider-specific routing documentation is clear and complete.**
The provider-specific routing section clearly explains the new namespace-based approach and lists all relevant endpoints for each provider type.
---
`84-91`: **Grammar is correct - static analysis false positive.**
The sentence structure "Not found (model or endpoint not available)" is grammatically correct in the context of HTTP status code descriptions. The static analysis tool has flagged a false positive.
</details>
<details>
<summary>test/integration/profile_routing_test.go (1)</summary>
`56-90`: **Test structure follows good patterns**
The test comprehensively covers valid and invalid provider scenarios, including routing prefix variations. The test data structure and assertions are clear and maintainable.
</details>
<details>
<summary>internal/app/handlers/handler_provider_test.go (1)</summary>
`96-149`: **Path stripping tests are well-structured**
The path stripping logic tests cover important edge cases including root paths and trailing slashes. The test implementation correctly mirrors the expected path manipulation logic.
</details>
<details>
<summary>readme.md (2)</summary>
`186-186`: **Load balancer naming standardisation looks good**
The change from underscore-separated to hyphen-separated load balancer strategy names (e.g., "round_robin" → "round-robin") improves naming consistency across the configuration.
---
`398-456`: **Provider-specific routing documentation is comprehensive**
The new provider-specific routing examples clearly demonstrate the namespace-based approach (`/olla/ollama/*`, `/olla/lmstudio/*`, etc.) and provide practical code examples for both Python OpenAI clients and curl commands.
</details>
<details>
<summary>docs/adding-providers.md (1)</summary>
`1-63`: **Excellent documentation for dynamic provider system**
The guide clearly explains how the new dynamic route registration system works and provides complete, practical examples. The step-by-step process makes it easy for users to add new providers without code changes.
</details>
<details>
<summary>internal/app/handlers/handler_common.go (5)</summary>
`14-25`: **Provider normalisation logic is well-designed**
The normalisation function correctly handles LM Studio's multiple naming variants and uses constants for consistency. The special case handling ensures all variations map to the canonical form.
---
`29-51`: **Path extraction logic handles edge cases well**
The function correctly handles various URL formats including paths without trailing segments. The use of constants and normalisation ensures consistency with the rest of the system.
---
`55-66`: **Good practice preserving original path in context**
Storing the original path in the request context before modification is excellent for debugging and logging purposes. The handling of both `Path` and `RawPath` is thorough.
---
`69-88`: **Provider validation with appropriate fallback**
The validation logic correctly uses the profile factory when available and falls back to static validation for tests. The fallback list is consistent with the constants, ensuring reliability.
---
`91-95`: **Simple and effective prefix generation**
The function correctly constructs provider prefixes whilst preserving the original provider name format for compatibility.
</details>
<details>
<summary>internal/app/handlers/handler_provider_models_test.go (3)</summary>
`1-18`: **LGTM: Clean import structure**
The imports are well-organised and follow proper Go conventions for testing. All dependencies are necessary and correctly scoped to internal packages.
---
`98-216`: **Excellent comprehensive test coverage**
The table-driven test approach effectively covers all provider-specific model endpoints with proper format validation. The test cases clearly verify:
- Ollama native and OpenAI-compatible formats
- LM Studio OpenAI and enhanced formats
- OpenAI provider aggregation across all endpoints
- vLLM OpenAI-compatible format
Each test properly validates JSON structure and model counts.
---
`218-436`: **Thorough provider filtering and format validation tests**
Both `TestProviderModelFiltering` and `TestUnifiedModelsFormatFiltering` provide essential coverage for:
- Provider isolation ensuring endpoints only return models from their specific provider type
- Format parameter filtering on unified endpoints
- Correct JSON structure validation for each format type
The tests properly validate that provider-specific endpoints maintain isolation whilst the unified endpoint supports multiple output formats with appropriate filtering.
</details>
<details>
<summary>docs/api/provider-routing.md (1)</summary>
`1-274`: **Comprehensive and well-structured documentation**
This documentation effectively explains the new provider-specific routing architecture with:
- Clear architectural overview with explicit URL patterns
- Detailed endpoint specifications with response format examples
- Practical configuration examples
- Comprehensive use cases demonstrating real-world applications
- Proper distinction between intercepted and proxied endpoints
The content aligns well with the implementation and provides excellent guidance for users migrating to the new routing system.
</details>
<details>
<summary>test/scripts/logic/test-provider-models.sh (2)</summary>
`20-96`: **Well-structured and comprehensive test script**
The script demonstrates excellent design with:
- Clear separation of concerns with dedicated functions for different test types
- Comprehensive provider coverage (Ollama, LM Studio, OpenAI, vLLM)
- Proper JSON format validation for each provider's expected response structure
- Good error handling with descriptive messages and appropriate HTTP status code interpretation
- Professional colour-coded output with progress tracking
The main execution flow properly validates prerequisites and provides helpful usage information.
Also applies to: 247-344
---
`67-168`: **Robust test validation logic**
Both `test_endpoint` and `test_unified_endpoint` functions implement solid validation:
- Appropriate format-specific JSON structure validation
- Correct model counting using grep pattern matching
- Proper handling of different response formats (Ollama, OpenAI, LM Studio)
- Good error reporting with format-specific error messages
- Support for verbose output showing response samples
The unified endpoint testing properly validates format parameter filtering behaviour.
Also applies to: 170-245
</details>
<details>
<summary>test/scripts/logic/test-provider-routing.sh (2)</summary>
`108-180`: **Excellent proxy routing validation logic**
The `test_proxy_routing` function implements comprehensive routing validation:
- Proper POST request testing with realistic JSON payload
- Header validation for routing confirmation (`X-Olla-Endpoint`, `X-Olla-Backend-Type`)
- Smart provider name normalisation handling (e.g., `lm-studio` vs `lmstudio`)
- Appropriate HTTP status code interpretation (404 for no providers, 2xx/4xx/5xx for processed requests)
- Clean temporary file handling for response headers
This thoroughly validates that the provider-specific routing works correctly.
---
`235-293`: **Well-structured script with proper execution flow**
The main function demonstrates good bash scripting practices:
- Comprehensive help text with usage examples
- Proper prerequisite validation (curl availability, Olla health check)
- Logical test execution sequence (model endpoints, then proxy routing, then comparison)
- Clean summary reporting with coloured output
The script provides a thorough validation of the provider-specific routing functionality.
</details>
<details>
<summary>test/scripts/logic/README.md (1)</summary>
`1-259`: **Comprehensive and well-organised test documentation**
This README provides excellent coverage of the logic test scripts with:
- Clear overview table summarising each script's purpose and key features
- Detailed sections for each script with practical usage examples
- Comprehensive endpoint coverage documentation showing what each script tests
- Proper documentation of environment variables and requirements
- Helpful guidance for running all tests sequentially including Python dependencies
- Standard exit code conventions for CI/CD integration
The documentation accurately reflects the implemented test scripts and provides valuable guidance for users and developers.
</details>
<details>
<summary>internal/adapter/registry/profile/factory.go (5)</summary>
`11-23`: **LGTM! Well-structured interface extension and thread-safe implementation.**
The addition of `NormalizeProviderName` to the interface and the `prefixLookup` map with mutex protection demonstrates good design for concurrent access patterns.
---
`25-43`: **Good optimisation with pre-computed prefix mappings.**
Pre-computing the prefix lookup table at factory creation enables O(1) provider resolution, which is an excellent performance optimisation.
---
`109-126`: **Well-designed validation with prefix lookup fallback.**
The two-stage validation approach (prefix lookup followed by exact match) provides good flexibility for handling provider name variations while maintaining backward compatibility.
---
`128-141`: **Clean implementation of provider name normalisation.**
The method correctly uses the prefix lookup for alias resolution and sensibly returns unknown names unchanged, allowing for future extensibility.
---
`143-162`: **Efficient routing table construction from profiles.**
Good implementation that extracts routing prefixes from YAML configs and includes profile names as implicit prefixes for convenience.
</details>
<details>
<summary>internal/app/handlers/handler_common_test.go (4)</summary>
`11-49`: **Well-structured mock implementation for testing.**
The mock ProfileFactory provides appropriate test doubles for the interface methods, with simplified but representative normalisation logic.
---
`51-79`: **Comprehensive test coverage for provider normalisation.**
Excellent table-driven test with thorough coverage of edge cases including various lmstudio variants, case sensitivity, and other provider types.
---
`81-108`: **Thorough testing of provider path extraction.**
Well-designed test cases covering successful extraction, normalisation, edge cases, and error conditions.
---
`110-164`: **Excellent dual-scenario testing for provider support.**
The test effectively covers both fallback behaviour (without factory) and factory-based validation, ensuring the method works correctly in different deployment scenarios.
</details>
<details>
<summary>internal/app/handlers/handler_provider_ollama.go (3)</summary>
`30-39`: **Appropriate handling of unsupported operation with clear documentation.**
Good decision to return 501 with clear comments explaining the challenges of aggregating model details across multiple instances.
---
`41-50`: **Consistent handling of unsupported running models endpoint.**
The implementation correctly returns 501 with clear explanation of the state synchronisation challenges.
---
`75-80`: **Good reusable handler for unsupported operations.**
Clean implementation that clearly communicates the limitation of model management operations in a distributed proxy environment.
</details>
<details>
<summary>test/scripts/logic/test-model-routing-provider.py (2)</summary>
`30-371`: **Excellent test implementation with comprehensive coverage.**
The `ProviderTester` class is well-designed with thorough error handling, detailed statistics tracking, and user-friendly colored output. The test coverage for provider-specific endpoints is comprehensive.
---
`372-421`: **Well-structured test orchestration.**
The main function provides clean argument parsing, sensible defaults, and logical test workflow orchestration.
</details>
<details>
<summary>internal/app/handlers/handler_provider_common.go (5)</summary>
`14-52`: **Well-designed provider profile creation with OpenAI compatibility.**
The implementation correctly handles OpenAI's inclusive routing model and provides appropriate fallbacks for test scenarios. The dynamic inclusion of OpenAI-compatible providers is a thoughtful design choice.
---
`54-105`: **Solid implementation of provider-scoped proxy handler.**
Good use of context for passing provider information through the request lifecycle. Appropriate error handling for invalid paths and unsupported providers.
---
`107-130`: **Efficient endpoint filtering with smart reuse of existing logic.**
Good design that leverages the existing RequestProfile filtering mechanism for provider-specific endpoint selection, with appropriate two-stage filtering.
---
`180-205`: **Well-structured model fetching pipeline.**
Clean implementation with logical flow: unified models → health filtering → provider filtering. Good error handling with descriptive messages.
---
`207-222`: **Clean format conversion using factory pattern.**
Good use of the converter factory for format-specific transformations. The implementation is clean and follows single responsibility principle.
</details>
<details>
<summary>internal/app/handlers/server_routes.go (2)</summary>
`27-49`: **Well-structured route registration with clear separation of concerns.**
The route registration follows a logical hierarchy: internal endpoints → unified models → legacy compatibility → provider-specific routes. The comments effectively document the purpose of each route group.
---
`233-242`: **Clean handling of provider display names.**
The display name formatting provides clear, human-readable route descriptions that will be helpful in logs and debugging. The special handling for LM Studio's multiple prefixes is well implemented.
</details>
</blockquote></details>
</details>
<!-- This is an auto-generated comment by CodeRabbit for review status -->
This PR redoes the API with explicit provider-specific routing, profile-driven route registration, and cleaner separation of logic for different LLM backends. It fixes backend ambiguity, simplifies debugging, and still plays nice with legacy clients.
tldr;
Provider-Specific Routing
/olla/proxy for clearly scoped routes:/olla/ollama//olla/lmstudio//olla/openai//olla/vllm/Dynamic Route Registration
config/profiles/*.yaml).Enhanced Model Discovery
/api/tags/api/v0/modelsImproved Code Organisation
server_routes.gohandler_common.gohandler_provider_common.gohandler_provider_ollama.gohandler_provider_lmstudio.gohandler_provider_openai.goTesting
test-provider-routing.shtest-model-routing-provider.pyDocumentation
docs/api/provider-routing.mddocs/adding-providers.mdSummary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests
Chores