-
-
Notifications
You must be signed in to change notification settings - Fork 11
tweaks & bugfixes #35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…rrectly. Incorrectly using the model ID when the same model name has different digests. We'd get models like this: hf.co/unsloth/Qwen3-32B-GGUF:Q4_K_XL-75fb8ad3 When it needed to be: hf.co/unsloth/Qwen3-32B-GGUF:Q4_K_XL interestingly the digest clash isn't for LMStudio.
WalkthroughThis update introduces enhanced logging middleware for HTTP requests, context-aware logging in proxy and discovery services, and user-friendly error messages for discovery errors. It changes model ID selection logic to prioritise aliases for routing compatibility, updates configuration loading to support a command-line flag, and adds a port availability check before starting the HTTP service. Associated tests are included for new utilities and behaviour. Changes
Sequence Diagram(s)Enhanced HTTP Request Logging MiddlewaresequenceDiagram
participant Client
participant Middleware
participant Handler
participant Logger
Client->>Middleware: HTTP Request
Middleware->>Logger: Log request start (method, path, etc.)
Middleware->>Handler: ServeHTTP (with context)
Handler-->>Middleware: Response
Middleware->>Logger: Log request completion (status, duration, bytes)
Middleware-->>Client: HTTP Response
Model Conversion with Alias IDsequenceDiagram
participant Converter
participant Model
Converter->>Model: Get ID, Get Aliases
alt Aliases exist
Converter->>Converter: Use first alias as model ID
else No aliases
Converter->>Converter: Use original model ID
end
Converter-->>Caller: Return model data with selected ID
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~18 minutes Possibly related PRs
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. 📜 Recent review detailsConfiguration used: .coderabbit.yaml 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (5)
internal/adapter/converter/openai_converter.go (1)
51-57: Fix typo in comment.There's a typo in line 52: "alas" should be "alias".
- // we need to use first alas as ID for routing compatibility + // we need to use first alias as ID for routing compatibilityinternal/adapter/converter/unified_converter.go (1)
78-84: Fix typo in comment.There's a typo in line 79: "alas" should be "alias".
- // we need to use first alas as ID for routing compatibility + // we need to use first alias as ID for routing compatibilityinternal/util/network.go (1)
51-59: Potential race condition in port availability check.The function correctly checks port availability, but there's a race condition between checking and the actual server start. Another process could claim the port after this check but before the HTTP server binds to it.
Consider these alternatives:
- Move this check closer to the actual server start
- Let the HTTP server handle port binding errors directly
- Use a more robust approach that reserves the port until the server starts
-// IsPortAvailable checks if a port is available by attempting to bind to it -func IsPortAvailable(host string, port int) bool { - listener, err := net.Listen("tcp", fmt.Sprintf("%s:%d", host, port)) - if err != nil { - return false - } - defer listener.Close() - return true -} +// IsPortAvailable checks if a port is available by attempting to bind to it +// Note: There's a race condition between checking and actual usage +func IsPortAvailable(host string, port int) bool { + listener, err := net.Listen("tcp", fmt.Sprintf("%s:%d", host, port)) + if err != nil { + return false + } + defer listener.Close() + return true +}internal/app/middleware/logging.go (1)
172-184: Add input validation to formatBytes function.The function handles most cases well, but should validate input to prevent potential issues with negative values.
-func formatBytes(bytes int64) string { - const unit = 1024 - if bytes < unit { - return fmt.Sprintf("%dB", bytes) - } +func formatBytes(bytes int64) string { + const unit = 1024 + if bytes < 0 { + return "0B" + } + if bytes < unit { + return fmt.Sprintf("%dB", bytes) + }main.go (1)
48-49: Well-implemented command-line flags for config file specification.Both short (-c) and long (--config) flag variants provide good user experience. The implementation correctly points both flags to the same variable.
Consider adding a brief comment to clarify that both flags serve the same purpose:
+ // Config file flags (both -c and -config point to the same variable) flag.StringVar(&configFile, "c", "", "Config file path") flag.StringVar(&configFile, "config", "", "Config file path")
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (16)
internal/adapter/converter/openai_converter.go(1 hunks)internal/adapter/converter/openai_converter_test.go(1 hunks)internal/adapter/converter/unified_converter.go(1 hunks)internal/adapter/converter/unified_converter_test.go(3 hunks)internal/adapter/discovery/errors.go(2 hunks)internal/adapter/discovery/errors_test.go(1 hunks)internal/adapter/discovery/service.go(2 hunks)internal/adapter/proxy/olla/service.go(5 hunks)internal/adapter/proxy/sherpa/service.go(4 hunks)internal/app/handlers/application.go(4 hunks)internal/app/middleware/logging.go(1 hunks)internal/app/middleware/logging_test.go(1 hunks)internal/app/services/http.go(2 hunks)internal/config/config.go(2 hunks)internal/util/network.go(1 hunks)main.go(2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*_test.go
📄 CodeRabbit Inference Engine (CLAUDE.md)
Unit tests should test individual components in isolation.
Files:
internal/adapter/converter/openai_converter_test.gointernal/adapter/discovery/errors_test.gointernal/app/middleware/logging_test.gointernal/adapter/converter/unified_converter_test.go
internal/{app,adapter}/**/*.go
📄 CodeRabbit Inference Engine (CLAUDE.md)
Endpoints should be exposed at
/internal/healthand/internal/status.
Files:
internal/adapter/converter/openai_converter_test.gointernal/adapter/discovery/errors_test.gointernal/adapter/converter/openai_converter.gointernal/adapter/converter/unified_converter.gointernal/adapter/discovery/service.gointernal/adapter/discovery/errors.gointernal/adapter/proxy/sherpa/service.gointernal/app/middleware/logging_test.gointernal/app/middleware/logging.gointernal/app/services/http.gointernal/app/handlers/application.gointernal/adapter/proxy/olla/service.gointernal/adapter/converter/unified_converter_test.go
🧠 Learnings (9)
📚 Learning: applies to config.yaml : the main configuration should be defined in `config.yaml`....
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-27T12:59:29.787Z
Learning: Applies to config.yaml : The main configuration should be defined in `config.yaml`.
Applied to files:
main.go
📚 Learning: applies to internal/adapter/proxy/*_test.go : shared proxy tests should ensure compatibility between...
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-27T12:59:29.787Z
Learning: Applies to internal/adapter/proxy/*_test.go : Shared proxy tests should ensure compatibility between both proxy engines.
Applied to files:
internal/adapter/converter/openai_converter_test.gointernal/adapter/proxy/sherpa/service.gointernal/app/middleware/logging_test.gointernal/adapter/proxy/olla/service.gointernal/adapter/converter/unified_converter_test.go
📚 Learning: applies to **/*_test.go : unit tests should test individual components in isolation....
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-27T12:59:29.787Z
Learning: Applies to **/*_test.go : Unit tests should test individual components in isolation.
Applied to files:
internal/adapter/discovery/errors_test.gointernal/app/middleware/logging_test.gointernal/adapter/converter/unified_converter_test.go
📚 Learning: applies to {proxy_sherpa.go,proxy_olla.go} : proxy implementations should be in `proxy_sherpa.go` an...
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-27T12:59:29.787Z
Learning: Applies to {proxy_sherpa.go,proxy_olla.go} : Proxy implementations should be in `proxy_sherpa.go` and `proxy_olla.go`.
Applied to files:
internal/adapter/proxy/sherpa/service.gointernal/app/services/http.gointernal/app/handlers/application.gointernal/adapter/proxy/olla/service.go
📚 Learning: applies to internal/adapter/proxy/*_test.go : integration tests should test the full request flow th...
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-27T12:59:29.787Z
Learning: Applies to internal/adapter/proxy/*_test.go : Integration tests should test the full request flow through the proxy.
Applied to files:
internal/adapter/proxy/sherpa/service.gointernal/app/middleware/logging_test.gointernal/adapter/proxy/olla/service.gointernal/adapter/converter/unified_converter_test.go
📚 Learning: applies to handler_proxy.go : request routing logic should be implemented in `handler_proxy.go`....
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-27T12:59:29.787Z
Learning: Applies to handler_proxy.go : Request routing logic should be implemented in `handler_proxy.go`.
Applied to files:
internal/adapter/proxy/sherpa/service.gointernal/app/handlers/application.gointernal/adapter/proxy/olla/service.go
📚 Learning: applies to internal/adapter/proxy/*_test.go : benchmark tests should measure performance of critical...
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-27T12:59:29.787Z
Learning: Applies to internal/adapter/proxy/*_test.go : Benchmark tests should measure performance of critical paths, proxy engine comparisons, connection pooling efficiency, and circuit breaker behavior.
Applied to files:
internal/adapter/proxy/sherpa/service.gointernal/app/middleware/logging_test.gointernal/adapter/proxy/olla/service.go
📚 Learning: applies to internal/adapter/proxy/*.go : expose the following response headers: `x-olla-endpoint`, `...
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-27T12:59:29.788Z
Learning: Applies to internal/adapter/proxy/*.go : Expose the following response headers: `X-Olla-Endpoint`, `X-Olla-Model`, `X-Olla-Backend-Type`, `X-Olla-Request-ID`, `X-Olla-Response-Time`.
Applied to files:
internal/adapter/proxy/sherpa/service.gointernal/app/middleware/logging.gointernal/app/services/http.gointernal/app/handlers/application.gointernal/adapter/proxy/olla/service.gointernal/adapter/converter/unified_converter_test.go
📚 Learning: applies to internal/{app,adapter}/**/*.go : endpoints should be exposed at `/internal/health` and `/...
Learnt from: CR
PR: thushan/olla#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-27T12:59:29.788Z
Learning: Applies to internal/{app,adapter}/**/*.go : Endpoints should be exposed at `/internal/health` and `/internal/status`.
Applied to files:
internal/adapter/proxy/sherpa/service.gointernal/app/services/http.gointernal/app/handlers/application.gointernal/adapter/proxy/olla/service.go
🧬 Code Graph Analysis (8)
main.go (1)
internal/config/config.go (1)
Load(112-153)
internal/adapter/discovery/errors_test.go (1)
internal/adapter/discovery/errors.go (4)
NewDiscoveryError(33-42)NetworkError(69-72)ParseError(54-58)GetUserFriendlyMessage(83-124)
internal/config/config.go (2)
internal/config/types.go (1)
Config(10-18)internal/logger/logger.go (1)
Config(17-26)
internal/app/middleware/logging_test.go (2)
internal/app/middleware/logging.go (5)
GetLogger(49-54)GetRequestID(57-62)EnhancedLoggingMiddleware(65-120)AccessLoggingMiddleware(123-170)FormatBytes(187-189)internal/logger/styled.go (2)
StyledLogger(12-35)LogContext(71-74)
internal/app/middleware/logging.go (3)
theme/default.go (1)
Default(45-81)internal/logger/styled.go (1)
StyledLogger(12-35)internal/logger/logger.go (1)
DefaultDetailedCookie(30-30)
internal/app/services/http.go (1)
internal/util/network.go (1)
IsPortAvailable(52-59)
internal/app/handlers/application.go (2)
internal/logger/styled.go (1)
StyledLogger(12-35)internal/app/middleware/logging.go (2)
EnhancedLoggingMiddleware(65-120)AccessLoggingMiddleware(123-170)
internal/adapter/proxy/olla/service.go (2)
internal/app/middleware/logging.go (3)
GetLogger(49-54)GetRequestID(57-62)FormatBytes(187-189)internal/adapter/proxy/common/errors.go (1)
ErrNoHealthyEndpoints(16-16)
🔇 Additional comments (38)
internal/adapter/converter/openai_converter_test.go (1)
92-92: Test correctly updated to match new alias-based model ID behaviour.The test assertion has been properly updated to expect the first alias name (
"phi4:latest") instead of the original model ID, which aligns with the implementation changes for routing compatibility.internal/adapter/converter/openai_converter.go (1)
54-57: LGTM: Alias prioritisation logic is correct.The implementation correctly prioritises the first alias for routing compatibility whilst maintaining backwards compatibility by falling back to the original model ID when no aliases exist.
internal/adapter/converter/unified_converter_test.go (2)
87-88: Test comment and assertion correctly updated.The comment accurately explains the new behaviour of using the first alias for routing compatibility, and the assertion correctly expects the alias name instead of the original model ID.
115-115: Consistent test assertions across all filter scenarios.The test assertions have been consistently updated across all filtering scenarios to expect the alias-based model IDs, ensuring comprehensive coverage of the new behaviour.
Also applies to: 143-143
internal/adapter/converter/unified_converter.go (1)
81-84: LGTM: Consistent alias prioritisation implementation.The implementation correctly mirrors the OpenAI converter's approach, ensuring consistent behaviour across different response formats. The logic properly prioritises the first alias whilst maintaining backwards compatibility.
Also applies to: 87-87
internal/app/middleware/logging.go (2)
64-120: Excellent middleware implementation with comprehensive logging.The middleware provides excellent request tracing capabilities with proper context propagation and metrics capture. The request ID generation and response size tracking are well implemented.
122-170: Well-structured access logging with detailed context.The access logging middleware complements the enhanced logging nicely, providing detailed information suitable for file-based analysis. The use of context flags for detailed logging is a good pattern.
internal/app/services/http.go (1)
9-9: Good addition of util import for network utilities.Clean import addition to support the new port availability check functionality.
main.go (2)
42-42: Good addition of configFile variable for command-line configuration.The global variable properly supports the new configuration file flag functionality.
105-105: Correct integration with updated config.Load signature.The config loading properly passes the configFile argument, aligning with the updated Load function signature that accepts optional config file paths.
internal/adapter/discovery/errors_test.go (1)
9-98: Excellent comprehensive test coverage for error message formatting.The test suite thoroughly covers all error scenarios handled by
GetUserFriendlyMessage, including:
- Network connectivity issues (connectex, timeout)
- HTTP status error conditions (404, 500)
- Parse errors and generic network errors
- Unknown error fallback cases
The table-driven approach and clear test case names make this maintainable and easy to understand. This aligns perfectly with the coding guideline to test components in isolation.
internal/adapter/discovery/errors.go (1)
82-124: LGTM! Well-designed user-friendly error messaging.The
GetUserFriendlyMessagefunction provides a clear, structured approach to converting technical discovery errors into user-friendly messages. The implementation correctly useserrors.Asfor type assertion, handles HTTP status code categorisation appropriately, and includes comprehensive fallback handling for various error scenarios.The string matching patterns for network errors (lines 101-109) cover common connection issues effectively.
internal/adapter/discovery/service.go (3)
99-99: Enhance error logging with context.Good integration of the user-friendly error messaging. The call to
LogErrorsWithContextprovides better error visibility during regular discovery operations.
104-111: Well-structured logging helper method.The
LogErrorsWithContextmethod effectively combines user-friendly messaging with detailed technical information through structured logging. The separation of user-facing and technical details is appropriate for operational visibility.
202-210: Improved error handling with structured context.The enhanced error logging in
handleDiscoveryErrorprovides better operational insight by combining user-friendly messages with detailed error context. This aligns well with the broader logging improvements across the system.internal/config/config.go (3)
112-112: Good design choice for backwards compatibility.The variadic parameter approach maintains backwards compatibility whilst enabling command-line flag support. This is a clean solution for extending the function's capability.
118-123: Clear priority hierarchy implementation.The priority logic (flag > environment variable > default paths) is well-implemented and clearly documented. The conditional logic correctly handles the precedence order.
141-147: Enhanced error handling for different config sources.The improved error handling provides specific messages based on the configuration source, which aids in troubleshooting. The logic correctly handles both flag-specified and environment variable scenarios.
internal/adapter/proxy/sherpa/service.go (4)
44-44: Appropriate middleware import for enhanced logging.The addition of the middleware import enables context-aware logging capabilities, supporting the broader logging infrastructure improvements.
215-241: Well-implemented context-aware logging pattern.The conditional logging approach effectively leverages context loggers when available whilst maintaining backwards compatibility. The enhanced debug information (endpoint count) provides better operational visibility.
255-263: Enhanced request dispatch logging.Good addition of request ID to the dispatch logging when using context logger. This provides better request traceability across the system.
365-391: Comprehensive completion metrics logging.Excellent enhancement to the completion logging with formatted byte counts and request ID. The detailed metrics (latency breakdown, byte formatting) provide valuable operational insights whilst maintaining the existing debug-level logging as fallback.
internal/adapter/proxy/olla/service.go (4)
43-43: Consistent middleware integration.Good addition of middleware import to enable context-aware logging, maintaining consistency with the sherpa proxy implementation.
398-423: Consistent context-aware logging implementation.The conditional logging pattern matches the sherpa service implementation, providing consistency across proxy engines. The enhanced debug information improves operational visibility.
444-452: Enhanced dispatch logging with request tracking.Good integration of request ID into dispatch logging when context logger is available, providing better request traceability.
597-598: Comprehensive completion metrics with context awareness.Excellent enhancement providing detailed completion metrics with formatted byte counts and request ID when context logger is available. The implementation maintains consistency with the sherpa service whilst preserving fallback behaviour.
Also applies to: 650-676
internal/app/handlers/application.go (6)
12-12: LGTM!The middleware package import is correctly added to support the new logging middleware functionality.
23-23: LGTM!The logger field addition to SecurityAdapters struct enables proper dependency injection for logging middleware.
26-32: Excellent logging middleware integration!The middleware chaining is well-structured with clear ordering: logging → access logging → security → handler. The implementation correctly wraps the next handler with both logging middleware layers.
53-53: LGTM!The handler correctly serves the wrapped middleware with access logging applied.
58-65: Good consistency in middleware application.Both chain and rate limit middleware now consistently apply the same logging middleware layers, ensuring uniform logging behaviour across all request types.
125-125: LGTM!The SecurityAdapters is correctly initialised with the logger dependency for middleware usage.
internal/app/middleware/logging_test.go (6)
15-72: Comprehensive test coverage for EnhancedLoggingMiddleware.The test effectively validates:
- Context logger injection and retrieval
- Request ID propagation and header setting
- Handler execution flow
- Response verification
The test follows good isolation principles by using a mock logger and testing the middleware in isolation.
74-109: Good test coverage for AccessLoggingMiddleware.The test validates the middleware behaviour with proper request setup including headers, content length, and query parameters. The response verification is thorough.
111-131: Excellent test coverage for FormatBytes utility.The test cases cover all the important byte size ranges from bytes to terabytes, ensuring the formatting function works correctly across different scales.
133-141: Good edge case testing for GetLogger.Testing the default behaviour when no logger is present in context ensures robustness.
143-151: Good edge case testing for GetRequestID.Testing the default behaviour when no request ID is present in context ensures robustness.
153-178: Complete mock implementation of StyledLogger interface.The mock implementation correctly implements all methods from the StyledLogger interface, enabling proper isolation testing of the middleware components. The implementation is minimal but sufficient for testing purposes.
This PR does some maintenance work:
-cor--configso you can pass in a config file),requestidetc)One important bugfix was to resolve model unification when the same model differs on servers (by digest). This is usually when a model is updated on one but not the other, we would get things like:
hf.co/unsloth/Qwen3-32B-GGUF:Q4_K_XL-75fb8ad3qwen3:32b-8ade7840And automation tools (mostly our suite) would fail to query those because they dont exist - aren't routable. They now use the id properly.
hf.co/unsloth/Qwen3-32B-GGUF:Q4_K_XLqwen3:32bSummary by CodeRabbit
Summary by CodeRabbit
New Features
Improvements
Bug Fixes
Tests