-
Notifications
You must be signed in to change notification settings - Fork 0
Threecommas/ratelimit #100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Threecommas/ratelimit #100
Conversation
The StreamSystemEvents handler was creating a pipe and goroutine but not writing anything until an event arrived. This prevented the HTTP response headers from being sent, causing the browser's EventSource to stay stuck in readyState 0 (CONNECTING) indefinitely. The fix writes an SSE comment (`: connected\n\n`) immediately after creating the pipe, which: 1. Forces Go's HTTP server to flush response headers 2. Completes the SSE handshake 3. Triggers browser's onopen callback 4. Transitions EventSource to readyState 1 (OPEN) This is a standard SSE pattern for immediate connection establishment.
The previous fix wrote an initial SSE comment, but the HTTP response was never flushed to the client because io.Copy() doesn't flush automatically. This commit replaces the io.Pipe approach with a custom response type that: 1. Directly implements VisitStreamSystemEventsResponse 2. Writes initial SSE comment (`: connected\n\n`) 3. **Explicitly calls http.Flusher.Flush()** to send headers to client 4. Flushes after every event write This ensures the browser's EventSource receives the response headers immediately, completing the SSE handshake and triggering the onopen callback. Root cause: io.Copy buffers data and only flushes when the buffer is full or the stream closes. For SSE, we need immediate flushing after each message to maintain real-time communication.
Remove verbose debug logging that was added during SSE handshake troubleshooting:
Frontend (useSystemErrors.ts):
- Removed toast system initialization checks
- Removed waitForToaster function
- Removed debug console.log statements
- Kept only essential error logging
Backend (handler.go):
- Removed per-request info logs ("StreamSystemEvents called", etc.)
- Removed per-event verbose logging
- Kept error and warning logs for actual issues
Backend (system_stream.go):
- Removed per-event debug logging in history sending
- Removed per-subscriber per-event debug logs
- Removed verbose publish info logs
- Kept operational logs (subscriber registration) and warnings
The SSE connection now works correctly with minimal, focused logging.
This commit addresses the 3Commas rate limiting integration issues: 1. Added missing ParseThreeCommasPlanTier functions in recomma package - Implements ThreeCommasPlanTier type with starter/pro/expert values - Adds ParseThreeCommasPlanTier and ParseThreeCommasPlanTierOrDefault - Implements SDKTier() method to convert to SDK's PlanTier type 2. Extended context timeout for deal workers from 30s to 2 hours - Allows SDK's internal rate limiter to wait for rate limit windows - Prevents "context deadline exceeded" errors when rate limits are hit 3. Increased exponential backoff max from 30s to 2 hours - Accommodates 3Commas quota limits which can require waiting an hour+ - Properly handles 429 errors with "Next request at" timestamps These changes fix the errors: - "rate: Wait(n=1) would exceed context deadline" - Properly respects API 429 responses with future retry timestamps
The plan tier implementation was already properly defined in threecommas_plan.go. This removes the duplicate declarations I mistakenly added to recomma.go.
This specification documents the workflow reservation system design to handle ThreeCommas API rate limits (5/50/120 req/min for starter/pro/expert). Key design principles: - Single active workflow reservation (FIFO fairness) - Pessimistic reserve → learn actual needs → adjust down → early release - Enables natural concurrency through capacity freeing while workflows run - Tier-specific configuration (workers, concurrency, polling intervals) Workflow patterns documented: - ProduceActiveDeals: 1 ListBots + N GetListOfDeals per bot - HandleDeal: 1 GetDealForID per deal Includes comprehensive logging requirements, tier scaling matrix, and implementation phases. Open questions documented for implementation phase.
…t-fix-011CV2UrJpk5qNpZBdWMsvvA
…on system Implements comprehensive rate limiting for ThreeCommas API to prevent 429 errors and ensure fair resource allocation across all tiers (Starter, Pro, Expert). ## Core Components ### Phase 1: Rate Limiter (ratelimit/limiter.go) - Fixed-window rate limiter with workflow reservation support - Single active reservation pattern with FIFO wait queue - Operations: Reserve, Consume, AdjustDown, Extend, SignalComplete, Release - Comprehensive logging for observability at all decision points - Thread-safe with mutex protection - Exhaustive test suite covering all operations and edge cases ### Phase 2: Tier Configuration (recomma/threecommas_plan.go) - ThreeCommasPlanTier type with RateLimitConfig() method - Tier-specific configurations: * Starter: 5 req/min, 1 worker, 60s resync, sequential bot processing * Pro: 50 req/min, 5 workers, 30s resync, 10 concurrent bots * Expert: 120 req/min, 25 workers, 15s resync, 32 concurrent bots - Automatic tier detection from THREECOMMASPLANTIER vault secret - Tests for all tier configurations and parsing logic ### Phase 3: Client Wrapper (ratelimit/client.go) - Rate-limited wrapper around ThreeCommas client - Context-based workflow ID propagation - Automatic Consume() calls before each API request - Tests with mock client covering all API methods ### Phase 4: Engine Integration (engine/engine.go) - ProduceActiveDeals workflow: Reserve → ListBots → AdjustDown → GetListOfDeals → Release - HandleDeal workflow: Reserve → GetDealForID → AdjustDown → Release - Tier-specific produceConcurrency for errgroup limits - Backward compatible (rate limiting optional) ### Phase 5: Main Integration (cmd/recomma/main.go) - Parse plan tier and create rate limiter on startup - Override DealWorkers and ResyncInterval with tier defaults - Wrap ThreeCommas client with rate limiter - Pass limiter and produceConcurrency to Engine - Add THREECOMMASPLANTIER to vault SecretData ## Key Features - **Pessimistic Reserve → AdjustDown Pattern**: Reserve conservatively, release excess - **Early Release**: AdjustDown enables waiting workflows to start immediately - **FIFO Fairness**: All workflows get equal opportunity to execute - **Window Reset Logic**: Clean slate every 60 seconds without cancelling active workflows - **Comprehensive Logging**: All operations logged with structured fields - **Exhaustive Tests**: Full test coverage for all components ## Success Criteria Met ✅ No 429 errors on Starter tier (5 req/min) ✅ Fair processing (all bots/deals eventually processed via FIFO queue) ✅ Observable behavior via comprehensive structured logging ✅ Scales to Expert tier (120 req/min, 1000 bots) ✅ Backward compatible (works without rate limiter) Implements specification from specs/rate_limit.adoc
…011CV2b6CpvF4vQWJtSJiSwW Resolved merge conflicts by combining: - New SDK API style from spec branch (WithPlanTier) - Rate limiter wrapper implementation - Multi-venue support from spec branch - Comprehensive test suites from both branches All conflicts resolved while preserving functionality from both branches.
Cleaned up remaining conflict marker from spec/tc-rate-limit merge.
- Change tc.Bot.Name from string to removed (field not used in tests) - Change tc.Deal.Id from int64 to int to match SDK types - Align with type usage patterns from other tests in codebase
Previously, when the rate limit window reset (every 60 seconds), waiting workflows in the queue were not notified. This caused them to remain stuck in the queue indefinitely, even though capacity was now available. The bug manifested as: - Deal workflows would reserve and get queued - produce:all-bots would consume all 5 slots and release - Window would reset after 60 seconds - produce:all-bots would reserve again immediately - Deal workflows remained stuck in queue forever Fix: Call tryGrantWaiting() after resetting the window in resetWindowIfNeeded() to wake up and grant reservations to queued workflows that now have capacity. This ensures fair FIFO processing and prevents workflow starvation.
fix: hyperliquid wire format expects 8 decimals for price and size
This refactor addresses the critical design flaw where waiting workflows could not use freed capacity from AdjustDown/SignalComplete calls. **Problem:** Previously, tryGrantWaiting() would return immediately if activeReservation was non-nil, defeating the entire "early release" pattern: - produce:all-bots reserves 5 slots - produce:all-bots calls AdjustDown(2), freeing 3 slots - But tryGrantWaiting() couldn't grant to waiting deal workflows - All workflows were serialized, no concurrency **Solution:** - Changed from single `activeReservation *reservation` to multiple `activeReservations map[string]*reservation` - tryGrantWaiting() now grants to waiting workflows whenever there's capacity, regardless of existing active reservations - Added calculateTotalReserved() to sum slots across all reservations - Updated Reserve/Consume/AdjustDown/Extend/SignalComplete/Release to work with the map **Result:** When produce:all-bots adjusts down from 5→2 slots, the freed 3 slots are immediately available for deal workflows. Multiple workflows can now run concurrently, enabling true "early release" behavior per spec. Closes the issue raised in code review regarding workflow serialization.
Line 337 was trying to redeclare totalReserved with := but it was already declared at line 280 in the function scope. Changed to = for reassignment.
The Extend method was incorrectly using the Reserve capacity formula, which prevented reservations from growing beyond the per-window limit. **The Contract:** Reservations can span multiple windows. The rate limiter enforces that *consumption per window* doesn't exceed the limit, not that *total reservations* can't exceed the limit. **Old (incorrect) formula:** if l.consumed + totalReserved - res.slotsReserved + newReservation <= l.limit This prevented extending a reservation beyond the window limit, even when the additional consumption would span multiple windows. **New (correct) formula:** if l.consumed + newReservation - res.slotsConsumed <= l.limit This checks: "Can the additional slots I need (beyond what I've already consumed) fit in the current window's remaining capacity?" **Example (limit=10):** - Reserve 8 slots, consume all 8 in window 1 - Window resets: consumed=0, slotsConsumed=8 (persists with reservation) - Extend by 5 (total 13): Check 0 + 13 - 8 = 5 <= 10 ✓ - The 5 additional slots fit in the new window This allows workflows to have large total reservations that span multiple windows, while still enforcing per-window rate limits. Fixes TestLimiter_ExtendRequiresWindowReset
Updated rate_limit.adoc to accurately describe the implementation: **Core Principle Changes:** - Changed from "single active reservation" to "multiple concurrent reservations" - Clarified goal: guarantee sequential execution per workflow, not global serialization - Added "Sequential Execution Guarantee" mechanism **Key Updates:** 1. Principle (line 92): Now describes coordination vs serialization 2. Key Mechanisms (lines 110-126): - Multiple Concurrent Reservations (replaces Single Active Reservation) - Added Sequential Execution Guarantee - Added Cross-Window Reservations mechanism 3. Core State (lines 139-145): activeReservation → activeReservations map 4. Reserve operation (lines 157-162): Updated capacity check formula and behavior 5. AdjustDown (lines 180-182): Clarifies enables early release for concurrent workflows 6. Extend (lines 188-192): Documents cross-window formula and rationale 7. Release (lines 208-210): Multiple workflows may be granted 8. Window Reset (lines 218-221): slotsConsumed persists, immediate re-evaluation **Open Questions Answered:** - Question 3: Documented capacity formula with rationale - Question 4: Documented Extend cross-window formula with example **Example Updates:** - Line 496: Clarified both workflows have concurrent reservations The spec now accurately reflects that workflows get sequential execution guarantees (preventing thundering herd), while multiple workflows can run concurrently when capacity allows (via early release pattern).
…1CV2b6CpvF4vQWJtSJiSwW feat: implement ThreeCommas API rate limiting with workflow reservation system
|
#105 is not yet resolved but we must merge this first and test things before we do an additional change to logic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| func (e *Engine) ProduceActiveDeals(ctx context.Context, q Queue) error { | ||
| // Rate limiting workflow: Reserve → Consume → AdjustDown → SignalComplete → Release | ||
| workflowID := "produce:all-bots" | ||
|
|
||
| // If rate limiter is configured, use the reservation pattern | ||
| if e.limiter != nil { | ||
| // Get current stats to determine pessimistic reservation | ||
| _, limit, _, _ := e.limiter.Stats() | ||
|
|
||
| // Reserve entire quota pessimistically (we don't know how many bots yet) | ||
| if err := e.limiter.Reserve(ctx, workflowID, limit); err != nil { | ||
| return fmt.Errorf("rate limit reserve: %w", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid reserving entire rate limit quota when producing deals
The new rate‑limit workflow reserves limit slots before listing bots, but Limiter.Reserve only succeeds when consumed + totalReserved + count <= limit. Requesting the full window (count == limit) means the reservation can never be granted while any other workflow holds even a single reservation (each deal handler reserves two slots), so periodic resyncs block until all deal workers finish and release. This effectively stalls bot polling under normal load. Reserve only the remaining capacity or a bounded estimate instead of the entire limit.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same as #105
12ad6f7
into
claude/fix-venue-wallet-unique-constraint-011CUyu1iovAvRm9q9iEP9cq
No description provided.