You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+35-33Lines changed: 35 additions & 33 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -324,12 +324,13 @@ AI providers can become unstable, return 5xx errors, or hit temporary rate limit
324
324
325
325
**How OmniRoute solves it:**
326
326
327
-
-**Settings-Driven Lock Hierarchy** — Provider profiles control default account/model lockouts, global model quarantine, and provider circuit breakers from one control surface, while explicit upstream `Retry-After` windows still take priority
328
-
-**Exponential Backoff** — Progressive retry delays for both account/model lockouts and higher-level quarantine
327
+
-**Request Queue & Pacing** — Per-connection request buckets smooth bursts before they hit upstream rate caps
328
+
-**Connection Cooldown** — A single connection cools down after retryable failures with optional upstream `Retry-After` hints and exponential backoff
329
+
-**Provider Circuit Breaker** — The provider only trips after fallback is exhausted and the provider request still fails with provider-wide transient errors; connection-scoped `429` rate limits stay in Connection Cooldown
330
+
-**Wait For Cooldown** — The server can wait for the earliest connection cooldown to expire and retry the same client request automatically
-**Request Idempotency** — 5s deduplication window for identical requests
472
473
-**Rate Limit Detection** — Per-provider RPM, min gap, and max concurrent tracking
473
-
-**Editable Rate Limits** — Configurable defaults in Settings → Resilience with persistence
474
+
-**Request Queue & Pacing** — Configurable queue, pacing, and concurrency defaults in Settings → Resilience
474
475
-**API Key Validation Cache** — 3-tier cache for production performance
475
476
-**Health Dashboard with Telemetry** — p50/p95/p99 latency, cache stats, uptime
476
477
@@ -567,8 +568,8 @@ Teams need quick runtime changes during incidents or cost events.
567
568
**How OmniRoute solves it:**
568
569
569
570
- Switch combo activation directly from MCP dashboard
570
-
-Apply resilience profiles from pre-defined policy packs
571
-
-Reset circuit breaker state from the same operations panel
571
+
-Tune queue, cooldown, breaker, and wait settings from the dedicated Resilience page
572
+
-Review live provider breaker state from the Health dashboard
572
573
573
574
</details>
574
575
@@ -1352,7 +1353,7 @@ OmniRoute v3.6 is built as an operational platform, not just a relay proxy.
1352
1353
| 🔢 **Hybrid Token Counting** | Uses provider-side `/messages/count_tokens` when available; falls back to estimation — accurate usage tracking without guessing |
1353
1354
| 🌱 **Model Alias Auto-Seed** | 30+ cross-proxy dialect aliases normalised at startup — no more routing mismatches |
1354
1355
| 🛡️ **Safe Outbound Fetch** | All provider validation and model discovery go through a guarded fetch layer blocking private/local URLs with retry, timeout, and SSRF protection |
1355
-
| 🔄 **Cooldown-Aware Retries** | Chat requests auto-retry on model-scoped cooldowns with configurable `requestRetry` and `maxRetryIntervalSec` |
1356
+
| ⏳ **Wait For Cooldown** | Server-side chat retries when every candidate connection is cooling down; configurable `enabled`, `maxRetries`, and `maxRetryWaitSec` |
1356
1357
| 🔍 **Runtime Env Validation** | Startup validates all env vars with Zod schemas — clear errors for missing secrets, invalid URLs, or wrong types |
@@ -794,7 +792,7 @@ legacy compatibility. The current runtime contract uses:
794
792
795
793
## 1) Account/Provider Availability
796
794
797
-
-provider account cooldown on transient/rate/auth errors
795
+
-connection cooldown on retryable upstream failures
798
796
- account fallback before failing request
799
797
- combo model fallback when current model/provider path is exhausted
800
798
@@ -876,7 +874,7 @@ Environment variables actively used by code:
876
874
5. The `open-sse/` directory is published as the `@omniroute/open-sse`**npm workspace package**. Source code imports it via `@omniroute/open-sse/...` (resolved by Next.js `transpilePackages`). File paths in this document still use the directory name `open-sse/` for consistency.
877
875
6. Charts in the dashboard use **Recharts** (SVG-based) for accessible, interactive analytics visualizations (model usage bar charts, provider breakdown tables with success rates).
878
876
7. E2E tests use **Playwright** (`tests/e2e/`), run via `npm run test:e2e`. Unit tests use **Node.js test runner** (`tests/unit/`), run via `npm run test:unit`. Source code under `src/` is **TypeScript** (`.ts`/`.tsx`); the `open-sse/` workspace remains JavaScript (`.js`).
879
-
8. Settings page is organized into 5 tabs: Security, Routing (6 global strategies: fill-first, round-robin, p2c, random, least-used, cost-optimized), Resilience (editable rate limits, circuit breaker, policies, **Context Relay** handoff config), AI (thinking budget, system prompt, prompt cache), Advanced (proxy).
877
+
8. Settings page is organized into 7 tabs: General, Appearance, AI, Security, Routing, Resilience, Advanced. The Resilience page only configures request queue, connection cooldown, provider breaker, and wait-for-cooldown behavior; live breaker runtime state is shown on the Health page.
880
878
9.**Context Relay** strategy (`context-relay`) is split across two layers: `combo.ts` decides if a handoff should be generated, `chat.ts` injects the handoff after account resolution. Handoff data lives in `context_handoffs` SQLite table. This split is intentional because only `chat.ts` knows whether the actual account changed.
881
879
10.**Proxy enforcement** is now comprehensive: `tokenHealthCheck.ts` resolves proxy per connection, `/api/providers/validate` uses `runWithProxyContext`, and `proxyFetch.ts` uses `undici.fetch()` to maintain dispatcher compatibility on Node 22.
882
880
11.**Node.js runtime policy detection**: `/api/settings/require-login` returns `nodeVersion` and `nodeCompatible` fields. The login page renders a warning banner when the runtime falls outside the supported secure Node.js lines.
0 commit comments