You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+30-1Lines changed: 30 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -180,7 +180,7 @@ Every successful request returns JSON with the **full resolved configuration**
180
180
}
181
181
```
182
182
183
-
Optional fields appear only when non-zero: `burst_size`(token_bucket), `window_seconds` (sliding_window), `queue_timeout`, `dynamic`.
183
+
Optional fields appear only when non-zero: `burst_size`(token_bucket), `window_seconds` (sliding_window), `queue_timeout`, `latency_compensation`, `network_latency_ms`, `dynamic`.
184
184
185
185
| Field | Description |
186
186
|-----------------|-------------|
@@ -190,6 +190,8 @@ Optional fields appear only when non-zero: `burst_size` (token_bucket), `window_
190
190
| `max_queue_size`| Maximum queue capacity |
191
191
| `overflow` | What happens when queue is full (`reject` or `block`) |
192
192
| `dynamic` | `true` if this endpoint was auto-created from an unconfigured path |
193
+
| `latency_compensation` | Configured latency compensation in ms |
194
+
| `network_latency_ms` | One-way network latency computed from `X-Sent-At` header (present only when header is sent) |
193
195
194
196
When the queue is full (`overflow: reject`) or the estimated wait exceeds `queue_timeout`, rls returns HTTP 429:
195
197
@@ -244,6 +246,33 @@ Set `queue_timeout` (seconds) to reject requests upfront when the predicted wait
244
246
245
247
Clients can override per-request with the `?timeout=N` query parameter (e.g. `?timeout=999` to effectively disable). A value of `0` (default) disables the check entirely. The timeout prediction is skipped for `lifo` and `random` schedulers where wait time is unpredictable.
246
248
249
+
### Latency compensation
250
+
251
+
When a client calls rls and then the target API, the total delay includes the network round-trip to rls. Set `latency_compensation` (ms) to release tickets early, so the actual API call hits the target closer to the ideal rate interval:
252
+
253
+
```yaml
254
+
defaults:
255
+
latency_compensation: 20 # compensate for 20ms one-way network latency
256
+
257
+
endpoints:
258
+
- path: "/api"
259
+
rate: 10
260
+
latency_compensation: 15 # per-endpoint override
261
+
```
262
+
263
+
Formula: `effective_interval = max(1ms, 1/rate - compensation_ms/1000)`. At 10 RPS (100ms interval) with 20ms compensation, the effective interval becomes 80ms (12.5 effective RPS). Defaults to `0` (no compensation, identical behavior to before).
264
+
265
+
### `X-Sent-At` header
266
+
267
+
Clients can send `X-Sent-At: <unix_milliseconds>` to measure one-way network latency. The server computes `network_latency_ms = now - sent_at` and includes it in the response for observability:
0 commit comments