-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Description
What would you like to be added:
Consider changing the default behavior of the forward plugin so that a single DNS request has a bounded number of upstream connection attempts, by assigning a small non‑zero default to the max_connect_attempts option, instead of the current default of 0 (“no per‑request cap”).
Concretely, once #7722 is merged: change its default from 0 to a non‑zero value. Going from unbounded attempts until deadline to a sane default.
Why is this needed:
-
With the current default behaviour and a TCP query (or
force_tcpenabled), a single query to a fast‑failing upstream (e.g. one returning immediate TCP RST) can trigger thousands ofProxy.Connectcalls before the per‑query deadline is hit. pprof shows a large fraction of CPU time spent in syscall thrashing. -
Although the new
max_connect_attemptsknob allows users to opt out of this behavior, the default remains surprising and can lead to sustained high CPU usage and latency spikes in real deployments (especially in constrained environments like Kubernetes pods with low CPU limits). -
Bounding per‑request retries to a small default would still preserve the intent of trying multiple upstreams, while preventing a single failing upstream from hoarding CPU time. This improves robustness out of the box.
See #7704 (review) for background info.