- Rust 100%
|
|
||
|---|---|---|
| src | ||
| .gitignore | ||
| Cargo.lock | ||
| Cargo.toml | ||
| COPYING.gpl3 | ||
| README.md | ||
| rustfmt.toml | ||
stringrouter
stringrouter binary
$ cargo run --bin stringrouter
Environment variables
-
STROUTER_CONTROL_LISTEN_ADDRESS:ip:porton which the control socket will listen.The control socket accepts newline-separated JWTs which control the forwarding infromation base (FIB), that is, the table which controls where connections are routed to.
If not set, there is no way to control the router from the outside.
-
STROUTER_CONTROL_KEY(required ifSTROUTER_CONTROL_LISTEN_ADDRESSis set): HMAC key used to verify JWTs. -
STROUTER_CONTROL_EXPOSE_METRICS: if set totrue,1oryes, the control port will also be an HTTP endpoint, which exposes OpenMetrics-compatible metrics onGET /metrics. -
STROUTER_METRICS_LISTEN_ADDRESS:ip:porton which a dedicated HTTP server will listen.That HTTP server will only expose the
GET /metricsendpoint, serving OpenMetrics-compatible metrics.This can be enabled in addition to
STROUTER_CONTROL_EXPOSE_METRICS.If unset, no dedicated metrics http server will be started.
-
STROUTER_FIB_SWEEP_INTERVAL_MS(default: 60000): interval in which expired entries will be removed from the FIB.Entries which are expired aren't used, but still consume memory until the sweep finds them.
-
STROUTER_FIB_SWEEP_BATCH_SIZE(default: 100): number of server name keys in the FIB which are sweeped at once.Large FIBs may take too long to sweep if they're processed in a single round. As sweeping needs a write lock on the FIB, it blocks accepting new connections.
Hence, sweeps are split into batches of
STROUTER_FIB_SWEEP_BATCH_SIZEserver name entries. One batch is processed eachSTROUTER_FIB_SWEEP_MINOR_INTERVAL_MSduring a sweep. -
STROUTER_FIB_SWEEP_MINOR_INTERVAL_MS(default: a 100th ofSTROUTER_FIB_SWEEP_INTERVAL_MS): step interval for sweeping.See
STROUTER_FIB_SWEEP_BATCH_SIZE. -
STROUTER_LISTEN_ADDRESS(required): theip:portwhere the router will listen for incoming connections. -
STROUTER_CONNECTION_LIMIT: Maximum number of connections which the router will serve in parallel. If this limit is reached, no new connections are accepted.If unset, the limit defaults to the
NOFILErlimit, minus 24, divided by two (as each connection needs two FDs and we need some headroom for connections to the control- and metrics ports). -
STROUTER_PEEK_DEFAULT: if set totrue,yes,1, oron, connections where peeking into the TLS client hello fails will be treated as connections without SNI and without ALPN, instead of dropping them.This is mainly meant for debugging and benchmarking purposes (as iperf3 does not send a Client Hello, of course).
-
STROUTER_PEEK_TIMEOUT_MS(default: 1000): maximum time spent waiting for the client to produce a valid Client Hello.If a client does not produce a valid and complete Client Hello in this time, the connection will be dropped (even if
STROUTER_PEEK_DEFAULTis set) and thestringrouter_connection_result_total{status="peek-timeout"}counter is increased. -
STROUTER_PEEK_READ_TIMEOUT_MS: if set, this is the maximum time spent on reads while peeking into the Client Hello.This can be thought of an upper bound on the round-trip time of the client while at the same time being an early defense against things trying to hit our connection limit.
-
STROUTER_CONNECT_TIMEOUT_MS(default: 1000): maximum time spent connecting to a backend.If the backend connection does not succeed within the given time, the connection is dropped and the
stringrouter_connection_result_total{status="connect-timeout"}counter is increased. -
STROUTER_PERSIST_FIB: if set, the FIB is persisted loaded from disk on boot and persisted to disk on shutdown.Must be the path to the FIB file. If the FIB file does not exist, it will be created on shutdown.
If the FIB file exists but cannot be read or if the location where the path points to is inaccessible, stringrouter fails to start.
-
STROUTER_PERSIST_FIB_INTERVAL_SECONDS: if set, the FIB is written to disk in the given interval.Has no effect if
STROUTER_PERSIST_FIBis not set.Errors while writing the FIB are logged, but do not block operation of stringrouter. Writing happens in a dedicated thread as to not block accepting connections, but a slow disk may block changes to the FIB (as changes are not possible while the FIB is being saved).
Errors increase the
stringrouter_persistence_write_failures_totalmetric.When the FIB gets persisted successfully, the
stringrouter_persistence_last_write_timestampmetric is set to the current unix timestamp.
Metrics
Metrics are available via the control socket (if enabled and if
STROUTER_CONTROL_EXPOSE_METRICS is enabled) or via the metrics socket (if
enabled) by sending an HTTP request for the /metrics resource.
Metrics are emitted in OpenMetrics compatible format.
The following metrics exist:
-
stringrouter_sent_bytes_total{direction="frontend"}(counter): Total number of bytes ever sent by this stringrouter instance in the direction of clients.This is equivalent to the number of bytes received by the backend, hence that metric is not exposed separately.
-
stringrouter_sent_bytes_total{direction="backend"}(counter): Total number of bytes ever sent by this stringrouter instance in the direction of backends.This is equivalent to the numebr of bytes received by the client, hence that metric is not exposed separately.
-
stringrouter_backend_connect_time_seconds(histogram): Histogram of the time it takes to connect to any backend.This effectively is the TLS handshake latency observed by clients after the Client Hello and may point you at the existence of slow or unhealthy backends.
-
stringrouter_peek_time_seconds(histogram): Histogram of the time it takes to peek into client's Client Hello message.May be useful to tune peek timeouts and observe handshake parsing performance.
-
stringrouter_connections_accepted_total(counter): Total number of (data) connections this stringrouter instance has ever accepted.This should normally equal
sum(stringrouter_connections_closed_total) + sum(stringrouter_connections{state!="available"}), barring race conditions in exposing the counters.A sustained or growing difference between the two indicates software bugs.
-
stringrouter_connections_closed_total{reason=..}(counters): Every closed client connection increases one of the metrics of this family.The following values for
reasonare defined:peek-failure: The Client Hello failed to decode or was sent incompletely (but did not time out). Note that this is never increased ifSTROUTER_PEEK_DEFAULTis set, as peek failures are masked by that flag.peek-timeout: The client did not provide the Client Hello within the configured timeouts (seeSTROUTER_PEEK_TIMEOUT_MSandSTROUTER_PEEK_READ_TIMEOUT_MS).no-fib-entry: No FIB entry matched the incoming connection's information.connect-error: Theconnect()call to the backend failed (but no with a timeout).connect-timeout: Theconnect()call to the backend did not complete withinSTROUTER_CONNECT_TIMEOUT_MSor hit an OS-defined timeout.forward-error: While forwarding data between the backend and the client, a read or write error occured and the connection was dropped.ok: The connection was closed cleanly by both sides.
-
stringrouter_connections{state=..}(gauges): Represents the status of the pool of connections available to the stringrouter.The sum of all of the metrics in this family should always be equal to the defined connection limit (see
STROUTER_CONNECTION_LIMIT), but due to race conditions, it may not add up when under high load. This is only an issue with data collection, however. The stringrouter never goes beyond the connection limit.The following states are defined:
available: Ready to accept a new client connection.peeking: Waiting for and parsing the Client Hello.connecting-backend: Connecting to the backend.sending-header: Sending the client hello to the backend.forwarding: Bi-directional forwarding between backend and client.
-
stringrouter_connection_limit_stalls_total(counter): Counts the number of times this stringrouter has reached the connection limit while preparing to accept a connection.When this counter increases, it is very likely that clients are experiencing connection setup latencies.
-
stringrouter_fib_names_total: Total number of server names in the FIB. -
stringrouter_fib_entries_total: Total number of backends in the FIB.This may be less than
stringrouter_fib_names_totalduring sweeps, as unused server names are only cleared out at the end of the sweep. -
stringrouter_persistence_write_failures_total: Total number of failures to write the FIB to disk.This is only increased if
STROUTER_PERSIST_FIB_INTERVAL_SECONDSandSTROUTER_PERSIST_FIBare both set. -
stringrouter_persistence_last_write_timestamp: UNIX timestamp of the last successful write of the FIB.This metric is only exposed after the first successful write and thus only exists if both
STROUTER_PERSIST_FIB_INTERVAL_SECONDSandSTROUTER_PERSIST_FIBare set.
strouterctl binary
$ cargo run --bin strouterctl -- --help
Environment variables
The strouterctl binary uses the STROUTER_CONTROL_LISTEN_ADDRESS
environment variable as an optional educated guess as to where the
stringrouter may be listening.
The STROUTER_CONTROL_KEY environment variable is required.