-
-
Notifications
You must be signed in to change notification settings - Fork 421
Description
No duplicates π₯².
- I have searched for a similar issue in our bug tracker and didn't find any solutions.
What happened?
Roadrunner can fail to start when utilizing the Kafka driver with a configured TLS timeout due to an i/o timeout connecting to the kafka cluster.
The problem arises when using the kafka driver in a jobs pipeline with a TLS timeout duration specified. The docs specify and the server accepts a duration such as "10s"; however, kafkajobs/config.go treats tls.timeout as a seconds-int, rather than duration.
if c.TLS.Timeout != 0 {
netDialer.Timeout = c.TLS.Timeout * time.Second
}
This can result in an int overflow, yielding a negative timeout config, cause instant io timeouts when connecting. Removing the timeout from the spec was sufficient in my case, as it only specified the default value of "10s". Patching this allows the Kafka driver to spin up for the jobs plugin even with a timeout specified.
Version (rr --version)
rr --version
rr version 2024.3.5 (build time: 2025-02-27T17:24:29+0000, go1.24.0), OS: linux, arch: amd64
How to reproduce the issue?
version: '3'
server:
relay: pipes
command: 'php <redacted>'
env:
APP_ENV: local
APP_BASE_PATH: <redacted>
LARAVEL_OCTANE: '1'
logs:
mode: development
level: debug
encoding: console
channels:
http:
mode: development
level: debug
encoding: console
output: ["stdout"]
err_output: ["stderr"]
server:
mode: development
level: debug
encoding: console
output: ["stdout"]
err_output: ["stdout"]
rpc:
mode: development
level: debug
encoding: console
output: ["stderr"]
err_output: ["stdout"]
http:
address: 0.0.0.0:8080
access_logs: true
max_request_size: 64
# Middlewares for the http plugin, order is important. Allowed values is: "headers", "gzip", "static", "sendfile", [SINCE 2.6] -> "new_relic", [SINCE 2.6] -> "http_metrics", [SINCE 2.7] -> "cache"
middleware: [ "static", "gzip" ]
static:
dir: "<redacted>"
pool:
debug: false
num_workers: 5
max_jobs: 50
max_queue_size: 100
allocate_timeout: 2s
reset_timeout: 60s
stream_timeout: 60s
destroy_timeout: 30s
dynamic_allocator:
max_workers: 50
spawn_rate: 10
idle_timeout: 60s
supervisor:
watch_tick: 1s
ttl: 0s
idle_ttl: 1h
max_worker_memory: 128
exec_ttl: 60s
endure:
# How long to wait for stopping.
grace_period: 30s
# Logging level. Possible values: "debug", "info", "warn", "error", "panic", "fatal".
log_level: error
rpc:
listen: tcp://127.0.0.1:6001
kafka:
brokers: [ "<redacted>:9092" ]
ping:
timeout: "10s"
tls:
timeout: "10s" # Buggy field
root_ca: "/etc/ssl/certs/ca-certificates.crt"
client_auth_type: require_any_client_cert
sasl:
# <redacted>
jobs:
num_pollers: 1
pipeline_size: 100000
pool:
num_workers: 6
max_jobs: 0
allocate_timeout: 2s
destroy_timeout: 30s
pipelines:
kafka-pipeline:
driver: kafka
config:
producer_options:
required_acks: AllISRAck
request_timeout: 5s
delivery_timeout: 100s
group_options:
group_id: some-group-id
Serve roadrunner as usual, in this case it was hosting an octane project and connect to a TLS-capable kafka cluster:
rr -e -c=/etc/rr/.rr.yml serve
Relevant log output
2025-04-28T23:29:28+0000 DEBUG rpc plugin was started {"address": "tcp://127.0.0.1:6001", "list of the plugins with RPC methods:": ["status", "resetter", "app", "jobs", "informer", "lock"]}
2025-04-28T23:29:28+0000 DEBUG jobs initializing driver {"pipeline": "kafka-pipeline", "driver": "kafka"}
2025-04-28T23:29:28+0000 DEBUG kafka.kgo opening connection to broker {"kgo_driver": "addr", "kgo_driver": "<redacted>:9092", "kgo_driver": "broker", "kgo_driver": "seed_0"}
2025-04-28T23:29:28+0000 WARN kafka.kgo unable to open connection to broker {"kgo_driver": "addr", "kgo_driver": "<redacted>:9092", "kgo_driver": "broker", "kgo_driver": "seed_0", "kgo_driver": "err", "kgo_driver": "dial tcp: lookup <redacted>: i/o timeout"}
2025-04-28T23:29:28+0000 DEBUG kafka.kgo opening connection to broker {"kgo_driver": "addr", "kgo_driver": "<redacted>:9092", "kgo_driver": "broker", "kgo_driver": "seed_0"}
2025-04-28T23:29:28+0000 WARN kafka.kgo unable to open connection to broker {"kgo_driver": "addr", "kgo_driver": "<redacted>:9092", "kgo_driver": "broker", "kgo_driver": "seed_0", "kgo_driver": "err", "kgo_driver": "dial tcp: lookup <redacted>: i/o timeout"}
2025-04-28T23:29:28+0000 ERROR jobs failed to initialize driver {"pipeline": "kafka-pipeline", "driver": "kafka", "error": "kafka_ping: unable to dial: dial tcp: lookup <redacted>: i/o timeout"}
handle_serve_command: Function call error:
serve error from the plugin *jobs.Plugin stopping execution, error: jobs_plugin_serve: kafka_ping: unable to dial: dial tcp: lookup <redacted>: i/o timeout
exited with code 1Metadata
Metadata
Assignees
Labels
Type
Projects
Status