Codestin Search App

rubenfiszel · 2026-05-11T15:57:22Z

Summary

Client::query_typed_raw (and the streaming-prepare code path) deadlocks when:

The result schema contains a column whose OID is not a built-in
tokio_postgres::types::Type (citext, custom enums / domains, postgis
geometry, etc.), AND
The typeinfo cache is cold for that OID, AND
The result set is large enough to span more than one BackendMessages
frame on the wire (≈100+ small rows on localhost in repro).

Head-of-line blocking: query::query_typed calls get_type(client, oid).await
synchronously while it still holds the original query's Responses stream
(tokio-postgres/src/query.rs):

Message::RowDescription(row_description) => {
    let mut columns: Vec<Column> = vec![];
    let mut it = row_description.fields();
    while let Some(field) = it.next().map_err(Error::parse)? {
        let type_ = get_type(client, field.type_oid()).await?;  // <-- blocks here
        ...
    }
    return Ok(RowStream { ... });
}

The original query's DataRows back up in the per-request Responses
mpsc::channel(1). Once full, Connection::poll_read parks the next
BackendMessages frame in pending_responses and returns Ok(None) (the
Poll::Pending branch around connection.rs:159). The wire stops being
drained, but the typeinfo sub-query's response is queued on the same socket
behind those DataRows — it never arrives, and the outer get_type await
never completes.

Two fix variants

This PR is the minimal variant. There's a more involved alternative in
#1349 that preserves the bounded-channel API. The two are mutually
exclusive — pick whichever you prefer. Same root cause, same memory profile
in the deadlock case, different surface.

	this PR (#1348)	#1349
Per-`Responses` channel	`unbounded`	`mpsc::channel(1)` (unchanged)
Where overflow buffers	inside the mpsc itself	new `parked: VecDeque<…>` field on each `Response`
New routing state	none	`completion_seen` flag + `target_response_idx()`
Lines changed	~25	~120
Memory in deadlock case	one result set's worth	one result set's worth (identical)
Public-ish API surface	`Responses` receiver type changes (pub-crate-private)	unchanged

This PR (minimal)

Switch the per-request response channel from mpsc::channel(1) to
mpsc::unbounded() so Connection::poll_read can keep draining the wire
regardless of how slowly the per-request consumer is iterating. The request
channel (Client → Connection) is already unbounded; this makes the
per-request response side match.

Trade-off: per-Responses backpressure between the Connection task and
its consumer is now unbounded. In practice the kernel socket buffer is what
bounded buffering anyway — channel(1) didn't slow the server down, it
just shifted where the bytes pile up.

#1349 (surgical alternative)

Keeps mpsc::channel(1) and instead changes Connection::poll_read to
keep draining the wire when a sender backs up, parking the unsent frame on
a per-response queue (parked: VecDeque<(BackendMessages, bool)>) inside
Response itself. Tracks a completion_seen flag so wire frames after a
parked-but-not-yet-delivered ReadyForQuery get routed to the next
response.

Per-response (rather than a single global parked queue) is the critical
detail: a global queue still deadlocks because once response[0]'s
sender is full, frames for response[1] would pile up behind it with no
way to deliver them. Per-response queues let us poll each sender
independently.

Why two PRs

I have a slight preference for this one because it's a one-line semantic
change and a one-type-swap diff. But I don't know if there's a reason the
original mpsc::channel(1) was specifically 1 rather than some other
small bound, and #1349 is there in case you'd rather preserve that. Happy
to fold the loser into the winner if you'd like — let me know.

Reproduction

Minimal standalone repro: https://github.com/rubenfiszel/tokio-postgres-deadlock-repro

git clone https://github.com/rubenfiszel/tokio-postgres-deadlock-repro
cd tokio-postgres-deadlock-repro
./setup.sh
cargo run --release

Buggy output on master:

[limit   1]  ok (1 rows) in 827µs
...
[limit 100]  ok (100 rows) in 663µs
[limit 200]  TIMEOUT after 10s
[limit 500]  TIMEOUT after 10s

After warming the typeinfo cache via client.prepare(...) on the same client:
[warm cache, limit 500]  ok (500 rows) in 604µs

With this PR applied (via [patch.crates-io] to this branch):

[limit 200]  ok (200 rows) in 669µs
[limit 500]  ok (500 rows) in 676µs

The repro's README has a walkthrough of the deadlock with line-by-line
references to query.rs and connection.rs.

Test plan

cargo build -p tokio-postgres clean
cargo test -p tokio-postgres --lib passes (the in-process unit tests
that don't need a live Postgres)
Standalone repro: deadlocks on master, passes with this PR
End-to-end verified against a real workload (the same query_typed_raw
call against a partitioned table with a citext column on Neon) —
hangs indefinitely on master, completes in ~420ms with the patch

…a has unknown OIDs When the result schema of a streaming query (`Client::query_typed_raw`, `Client::query_typed`, or `Client::prepare` when followed by a Bind+Execute on the same connection) contains a column whose OID isn't a built-in `tokio_postgres::types::Type`, and the typeinfo cache is cold, `query::query_typed` calls `get_type(client, oid).await` synchronously while it still holds the original query's `Responses` stream: Message::RowDescription(row_description) => { let mut columns: Vec<Column> = vec![]; let mut it = row_description.fields(); while let Some(field) = it.next().map_err(Error::parse)? { let type_ = get_type(client, field.type_oid()).await?; // <-- blocks here ... } return Ok(RowStream { ... }); } If the result is large enough that the server keeps sending `DataRow` messages for the original query before that `get_type` completes, those `DataRow`s back up in the per-request `Responses` `mpsc::channel(1)`. As soon as the channel is full `Connection::poll_read` returns `Poll::Pending` and parks the next `BackendMessages` frame in `pending_responses`, then stops draining the wire. The typeinfo sub-query's response is queued on the same socket behind those `DataRow`s, so it never arrives, and the outer `get_type` await never completes. Classic head-of-line blocking. This trips most often with the `citext` extension, custom enums, custom domains, and postgis geometry — anything with an OID that `Type::from_oid` doesn't recognise. The deadlock requires the result to span more than one `BackendMessages` frame on the wire, so it manifests at moderate row counts (≈100+ small rows on localhost in repro). Fix: switch the per-request response channel from `mpsc::channel(1)` to `mpsc::unbounded()` so `Connection::poll_read` can keep draining the wire regardless of how slowly the per-request consumer is iterating. The `Request` channel (`Client` → `Connection`) is already unbounded; this makes the per-request response side match. Trade-off: per-`Responses` backpressure between the `Connection` task and its consumer is now unbounded. In practice the kernel socket buffer is what bounded buffering anyway — `channel(1)` didn't slow the server down, it just shifted where the bytes pile up. Minimal reproduction: https://github.com/rubenfiszel/tokio-postgres-deadlock-repro

This was referenced May 11, 2026

tokio-postgres: fix deadlock in query_typed/prepare on unknown OIDs (per-response parking variant — alternative to #1348) #1349

Open

deps: pin tokio-postgres to forked branch with query_typed_raw deadlock fix windmill-labs/windmill#9106

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tokio-postgres: fix deadlock in query_typed/prepare when result schema has unknown OIDs#1348

tokio-postgres: fix deadlock in query_typed/prepare when result schema has unknown OIDs#1348
rubenfiszel wants to merge 1 commit into
rust-postgres:masterfrom
rubenfiszel:fix-query-typed-raw-deadlock

rubenfiszel commented May 11, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rubenfiszel commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Two fix variants

This PR (minimal)

#1349 (surgical alternative)

Why two PRs

Reproduction

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rubenfiszel commented May 11, 2026 •

edited

Loading