Thanks to visit codestin.com
Credit goes to github.com

Skip to content

CQs: fix a read_rom_q_tail/4 crash when tune_read/2 rounds past end_seq_id#15595

Merged
michaelklishin merged 4 commits into
mainfrom
mk-cq-qt-1-crash
Mar 3, 2026
Merged

CQs: fix a read_rom_q_tail/4 crash when tune_read/2 rounds past end_seq_id#15595
michaelklishin merged 4 commits into
mainfrom
mk-cq-qt-1-crash

Conversation

@michaelklishin
Copy link
Copy Markdown
Collaborator

tune_read/2 may round up the read range to a segment boundary, advancing start_seq_id past end_seq_id. Guard against this in both branches of read_from_q_tail;4.

This scenario was detected by a sporadic failure in classic_queue_prop_SUITE which now passes 200 times in a row.

@michaelklishin michaelklishin added this to the 4.3.0 milestone Feb 28, 2026
@michaelklishin michaelklishin changed the title Fix q_tail crash when tune_read rounds past end_seq_id Fix a read_rom_q_tail/4 crash when tune_read/2 rounds past end_seq_id` Feb 28, 2026
@michaelklishin michaelklishin changed the title Fix a read_rom_q_tail/4 crash when tune_read/2 rounds past end_seq_id` CQs: fix a read_rom_q_tail/4 crash when tune_read/2 rounds past end_seq_id` Feb 28, 2026
@lhoguin
Copy link
Copy Markdown
Contributor

lhoguin commented Mar 2, 2026

Checking.

@lhoguin
Copy link
Copy Markdown
Contributor

lhoguin commented Mar 2, 2026

Could you provide me with the crash log for this? Both to confirm the fix and to write a regression test case.

@michaelklishin
Copy link
Copy Markdown
Collaborator Author

@lhoguin I will dig in, they were logs from last Friday.

@michaelklishin
Copy link
Copy Markdown
Collaborator Author

Below a standard CT .zip file straight from Actions. It should include node data directories.

The failure can be seen in [email protected]_08.34.49/deps.rabbit.classic_queue_prop_SUITE.logs/run.2026-02-28_08.36.08/classic_queue_prop_suite.classic_queue_v2.html.

CT logs (rabbit parallel-ct-set-1 OTP-28 ).zip

@michaelklishin michaelklishin changed the title CQs: fix a read_rom_q_tail/4 crash when tune_read/2 rounds past end_seq_id` CQs: fix a read_rom_q_tail/4 crash when tune_read/2 rounds past end_seq_id Mar 2, 2026
@lhoguin
Copy link
Copy Markdown
Contributor

lhoguin commented Mar 3, 2026

Thank you. I can reproduce.

@lhoguin lhoguin force-pushed the mk-cq-qt-1-crash branch from a2a3d6e to e04f6cb Compare March 3, 2026 11:43
During recovery when all transient messages are dropped
the code could crash because the #q_tail{} resulting
from reading messages into memory had a start seqid
higher than the end seqid, due to tune_read rounding
up the read range past segment boundaries and the code
not noticing it went above the end seqid.
@lhoguin lhoguin force-pushed the mk-cq-qt-1-crash branch from e04f6cb to ccb1e48 Compare March 3, 2026 11:44
@lhoguin
Copy link
Copy Markdown
Contributor

lhoguin commented Mar 3, 2026

I have amended and pushed the change.

I added a regression test based on the CT logs and tweaked the code, notably removing the check in the second clause because it is already covered by case QTailCount - QHead1Len where we know nothing remains in q_tail and we don't use QTailSeqId1 in that case.

@michaelklishin michaelklishin merged commit 9e2e12c into main Mar 3, 2026
184 checks passed
@michaelklishin michaelklishin deleted the mk-cq-qt-1-crash branch March 3, 2026 16:37
mergify Bot pushed a commit that referenced this pull request Mar 3, 2026
(cherry picked from commit f43a079)
mergify Bot pushed a commit that referenced this pull request Mar 3, 2026
michaelklishin added a commit that referenced this pull request Mar 3, 2026
Note that on v4.2.x, the general problem is still
present but under different conditions, namely
DeltaSeqId1 =:= DeltaSeqIdEnd.

Unlike in `main`,
DeltaSeqIdEnd cannot read past DeltaSeqId.
michaelklishin added a commit that referenced this pull request Mar 3, 2026
CQs: fix a `read_rom_q_tail/4` crash when `tune_read/2` rounds past `end_seq_id` (backport #15595)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants