Codestin Search App

MilosM348 · 2026-05-07T18:40:06Z

Summary

Adds a mid-flight free-disk watchdog to the optimizer's slow path so that an out-of-disk condition between the pre-flight check and the segment-builder write surface fails with a clean "No space left on device:" error instead of crashing inside segment_builder.update / populate_vector_storages / segment_builder.build.

This addresses the open part of #4297 - the up-front-only check_segments_size (added in #4578) is correct, but it is also fundamentally racy against external disk consumers. xhjkl flagged exactly this in the review of #4578:

We still might run into OOD down the line because FS is non-atomic.
(...) some external process can occupy the disk and we can do nothing about it

There is no graceful fix for "the disk filled up while we were holding permit", but we can still detect it before the next slow phase begins and abort the optimization in the same shape as the WAL/insertion path (DiskUsageWatcher).

/claim #4297

What changed

check_segments_size now returns the computed space_needed estimate so callers can re-use it for mid-flight checks. Existing call site in execute_optimization is the only consumer.
New recheck_free_space helper in lib/shard/src/optimize.rs takes the same temp_path and space_needed and aborts the optimization with a canonical "No space left on device:" error if available space has dropped below the larger of (estimate, 8 MiB safety floor).
The watchdog is invoked twice in build_new_segment:
- after segment_builder.update (i.e. before the HNSW indexing phase that historically blows past the conservative 2x pre-flight estimate when link tables are large),
- after populate_vector_storages (i.e. immediately before segment_builder.build, which is where ENOSPC has historically surfaced as a panic).
The pre-flight error message is also normalized to lead with "No space left on device:" so it is logged in the same shape as the DiskUsageWatcher path on insertion and matches the assertion in tests/e2e_tests/test_low_disk.py:
python expected_msg = "No space left on device:" assert expected_msg in logs
Without this normalization, an OOD that trips the optimizer's own pre-flight check (rather than the WAL writer) would log "Not enough space available for optimization" instead, and the e2e assertion would only pass by accident through unrelated WAL log lines.
Three new unit tests in disk_watchdog_tests:
- watchdog_passes_when_disk_has_room: healthy tempdir accepts both None and small estimates.
- watchdog_fails_when_estimate_exceeds_available: u64::MAX estimate trips the watchdog and the rendered error contains the canonical OOD prefix, the optimizer name, and the temp path (so logs stay diagnostic).
- watchdog_uses_max_of_estimate_and_safety_buffer: pins OPTIMIZER_DISK_WATCHDOG_BUFFER_BYTES into the safe range [1 MiB, 64 MiB] so future contributors don't accidentally make the buffer either pointless or false-positive-prone.

What this is not

It is not a replacement for check_segments_size. The pre-flight check is still the primary guard.
It is not a periodic timer. We piggy-back on the existing phase boundaries in build_new_segment rather than spinning up a background task - the maintainer's review on Fail early when encountering out-of-storage during optimization #4578 was explicit that an async watchdog "is fake anyway" on top of FS that already isn't atomic. Two synchronous checks at the obvious phase boundaries match the existing style and add no new threads/locks.
It is not a graceful mid-write ENOSPC handler. If a write(2) returns ENOSPC inside the segment builder while we're already mid-phase, the existing error path still applies - the watchdog just shrinks the window in which that can happen.

Why "8 MiB safety floor"

When space_needed is None (estimate failed, e.g. an unreadable segment dir), the watchdog falls back to a small fixed buffer so it isn't completely toothless. 8 MiB matches the smallest reasonable single-vector-storage write that a real optimization will perform and is consistent with DiskUsageWatcher::min_free_disk_size_mb defaults elsewhere in the codebase. The unit test pins this constant into a sane range so future contributors don't drift it.

Risks / things to look at in review

No new dependencies. All used APIs (fs4::available_space, bytes_to_human, OperationError::service_error) were already imported by this file.
No public API changes. check_segments_size, recheck_free_space, and build_new_segment are all crate-private.
Rebased onto dev as required by CONTRIBUTING.md.

Test plan

cargo test -p shard --test disk_watchdog_tests (added; passes locally on a healthy fs)
tests/e2e_tests/test_low_disk.py::TestLowDisk::test_low_disk_handling[indexing] - requires the docker e2e harness, please run on CI
No regression in existing optimizer tests (the watchdog only fires on < required available space, and existing tests run on hosts with plenty of free disk)

Closes #4297 if accepted.

Co-authored-by: Cursor [email protected]

The pre-flight check_segments_size only runs once at the start of an optimization, but the slow phases (segment_builder.update, populate_vector_storages, segment_builder.build) can take many minutes. During that window other parallel optimizations, snapshots, WAL growth, or unrelated processes on the same volume can fill the disk and crash the segment builder on a raw ENOSPC. xhjkl flagged exactly this in the review of PR qdrant#4578 (we still might run into OOD down the line because FS is non-atomic). Changes ------- * check_segments_size now returns an OptimizationSpaceEstimate carrying both the space_needed estimate AND the precheck-time available bytes, so the mid-flight watchdog can enforce headroom rather than the full initial estimate (the optimizer itself is expected to consume the estimate by design). * New recheck_free_space helper aborts the optimization with a canonical No space left on device: error if available space has dropped below max(precheck_available - space_needed, 8 MiB safety floor). Per-IO available_space lookup is injectable via recheck_free_space_with for testability. * The watchdog is invoked twice in build_new_segment: once after segment_builder.update and once after populate_vector_storages, i.e. before the two slow phases that historically exceed the conservative 2x pre-flight estimate. * The pre-flight error message is also normalized to lead with No space left on device: so it is logged in the same shape as the WAL/insertion path (DiskUsageWatcher) and matches the assertion in tests/e2e_tests/test_low_disk.py. * Seven unit tests in disk_watchdog_tests pin the headroom semantics, the OOD message format, the statvfs-failure skip behaviour, and the one-call-per-checkpoint contract on available_space. The watchdog only triggers when available space drops below the headroom the up-front check accepted, and treats fs4::available_space errors as skip, so neither the optimizer's own writes nor a transient statvfs failure can abort an otherwise healthy optimization. Refs: qdrant#4297, qdrant#4578 Co-authored-by: Cursor <[email protected]>

algora-pbc Bot added the 🙋 Bounty claim label May 7, 2026

algora-pbc Bot mentioned this pull request May 7, 2026

Implement better handling of OOD issues in optimizer #4297

Open

This comment was marked as resolved.

Sign in to view

MilosM348 force-pushed the fix/optimizer-ood-watchdog branch from c115ed8 to 79f8d60 Compare May 7, 2026 18:51

qdrant deleted a comment from coderabbitai Bot May 8, 2026

MilosM348 force-pushed the fix/optimizer-ood-watchdog branch from 79f8d60 to dbf318d Compare May 8, 2026 10:13

qdrant deleted a comment from coderabbitai Bot May 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(optimizer): add mid-flight free-disk watchdog (#4297)#8948

fix(optimizer): add mid-flight free-disk watchdog (#4297)#8948
MilosM348 wants to merge 1 commit into
qdrant:devfrom
MilosM348:fix/optimizer-ood-watchdog

MilosM348 commented May 7, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MilosM348 commented May 7, 2026

Summary

What changed

What this is not

Why "8 MiB safety floor"

Risks / things to look at in review

Test plan

Uh oh!

This comment was marked as resolved.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant