-
Notifications
You must be signed in to change notification settings - Fork 185
Open
Description
We've already kicked off some work on a "node level" simulation test which brings in multiple components (namely, the compactor and the garbage collector) and runs their processes concurrently to check for any unexpected effects. We currently have:
- A compactor mock driver that supports multiple systems.
- A garbage collector mock driver that supports multiple systems.
- A node simulation test that sets up a system with the mock compactor, mock garbage collector, a shared in memory bufferpool, trie catalog, block catalog and allocator.
- A number of existing tests against running these together - sequentially, concurrently, across single and multiple systems.
Expected properties of GC
We are primarily testing the behaviour of garbage collection, and we expect the following to always hold:
- L0 files should NEVER get GCed in ANY circumstance.
- Any
liveL1+ tries shouldn't get GCed - including by other nodes.- This should hold for ALL levels of files, though for all intents and purposes we can probably check up to L3/L4.
- The same as above goes for any
nascenttries - including by other nodes. - Garbage removal should respect the set garbage
asOfvalue and configuredgarbageLifetime. - All nodes should converge to the same trie catalog state following compaction + gc.
- All tries that get removed from a node's trie catalog should also (eventually) be removed from shared storage.
- There shouldn't be anything live/nascent on the trie catalog that isn't present on storage.
- This is bad - the node will break as it is missing a file it considers that it needs.
- Note: garbage tries in catalog without corresponding storage is OK - another node may have already deleted.
- There shouldn't be anything present on storage that isn't present on the trie catalog.
- This is a bug in the garbage collector, though the impact is wasted storage space.
Analysis of current coverage (2025-01-06)
Good coverage:
- L0 preservation (a number of explicit tests and implicit tests based on trie counts)
liveL1+ tries not GCed (explicit tests at L1, L2 and L3 levels)- Multi-node convergence (All multi system tests feature explicit convergence checks at the end)
- Catalog/storage consistency at final state
Areas with gaps:
asOf/garbageLifetime: All tests use asOf far in the future, bypassing time-based filtering entirely. No verification that the time boundary is respected.- Nascent tries: Tests run compaction to completion, so
nascentstate is hard to observe by GC. We have no exact verification thatnascenttries are protected. - Mid-operation consistency: Only final state is checked; transient violations during concurrent ops would be missed.
In general, believe the current tests are good at catching L0 deletion, convergence failures, and that live l1+ files are being kept, but could miss time-based or nascent-related bugs.
TODO
- Adding tests for l3+ compaction and GC given observed bugs with l2 GC (
delete-triesdoesn't remove garbage l02 files from the trie catalogΒ #5101) - Handle time within simulator mocks
- Currently not handling time - GC collects anything marked as garbage regardless of asOf/garbageLifetime
- This bypasses the main mechanism preventing premature GC and can cause race conditions in the test framework itself
- Need clock simulation or controlled timestamps relative to garbage-as-of
- Mid-operation trie catalog - > storage consistency checks
- Add helper to list live/nascent tries from a trie catalog.
- Add ability to track delete files by the GC component.
- Want to verify that in the middle of concurrent gc operations:
- All live and nascent tries in a trie catalog are present on storage.
- If we keep track of deletions: for all trie catalogs, no live/nascent files should be within the list of deleted files.
- NOTE: Have tried checking this, though run into a problem where we one system will have compacted and then later GCed a file before another system has even written it for the first time! May be another pointer for us to sort out time in the simulator mocks.
- Add a test with a much higher amount of compaction work from L0 and a number of GC runs per system - see if it flushes out any issues the smaller runs do not
- NOTE: I have actually written one of these, and it bumps into a concurrency problem with how the drivers work. This would be prevented by handling time properly! See https://github.com/danmason/xtdb/blob/bigger-node-sim-compaction-test/core/src/test/kotlin/xtdb/NodeSimulationTest.kt#L739
- Historical tries (L1H/L2H) (lower priority)
- Current tests only use "current" (rc) tries; historical path is similar but untested
Reactions are currently unavailable