Dov/antithesis poc upstream#36512
Draft
DAlperin wants to merge 33 commits into
Draft
Conversation
Contributor
|
Thank you for your submission! We really appreciate it. Like many source-available projects, we require that you sign our Contributor License Agreement (CLA) before we can accept your contribution. I have read the Contributor License Agreement (CLA) and I hereby sign the CLA. 2 out of 3 committers have signed the CLA. |
754deec to
d4373eb
Compare
…older .env mzbuild's _build_locked runs `git clean -ffdX <image_path>` before each build, which wipes any gitignored file in the build context — including the .env we generate. Two fixes: 1. publish:false on antithesis-config so the standard ci.test.build flow skips it entirely on regular nightly builds (where .env never exists). Only build-antithesis.sh / push-antithesis.py builds this image, and they write .env first. 2. Commit a placeholder .env so the file is tracked (survives git clean) and participates in mzbuild's fingerprint computation. build-antithesis.sh overwrites it with real registry refs before the build runs; fingerprint reflects the overwritten content per build.
Add 16 Antithesis properties for Kafka source ingestion (NONE + UPSERT
envelopes) to the scratchbook, plus the workload-side implementation of
upsert-key-reflects-latest-value.
Scratchbook additions:
- sut-analysis Appendix A: kafka source pipeline detail
- existing-assertions: enumerated SUT-side panic/assert sites that are
candidates for Antithesis SDK instrumentation
- property-catalog Category 7: 16 new Kafka/UPSERT properties
- property-relationships clusters 7-10 plus cross-cluster connections
- 16 per-property evidence files
- evaluation/synthesis.md: four-lens review
Workload:
- parallel_driver_upsert_latest_value.py: produces upserts+tombstones
with deterministic randomness, requests a quiet period, polls
mz_source_statistics for catchup, and asserts per-key value match
(two always() assertions + one sometimes() liveness anchor).
- helper_pg / helper_kafka / helper_quiet / helper_random /
helper_source_stats / helper_upsert_source: shared utilities for
subsequent Kafka source properties.
…o-data-duplication
… state-rehydrates-correctly
…config} + transitive deps
…atalog properties
…zable-reads workload driver
… catalog-recovery-consistency workload driver
…imeouts; remove dead upsert.rs (classic) antithesis asserts
…r multi-replica fault coverage
…are RocksDB lock When I added clusterd2 in 4366c9e, both clusterds inherited the DEFAULT_MZ_VOLUMES list, which uses a single named volume scratch:/scratch. Docker named volumes are shared across containers by name, so the two clusterds mounted the same /scratch and contended for RocksDB locks at /scratch/storage/upsert/<id>/<worker>/LOCK. This wedged clusterd1: it could never open its upsert RocksDB ("Resource temporarily unavailable" on the LOCK file), entered Stalled health with "Failed to rehydrate state", broadcast suspend-and-restart, and looped retry-fail-suspend-restart for the entire run. The continuous restart loop drove the upsert feedback-driven snapshot replay path in ways that produced visibly wrong durable state for the source — exactly the upsert-state-rehydrates-correctly assertions caught in the 2026-05-12 05:39 UTC Antithesis report. Fix: give each clusterd its own per-instance named volume for /scratch. The other volumes stay shared because they don't take exclusive locks. Also patch export-compose.py to auto-declare any service-referenced named volume at the top level — Composition only auto-declares DEFAULT_MZ_VOLUMES, so without this the custom names broke `docker compose config`.
…ker thread pausing
…ed in MySQL 8.4 (WRITESET is the default)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Remove these sections if your commit already has a good description!
Motivation
Why does this change exist? Link to a GitHub issue, design doc, Slack
thread, or explain the problem in a sentence or two. A reviewer who has
no context should understand why after reading this section.
If this implements or addresses an existing issue, it's enough to link to that:
Closes
Fixes
etc.
Description
What does this PR actually do? Focus on the approach and any non-obvious
decisions. The diff shows the code --- use this space to explain what the
diff can't tell a reviewer.
Verification
How do you know this change is correct? Describe new or existing automated
tests, or manual steps you took.