Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Add opt in initial check run #14087

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

SimonUnge
Copy link
Contributor

@SimonUnge SimonUnge commented Jun 16, 2025

Proposed Changes

This PR introduces a new optional feature to help detect potential data loss scenarios during RabbitMQ node startup. The feature adds a verify_initial_run configuration option that defaults to false. When enabled, nodes will create a marker file called node_initialized.marker on their first startup and use this to verify data consistency on subsequent restarts.

The implementation works by adding a new boot step called initial_run_check that runs after recovery but before the existing empty_db_check step.

If the marker file exists but database tables are empty, this indicates a potential data loss scenario such as corruption, accidental database resets, or split-brain recovery issues. In these cases, the node will fail to start with a specific error cluster_already_initialized_but_tables_empty, giving operators clear indication that manual intervention may be required rather than silently starting with empty data.

Types of Changes

What types of changes does your code introduce to this project?
Put an x in the boxes that apply

  • Bug fix (non-breaking change which fixes issue #NNNN)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause an observable behavior change in existing systems)
  • Documentation improvements (corrections, new content, etc)
  • Cosmetic change (whitespace, formatting, etc)
  • Build system and/or CI

Checklist

Put an x in the boxes that apply.
You can also fill these out after creating the PR.
If you're unsure about any of them, don't hesitate to ask on the mailing list.
We're here to help!
This is simply a reminder of what we are going to look for before merging your code.

  • I have read the CONTRIBUTING.md document
  • I have signed the CA (see https://cla.pivotal.io/sign/rabbitmq)
  • I have added tests that prove my fix is effective or that my feature works
  • All tests pass locally with my changes
  • If relevant, I have added necessary documentation to https://github.com/rabbitmq/rabbitmq-website
  • If relevant, I have added this change to the first version(s) in release-notes that I expect to introduce it

Further Comments

It the testing section, I fairly aggressively delete schema files to force rabbit_table:needs_default_data() returns true. There might be more elegant solutions?

@SimonUnge
Copy link
Contributor Author

@michaelklishin it seems I cannot do PRs directly to rabbitmq/rabbitmq-server anymore so unsure best way to do PRs that makes the tests run?

@mergify mergify bot added the make label Jun 16, 2025
@SimonUnge SimonUnge force-pushed the su_aws/initial_run_check branch from f25f581 to 2d2c70c Compare June 17, 2025 20:33
@SimonUnge SimonUnge marked this pull request as ready for review June 20, 2025 18:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant