fix: prevent SIGTERM/reload deadlock #7562
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
1. Why is this pull request needed and what does it do?
CoreDNS could deadlock when SIGTERM arrived during an in-flight reload. Caddy’s shutdown path held the instance mutex while the reload plugin’s final-shutdown callback attempted to send on an unbuffered channel. The reload goroutine was simultaneously trying to Restart (which needed the same mutex).
Changes to core:
Changes to plugin/reload:
Adds an integration test to validate this works properly.
2. Which issues (if any) are related?
Fixes #7314. Validated locally that the hang no longer happens. Depending on when the kill hits it, you might see this instead:
And then shut down.
3. Which documentation changes (if any) need to be made?
None.
4. Does this introduce a backward incompatible change or deprecation?
No.