Fix TestAggregatedAPIServer flake from cleanup ordering#137336
Fix TestAggregatedAPIServer flake from cleanup ordering#137336Jefftree wants to merge 1 commit intokubernetes:masterfrom
Conversation
|
Skipping CI for Draft Pull Request. |
|
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Jefftree The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
During test teardown, defer os.RemoveAll(wardleCertDir) ran before t.Cleanup(cancel), deleting the wardle server's certificate files while the server was still running. The fsnotify file watcher would then fail trying to re-add watches on the deleted files. Fix the ordering by using defer cancel() (runs in the defer phase) and t.Cleanup for os.RemoveAll (runs after all defers). This ensures the context is cancelled first, signaling the wardle server to shut down, before its certificate files are removed.
5441f95 to
56febb3
Compare
|
/assign @BenTheElder |
What type of PR is this?
/kind flake
What this PR does / why we need it:
Fixes a test flake caused by incorrect cleanup ordering. During teardown,
defer os.RemoveAll(wardleCertDir)ran beforet.Cleanup(cancel)(Go runs all defers before anyt.Cleanupfunctions), so the wardle server's cert files were deleted while it was still running. The fsnotify file watcher would then fail trying to re-watch the deleted files.The fix uses
defer cancel()andt.Cleanup(os.RemoveAll)so the context is cancelled first (stopping the server) before files are removed.from the log below:
Not 100% sure this will fix the flake, but I think it improves our edge case handling.
Which issue(s) this PR is related to:
Fixes #137207
Special notes for your reviewer:
N/A
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: