[16.0] Harden NI Reconciler to prevent and recover from dnsmasq host file conflicts#5384
Merged
rene merged 2 commits intoNov 12, 2025
Merged
Conversation
…uring removal Previously, when two applications accidentally shared the same DNS host file (for example, due to a duplicate DisplayName), the removal of the second application would fail because the shared file had already been deleted during the removal of the first app. This failure caused the dnsmasq configuration item to enter a broken state, preventing any new applications from being deployed into the same network instance. The only workaround in such cases was to redeploy the entire network instance and all applications connected to it. This change hardens the Dnsmasq configurator by handling missing DNS and DHCP host files gracefully. If a host file is not found during removal, a warning is logged, but the configuration item is not marked as failed. Although the original issue only affected DNS host files, it is also prudent to handle missing DHCP host files in the same way to ensure that their absence does not leave the network instance in a permanently broken state. Signed-off-by: Milan Lenco <[email protected]> (cherry picked from commit 3536a1f)
Normally, deploying two applications with the same DisplayName is not allowed. However, it is possible for user to replace an existing app definition inside EdgeDevConfig with a new one that has a different UUID but reuses the same DisplayName. Because shutting down the original app takes some time, zedmanager may start bringing up the new app while the old one is still connected. In this window, zedrouter may attempt to connect both apps to the same network instance. Since dnsmasq DNS host files are named after the app DisplayName, both apps would end up using the same file. Once the obsolete app is disconnected, it removes the shared host file, breaking name-to-IP resolution for the new app inside the NI. To prevent this, zedrouter now disallows multiple apps with the same DisplayName from being connected to a network instance simultaneously. In such cases, the new app will be marked with an error until the old app is fully removed. Once cleanup completes, the retry timer will reconnect the new app and clear the error state. Signed-off-by: Milan Lenco <[email protected]> (cherry picked from commit 347b860)
7 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Backport of #5361
How to test and validate this PR
This issue was originally detected during internal automated testing. Re-running the same test suite should confirm whether the fix is effective.
To reproduce and verify manually:
Changelog notes
Checklist