plugin/reload: potential race conditon with imported files

**What happened**:

If the `Corefile` uses imported files, a race condition may happen during reload, in the sense that the restarted instance may not run with the most recent configuration. Explanation attempt:

As of now, the `InstanceStartupEvent` event hook in https://github.com/coredns/coredns/blob/master/plugin/reload/reload.go#L76 reads and remembers an initial hash value of the current parsed `Corefile`. Here, 'current' means that it indeed (and correctly IMHO) uses the `Corefile` content that was loaded by current instance's `Start()` or `Restart()` methods. However, any imported files are read at execution time of the time of the reload's event `hook()` function, when calling the `parse()` function. 

This can create a race condition. Imagine the following sequence:
1. A reload is triggered for some reason, e.g. by a change of the Corefile or one of the imported files
2. The currently running reload hook does its job, and triggers a `Restart()` of the instance (https://github.com/coredns/coredns/blob/master/plugin/reload/reload.go#L108)
3. Now imagine, another change happens in one of the imported files, during or after executing the `Corefile` (https://github.com/coredns/caddy/blob/master/caddy.go#L246), but before emitting the `InstanceStartupEvent` (https://github.com/coredns/caddy/blob/master/caddy.go#L264). Then some or all of the plugins might use the previous, outdated content of the imported file, but the new reload handler started in the `hook()` function will remember the SHA of the current content. As a consequence, no further reload will happen (unless another change occurs), but plugins might run with a non-up-to-date configuration.

IMHO the actual problem is that the `caddy.Instance`  stores the content of the current `Corefile`, but not the content of potentially imported files. Probably, changing this would be a bigger effort.

As an alternative solution, I created a pull request (https://github.com/coredns/coredns/pull/6244).

**What you expected to happen**:

Such a race condition should not happen.

**How to reproduce it (as minimally and precisely as possible)**:

Import a frequently changing file, and let reload run at a high frequency; and wait ...

**Anything else we need to know?**: no

**Environment**:

- the version of CoreDNS: 1.10.1, master
- Corefile: 
    ```
    .:62529 {
        bind 127.0.0.1
        errors
        log .
	    prometheus
        kubernetes cluster.local in-addr.arpa ip6.arpa {
            kubeconfig /var/folders/qq/3flcldfx1x3gvszg3xfv_13m0000gn/T/2860259645/kubeconfig
            pods insecure
            fallthrough in-addr.arpa ip6.arpa
            ttl 1
        }
        forward . 8.8.8.8 {
            max_concurrent 1000
        }
        loop
        reload 2s 1s
        loadbalance
        import *.override
    }
    ```
    masquerading.override (frequently changing):
    ```
    rewrite name exact mhzavbbgti.gvdmt kubernetes.default.svc.cluster.local
    rewrite name exact jbveovgrmf.bwyke kubernetes.default.svc.cluster.local
    ```
- logs, if applicable: n/a
- OS (e.g: `cat /etc/os-release`): Darwin XXX 22.5.0 Darwin Kernel Version 22.5.0: Mon Apr 24 20:52:24 PDT 2023; root:xnu-8796.121.2~5/RELEASE_ARM64_T6000 arm6 (but happens on linux too)
- Others: n/a


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

plugin/reload: potential race conditon with imported files #6243

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

plugin/reload: potential race conditon with imported files #6243

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions