Thanks to visit codestin.com
Credit goes to github.com

Skip to content

AXFR Transferer interface usage in transfer.ServeDNS causes goroutine leak. #6548

@ironsdan

Description

@ironsdan

What happened:
Broken connections during AXFR requests cause Transferer plugins to get stuck on channel send, leaking goroutines and their memory. This has repeatedly run our machine running CoreDNS out of memory and caused denial of service.

What you expected to happen:
Broken connections during AXFR requests cancel Transferer plugins using a context or other channel.

How to reproduce it (as minimally and precisely as possible):

  1. Create a dummy file for the CoreDNS file plugin with enough entries to be able to interrupt an AXFR request before it finishes.
  2. Start CoreDNS.
  3. dig @localhost -t axfr example.com
  4. Before 3. finishes interrupt the dig process.

Anything else we need to know?:
Potential fix is to change the Transferer interface to:

Transfer(ctx context.Context, zone string, serial uint32) (<-chan []dns.RR, error)

and add the context in plugin/transfer/transfer.go ServeDNS:

ctx, cancel := context.WithCancel(context.Background())
defer cancel()
// Get a receiving channel from the first Transferer plugin that returns one.
var pchan <-chan []dns.RR
var err error
for _, p := range t.Transferers {
  pchan, err = p.Transfer(ctx, state.QName(), serial)
  if err == ErrNotAuthoritative {
    // plugin was not authoritative for the zone, try next plugin
    continue
  }
  if err != nil {
    return dns.RcodeServerFailure, err
  }
  break
}

and then in the plugin you would:

select {
case <-ctx.Done():
  return
case ch <- []dns.RR{...}:
}

Environment:
N/A

  • the version of CoreDNS: 1.11.1
  • Corefile:
example.com:53 {
  bind 127.0.0.1
  log
  errors
  pprof 0.0.0.0:8080
  file db.example.com
  transfer {
    to *
  }
}
  • logs, if applicable:
[INFO] 127.0.0.1:35559 - 29525 "AXFR IN example.com. tcp 55 false 65535" NOERROR qr,aa 989928 0.293825438s
[ERROR] plugin/errors: 2 example.com. AXFR: write tcp 127.0.0.1:53->127.0.0.1:35559: write: connection reset by peer
  • OS (e.g: cat /etc/os-release):
NAME="Red Hat Enterprise Linux"
VERSION="8.6 (Ootpa)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="8.6"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Red Hat Enterprise Linux 8.6 (Ootpa)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:8::baseos"
HOME_URL=https://www.redhat.com/
DOCUMENTATION_URL=https://access.redhat.com/documentation/red_hat_enterprise_linux/8/
BUG_REPORT_URL=https://bugzilla.redhat.com/

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 8"
REDHAT_BUGZILLA_PRODUCT_VERSION=8.6
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="8.6"
  • Others:
    N/A

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions