daemon/logger: fix goroutine leak in ringLogger when Log() blocks#52043
Open
4RH1T3CT0R7 wants to merge 1 commit intomoby:masterfrom
Open
daemon/logger: fix goroutine leak in ringLogger when Log() blocks#520434RH1T3CT0R7 wants to merge 1 commit intomoby:masterfrom
4RH1T3CT0R7 wants to merge 1 commit intomoby:masterfrom
Conversation
When the underlying logger's Log() method blocks (e.g. due to an unresponsive fluentd/syslog backend), the run() goroutine gets stuck and Close() hangs forever on wg.Wait(), leaking the goroutine. Fix this by running Log() in a sub-goroutine within run(), with a select on a new closeCh channel. When Close() is called, closeCh is closed, allowing run() to exit promptly. The in-flight Log() goroutine is tracked via orphanedLog so that Close() waits for it (up to 5s) before accessing the underlying logger, preventing concurrent Log()/Close() calls that can panic in some drivers. Fixes moby#51301 Signed-off-by: 4RH1T3CT0R7 <[email protected]>
e7802c7 to
8440181
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ringLogger.run()when the underlying logger'sLog()method blocks indefinitely (e.g. unresponsive fluentd/syslog backend), which also causesClose()to hang forever onwg.Wait().Log()in a sub-goroutine withinrun(), using acloseChchannel to allow prompt exit whenClose()is called.Log()goroutine viaorphanedLogso thatClose()waits for it (up to 5 seconds) before accessing the underlying logger, preventing concurrentLog()/Close()calls that can panic in some drivers (e.g. fluentd, as fixed in cf259eb).Background
Issue #51301 reports that when reading a log message times out, the
run()goroutine leaks because it is stuck in a blockingr.l.Log(msg)call. Closing the ring buffer unblocksDequeue()but cannot unblockLog(), soClose()never returns.Root cause
In
ringLogger.run(), the call tor.l.Log(msg)is synchronous. If the underlying logger blocks (network logger with unresponsive backend, full disk, etc.), the goroutine is permanently stuck.Close()callsr.buffer.Close()(which only unblocksDequeue) and thenr.wg.Wait(), which blocks forever.Fix
closeCh chan struct{}: A new channel closed byClose()to signalrun()to stop.Log(): Inrun(), eachr.l.Log(msg)call is wrapped in a goroutine with aselectonlogDone(Log completed) andcloseCh(Close called). This allowsrun()to exit even ifLog()is blocked.orphanedLog chan struct{}: Whenrun()exits viacloseCh, it saves the pendinglogDonechannel.Close()waits on it (with a 5-second timeout) before proceeding to drain and close the underlying logger. This prevents the concurrentLog()/Close()race that previously caused panics in the fluentd driver.Safety guarantees
orphanedLogis written byrun()beforewg.Done(), read byClose()afterwg.Wait()- the WaitGroup provides the happens-before relationship.Log()goroutine is in flight at a time - theselectblocks until the current one completes orcloseChfires.<-logDonepath is taken and run() loops as before.Test plan
TestRingLoggerCloseWithBlockingLogverifiesClose()does not hang whenLog()is blockedTestRingLogger,TestRingCap,TestRingClose,TestRingDrain)go vet ./daemon/logger/passesgo build ./daemon/logger/succeedsFixes #51301