-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Description
Please answer these questions before submitting your issue. Thanks!
What version of Go are you using (go version)?
go1.7
What operating system and processor architecture are you using (go env)?
linux
What did you do?
We have a simple Stubby-alike RPC protocol ("Sake"), which spins up input and output goroutines. We don't currently force a handshake, so input and output race to handshake. (I'll be fixing that after I file this bug.)
- "Read" and "Write" goroutines race to handshake, racing for
inat the point where the handshake routine tries to give uphandshakeMutexand grabin. - "Read" gets
inand completes handshake. - "Read" gets
in, and blocks forever, waiting for input. Because this is a simple client-server protocol, with client-initiated communication, after the connection is established and the handshake is finished, nothing will ever be read except in response to a client-initiated request. - "Write" gets stuck forever trying to take
ininside ofHandshake().
The comment here https://github.com/golang/go/blob/5a589904a3/src/crypto/tls/conn.go#L1204 exactly describes the problem.
It was introduced in this commit: af125a5#diff-ef0187d9cfe69a02cab179f844c8e712R1167
What did you expect to see?
Successful connections and communications.
What did you see instead?
The output thread stuck indefinitely until a redeploy of the other end tears down all the connections.
Notes
In our staging environment, I have a test client running on two hosts, each making 300 connections to a test server. It took me six or seven restarts to see this problem happen on one connection, so let's guess one attempt in two thousand. Ish.
An apology: it should be possible to reproduce in a test case, but I haven't the time to write it right now: set up a simple server, using TLS, and repeatedly try to connect pairs of input/output goroutines. A watchdog goroutine checks for hung connections.