-
Notifications
You must be signed in to change notification settings - Fork 882
fix: Deadlock and race in peer
, test improvements
#3086
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
if c.isClosed() { | ||
return | ||
} | ||
select { | ||
case <-c.closed: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this check? Is it a problem if sendMore
gets a value in the buffered channel? I don't think it matters which one is prioritized. Either one will exit the select statement and "return". If sendMore
is sent a write will happen. But if you close the connection, a write will also happen because the <-c.sendMore
gets unblocked.
if c.isClosed() {
return
}
Idk, just seems redundant.
Grabbing the closed lock doesn't seem necessary because of the same argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both (check and mutex) are needed because c.sendMore
is closed in c.closeWithError
, so we want to guard against potentially sending on a closed channel (which would panic).
By holding the mutex, we ensure that closure doesn't happen between the isClosed()
check and send on c.sendMore
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh btw, I was also first of the impression that the check might not be needed, but without it the pion/sctp
library may try to unlock an unlocked mutex:
fatal error: sync: Unlock of unlocked RWMutex
goroutine 447 [running]:
runtime.throw({0xe70b28?, 0x0?})
/usr/local/go/src/runtime/panic.go:992 +0x71 fp=0xc0001d26c8 sp=0xc0001d2698 pc=0x469d71
sync.throw({0xe70b28?, 0x418f30?})
/usr/local/go/src/runtime/panic.go:978 +0x1e fp=0xc0001d26e8 sp=0xc0001d26c8 pc=0x499abe
sync.(*RWMutex).Unlock(0xc000158550)
/usr/local/go/src/sync/rwmutex.go:201 +0x7e fp=0xc0001d2728 sp=0xc0001d26e8 pc=0x4afbfe
github.com/pion/sctp.(*Association).handleChunk.func1()
/home/maf/go/pkg/mod/github.com/pion/[email protected]/association.go:2244 +0x3a fp=0xc0001d2748 sp=0xc0001d2728 pc=0x77e4da
panic({0xdb16e0, 0xfd2780})
/usr/local/go/src/runtime/panic.go:844 +0x258 fp=0xc0001d2808 sp=0xc0001d2748 pc=0x4697d8
runtime.selectgo(0xc0001d29c8, 0xc0001d2998, 0x47ad94?, 0x1, 0x7fc6abe90008?, 0x0)
/usr/local/go/src/runtime/select.go:516 +0xf3c fp=0xc0001d2968 sp=0xc0001d2808 pc=0x47dfbc
github.com/coder/coder/peer.(*Channel).init.func1()
/home/maf/src/coder/peer/channel.go:109 +0xd0 fp=0xc0001d29f8 sp=0xc0001d2968 pc=0xb13710
github.com/pion/sctp.(*Stream).onBufferReleased(0xc00020e500, 0x16a0)
/home/maf/go/pkg/mod/github.com/pion/[email protected]/stream.go:357 +0x4af fp=0xc0001d2a80 sp=0xc0001d29f8 pc=0x79faef
github.com/pion/sctp.(*Association).handleSack(0xc000158540, 0xc000c32000)
/home/maf/go/pkg/mod/github.com/pion/[email protected]/association.go:1623 +0x9bb fp=0xc0001d2c28 sp=0xc0001d2a80 pc=0x776a1b
github.com/pion/sctp.(*Association).handleChunk(0xc000158540, 0xc000e02d60?, {0xfd88e8?, 0xc000c32000?})
/home/maf/go/pkg/mod/github.com/pion/[email protected]/association.go:2288 +0x30d fp=0xc0001d2df0 sp=0xc0001d2c28 pc=0x77d88d
github.com/pion/sctp.(*Association).handleInbound(0xc000158540, {0xc000e02d60, 0x1c, 0x1c})
/home/maf/go/pkg/mod/github.com/pion/[email protected]/association.go:603 +0x505 fp=0xc0001d2ec0 sp=0xc0001d2df0 pc=0x769585
github.com/pion/sctp.(*Association).readLoop(0xc000158540)
/home/maf/go/pkg/mod/github.com/pion/[email protected]/association.go:521 +0x29c fp=0xc0001d2fc0 sp=0xc0001d2ec0 pc=0x76783c
github.com/pion/sctp.(*Association).init.func2()
/home/maf/go/pkg/mod/github.com/pion/[email protected]/association.go:339 +0x3a fp=0xc0001d2fe0 sp=0xc0001d2fc0 pc=0x765a1a
runtime.goexit()
/usr/local/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc0001d2fe8 sp=0xc0001d2fe0 pc=0x49edc1
created by github.com/pion/sctp.(*Association).init
/home/maf/go/pkg/mod/github.com/pion/[email protected]/association.go:339 +0x12a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will a select send something on a closed channel? If so TIL.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, if you try to run this playground a couple of times, you will notice that sometimes it exits 0, and sometimes panics: https://go.dev/play/p/c35kE0948kl
Co-authored-by: Steven Masley <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow is this thorough 😍! Awesome work.
This PR fixes a few edge cases that I noticed whilst combing over the
peer
package. Some of the tests were also updated in an attempt to increase robustness and improve cleanup (e.g. closing channels and connections).After a while, I noticed what I was really debugging was the resurfacing of #1644 (I believe). These changes do not fix that issue.