Thanks to visit codestin.com
Credit goes to github.com

Skip to content

fix: Deadlock and race in peer, test improvements #3086

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jul 21, 2022
Merged

Conversation

mafredri
Copy link
Member

This PR fixes a few edge cases that I noticed whilst combing over the peer package. Some of the tests were also updated in an attempt to increase robustness and improve cleanup (e.g. closing channels and connections).

  • fix: Potential deadlock in peer.Channel dc.OnOpen
  • fix: Potential send on closed channel
  • fix: Improve robustness of waitOpened during close
  • chore: Simplify statements
  • fix: Improve teardown and timeout of peer tests
  • fix: Improve robustness of TestConn/Buffering test

After a while, I noticed what I was really debugging was the resurfacing of #1644 (I believe). These changes do not fix that issue.

@mafredri mafredri self-assigned this Jul 21, 2022
@mafredri mafredri requested review from kylecarbs and a team July 21, 2022 13:05
Comment on lines 113 to 117
if c.isClosed() {
return
}
select {
case <-c.closed:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this check? Is it a problem if sendMore gets a value in the buffered channel? I don't think it matters which one is prioritized. Either one will exit the select statement and "return". If sendMore is sent a write will happen. But if you close the connection, a write will also happen because the <-c.sendMore gets unblocked.

if c.isClosed() {
	return
}

Idk, just seems redundant.

Grabbing the closed lock doesn't seem necessary because of the same argument.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both (check and mutex) are needed because c.sendMore is closed in c.closeWithError, so we want to guard against potentially sending on a closed channel (which would panic).

By holding the mutex, we ensure that closure doesn't happen between the isClosed() check and send on c.sendMore.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh btw, I was also first of the impression that the check might not be needed, but without it the pion/sctp library may try to unlock an unlocked mutex:

fatal error: sync: Unlock of unlocked RWMutex

goroutine 447 [running]:
runtime.throw({0xe70b28?, 0x0?})
	/usr/local/go/src/runtime/panic.go:992 +0x71 fp=0xc0001d26c8 sp=0xc0001d2698 pc=0x469d71
sync.throw({0xe70b28?, 0x418f30?})
	/usr/local/go/src/runtime/panic.go:978 +0x1e fp=0xc0001d26e8 sp=0xc0001d26c8 pc=0x499abe
sync.(*RWMutex).Unlock(0xc000158550)
	/usr/local/go/src/sync/rwmutex.go:201 +0x7e fp=0xc0001d2728 sp=0xc0001d26e8 pc=0x4afbfe
github.com/pion/sctp.(*Association).handleChunk.func1()
	/home/maf/go/pkg/mod/github.com/pion/[email protected]/association.go:2244 +0x3a fp=0xc0001d2748 sp=0xc0001d2728 pc=0x77e4da
panic({0xdb16e0, 0xfd2780})
	/usr/local/go/src/runtime/panic.go:844 +0x258 fp=0xc0001d2808 sp=0xc0001d2748 pc=0x4697d8
runtime.selectgo(0xc0001d29c8, 0xc0001d2998, 0x47ad94?, 0x1, 0x7fc6abe90008?, 0x0)
	/usr/local/go/src/runtime/select.go:516 +0xf3c fp=0xc0001d2968 sp=0xc0001d2808 pc=0x47dfbc
github.com/coder/coder/peer.(*Channel).init.func1()
	/home/maf/src/coder/peer/channel.go:109 +0xd0 fp=0xc0001d29f8 sp=0xc0001d2968 pc=0xb13710
github.com/pion/sctp.(*Stream).onBufferReleased(0xc00020e500, 0x16a0)
	/home/maf/go/pkg/mod/github.com/pion/[email protected]/stream.go:357 +0x4af fp=0xc0001d2a80 sp=0xc0001d29f8 pc=0x79faef
github.com/pion/sctp.(*Association).handleSack(0xc000158540, 0xc000c32000)
	/home/maf/go/pkg/mod/github.com/pion/[email protected]/association.go:1623 +0x9bb fp=0xc0001d2c28 sp=0xc0001d2a80 pc=0x776a1b
github.com/pion/sctp.(*Association).handleChunk(0xc000158540, 0xc000e02d60?, {0xfd88e8?, 0xc000c32000?})
	/home/maf/go/pkg/mod/github.com/pion/[email protected]/association.go:2288 +0x30d fp=0xc0001d2df0 sp=0xc0001d2c28 pc=0x77d88d
github.com/pion/sctp.(*Association).handleInbound(0xc000158540, {0xc000e02d60, 0x1c, 0x1c})
	/home/maf/go/pkg/mod/github.com/pion/[email protected]/association.go:603 +0x505 fp=0xc0001d2ec0 sp=0xc0001d2df0 pc=0x769585
github.com/pion/sctp.(*Association).readLoop(0xc000158540)
	/home/maf/go/pkg/mod/github.com/pion/[email protected]/association.go:521 +0x29c fp=0xc0001d2fc0 sp=0xc0001d2ec0 pc=0x76783c
github.com/pion/sctp.(*Association).init.func2()
	/home/maf/go/pkg/mod/github.com/pion/[email protected]/association.go:339 +0x3a fp=0xc0001d2fe0 sp=0xc0001d2fc0 pc=0x765a1a
runtime.goexit()
	/usr/local/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc0001d2fe8 sp=0xc0001d2fe0 pc=0x49edc1
created by github.com/pion/sctp.(*Association).init
	/home/maf/go/pkg/mod/github.com/pion/[email protected]/association.go:339 +0x12a

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will a select send something on a closed channel? If so TIL.

👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, if you try to run this playground a couple of times, you will notice that sometimes it exits 0, and sometimes panics: https://go.dev/play/p/c35kE0948kl

Copy link
Member

@kylecarbs kylecarbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow is this thorough 😍! Awesome work.

@mafredri mafredri merged commit e33a749 into main Jul 21, 2022
@mafredri mafredri deleted the mafredri/peer-fixes branch July 21, 2022 15:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants