-
Notifications
You must be signed in to change notification settings - Fork 892
Agent / PTY data race reading and resizing #3236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
looking into this more, I believe the race condition is in creack/pty -- when calling Setsize(), it includes a "bare" call to get the file descriptor, which it uses to set an ioctl. Grabbing the file descriptor in this way doesn't go through the fdmutex on the file. The "modern" go solution to this, since Go 1.12, is to use f.SyscallConn() to wrap the ioctl. I'm testing this out locally and will see if the upstream wants a PR. Unfortunately it seems to break their riscv compiler, which is still on go 1.6 😱 , so they may not want it. We could consider forking... |
@kylecarbs @dwahler @mafredri do you think this is worth forking right away over, or should we wait and see whether upstream will accept a PR? |
Nice find, whilst refactoring Would guarding resize and close by the same mutex sufficiently protect against this case? If so, we could consider doing that instead of forking. Otherwise, I'd say go forth and fork. |
Didn’t see this for a few CI runs now with #3270. If it doesn’t resurface we could consider this fixed? |
Sorry I missed #3270 - I don’t think that actually fixes the race as you’ll notice from the stack traces that the race ends up between Resize and Read. We have a goroutine that copies from the SSH session to the TTY file. I think what’s happening is that when the file is Closed, this doesn’t trigger the finalizer because we are still copying. It appears they are using a panic on the file to capture the EOF, which then is able to actually finalize the file. Net is that in order to work around the issue we would probably need to include Read and Write in the mutex’d operations. That feels pretty annoying given that go already has a mutex that handles this stuff. |
@spikecurtis The fact that My suspicion was then that we're calling Lines 456 to 463 in 74c8766
The race would still be there if the I thought about guarding |
I agree that we are almost certainly the ones calling Close. But, there are three interacting goroutines (Close, Resize, Read). Yes, the race only occurs when Close is called, but fundamentally there is a problem between Resize and Read. Making Close and Resize mutually exclusive might narrow the window, but the race is still there. Basically it goes like this:
1 and 2 can’t be concurrent because of the mutex, but 2 and 3 can, and we’ll have a race in that case. |
I'm not sure I fully follow because that sounds like the case that is fixed by #3270. I.e. 2 and 3 are racy iff This is not accounting for other ways the |
The mutex doesn’t prevent Close from being called! It just prevents it from being concurrent with Resize. The race is possible any time Close is called, not just when it happens concurrently with Resize. |
I don't think there's any race between So calls to |
@mafredri and I talked and I misunderstood the fix he made. It prevents the call to Setsize after the Close, so that should prevent a race with Read. |
Spotted during CI
Note that this trace is from a branch, and so line numbers might not be accurate to what is in main.
It's interesting that
destroy
is in the top stack --- which might mean the race occurs while the TTY is being closed and file descriptors being cleaned up.The text was updated successfully, but these errors were encountered: