-
Notifications
You must be signed in to change notification settings - Fork 886
fix(cli/ssh): Avoid connection hang when workspace is stopped #7201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
2ffea3d
to
cd9d4ad
Compare
Two issues are addressed here: 1. We were not detecting disconnects due to waiting for Stdin to close (disconnect would only propagate after entering input and failing to write to the connection). 2. In other scenarios, where the connection drop is not detected, we now also watch workspace status and drop the connection when a workspace reaches the stopped state. Fixes: https://github.com/coder/jetbrains-coder/issues/199 Refs: #6180, #6175
cd9d4ad
to
0f3af93
Compare
03db373
to
b44db5a
Compare
bcf9550
to
918cf07
Compare
fb11493
to
f876e84
Compare
8b539a1
to
767eb3b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic tests too!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried it with Gateway and confirmed it fixes the workspace restart issue! 🎉
This is probably a dumb question but does this mean there could be some problem with a disconnect not getting detected or propagated? Like if this is going over Tailscale for example I would imagine there is some kind of ping/pong that would normally close the connection after a timeout but in Gateway the proxy command hangs indefinitely which seems weird. Maybe that is by design though to keep retrying in the background or something.
Awesome, thanks for testing!
Not a dumb question at all, and indeed, that can happen. Since we want to tolerate network issues, it’s not clear when to decide to cut off a connection. For instance, we recently made a change in #7196 which keeps the connection alive for 72h, I’m not sure whether it applies to both client and server, but it would make sense (to me) to be server only (then again, depending on client machine sleep behavior, might make sense there too). Another example of where the connection would remain active is if the agent is running in a VM and you do |
Ahhhhhh that makes sense! 72 hours seems like an excessively long time but maybe that is to keep it alive over the weekend or something? I tried adding |
Two issues are addressed here:
(disconnect would only propagate after entering input and failing to
write to the connection).
also watch workspace status and drop the connection when a workspace
reaches the stopped state.
Fixes: https://github.com/coder/jetbrains-coder/issues/199
Refs: #6180, #6175