Thanks to visit codestin.com
Credit goes to github.com

Skip to content

fix(cli/ssh): Avoid connection hang when workspace is stopped #7201

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Apr 19, 2023

Conversation

mafredri
Copy link
Member

Two issues are addressed here:

  1. We were not detecting disconnects due to waiting for Stdin to close
    (disconnect would only propagate after entering input and failing to
    write to the connection).
  2. In other scenarios, where the connection drop is not detected, we now
    also watch workspace status and drop the connection when a workspace
    reaches the stopped state.

Fixes: https://github.com/coder/jetbrains-coder/issues/199

Refs: #6180, #6175

Two issues are addressed here:
1. We were not detecting disconnects due to waiting for Stdin to close
   (disconnect would only propagate after entering input and failing to
   write to the connection).
2. In other scenarios, where the connection drop is not detected, we now
   also watch workspace status and drop the connection when a workspace
   reaches the stopped state.

Fixes: https://github.com/coder/jetbrains-coder/issues/199

Refs: #6180, #6175
@mafredri mafredri force-pushed the mafredri/ssh-disconnect branch from cd9d4ad to 0f3af93 Compare April 19, 2023 12:11
@mafredri mafredri force-pushed the mafredri/ssh-disconnect branch from 03db373 to b44db5a Compare April 19, 2023 12:42
@mafredri mafredri force-pushed the mafredri/ssh-disconnect branch from bcf9550 to 918cf07 Compare April 19, 2023 13:04
@mafredri mafredri force-pushed the mafredri/ssh-disconnect branch from fb11493 to f876e84 Compare April 19, 2023 13:29
@mafredri mafredri force-pushed the mafredri/ssh-disconnect branch from 8b539a1 to 767eb3b Compare April 19, 2023 16:18
@mafredri mafredri marked this pull request as ready for review April 19, 2023 16:31
Copy link
Member

@kylecarbs kylecarbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic tests too!

Copy link
Member

@code-asher code-asher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried it with Gateway and confirmed it fixes the workspace restart issue! 🎉

This is probably a dumb question but does this mean there could be some problem with a disconnect not getting detected or propagated? Like if this is going over Tailscale for example I would imagine there is some kind of ping/pong that would normally close the connection after a timeout but in Gateway the proxy command hangs indefinitely which seems weird. Maybe that is by design though to keep retrying in the background or something.

@mafredri
Copy link
Member Author

mafredri commented Apr 19, 2023

I tried it with Gateway and confirmed it fixes the workspace restart issue! 🎉

Awesome, thanks for testing!

This is probably a dumb question but does this mean there could be some problem with a disconnect not getting detected or propagated? Like if this is going over Tailscale for example I would imagine there is some kind of ping/pong that would normally close the connection after a timeout but in Gateway the proxy command hangs indefinitely which seems weird. Maybe that is by design though to keep retrying in the background or something.

Not a dumb question at all, and indeed, that can happen. Since we want to tolerate network issues, it’s not clear when to decide to cut off a connection. For instance, we recently made a change in #7196 which keeps the connection alive for 72h, I’m not sure whether it applies to both client and server, but it would make sense (to me) to be server only (then again, depending on client machine sleep behavior, might make sense there too). Another example of where the connection would remain active is if the agent is running in a VM and you do ifdown on the network interface. Not sure how long the connection would remain open in this case (15-25min?), but ultimately would not help to disconnect either.

@mafredri mafredri merged commit c2871e1 into main Apr 19, 2023
@mafredri mafredri deleted the mafredri/ssh-disconnect branch April 19, 2023 18:32
@github-actions github-actions bot locked and limited conversation to collaborators Apr 19, 2023
@code-asher
Copy link
Member

Ahhhhhh that makes sense! 72 hours seems like an excessively long time but maybe that is to keep it alive over the weekend or something? I tried adding ServerAliveInterval on the client side and that works, might add that to the plugin.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants