-
Notifications
You must be signed in to change notification settings - Fork 887
Fix websocket/yamux/drpc goleak failure on Windows #317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Pretty sure it's this: It's using a |
Just confirmed with nhooyr that's it! Working on a fix now. |
Closes #317. We depended on the context canceling the yamux connection, but this isn't a sync operation. Explicitly calling close ensures the handler waits for yamux to complete before exit.
Turns out that is a leak but isn't the cause of our problem. Not synchronously closing yamux caused this leak. |
Closes #317. The httptest server cancels the context after the connection is closed, but if a connection takes a long time to close, the request would never end. This applies a context to the entire listener that cancels on test cleanup. After discussion with @bryphe-coder, reducing the parallel limit on Windows is likely to reduce failures as well.
Closes #317. The httptest server cancels the context after the connection is closed, but if a connection takes a long time to close, the request would never end. This applies a context to the entire listener that cancels on test cleanup. After discussion with @bryphe-coder, reducing the parallel limit on Windows is likely to reduce failures as well.
Closes #317. The httptest server cancels the context after the connection is closed, but if a connection takes a long time to close, the request would never end. This applies a context to the entire listener that cancels on test cleanup. After discussion with @bryphe-coder, reducing the parallel limit on Windows is likely to reduce failures as well.
Closes #317. The httptest server cancels the context after the connection is closed, but if a connection takes a long time to close, the request would never end. This applies a context to the entire listener that cancels on test cleanup. After discussion with @bryphe-coder, reducing the parallel limit on Windows is likely to reduce failures as well.
* fix: Leaking yamux session after HTTP handler is closed Closes #317. The httptest server cancels the context after the connection is closed, but if a connection takes a long time to close, the request would never end. This applies a context to the entire listener that cancels on test cleanup. After discussion with @bryphe-coder, reducing the parallel limit on Windows is likely to reduce failures as well. * Switch to windows-2022 to improve decompression * Invalidate cache on matrix OS
* feat: Add workspace agent for SSH This adds the initial agent that supports TTY and execution over SSH. It functions across MacOS, Windows, and Linux. This does not handle the coderd interaction yet, but does setup a simple path forward. * Fix pty tests on Windows * Fix log race * Lock around dial error to fix log output * Fix context return early * fix: Leaking yamux session after HTTP handler is closed Closes #317. We depended on the context canceling the yamux connection, but this isn't a sync operation. Explicitly calling close ensures the handler waits for yamux to complete before exit. * Lock around close return * Force failure with log * Fix failed handler * Upgrade dep * Fix defer inside loops * Fix context cancel for HTTP requests * Fix resize
Fixing the yamux portion helped, but nhooyr/websocket still has it's leak. https://github.com/coder/coder/runs/5265928205?check_suite_focus=true#step:7:44 |
There is failure that occurs somewhat regularly in CI - the test pass, but
goleak
reports a failure.Full Failure
Link to example manifestation: https://github.com/coder/coder/runs/5239756778?check_suite_focus=true#step:7:44
I suspect we're failing to close a connection or server - it seems like this is stemming from
and an issue where a connection is staying open between the provisioner <-> coderd host.
Interestingly, I've only observed this particular failure on Windows runs.
The text was updated successfully, but these errors were encountered: