-
Notifications
You must be signed in to change notification settings - Fork 216
mcp: fix goroutine leaks in unit tests #496
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
// get a signal when the server process exits | ||
onExit := make(chan struct{}) | ||
go func() { | ||
cmd.Process.Wait() | ||
close(onExit) | ||
}() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just simplified to cmd.Process.Wait()
further down below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the previous logic was serving a purpose: the test would wait at most 5s.
Generally speaking, it's bad form to depend on timing in tests, but pragmatically it can be useful, and in any case we probably should separate this simplification into a separate CL, if we really care about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with removing this if you prefer the more complex setup.
Just sharing my motivation: In my experience it's typically acceptable to let the runtime / CI enforce a timeout in these situations (even if the default timeout of 10m can feel long). For the happy path (no test failures) it makes no difference but you get slightly less complexity in your test. When the test starts to hang, I would notice during local development, where I would expect developers to run it in isolation and make use of the -timeout
flag of go test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, this is ifne.
mcp/server.go
Outdated
if err := ss.mcpConn.Close(); err != nil { | ||
connErr = fmt.Errorf("failed to close mcp connection: %w", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked the code, and it seemed like an oversight to me that the mcpConn
was never closed, but please double-check if I missed something here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should just be ss.Close()
.
The mcpConn is closed automatically when the jsonrpc2 conn is closed (this happens in the overly complex connect
function).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed via a43fb6f
realServer := httptest.NewServer(NewStreamableHTTPHandler(func(*http.Request) *Server { return server }, nil)) | ||
defer realServer.Close() | ||
t.Cleanup(func() { | ||
t.Log("Closing real HTTP server") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given this function currently does not pass the leak check because of #499 , I decided to leave these logs in here which I added during debugging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, makes sense. How about also adding a comment referencing #499. Something like:
// Until we have a way to clean up abandoned sessions, this test will leak goroutines (see #499)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done via f41256c
@findleyr let me know what you think when you have time :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your patience, and for these improvements. I got busy and had to put down this review temporarily.
mcp/server.go
Outdated
if err := ss.mcpConn.Close(); err != nil { | ||
connErr = fmt.Errorf("failed to close mcp connection: %w", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should just be ss.Close()
.
The mcpConn is closed automatically when the jsonrpc2 conn is closed (this happens in the overly complex connect
function).
mcp/transport.go
Outdated
rcErr := r.rc.Close() | ||
|
||
var wcErr error | ||
if r.wc != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a comment that we only allow a nil writer for tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done via 415ff56
realServer := httptest.NewServer(NewStreamableHTTPHandler(func(*http.Request) *Server { return server }, nil)) | ||
defer realServer.Close() | ||
t.Cleanup(func() { | ||
t.Log("Closing real HTTP server") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, makes sense. How about also adding a comment referencing #499. Something like:
// Until we have a way to clean up abandoned sessions, this test will leak goroutines (see #499)
// get a signal when the server process exits | ||
onExit := make(chan struct{}) | ||
go func() { | ||
cmd.Process.Wait() | ||
close(onExit) | ||
}() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the previous logic was serving a purpose: the test would wait at most 5s.
Generally speaking, it's bad form to depend on timing in tests, but pragmatically it can be useful, and in any case we probably should separate this simplification into a separate CL, if we really care about it.
I have addressed all comments and merged the most recent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, just superficial comments at this point. Really appreciate your time and diligence in tracking this down.
My largest comment was that I don't want to add a dependency on go-leak, and would prefer to do this analysis in an ad-hoc manner. (I actually thought I'd already left that feedback, but alas my review was still pending--sorry).
// Connect the server and client... | ||
t1, t2 := mcp.NewInMemoryTransports() | ||
if _, err := s.Connect(ctx, t1, nil); err != nil { | ||
sess1, err := s.Connect(ctx, t1, nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/sess1/serverSession (or ss)
s/sess2/clientSession (or cs)
sess1 and sess2 obscures the fact that these variables have different types.
// get a signal when the server process exits | ||
onExit := make(chan struct{}) | ||
go func() { | ||
cmd.Process.Wait() | ||
close(onExit) | ||
}() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, this is ifne.
handler := mcp.NewStreamableHTTPHandler(func(r *http.Request) *mcp.Server { | ||
return server | ||
}, &mcp.StreamableHTTPOptions{JSONResponse: true}) | ||
}, &mcp.StreamableHTTPOptions{JSONResponse: true, Stateless: true}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why make this Stateless?
This example is demonstrating how to use the API, so using 'Stateless' here may be distracting. Would prefer to leave a comment that this test may leak goroutines, as you've done below.
github.com/google/go-cmp v0.7.0 | ||
github.com/google/jsonschema-go v0.3.0 | ||
github.com/yosida95/uritemplate/v3 v3.0.2 | ||
go.uber.org/goleak v1.3.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I don't think we should add an additional dependency just for this purpose. It seems like handling these as a one-off, every once in a while, is sufficient for now.
Hi Friedrich, we really appreciate this contribution and would like to land it. Let me know if you'd like me to take it over (mea culpa for the review latency--it has been a busy few weeks!). |
This PR fixes goroutine leaks in all unit tests.
To find the leaks I integrated https://github.com/uber-go/goleak and I suggest to keep using it to catch any future regressions.
I used the folowing script to more easily find individual tests which were leaking goroutines: