-
Notifications
You must be signed in to change notification settings - Fork 1k
feat: add immortal streams manager #19225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
b81d8df
to
fa4eff3
Compare
27da7ef
to
e5be506
Compare
fa4eff3
to
6e48486
Compare
e5be506
to
77e912f
Compare
ea83092
to
bae956a
Compare
16abe05
to
dde9516
Compare
2fbfcb1
to
b4276f8
Compare
7468299
to
b2188f9
Compare
0b6e27e
to
9bcaa2f
Compare
45558ec
to
064514e
Compare
e256a4a
to
5c50940
Compare
5c50940
to
c0cba16
Compare
95dc01a
to
85c505d
Compare
85c505d
to
9cafe05
Compare
57b8912
to
fd99a7f
Compare
9cafe05
to
c65afb6
Compare
4c46fea
to
e81618b
Compare
c65afb6
to
8b5040d
Compare
0f58377
to
2173bc4
Compare
9f720cc
to
293804b
Compare
c00f439
to
dafdb32
Compare
dafdb32
to
be330a4
Compare
|
||
// Always dial localhost; internal listeners are handled by the dialer. | ||
addr := fmt.Sprintf("localhost:%d", port) | ||
conn, err := m.dialer.DialContext(ctx, addr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are hard-coding the host portion of the address, it doesn't convey any information. Why not make the dialer just take the port?
|
||
// Always dial localhost; internal listeners are handled by the dialer. | ||
addr := fmt.Sprintf("localhost:%d", port) | ||
conn, err := m.dialer.DialContext(ctx, addr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Holding the lock while dialing will prevent creation of any new streams, reconnection to old streams, etc.
|
||
disconnectedAt := stream.LastDisconnectionAt() | ||
|
||
// Prioritize streams that have actually been disconnected over never-connected streams |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this mean that if a client creates a stream and then fails to connect to it, that stream will never get evicted?
RFC says:
- When a new stream is created that would put us over the limit, we transparently kill the oldest, disconnected Stream.
- If all 32 Streams are currently connected, we refuse to evict and return an error on the create API call.
To me, this reads as prioritizing currently connected streams only. Nothing about never-connected vs connected-then-disconnected.
I see there is some ambiguity about what we mean by "oldest." What makes sense to me is to take the newest datetime among created_at, disconnected_at, and use that for a disconnected stream's age.
closed := stream.closed | ||
handshaking := stream.handshakePending | ||
streamDisconnected := !stream.connected | ||
pipeDisconnected := stream.pipe != nil && !stream.pipe.Connected() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be a lot simpler if the pipe is never nil'd out. It's not clear to me why Stream.Close()
needs to set the pipe to nil.
// Closing the channel wakes all waiters exactly once | ||
select { | ||
case <-s.shutdownChan: | ||
// already closed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unhittable. You check whether we are already closed on line 319. Just close the channel without this select.
default: | ||
// already requested; coalesced | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems unnecessarily complicated to have the stream signal the pipe to reconnect, which itself calls back into the stream to wait for a connection. It involves multiple chanels and a sync.Cond.
Instead, on the Agent side, the Reconnect
callback can be a no-op. There is nothing to do on the agent side when the unreliable connection goes down.
Then, when a new connection comes in like this, directly call a method on the backed pipe to do the reconnection & replay.
|
||
// UpdateTailnetConn updates the tailnet connection and agent address. | ||
// This allows the LocalDialer to start using tailscale network routing. | ||
func (d *LocalDialer) UpdateTailnetConn(tailnetConn *tailnet.Conn) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This smells funny. The tailnet Conn should be part of initializing immortal streams in the first place, not added after the fact.
Immortal Streams are part of the Agent's HTTP API, which cannot be accessed until the tailnet network is initialized, so there should be no problem with strict dependencies at initialization.
// connection was delivered to it. If no internal listener exists for the port, | ||
// it returns (nil, false, nil). If an unexpected error occurs while attempting | ||
// to wire up the connection, an error is returned. | ||
func (c *Conn) DialInternalTCP(_ context.Context, dstPort uint16) net.Conn { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment lies about return values. This should return (net.Conn, error)
and report a proper error instead of nil, partly for logging and partly for consistency of style.
|
||
server, client := net.Pipe() | ||
// Deliver the server end to the listener asynchronously. | ||
go handler(server) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems strange to me that we don't wait for the handler --- this is the bit that actually connects to the listener. Waiting for the dial to complete is the usual behavior of a dialer.
logger slog.Logger | ||
|
||
// localDialer handles traditional local network connections | ||
localDialer *net.Dialer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
confusing to have a field called localDialer
on a struct called LocalDialer
.
Do we even need this? It's always set to a zero-valued *net.Dialer
, so you could just instantiate one directly when needed.
d := net.Dialer{}
return d.DialContext(ctx, "tcp", address)
Go automatically handles the pointer receiver.
Added an "Immortal Streams" feature to the agent that maintains persistent TCP connections to local services, allowing clients to reconnect without losing data.
What changed?
immortalstreams
package in the agent that provides persistent TCP connections to local services