-
Notifications
You must be signed in to change notification settings - Fork 5k
[API Proposal]: Add APIs to WebSocket which allow it to be read as a Stream #111217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Tagging subscribers to this area: @dotnet/ncl |
It's an interesting idea, though its use seems limited only to cases where you know that only the binary content is being transmitted (e.g. only the audio, no control data, no extra framing). I can see it being useful in cases where you just need an opaque Sample code if someone needed such a public sealed class WebSocketStream : Stream
{
private readonly WebSocket _webSocket;
public WebSocketStream(WebSocket webSocket) => _webSocket = webSocket;
public override bool CanRead => _webSocket.State is WebSocketState.Open or WebSocketState.CloseSent;
public override bool CanWrite => _webSocket.State is WebSocketState.Open or WebSocketState.CloseReceived;
public override bool CanSeek => false;
public override void Flush() { }
public override Task FlushAsync(CancellationToken cancellationToken) => Task.CompletedTask;
public override Task<int> ReadAsync(byte[] buffer, int offset, int count, CancellationToken cancellationToken) =>
ReadAsync(buffer.AsMemory(offset, count), cancellationToken).AsTask();
public override Task WriteAsync(byte[] buffer, int offset, int count, CancellationToken cancellationToken) =>
WriteAsync(buffer.AsMemory(offset, count), cancellationToken).AsTask();
public override async ValueTask<int> ReadAsync(Memory<byte> buffer, CancellationToken cancellationToken = default)
{
ValueWebSocketReceiveResult result = await _webSocket.ReceiveAsync(buffer, cancellationToken);
if (result.MessageType != WebSocketMessageType.Binary)
{
if (result.MessageType == WebSocketMessageType.Close)
{
await _webSocket.SendAsync(ReadOnlyMemory<byte>.Empty, WebSocketMessageType.Close, endOfMessage: true, cancellationToken);
return 0;
}
throw new Exception("Expected binary messages");
}
return result.Count;
}
public override ValueTask WriteAsync(ReadOnlyMemory<byte> buffer, CancellationToken cancellationToken = default) =>
_webSocket.SendAsync(buffer, WebSocketMessageType.Binary, endOfMessage: true, cancellationToken);
public override ValueTask DisposeAsync()
{
Dispose(true);
return default;
}
protected override void Dispose(bool disposing) => _webSocket.Dispose();
public override int Read(byte[] buffer, int offset, int count) => ReadAsync(buffer, offset, count).GetAwaiter().GetResult();
public override void Write(byte[] buffer, int offset, int count) => WriteAsync(buffer, offset, count).GetAwaiter().GetResult();
public override long Length => throw new NotSupportedException();
public override long Position { get => throw new NotSupportedException(); set => throw new NotSupportedException(); }
public override long Seek(long offset, SeekOrigin origin) => throw new NotSupportedException();
public override void SetLength(long value) => throw new NotSupportedException();
} |
Triage: We see the value and it should be a relatively low amount of work to implement. Even a simple GH search shows many users implementing similar wrappers themselves, moving to 10.0. We'll have to figure out a default for how we handle things like close messages, but it should otherwise be straightforward. |
I think we can also take inspiration from the NetworkStream, which does similar thing for a Socket |
@antonfirsov, I see you assigned this to yourself in January. Are you working on it? If not, I'd like to push it forward. |
namespace System.Net.WebSockets;
public class WebSocketStream : Stream
{
internal WebSocketStream();
public WebSocket WebSocket { get; }
public static WebSocketStream Create(WebSocket webSocket, bool ownsWebSocket = false);
public static WebSocketStream CreateWritableMessageStream(WebSocket webSocket);
public static WebSocketStream CreateReadableMessageStream(WebSocket webSocket);
... // relevant Stream overrides
} |
Would you consider an optional parameter to throw instead? For server scenarios this is risky. I'd rather know we coded a perf bug right away. |
I don't think we have anything like that anywhere else in .NET. You could of course wrap the resulting stream in one that just delegates to the wrapped one for anything other than the synchronous methods and has the synchronous methods all throw. |
Kestrel does have an opt-in |
@stephentoub Thanks for keeping this moving. Unfortunately, we didn’t get a chance to share our feedback on the proposal in time. The message below might now be slightly outdated, as I still need to catch up on the API review recording. I’ll post it anyway and plan to follow up tomorrow with any updates or further thoughts. @antonfirsov did some research of the existing community implementations. The results were:
It's also worth taking a closer look at the WCF WebSocketStream implementation that has an interesting approach:
Some raw notes under cut
I think it makes sense to expose the options covering the highlighted differences as a ctor/factory method parameter, for example: namespace System.Net.WebSockets;
public class WebSocketStreamOptions
{
WebSocketMessageType OutgoingMessageType { get; set; } = WebSocketMessageType.Binary;
bool WriteSingleMessagePerStream { get; set; } = false; // EOS is outgoing EOM
bool ReadSingleMessagePerStream { get; set; } = false; // incoming EOM is EOS
bool OwnsWebSocket { get; set; } = false;
TimeSpan DisposeTimeout { get; set; } = TimeSpan.FromSeconds(1); // reflecting the hardcoded timeout from https://github.com/dotnet/runtime/blob/2bd17019c1c01a6bf17a2de244ff92591fc3c334/src/libraries/System.Net.WebSockets/src/System/Net/WebSockets/ManagedWebSocket.cs#L1153
}
public class WebSocketStream : Stream
{
internal WebSocketStream();
public WebSocket WebSocket { get; }
public static WebSocketStream Create(WebSocket webSocket, bool ownsWebSocket = false);
public static WebSocketStream Create(WebSocket webSocket, WebSocketStreamOptions options);
... // relevant Stream overrides
} |
@CarnaViire, with the exception of the dispose draining / timeout, isn't most of that covered by the approved API? It is not hard to implement this on top of WebSocket, so this doesn't need to be everything to everyone. It should address the 90% case simply and well, and I believe the approved proposal does that, no? For something super rare, like wanting to write out text instead of binary, a developer can still use WebSocket directly or layer on their own stream.
Lumping all those options together leads to strange combinations. If WriteSingleMessagePerStream is true, for example, then I would assume the stream doesn't / shouldn't own the websocket, because disposal is then about ending a message, not about closing the underlying websocket. Yet a developer is still presented with the option of setting OwnsWebSocket to true, and is still presented with a disposal timeout that'd never be used. Similarly, having a duplex stream that's somehow responsible for both reading a single message and writing a single message is a confusing mix of concepts. If you instead separate those out into separate methods, you end up with the approved proposal. |
I don't think these combinations are that weird though..?
The Options class approach is also used quite a lot for websockets. And it is handy when the options set is expanded. If we were to add something applicable to all the three factory methods, we'd need to add more and more overloads...
that's exactly why I feel it is important to expose configurability. this is a convenience class, so it should be as convenient as it can IMO...
I believe the specifics of the dispose behavior is an important thing, given that the community implementations are all doing different things there... I'm not against the factory methods in the proposal, but it just feels that if we'd ever need to expand, we'll end up adding the overload with some kind of an options class anyway. |
Yes, and those aren't per "message". I agree bidirectional is the main use case, that's what the Create method is for. Frankly I'd be happy if we just did the Create(WebSocket, bool) overload. Maybe that's the answer. If in the future there's real demand for configuration, another configurable overload can be exposed.
Why must that be a single message? Keeping in mind, as Stephen called out yesterday, that JS in the browser doesn't read partial messages.
For single message? How do you communicate you're done writing the outbound message while still being able to read the inbound message? All that exists in that case to end the message is Dispose, and if you call that, it'd also block until the full read was drained or timed out, and you wouldn't be able to read the rest. Using the same instance for both reading one message and writing one message does not make sense to me. |
Touché 😄
I believe the unidirectional single message case is important enough. It allows for a seamless integration with e.g. JsonSerializer.SerializeAsync(Stream, ...) and JsonSerializer.DeserializeAsync(Stream, ...) (example found in azure-web-pubsub-bridge sample) Exploring Azure WebSockets further led me to another sample "Streaming logs using json.webpubsub.azure.v1 subprotocol and native websocket libraries" which shows that the Azure's subprotocol This makes me wondering if we should include the message opcode parameter, at least to the var ws = new ClientWebSocket();
ws.Options.AddSubProtocol("json.webpubsub.azure.v1");
await ws.ConnectAsync(url, default);
while (ReadNextMessage() is {} message)
{
using var wsStream = WebSocket.CreateWritableMessageStream(ws, WebSocketMessageType.Text);
await JsonSerializer.SerializeAsync(wsStream, message);
} |
Seems ok. What is it you'd like the API shape to be then? |
We had an offline discussion to align on next steps for the WebSocketStream API. TL;DR: We agreed on another iteration to refine the API, focusing on key scenarios and essential parameters, clarifying default disposal behavior, and determining necessary timeouts. Further investigation and scenario analysis are planned before finalizing the proposal. Discussion SummaryTeam Alignment:
API Goals:
Parameters and Configurability:
Disposal Behavior & Default Values:
Timeout Considerations:
Next Steps:
More details under cutDiscussion Summary (Extended)Team Discussion: We had an offline discussion as a team and aligned on the need for another iteration of work and review for the API. API Iteration: Our goal is to cover 90% of the use cases, but we don't fully understand what this 90% entails yet. During the discussion, I brought up an important use case that we had failed to consider earlier, highlighting the need for more thorough investigations. We had a brief investigation before, but it needs to be deeper than that to ensure we cover all necessary aspects. Parameter Agreement: We have agreed that we don't want to have too many parameters for the API. This is a convenience API, so we need to find a good balance that provides just enough configurability without covering every single thing. The current state of the API lacks at least one important parameter, WebSocketMessageType. We need to assess the possibility of additional parameters being expanded in the future. If we need to add one more parameter in the next release, we would have to add it as an overload. Generally, we would like to avoid growing overloads for every new parameter. In such cases, we usually add a property bag (options class) that is passed to the method, so only the property bag grows and the signature of the method remains the same. However, my initial proposal to make these options too extensive was rejected because some of them don't work together. That's why the current iteration with three methods is preferential. For example, the "ownsSocket" parameter only makes sense for the bidirectional case. Parameter Investigation: We agreed that we need an investigation to determine which parameters should be included right now, which have the possibility to be added in the future based on demand, and which are too niche and should be implemented in their own stream. This is a convenience API, so it should be driven by scenarios. We need an investigation of the WebSocket usage scenarios, with a focus on integrations with other APIs and tools, as well as important use cases like Azure PubSub. Based on that, we can determine which scenarios represent the 90% and which we consider too niche. We should also consider cases where the end users didn't actually implement a stream, but adding it will be extremely beneficial. For example, integration with JSON serialization, which makes the code much more compact and clearer. Default Values and Behaviors: We will have to make decisions on the default values and behaviors for the things we don't expose as parameters. The most controversial is the dispose behavior. For duplex stream, we have the ownsWebSocket parameter, which determines whether or not the WebSocket close sequence is triggered on disposal. For single-message read or write stream, we agree that the WebSocket is not owned by that stream. However, if we didn't read the message until the end, what should happen? Options:
Timeout Considerations: The timeout is a question in itself because we can decide to expose it. Even though disposal behavior is an implementation detail, it can actually affect the API shape. The timeout is applicable both to the close handshake (note there is already a hardcoded "drain on close" timeout) and to the draining on single-message-read stream, if we decide to implement it. An open question is whether this draining is actually needed at all. It was present in the WCF WebSocketStream implementation, but we need to check the usages and potentially understand what it was needed for. We know that someone has a WCF WebSocketStream source copy in their code. It would be good to know the reason and whether this draining is used at all. If draining is not needed, and if we verify that there is already a timeout enforced over CloseAsync, this will make things much easier because it will save us the dispose-related parameters. No one seemed to complain about close for all this time, so the hardcoded timeout might work well enough. To-Do Items:
@stephentoub, @MihaZupan, @antonfirsov pls let me know if I failed to capture something, or if you had some additional thoughts since our discussion. Thanks! |
EDITED by @stephentoub 4/7/2025 to add API for review:
Open questions:
Background and motivation
Utilizing WebSockets is a convenient approach to writing real-time audio processing code for ASP.NET applications. One such scenario is implementing a real-time conversation with Open AI.
OpenAI's real-time API SendInputAudioAsync accept a Stream as input which leaves it up to the developer to write a custom Stream implementation that reads from an underlying WebSocket. It would be a nice enhancement to the WebSocket APIs if one could wrap read operations in a Stream.
API Proposal
API Usage
Alternative Designs
No response
Risks
WebSocket doesn’t provide synchronous methods for wire-based operations, so all of the Stream sync APIs (including Dispose, which presumably would need to not just Dispose the WebSocket but also CloseAsync it) would be sync-over-async.
The text was updated successfully, but these errors were encountered: