-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Use Utf8.IsValid to optimize ManagedWebSocket.TryValidateUtf8 #104865
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
stephentoub
commented
Jul 14, 2024
Method | Mean |
---|---|
Existing | 1,279.55 ns |
Utf8IsValid | 65.23 ns |
Tagging subscribers to this area: @dotnet/ncl |
Does such a big difference imply that the currently used algorithm is suboptimal? (for non-complete data) |
There's no vectorization or ASCII fast path, but rather multiple operations per byte. I don't know how common it is though to split text messages into multiple fragments. If it's important, we could easily accelerate through ASCII portions. For anything more complicated, I'd prefer we expose an OperationStatus-based (or similar) validation API on Utf8 this could just use. I added the fast part I added as I expect it's the most important case, and Utf8 makes it trivial to handle better. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
But it seems like we (historically) don't have any tests for this method (because they were left in https://github.com/aspnet/WebSockets/blob/aa63e27fce2e9202698053620679a9a1059b501e/test/Microsoft.AspNetCore.WebSockets.Protocol.Test/Utf8ValidationTests.cs)
Can you please bring them in? And see if they also cover the changed (ascii skip) part? Thanks!
Done. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.