Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

stephentoub
Copy link
Member

Method Mean
Existing 1,279.55 ns
Utf8IsValid 65.23 ns
private Utf8MessageState _state = new();
private byte[] _utf8Data = """
    Shall I compare thee to a summer’s day?
    Thou art more lovely and more temperate:
    Rough winds do shake the darling buds of May,
    And summer’s lease hath all too short a date:
    Sometime too hot the eye of heaven shines,
    And often is his gold complexion dimm’d;
    And every fair from fair sometime declines,
    By chance or nature’s changing course untrimm’d;
    But thy eternal summer shall not fade
    Nor lose possession of that fair thou owest;
    Nor shall Death brag thou wander’st in his shade,
    When in eternal lines to time thou growest:
    So long as men can breathe or eyes can see,
    So long lives this and this gives life to thee.
    """u8.ToArray();

[Benchmark]
public bool Existing() => TryValidateUtf8(_utf8Data, true, _state);

[Benchmark]
public bool Utf8IsValid() => Utf8.IsValid(_utf8Data);

Copy link
Contributor

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

@EgorBo
Copy link
Member

EgorBo commented Jul 14, 2024

Does such a big difference imply that the currently used algorithm is suboptimal? (for non-complete data)

@stephentoub
Copy link
Member Author

stephentoub commented Jul 14, 2024

Does such a big difference imply that the currently used algorithm is suboptimal? (for non-complete data)

There's no vectorization or ASCII fast path, but rather multiple operations per byte.

I don't know how common it is though to split text messages into multiple fragments. If it's important, we could easily accelerate through ASCII portions. For anything more complicated, I'd prefer we expose an OperationStatus-based (or similar) validation API on Utf8 this could just use.

I added the fast part I added as I expect it's the most important case, and Utf8 makes it trivial to handle better.

Copy link
Member

@CarnaViire CarnaViire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

But it seems like we (historically) don't have any tests for this method (because they were left in https://github.com/aspnet/WebSockets/blob/aa63e27fce2e9202698053620679a9a1059b501e/test/Microsoft.AspNetCore.WebSockets.Protocol.Test/Utf8ValidationTests.cs)

Can you please bring them in? And see if they also cover the changed (ascii skip) part? Thanks!

@stephentoub
Copy link
Member Author

Can you please bring them in? And see if they also cover the changed (ascii skip) part? Thanks!

Done. Thanks.

@stephentoub stephentoub requested a review from CarnaViire July 15, 2024 18:03
Copy link
Member

@CarnaViire CarnaViire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@CarnaViire CarnaViire merged commit 3212e3f into dotnet:main Jul 15, 2024
@stephentoub stephentoub deleted the wsutf8 branch July 22, 2024 16:12
@github-actions github-actions bot locked and limited conversation to collaborators Aug 22, 2024
@karelz karelz added this to the 9.0.0 milestone Sep 3, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants