-
Notifications
You must be signed in to change notification settings - Fork 9
Vectorized HttpUserAgentParser.TryExtractVersion #79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
gfoidl
wants to merge
2
commits into
main
Choose a base branch
from
version-vectorization
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,78 @@ | ||
// Copyright © https://myCSharp.de - all rights reserved | ||
|
||
using System.Runtime.CompilerServices; | ||
using System.Runtime.Intrinsics; | ||
using System.Runtime.Intrinsics.Arm; | ||
using System.Runtime.Intrinsics.X86; | ||
|
||
namespace MyCSharp.HttpUserAgentParser; | ||
|
||
internal static class VectorExtensions | ||
{ | ||
extension(ref char c) | ||
{ | ||
[MethodImpl(MethodImplOptions.AggressiveInlining)] | ||
public Vector128<byte> ReadVector128AsBytes(int offset) | ||
{ | ||
ref short ptr = ref Unsafe.As<char, short>(ref c); | ||
|
||
#if NET10_0_OR_GREATER | ||
return Vector128.NarrowWithSaturation( | ||
Vector128.LoadUnsafe(ref ptr, (uint)offset), | ||
Vector128.LoadUnsafe(ref ptr, (uint)(offset + Vector128<short>.Count)) | ||
).AsByte(); | ||
#else | ||
if (Sse2.IsSupported) | ||
{ | ||
return Sse2.PackUnsignedSaturate( | ||
Vector128.LoadUnsafe(ref ptr, (uint)offset), | ||
Vector128.LoadUnsafe(ref ptr, (uint)(offset + Vector128<short>.Count))); | ||
} | ||
else if (AdvSimd.Arm64.IsSupported) | ||
{ | ||
return AdvSimd.Arm64.UnzipEven( | ||
Vector128.LoadUnsafe(ref ptr, (uint)offset).AsByte(), | ||
Vector128.LoadUnsafe(ref ptr, (uint)(offset + Vector128<short>.Count)).AsByte()); | ||
} | ||
else | ||
{ | ||
return Vector128.Narrow( | ||
Vector128.LoadUnsafe(ref ptr, (uint)offset), | ||
Vector128.LoadUnsafe(ref ptr, (uint)(offset + Vector128<short>.Count)) | ||
).AsByte(); | ||
} | ||
#endif | ||
} | ||
|
||
[MethodImpl(MethodImplOptions.AggressiveInlining)] | ||
public Vector256<byte> ReadVector256AsBytes(int offset) | ||
{ | ||
ref short ptr = ref Unsafe.As<char, short>(ref c); | ||
|
||
#if NET10_0_OR_GREATER | ||
return Vector256.NarrowWithSaturation( | ||
Vector256.LoadUnsafe(ref ptr, (uint)offset), | ||
Vector256.LoadUnsafe(ref ptr, (uint)offset + (uint)Vector256<short>.Count) | ||
).AsByte(); | ||
#else | ||
if (Avx2.IsSupported) | ||
{ | ||
Vector256<byte> tmp = Avx2.PackUnsignedSaturate( | ||
Vector256.LoadUnsafe(ref ptr, (uint)offset), | ||
Vector256.LoadUnsafe(ref ptr, (uint)offset + (uint)Vector256<short>.Count)); | ||
|
||
Vector256<long> tmp1 = Avx2.Permute4x64(tmp.AsInt64(), 0b_11_01_10_00); | ||
|
||
return tmp1.AsByte(); | ||
} | ||
else | ||
{ | ||
return Vector256.Narrow( | ||
Vector256.LoadUnsafe(ref ptr, (uint)offset), | ||
Vector256.LoadUnsafe(ref ptr, (uint)offset + (uint)Vector256<short>.Count) | ||
).AsByte(); | ||
} | ||
#endif | ||
} | ||
} | ||
} |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure the implementation is correct, but should we add some kind of toggle (env variable) with which vectorization can be disabled in case there's a bug in the code (i.e. from some strange user agent that we don't know at the moment), so a user could still use the lib w/o the need to wait for hotfix release?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the environment variable is visible to users. It's kind of hidden black magic and users will remove the library without checking. For me it feels the IndexOf way is the most stable and a very fast solution for now, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IndexOf
is quite good, but for the11.0) like Gecko
benchmark it's +75% slower and such short versions are quite common.The specialized vectorized code is nice, but it's a lot of hard to maintain code so actually I don't like this approach very much.
I'd like to keep this PR open for a moment, so I'll be able to explore other approaches for speed-up too.
The easiest one is to shorten the "window" in
HttpUserAgentParser/src/HttpUserAgentParser/HttpUserAgentParser.cs
Lines 210 to 214 in 4a82130
Further the biggest speed-up may come from a not linear scan through all possible patterns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my head I'm ready with such an approach, but I need to finish a work-project, then I'll prototype it and see how it goes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free, have complete confidence in your everything looks impressively good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first try with that idea is nice, as the user agent only needs to be scaned once ($O(n)$) and not for every possibility.
But for some browsers, where we depend on the order in the arrays, the wrong result is yielded. So I have to research a bit more.
Maybe there will move also some things around (w/o API breaking changes), so I'd like to leave this PR open so that all together can be done or only parts of it.
I hope to continue on this next weeks.