Thanks to visit codestin.com
Credit goes to github.com

Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Conversation

jamesqo
Copy link

@jamesqo jamesqo commented May 10, 2016

Right now many methods in Char are implemented like this (for example):

public static bool IsDigit(char c) => c >= '0' && c <= '9';

This isn't the most efficient way to implement it though; since these methods are often used in a loop, I've taken advantage of the fact that casting to uint causes the value to wrap, which avoids an additional branch. Here is the above method rewritten using this tactic:

// Similar to what @jkotas suggested in dotnet/corefx#7546
public static bool IsDigit(char c) => (uint)(c - '0') <= (uint)('9' - '0');

I've changed a bunch of static methods in Char to take advantage of this fact and avoid an additional branch, along with a couple of other changes:

  • Used stackalloc, instead of creating a new char array on the heap, for ConvertFromUtf32.
  • Many overloads that accept a string and an index can simply forward to the char-based overload.
  • Added HIGH_SURROGATE_END and LOW_SURROGATE_START consts, and removed references to CharUnicodeInfo for getting those values.

Note: Some of the corefx tests are failing with my changes, but unfortunately I'm not sure why/for what chars as the error messages are not very helpful.

cc @JonHanna @jkotas @mikedn @hughbe

edit: Looks like the string-and-index overloads actually can't just forward to the char-based ones, that's likely why the tests are failing. Will fix in a moment.

@jkotas
Copy link
Member

jkotas commented May 10, 2016

cc @AlexGhiondea @ellismg




/*================================= ConvertFromUtf32 ============================
** Convert an UTF32 value into a surrogate pair.
==============================================================================*/

public static String ConvertFromUtf32(int utf32)
public unsafe static String ConvertFromUtf32(int utf32)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a public method. Does adding unsafe affect the public signature?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would, however, like the unsafe modifier scoped to the part of the method that needs it (specifically around the stackalloc). We should strive to minimize the amount of unsafe regions and how much code is in an unsafe context.

Copy link
Author

@jamesqo jamesqo May 12, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ellismg I'm going to move the stackalloc part of this PR into a new PR, since (I'm hoping) it won't introduce any merge conflicts and it doesn't really fit in with the rest of this PR. Will address your feedback there.

@hughbe
Copy link

hughbe commented May 10, 2016

Good stuff! Assuming you've fixed the corefx test failures, this looks good, espcially the ConvertFromUtf32 allocation stuff

@ellismg
Copy link

ellismg commented May 10, 2016

Could you tease out the stack-allocation 8cba55f into it's own commit? It would also be interesting to understand the impact of each of these changes on the relevant microbenchmarks.

@ellismg
Copy link

ellismg commented May 10, 2016

Added HIGH_SURROGATE_END and LOW_SURROGATE_START consts, and removed references to CharUnicodeInfo for getting those values.

If we are going to cleanup here, I would rather just always get the values from CharUnicodeInfo so they constants are defined in one place. They are const in CharUnicodeInfo so there should not be a codegen impact either way.

char* surrogate = stackalloc char[2];
surrogate[0] = (char)((utf32 / 0x400) + HIGH_SURROGATE_START);
surrogate[1] = (char)((utf32 % 0x400) + LOW_SURROGATE_START);
return new string(surrogate, 0, 2);
Copy link

@hughbe hughbe May 10, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what may be more performant, stackallocing a char* and constructing a string, or the following that uses an internal method I've seen in System.Globalization, StringBuilder and String itself:

string result = string.FastAllocateString(2);
fixed(char* pResult = result)
{
     pResult[0] = (char)((utf32 / 0x400) + HIGH_SURROGATE_START);
     pResult[1] = (char)((utf32 % 0x400) + LOW_SURROGATE_START);
}
return result;

@ellismg let me know what you think

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hughbe Maybe, but 1) that's more of an implementation detail (it wouldn't work if you took the code and copy/pasted it outside of mscorlib), and 2) it requires pinning/unpinning the string while we write characters to it, so the benefit from that would be questionable. For now, I'm sticking to stackalloc.

@jamesqo
Copy link
Author

jamesqo commented May 10, 2016

@ellismg Regarding the constant values, maybe we should alias the consts in this file to the ones in CharUnicodeInfo, or add a using static CharUnicodeInfo at the top? I'd like to avoid code duplication as well, but personally I find CharUnicodeInfo.HIGH_SURROGATE_START (for example) a little bit verbose.

@@ -203,7 +203,7 @@ public bool Equals(Char obj)
[Pure]
public static bool IsDigit(char c) {
if (IsLatin1(c)) {
return (c >= '0' && c <= '9');
return (uint)(c - '0') <= (uint)('9' - '0');
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self-note: I'm considering moving all of these boolean returns into a private helper method like the following:

private bool IsBetweenInclusive(char lowerBound, char upperBound)
{
    return (uint)(m_value - lowerBound) <= (uint)(upperBound - lowerBound);
}

// Usage:
c.IsBetweenInclusive('0', '9');

This way we avoid repeating the value for the lower bound, and is less verbose/error prone.

@JonHanna
Copy link

Sorry, I was tagged to take a look at this, but I'm a bit busy and not going to be able to pay much attention to .NET Core things for the next couple of days.
I'll just add two thoughts: The first is that the reduction in branching of the first change described could have very different effects with different degrees of inlining, so if it seems to just break even it might be worth doing anyway. It should still be tested though, and an honest test would try several variations of whether the same branch was taken all the time, alternating, random, etc. and then look at them all.
The second is that I think the stackalloc should definitely be examined separately from the rest. Stackalloc optimisations can sometimes be very disappointing pessimisations, so one should be on guard for that.

@jamesqo
Copy link
Author

jamesqo commented May 14, 2016

Alright, so I finally got around to making perf tests for this PR. Here are the results:

Since I used Parallel.ForEach to calculate the results (I did 1 billion iterations), the results for each char in the old/new files may not be in the same order.

Notes:

  • IsUpper and IsLower seem to be consistently faster for ASCII chars
  • IsDigit and IsNumber have regressions of ~300%, ~25% respectively across the board
  • The numbers of IsSurrogate, IsHighSurrogate and IsLowSurrogate have mostly stayed the same

Would be much appreciated if someone could validate these numbers for me. 😄

@jamesqo jamesqo changed the title [mscorlib] Improve perf for many Char methods [wip] [mscorlib] Improve perf for many Char methods May 14, 2016
@mikedn
Copy link

mikedn commented May 14, 2016

Would be much appreciated if someone could validate these numbers for me

The benchmark code should be adjusted so that the result of char.IsX calls is used. As is now the JIT may eliminate code and then you get skewed results.

Self-note: I'm considering moving all of these boolean returns into a private helper method like the following: private bool IsBetweenInclusive(char lowerBound, char upperBound)

IsBetween generates worse code even if it is inlined.

@jamesqo
Copy link
Author

jamesqo commented May 14, 2016

@mikedn

The benchmark code should be adjusted so that the result of char.IsX calls is used. As is now the JIT may eliminate code and then you get skewed results.

Good point, I had just realized that. I'm going to alter my test scheme to do something like the following:

byte unused = 3;

for (int i = 0; i < Outer; i++)
{
    var watch = Stopwatch.StartNew();
    for (int j = 0; j < Inner; j++)
    {
        if (char.IsX(c)) unused++;
    }
    watch.Stop();
    Console.WriteLine(watch.Elapsed);
}

// At the end...
GC.KeepAlive(unused);

This way, if I understand correctly, the JIT will be forced to generate code for the method as it's being used in a branch.

IsBetween generates worse code even if it is inlined.

Wait, really? Why so? (Also I've refactored it into a new static method named IsIntBetween that takes 3 ints, not sure if this helps or not.)

@mikedn
Copy link

mikedn commented May 14, 2016

Good point, I had just realized that. I'm going to alter my test scheme to do something like the following:

Yeah, that should work.

Wait, really? Why so? (Also I've refactored it into a new static method named IsIntBetween that takes 3 ints, not sure if this helps or not.)

The code that the JIT generates for this kind of method isn't very good, see #914

@jamesqo jamesqo changed the title [wip] [mscorlib] Improve perf for many Char methods [mscorlib] Improve perf for many Char methods May 14, 2016
@ellismg
Copy link

ellismg commented May 15, 2016

maybe we should alias the consts in this file to the ones in CharUnicodeInfo, or add a using static CharUnicodeInfo at the top? I'd like to avoid code duplication as well, but personally I find CharUnicodeInfo.HIGH_SURROGATE_START (for example) a little bit verbose.

My preference would be to just alias them instead of using static.

@jamesqo
Copy link
Author

jamesqo commented May 15, 2016

Ok guys, so I finally got around to making another round of perf tests for this change. Here is the source code, the old results, and the new results.

Since the files are quite large / tedious to go through manually, I wrote a script to analyze the test results (you can view the output here).

Notes:

  • Most of the times have improved from this change (433 vs 127)
  • Most of the regressions seem to come from IsDigit (34)
  • There were zero IsWhiteSpace, IsSymbol or IsPunctuation cases that regressed
    • Only two IsControl cases have regressed
  • All of the IsLetter benchmarks that regressed come from ASCII characters
  • IsUpper only regresses for non-ASCII characters, IsLower for the most part as well (although it has a few regression for ASCII)
  • There were only 5 regressions for IsNumber, and they all came from ASCII

edit: Ok, I've removed all of the ASCII-related changes, and only kept the ones affecting switch statements like IsSymbol, IsWhiteSpace, or IsPunctuation. Here are my final results: https://gist.github.com/anonymous/c5550fbe75bf36a13b94b7859ea127fa

I think this is finally ready to be merged. 😄

@jamesqo jamesqo changed the title [mscorlib] Improve perf for many Char methods [wip] [mscorlib] Improve perf for many Char methods May 15, 2016
@jamesqo jamesqo changed the title [wip] [mscorlib] Improve perf for many Char methods [mscorlib] Improve perf for many Char methods May 15, 2016
@jamesqo
Copy link
Author

jamesqo commented Aug 20, 2016

Closing this for now, I have a better one coming up in the future...

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants