-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Improve encoding performance #10124
Improve encoding performance #10124
Conversation
@@ -516,6 +534,7 @@ internal override unsafe int GetCharCount(byte* bytes, int count, DecoderNLS bas | |||
|
|||
// For fallback we may need a fallback buffer | |||
DecoderFallbackBuffer fallbackBuffer = null; | |||
char* charsForFallback; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this used?
In the log i only see src\System\Text\UTF32Encoding.cs(537,19): warning CS0168: The variable 'charsForFallback' is declared but never used [D:\j\workspace\x64_release_w---0575cb46\src\mscorlib\System.Private.CoreLib.csproj]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
@@ -369,7 +371,9 @@ internal override unsafe int GetByteCount(char* chars, int count, EncoderNLS bas | |||
|
|||
// Do our fallback. Actually we already know its a mixed up surrogate, | |||
// so the ref pSrc isn't gonna do anything. | |||
fallbackBuffer.InternalFallback(unchecked((char)ch), ref pSrc); | |||
pSrcForFalback = pSrc; // Avoid passing pSrc by reference to allow it to be enregistered |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo 'falback'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and elsewhere
{ | ||
chars = charsForFallback; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be clearer and help to avoid missed path to set chars in only one place ie
charsForFallback = chars; // Avoid passing chars by reference to allow it to be enregistered
bool result = fallbackBuffer.InternalFallback(byteBuffer, bytes, ref charsForFallback);
chars = charsForFallback;
if (result)
{
LGTM although there is a possibly related test failure. Is there an issue to fix the JIT so it can do this by itself without this ugly workaround? |
The analysis to prove that the address taken does not escape gets complex fast. The JIT does a little bit of it today, the C/C++ optimizing backends do some more of it; but you never want to take address of your perf critical loop control locals in any case and you need to structure your code accordingly. The encoding fallback implementation is pretty poorly structured. I hope we will get a chance to refactor it as part of aligning the implementation with the corefxlab one. The fallback support were not in the code when I have fine tuned the encoding originally. They were added together with the ref on the perf critical locals later by somebody who did not know what he is doing. |
@dotnet-bot test Windows_NT x86 Checked Build and Test please |
This is low-risk stop gap measure to improve encoding performance because of we may not have enough time to improve the test coverage (https://github.com/dotnet/corefx/issues/16334) and refactor to code to use corefxlab high-performance encoding routines for 2.0.
The change is to avoid passing the key loop control variables by ref to the invalid character fallback routines. Taking address of a variable prevents RuyJIT from enregistering it.
Results: UTF8 decoding 1k of ASCII characters is 1.65x faster, similar for other affected codepaths.