-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Fix to issue #2257 - Trivial change to make Queue<T>'s Enqueue / Dequeue twice faster #2515
Conversation
newcapacity = _array.Length + MinimumGrow; | ||
} | ||
SetCapacity(newcapacity); | ||
Grow(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this refactored into a separate method? Does it help with inlining?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it has nothing to do with inlining. It was just instinct :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. In that case please revert it; if it's beneficial to separate it out, that can be done separately.
Ok, this is what I've found. lea eax,[rdx+1]
mov dword ptr [rsi+1Ch],eax
cmp r8d,eax
jne 00007FF8F4E708E8
xor eax,eax
mov dword ptr [rsi+1Ch],eax to lea rax,[rsi+1Ch]
mov edx,dword ptr [rax]
inc edx
mov dword ptr [rax],edx
mov rcx,qword ptr [rsi+8]
cmp dword ptr [rcx+8],edx
jne 00007FF8F4E80920
xor edx,edx
mov dword ptr [rax],edx That's probably because JIT is not yet perfect at inlining methods with ref params pointing to fields. But with @sharwell's suggestion, looking like: int tail = _tail;
_array[tail] = item;
Increment(ref tail);
_tail = tail; JIT does a perfect job of inlining the call to Increment: inc eax
cmp ecx,eax
jne 00007FF8F4E7079D
xor eax,eax
mov dword ptr [rsi+1Ch],eax With the changes enqueuing a million ints takes about 5.5 ms. The results for Dequeue: 9.3 ms vs 3 ms. |
Thanks, @omariom. Sounds like adopting both Nick's and Sam's suggestions is the way to go. |
@omariom: Keep in mind the same thing (local variable to ensure single-update to the field) can be applied to the location where |
I've made the discussed changes: https://github.com/omariom/corefx/commit/8c22635ac1e38433244b8e7351c7d653906a4db6 What to do now with Grow method? Should I create a separate issue or just a PR will be enough? |
I have to say I'm disappointed that the code has gotten repetitive. I'd really like to capture everything in the increment helper if at all possible. cc @CarolEidt to see if there's something else we can use to get around the sub-optimal inline... |
I can turn it back to usage of the fields (without copies). It was good enough already. |
Thanks, @omariom. I'd suggest not worrying about the increased number of instructions for now; as you say, that's something that'll just get better as the backend improves. We'd have the helper like: private void MoveNext(ref int value)
{
int tmp = value + 1;
value = (tmp == _array.Length) ? 0 : tmp;
} which would be used at call sites like: _array[_tail] = item;
MoveNext(ref _tail); // instead of _tail = (_tail + 1) % _array.Length;
_size++;
_version++; and that should provide the bulk of the wins while keeping the call sites simple. |
@stephentoub |
Thanks, but what about the local temp in the helper? Can you also please squash this down to a single commit? |
Fixed and squashed 👌 |
// Increments the index wrapping it if necessary. | ||
private void MoveNext(ref int index) | ||
{ | ||
// It is tempting to use the reminder operator here but it is actually much slower |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: remainder, not reminder
Thanks! Perf still looks good? |
Dequeue is about 3.5 ms, Enqueue 4.5-5 ms. Even better than twice. |
LGTM. Thanks! |
Fix to issue #2257 - Trivial change to make Queue<T>'s Enqueue / Dequeue twice faster
int tmp = index + 1; | ||
index = (tmp == _array.Length) ? 0 : tmp; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this be better?
private void MoveNext(ref int index)
{
index++;
if (index == _array.Length)
{
index = 0;
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this be better?
Have you tried it? I would expect "no", since every read/write on index
needs to go through that ref
(hence the use of the tmp
here), but if you find otherwise and have data to back it up, PRs are welcome. 😄 (It's possible subsequent JIT improvements in the last year have helped, too.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just wondering. But the ref
access does make a difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would the C# compiler automatically inline the MoveNext method in the release build? If so, then ref won't be slower than a local variable. Just wondering...
Fix to issue dotnet/corefx#2257 - Trivial change to make Queue<T>'s Enqueue / Dequeue twice faster Commit migrated from dotnet/corefx@51f757c
The issue: https://github.com/dotnet/corefx/issues/2257
This PR replaces usage of reminder operator (which is fairly slow) in Queue's Enqueue / Dequeue / Contains methods with simple boundary check.