Thanks to visit codestin.com
Credit goes to github.com

Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Conversation

jamesqo
Copy link
Contributor

@jamesqo jamesqo commented Aug 24, 2016

Currently Queue.Enumerator.MoveNext uses the modulo operator, which is very slow (idiv has lots of latency compared to other instructions). Additionally, the method it calls that uses it (GetElement) is not being inlined.

This PR removes such use of the % operator and replaces it with simple branch/subtraction operations. I also separated a part of the method that will not get called very often into a new MoveNextRare function, which decreases the code size for the main codepath significantly.

Disassembly before and after

I did some benchmarking and this change brings the average running time for a foreach loop from ~2.9s to ~1.7s. (microbenchmark)

Fixes #10854

cc @ianhays, @GSPP

edit: forgot to cc @omariom who wrote #2515

@omariom
Copy link
Contributor

omariom commented Aug 25, 2016

👍
There is one more perf sensitive place where % is used. Dictionary.FindEntry
Any idea how to get rid of it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like unnecessary complexity. I'd rather leave it as it was and then see the JIT improve so that this and all other such cases benefit. This new code is, IMO, much harder to follow than the previous version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@jamesqo
Copy link
Contributor Author

jamesqo commented Aug 25, 2016

@omariom: @GSPP mentioned something here, however I'm not sure how to implement it.

@jamesqo
Copy link
Contributor Author

jamesqo commented Aug 26, 2016

@stephentoub PR feedback is done; thanks for reviewing.

@jamesqo
Copy link
Contributor Author

jamesqo commented Aug 26, 2016

Test Innerloop OSX Debug Build and Test
Test Innerloop Ubuntu14.04 Debug Build and Test

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As previously commented, is this really saving enough / providing enough value to be worth churning the code, making it harder to read, etc.? What measurement are you using?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep this PR to exactly what it was meant to be about: removing the modulo. If it's then justifiable to do further surgery, that can be done separately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub OK, sure. I've done testing and omitting the result variable doesn't seem to change much, so the changes in this PR are now as minimal as possible.

@stephentoub
Copy link
Member

LGTM. Thanks for the improvement, @jamesqo.

@stephentoub stephentoub merged commit 3e0888e into dotnet:master Aug 27, 2016
@jamesqo jamesqo deleted the queue-nomod branch August 27, 2016 13:05
@karelz karelz modified the milestone: 1.1.0 Dec 3, 2016
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants