-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Improve speed in projections/geo.py #22677
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
9602112
to
af48923
Compare
+/-0 on this:
The question is: How much performance gain do we get? E.g if I save 1µs in a function that takes 100µs (numbers made up), special-casing some functions to math is IMHO not worth it. |
Note that math.sqrt is no longer used. The math.pi part was not originally part of the PR, but suggested by @anntzer New benchmark (different computer):
Not sure how much math.pi improves the speed (if any). Tried to read up if the compiler possibly can precompute the result with math.pi but I have not come to any insight. I'm also a bit doubtful about if e.g. pi / 2 or pi / 2.0 should be used (or 0.5 * pi). Benchmarking gives a slight advantage for the float constants, but there may be other aspects as well. |
For reference, in current main:
|
The question is not how much speed is gained by replacing |
Correct. But much easier to just try out the operation... |
Btw, it seems like not all the code is actually tested. I messed up in a trig rewrite in two locations, but only got an error for one of them... |
codecov states that we have 81% coverage. The missing 19% is unfortunately not only edge cases and trivial code. In particular in earlier times, testing was more optional, and there's quite a bit of code that nobody has written tests for yet. On the optimizations: I have the impression you are falling for the micro-optimization trap
In relative numbers this is a magnificently sounding 100x speed improvement. However, assume we need 100 of those calculations for creating a plot. That's still only 70us, and negligable compared to
Unless such nanosecond optimizations are in a really hot place, they don't give any measurable benefit and are thus not worth the effort. Even more: If the performance benefit is negligable, other aspects like readability and maintainability of the code will become the deciding factors how code is best written. |
We are older than both pytest and nose ;) |
If somebody still knows nose 👴. |
alpha = np.sqrt(1.0 + cos_latitude * np.cos(half_long)) | ||
x = (2.0 * sqrt2) * (cos_latitude * np.sin(half_long)) / alpha | ||
x = (2 * sqrt2) * (cos_latitude * np.sin(half_long)) / alpha |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd just write 2**(3/2)
here and 2**(1/2)
below and drop the sqrt(2) variable, as noticed elsewhere this will get inlined anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I follow the argument in #22678 (comment) that 2**0.5
reads better than 2**(1/2)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine with me too.
@@ -351,18 +352,18 @@ def transform_non_affine(self, ll): | |||
# docstring inherited | |||
def d(theta): | |||
delta = (-(theta + np.sin(theta) - pi_sin_l) | |||
/ (1 + np.cos(theta))) | |||
/ (1.0 + np.cos(theta))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's stick with 1
(and likewise 2
for 2.0
below) unless it matters significantly (I doubt so...); it reads better IMO (e.g. it matches the math formula).
latitude = np.arcsin((2 * theta + np.sin(2 * theta)) / np.pi) | ||
sqrt2 = 2 ** (1 / 2) | ||
theta = np.arcsin(y / sqrt2) | ||
longitude = (math.pi / (2.0 * sqrt2)) * x / np.cos(theta) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again sqrt2 doesn't warrant being a separate variable; the compiler will inline 2**(1/2)
in theta and 2**(3/2)
in longitude; and again 2.0
-> 2
@@ -52,8 +53,8 @@ def cla(self): | |||
|
|||
self.grid(rcParams['axes.grid']) | |||
|
|||
Axes.set_xlim(self, -np.pi, np.pi) | |||
Axes.set_ylim(self, -np.pi / 2.0, np.pi / 2.0) | |||
Axes.set_xlim(self, -math.pi, math.pi) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just use np.pi in most of these places, because set_xlim/etc. will directly convert everything to numpy scalars anyways, obviating any speedup. (Using python scalars is only useful if you do some computations with them, and even then, probably the gain of writing math.pi*2
below is obviated by the additional builtin->numpy conversion.)
@oscargus did you still want this to move forward? I'll move to draft, but feel free to move back |
PR Summary
Stumbled upon some optimization opportunities while browsing the code.
Removed redundant calls to sin/cos/sqrt. Replaced np.sqrt with power computation for constant scalars. Used np.cbrt.
PR Checklist
Tests and Styling
pytest
passes).flake8-docstrings
and runflake8 --docstring-convention=all
).Documentation
doc/users/next_whats_new/
(follow instructions in README.rst there).doc/api/next_api_changes/
(follow instructions in README.rst there).