Thanks to visit codestin.com
Credit goes to github.com

Skip to content

PRF: Don't used MaskedArray in Aitoff transform. #9862

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 28, 2017

Conversation

QuLogic
Copy link
Member

@QuLogic QuLogic commented Nov 26, 2017

PR Summary

If you benchmark it, the Aitoff projection is slightly slower than the others. This is because of the use of np.ma.MaskedArray which is a bit slow. Using indexing works just the same and is fast enough to make Aitoff run about the same speed as the others. Benchmarking the function directly shows indexing to be faster no matter if the input length is 1 or 1000000 or the number of zeros is 0% or 100%.

(Actually, if you look at the history, polar plots also got a bit slower at some point, though not as slow as Aitoff. This is due to some extra resets from the new tick handling. I have a change in the works to get rid of this soon.)

PR Checklist

  • Has Pytest style unit tests
  • Code is PEP 8 compliant
  • [N/A] New features are documented, with examples if plot related
  • [N/A] Documentation is sphinx and numpydoc compliant
  • [N/A] Added an entry to doc/users/next_whats_new/ if major new feature (follow instructions in README.rst there)
  • [N/A] Documented in doc/api/api_changes.rst if API changed in a backward-incompatible way

Copy link
Member

@efiring efiring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Speeding things up by handling the special cases explicitly instead of letting masked arrays do it is a good idea, but I have questions about the implementation.

# The numerators also need to be masked so that masked
# division will be invoked.
mask = alpha != 0.0
alpha = alpha[mask]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A simpler alternative would be to use the general method that numpy uses in its sinc:

    alpha[alpha == 0] = 1e-20
    sinc_alpha = sin(alpha)/alpha

But it looks like there is more to it than just getting sinc(0) = 1. See below.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The singular point is just the center point, (x, y) = (lon, lat) = (0, 0). So it looks like the modification suggested above would work fine, eliminating all of the subsequent indexing, simplifying and speeding up the code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working through the same thing below, alpha = 0 from arccos comes from input of 1 from cos_latitude * np.cos(half_long). Using the same limits, that can only happen if longitude = latitude = 0. To ensure that the result is 0 with sinc(0) = 1, the numerator of both calculations needs to be 0. Fortunately, we have sin(longitude|latitude) in both which nets 0 there too.

I think this will work out just as well.

xy = np.empty_like(ll, float)
xy[mask, 0] = ((cos_latitude[mask] * np.sin(half_long[mask])) /
sinc_alpha)
xy[mask, 1] = np.sin(latitude[mask]) / sinc_alpha
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can sinc_alpha be zero? If so, the original masked array version was handling that case and the present version does not appear to be doing so.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sinc_alpha comes from sin(alpha), which can only be zero if alpha == nπ. alpha comes from arccos, whose output is [0, π] and since 0 is masked, we only need to worry about π which is from input of -1, from cos_latitude * np.cos(half_long). The limits are -π/2 ≤ latitude ≤ π/2 and -π ≤ longitude ≤ π (and cannot be changed), that's also -π/2 ≤ half_long ≤ π/2, so both cos terms (and the entire input) are [0, 1]. Thus it's impossible for sinc_alpha to be zero.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. (Sorry, I was too lazy to figure that out for myself.)

It's just a slight bit slower than plain indexing.
@QuLogic
Copy link
Member Author

QuLogic commented Nov 27, 2017

Unfortunately, it seems like we have no tests for these alternate projections. But I did run the demo and compared out the same before and after.

@tacaswell tacaswell added this to the v2.2 milestone Nov 27, 2017
# The numerators also need to be masked so that masked
# division will be invoked.
# Avoid divide-by-zero errors using same method as NumPy.
alpha[alpha == 0.0] = 1e-20
Copy link
Contributor

@anntzer anntzer Nov 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np.maximum(alpha, 1e-20, out=alpha) should work better if you somehow have tiny alphas (between 0 and 1e-20) and also saves an extra allocation, I think.
(or alpha = np.maximum(alpha, 1e-20) if you don't want to obfuscate it :-))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fast either way, but np.maximum takes more than twice as long, at least in my test with 10,000 pts. I think the 1e-20 is a completely arbitrary small number, and its only purpose is to prevent division by exactly zero. A tiny number still works:

In [21]: np.sin(1e-300) / 1e-300
1.0

x = (cos_latitude * ma.sin(half_long)) / sinc_alpha
y = (ma.sin(latitude) / sinc_alpha)
return np.concatenate((x.filled(0), y.filled(0)), 1)
xy = np.empty_like(ll, float)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like column_stack, which I think has a very descriptive name... (but it's just personal preference)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pre-allocating and then assigning to slices is faster, and still very readable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's essentially the same speed (not that it really matters):

In [15]: %%timeit x = np.random.rand(10000); y = np.random.rand(10000)
    ...: z = np.column_stack([x, y])
    ...: 
37.8 µs ± 216 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [16]: %%timeit x = np.random.rand(10000); y = np.random.rand(10000)
    ...: z = np.empty((10000, 2))
    ...: z[:, 0] = x; z[:, 1] = y
    ...: 
36.7 µs ± 1.39 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

@anntzer
Copy link
Contributor

anntzer commented Nov 28, 2017

thanks

@anntzer anntzer merged commit fa44c87 into matplotlib:master Nov 28, 2017
@QuLogic QuLogic deleted the Axes-micro-opt branch November 28, 2017 04:30
@dopplershift
Copy link
Contributor

Probably should have taken the time to add the simplest of tests here...

@QuLogic QuLogic modified the milestones: needs sorting, v2.2.0 Feb 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants