Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Improve speed in some modules #22678

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

oscargus
Copy link
Member

@oscargus oscargus commented Mar 21, 2022

PR Summary

Enable pre-computation during compilation and remove redundant computations.

Before:

In [7]: %timeit Path.circle()
25 µs ± 78.9 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

After:

In [28]: %timeit Path.circle()
21.6 µs ± 72.1 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

PR Checklist

Tests and Styling

  • Has pytest style unit tests (and pytest passes).
  • Is Flake 8 compliant (install flake8-docstrings and run flake8 --docstring-convention=all).

Documentation

  • New features are documented, with examples if plot related.
  • New features have an entry in doc/users/next_whats_new/ (follow instructions in README.rst there).
  • API changes documented in doc/api/next_api_changes/ (follow instructions in README.rst there).
  • Documentation is sphinx and numpydoc compliant (the docs should build without error).

@oscargus oscargus force-pushed the speedimprovements branch from 295bef0 to 7b8fd5e Compare March 21, 2022 10:41
@tacaswell tacaswell added this to the v3.6.0 milestone Mar 22, 2022
@tacaswell
Copy link
Member

I'm actually quite surprised (but remember being surprised by this before) that raising to a fraction power is pretty fast:

$ ipython
Python 3.11.0a6+ (heads/main:29624e769c, Mar 18 2022, 18:36:12) [GCC 11.2.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.2.0.dev -- An enhanced Interactive Python. Type '?' for help.

In [1]: import numpy as np

In [2]: %timeit np.sqrt(52)
476 ns ± 1.14 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In [3]: %timeit 52 ** (1/2)
3.12 ns ± 0.018 ns per loop (mean ± std. dev. of 7 runs, 100,000,000 loops each)

In [4]: import math

In [5]: %timeit math.sqrt(52)
25.4 ns ± 0.125 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

In [6]: s = math.sqrt

In [7]: %timeit s(52)
25.4 ns ± 0.0733 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

In [8]: %timeit s(52.0)
14.9 ns ± 0.0463 ns per loop (mean ± std. dev. of 7 runs, 100,000,000 loops each)

In [9]: %timeit 52.0 ** (1/2)
3.15 ns ± 0.00981 ns per loop (mean ± std. dev. of 7 runs, 100,000,000 loops each)

In [10]: 

@timhoffm
Copy link
Member

  1. Powers of constants are pre-computed.
  2. Powers of variables use BINARY_POWER
  3. I anticipate that the function call overhead for math.sqrt() is relevant on the ns timescale.
In [10]: dis.dis('52 ** 0.5')
  1           0 LOAD_CONST               0 (7.211102550927978)
              2 RETURN_VALUE

In [11]: dis.dis('a ** 0.5')
  1           0 LOAD_NAME                0 (a)
              2 LOAD_CONST               0 (0.5)
              4 BINARY_POWER
              6 RETURN_VALUE

In [12]: dis.dis('math.sqrt(52)')
  1           0 LOAD_NAME                0 (math)
              2 LOAD_METHOD              1 (sqrt)
              4 LOAD_CONST               0 (52)
              6 CALL_METHOD              1
              8 RETURN_VALUE

In [13]: dis.dis('math.sqrt(a)')
  1           0 LOAD_NAME                0 (math)
              2 LOAD_METHOD              1 (sqrt)
              4 LOAD_NAME                2 (a)
              6 CALL_METHOD              1
              8 RETURN_VALUE

@QuLogic
Copy link
Member

QuLogic commented Mar 22, 2022

In [11]: dis.dis('a ** 0.5')

Speaking of, can we write them as 0.5 instead of (1/2), which seems extra long?

@@ -1150,7 +1150,7 @@ def boxplot_stats(X, whis=1.5, bootstrap=None, labels=None,
def _bootstrap_median(data, N=5000):
# determine 95% confidence intervals of the median
M = len(data)
percentiles = [2.5, 97.5]
percentiles = (2.5, 97.5)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this matters wrt speed (does it?); also I prefer a list here as this matches "more" the return type of np.percentile below (a ndarray is closer to a list than to a tuple.)

@@ -1221,7 +1222,8 @@ def _compute_conf_interval(data, med, iqr, bootstrap):
stats['mean'] = np.mean(x)

# medians and quartiles
q1, med, q3 = np.percentile(x, [25, 50, 75])
percentiles = (25, 50, 75)
q1, med, q3 = np.percentile(x, percentiles)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, leave as before.

@@ -1169,8 +1169,9 @@ def _compute_conf_interval(data, med, iqr, bootstrap):
else:

N = len(data)
notch_min = med - 1.57 * iqr / np.sqrt(N)
notch_max = med + 1.57 * iqr / np.sqrt(N)
half_height = 1.57 * iqr / (N ** (1 / 2))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you could inline len(data) into the computation here.


vertices = np.array([[0.0, -1.0],
Copy link
Contributor

@anntzer anntzer Apr 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I would just write this is as vertices = np.array([...]) without even bothering with the dtype=float which is unneeded; this also avoids a temporary variable (mostly avoids having to keep track of it mentally). Ditto below.

@@ -677,7 +677,7 @@ def _set_pentagon(self):
self._path = polypath
else:
verts = polypath.vertices
y = (1 + np.sqrt(5)) / 4.
y = (1 + 5 ** (1 / 2)) / 4.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced of all these sqrt micro-optimizations. IMHO it makes the code less readable, and the performance gain is minimal. For example here

The sqrt is less than 2% of the the time of the immediate surrounding code in the function:

In [21]: %%timeit
    ...: Affine2D().scale(0.5)
    ...: polypath = Path.unit_regular_polygon(5)
    ...: verts = polypath.vertices
    ...: top = Path(verts[[0, 1, 4, 0]])
    ...: bottom = Path(verts[[1, 2, 3, 4, 1]])
    ...: left = Path([verts[0], verts[1], verts[2], [0, -y], verts[0]])
    ...: right = Path([verts[0], verts[4], verts[3], [0, -y], verts[0]])
    ...: 
61.1 µs ± 228 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In [22]: %timeit np.sqrt(5)
2.36 µs ± 35 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

and that function _set_pentagon() again is only used to create a new marker object, which in itself is quite rare. So I bet the performance gain is not measurable in any real example.

@QuLogic QuLogic modified the milestones: v3.6.0, v3.7.0 Aug 24, 2022
@timhoffm
Copy link
Member

timhoffm commented Dec 15, 2022

@oscargus Thanks for trying to speed the code up, However, as said before:

I'm not convinced of all these sqrt micro-optimizations. IMHO it makes the code less readable, and the performance gain is minimal.

Unless we can show there is a measurable performance gain in real situations, I propose to close this.

@tacaswell tacaswell modified the milestones: v3.7.0, v3.8.0 Jan 4, 2023
@jklymak jklymak marked this pull request as draft January 30, 2023 23:01
@jklymak
Copy link
Member

jklymak commented Jan 30, 2023

@oscargus is it OK to close this? I agree that some of the optimizations are marginal at best... In either case, moving to draft ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants