ENH: use size-zero dtype for broadcast-shapes #26599

mhvk · 2024-06-02T18:14:00Z

In #26160, the performance of broadcast_shapes was improved by replacing a dtype=[] with dtype=bool. The reason this worked is that the conversion is faster (i.e., np.dtype(bool) is faster than np.dtype([]), but this has the side effect that empty arrays real sizes are created - and hence broadcast_shapes is now considerable slower than it was for large-sized arrays. This PR changes it to a definition of the dtype outside of the function, which has the best of both worlds:

# Pre-26160
In [1]: %timeit np.broadcast_shapes((6, 7), (5, 6, 1), (7,), (5, 1, 7))
2.28 µs ± 2.64 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [2]: %timeit np.broadcast_shapes((6, 7000000), (5, 6, 1), (7000000,), (5, 1, 7000000))
2.29 µs ± 3.89 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

# With 26160
In [1]: %timeit np.broadcast_shapes((6, 7), (5, 6, 1), (7,), (5, 1, 7))
2 µs ± 7.77 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In [2]: %timeit np.broadcast_shapes((6, 7000000), (5, 6, 1), (7000000,), (5, 1, 7000000))
10.8 µs ± 9.43 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

# With this PR
In [1]: %timeit np.broadcast_shapes((6, 7), (5, 6, 1), (7,), (5, 1, 7))
2.02 µs ± 2.54 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [2]: %timeit np.broadcast_shapes((6, 7000000), (5, 6, 1), (7000000,), (5, 1, 7000000))
2.04 µs ± 3.48 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

This makes the speed independent of the actual shapes (as it used to be before numpygh-26160), but still fast.

seberg · 2024-06-03T12:09:00Z

Thanks for the follow up!

mhvk added component: numpy.lib 03 - Maintenance labels Jun 2, 2024

mhvk requested a review from seiko2plus June 2, 2024 18:14

mhvk mentioned this pull request Jun 2, 2024

ENH: Improve performance of np.broadcast_arrays and np.broadcast_shapes #26160

Merged

ENH: use size-zero dtype for broadcast-shapes

7a647ea

This makes the speed independent of the actual shapes (as it used to be before numpygh-26160), but still fast.

mhvk force-pushed the broadcast-size-0-array branch from ce8775e to 7a647ea Compare June 2, 2024 18:15

mhvk changed the title ~~MAINT: use size-zero dtype for broadcast-shapes~~ ENH: use size-zero dtype for broadcast-shapes Jun 2, 2024

mhvk added 01 - Enhancement and removed 03 - Maintenance labels Jun 2, 2024

seberg merged commit a2d1972 into numpy:main Jun 3, 2024
67 of 68 checks passed

mhvk deleted the broadcast-size-0-array branch June 3, 2024 12:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: use size-zero dtype for broadcast-shapes #26599

ENH: use size-zero dtype for broadcast-shapes #26599

Uh oh!

mhvk commented Jun 2, 2024

Uh oh!

Uh oh!

seberg commented Jun 3, 2024

Uh oh!

Uh oh!

Uh oh!

ENH: use size-zero dtype for broadcast-shapes #26599

ENH: use size-zero dtype for broadcast-shapes #26599

Uh oh!

Conversation

mhvk commented Jun 2, 2024

Uh oh!

Uh oh!

seberg commented Jun 3, 2024

Uh oh!

Uh oh!