ENH: einsum_sumprod.c.src to cpp in numpy.multiarray see #29528 #30210

athurdekoos · 2025-11-14T04:02:47Z

This PR is a continuation of #29528.

Description

This change ports einsum_sumprod from the generated einsum_sumprod.c.src file to a C++ template source file (einsum_sumprod.cpp). The goal is to improve readability and maintainability while preserving the existing behavior and performance characteristics.

There are no intended changes to the public API.

Summary of changes:

Replacement of einsum_sumprod.c.src with einsum_sumprod.cpp.
Original function structure was preserved as closely as possible.
- Function patterns remain the same.
- Function names have been preserved where possible.
- Variable names have been preserved where possible.
Logic was maintained as close to the original as possible.
Some larger functions have been refactored into smaller discrete implementations.
Helper functions added.
If possible, macro based optimizations have been preserved.
Introduced new macro implementations primarily to handle NPY_SIMD_F32 and NPY_SIMD_F64.
Optimizations that were only present in some code paths in the original implementation have been ported to the corresponding functions in the new structure when possible.
Expanded and clarified some comments.
Some templated parameters from the original code were removed where they were not implement or did not materially improve performance or clarity.

Notes:

Prioritization of this port was readability over aggressive abstraction.
When possible templating is resolved at compile time.

Considerations not implemented:

Considered introducing C++ namespaces but did not do so to keep in line with the overall C-API style.
- Some function names as a result are a long but descriptive.
Considered using templates as a replacement for some macros, but opted against this where it hurt clarity

Benchmarks

Across multiple machines were no major performance changes were observed.
However, on my personal machine, benchmarks seem to improve dramatically in several cases with the exception of a small inconsistent regression

Change	Before [`fabf184`]	After [`fad2105`] <einsum_sumprod_to_cpp>	Ratio	Benchmark (Parameter)
+	194±7μs	220±10μs	1.13	bench_linalg.Eindot.time_inner_trans_a_a
-	25.5±1μs	23.5±0.3μs	0.92	bench_linalg.Eindot.time_dot_d_dot_b_c
-	1.10±0.1ms	987±30μs	0.89	bench_linalg.Eindot.time_dot_trans_at_a
-	106±10μs	70.8±7μs	0.66	bench_linalg.Eindot.time_dot_trans_a_atc
-	1.44±0.02ms	899±30μs	0.62	bench_linalg.Eindot.time_einsum_i_ij_j
-	110±5ms	33.1±3ms	0.30	bench_linalg.Eindot.time_einsum_ijk_jil_kl
-	114±6ms	6.11±0.08ms	0.05	bench_linalg.Eindot.time_einsum_ij_jk_a_b

Please let me know if you'd like me to adjust naming or structure.

athurdekoos · 2025-11-15T03:21:47Z

@inakleinbottle quick fyi ping to keep you apprised of my current status

inakleinbottle · 2025-11-17T09:13:19Z

numpy/_core/src/multiarray/einsum_sumprod.cpp

+ * and SIMDF64 */
+template <>
+struct SumSIMD<npy_float, npy_float> {
+    using SimdType = NpySIMDF32;


I think this, and a great deal of the other supporting structure will be useful elsewhere and should probably be moved to a header file that can just be included in all these replacement files as they are added. It might be worth having a discussion with the optimization team too for the SIMD stuff to avoid replicating work.

inakleinbottle · 2025-11-17T09:25:25Z

numpy/_core/src/multiarray/einsum_sumprod.cpp

+/* Template where (npy_double, npy_double) will allow the SIMD
+ * capable version.*/
+template <typename T, typename Temptype,
+          typename std::enable_if<std::is_same<T, Temptype>::value &&


The enable_if construct is a little odd here. First, since we have access to C++17, you might want to use the _t and _v variants which make this a bit more readable. However, here I think simple overloading would serve the same purpose. C++ always prefers concrete instantiations over templates if it is the best match, and enable_if constructions have a large cost for instantiation whereas overloading has a much smaller cost. I think though, that what is needed here is actually a partially specializable struct template that can be used as a customization point inside the sum_of_arr function. For instance:

template <typename T, typename TempType, typename SFINAE=void> struct SumOfArr { static TempType eval(T* daa, npy_intp size) noexcept(?) {} }; template <typename T> static inline TempTypeOf<T> sum_of_arr(T* data, npy_intp count) noexcept (?) { using Helper = SumOfArray<T, TempTypeOf<T>>; return Helper::eval(data, count); }

This is just an idea, but it would give you relatively low-overhead control the implementation based on the type and TempType (which I've shorthanded here as a trait, but could be left as a template argument if multiple choices are necessary).

inakleinbottle · 2025-11-17T09:29:30Z

Thanks @athurdekoos for tagging me in. I had a quick look over the code and made a couple of what I hope are helpful comments, or at least things to think about. To be clear, I don't think these are necessary changes, so this isn't a review of any kind. Both comments are "looking ahead" to the other c.src modules too, so establishing some standard patterns and reusable components might be a good idea. Very happy discuss or explain further.

WIP: einsum_sumprod to cpp in Numpy.multiarray see numpy#29528

bb76bb3

github-actions bot added the 01 - Enhancement label Nov 14, 2025

ENH: einsum_sumprod.c.src to cpp in numpy.multiarray see numpy#29528

bec2370

athurdekoos force-pushed the einsum_sumprod_to_cpp branch from 523e2df to bec2370 Compare November 15, 2025 00:04

ENH: einsum_sumprod.c.src to cpp in numpy.multiarray see numpy#29528

b55b1b3

inakleinbottle reviewed Nov 17, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: einsum_sumprod.c.src to cpp in numpy.multiarray see #29528 #30210

ENH: einsum_sumprod.c.src to cpp in numpy.multiarray see #29528 #30210

athurdekoos commented Nov 14, 2025 •

edited

Loading

Uh oh!

athurdekoos commented Nov 15, 2025

Uh oh!

inakleinbottle Nov 17, 2025

Uh oh!

inakleinbottle Nov 17, 2025

Uh oh!

inakleinbottle commented Nov 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

ENH: einsum_sumprod.c.src to cpp in numpy.multiarray see #29528 #30210

Are you sure you want to change the base?

ENH: einsum_sumprod.c.src to cpp in numpy.multiarray see #29528 #30210

Conversation

athurdekoos commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Summary of changes:

Notes:

Considerations not implemented:

Benchmarks

Uh oh!

athurdekoos commented Nov 15, 2025

Uh oh!

inakleinbottle Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

inakleinbottle Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

inakleinbottle commented Nov 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

athurdekoos commented Nov 14, 2025 •

edited

Loading