-
-
Notifications
You must be signed in to change notification settings - Fork 11.7k
ENH: einsum_sumprod.c.src to cpp in numpy.multiarray see #29528 #30210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
523e2df to
bec2370
Compare
|
@inakleinbottle quick fyi ping to keep you apprised of my current status |
| * and SIMDF64 */ | ||
| template <> | ||
| struct SumSIMD<npy_float, npy_float> { | ||
| using SimdType = NpySIMDF32; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this, and a great deal of the other supporting structure will be useful elsewhere and should probably be moved to a header file that can just be included in all these replacement files as they are added. It might be worth having a discussion with the optimization team too for the SIMD stuff to avoid replicating work.
| /* Template where (npy_double, npy_double) will allow the SIMD | ||
| * capable version.*/ | ||
| template <typename T, typename Temptype, | ||
| typename std::enable_if<std::is_same<T, Temptype>::value && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The enable_if construct is a little odd here. First, since we have access to C++17, you might want to use the _t and _v variants which make this a bit more readable. However, here I think simple overloading would serve the same purpose. C++ always prefers concrete instantiations over templates if it is the best match, and enable_if constructions have a large cost for instantiation whereas overloading has a much smaller cost. I think though, that what is needed here is actually a partially specializable struct template that can be used as a customization point inside the sum_of_arr function. For instance:
template <typename T, typename TempType, typename SFINAE=void>
struct SumOfArr {
static TempType eval(T* daa, npy_intp size) noexcept(?) {}
};
template <typename T>
static inline TempTypeOf<T> sum_of_arr(T* data, npy_intp count) noexcept (?) {
using Helper = SumOfArray<T, TempTypeOf<T>>;
return Helper::eval(data, count);
}This is just an idea, but it would give you relatively low-overhead control the implementation based on the type and TempType (which I've shorthanded here as a trait, but could be left as a template argument if multiple choices are necessary).
|
Thanks @athurdekoos for tagging me in. I had a quick look over the code and made a couple of what I hope are helpful comments, or at least things to think about. To be clear, I don't think these are necessary changes, so this isn't a review of any kind. Both comments are "looking ahead" to the other c.src modules too, so establishing some standard patterns and reusable components might be a good idea. Very happy discuss or explain further. |
This PR is a continuation of #29528.
Description
This change ports einsum_sumprod from the generated einsum_sumprod.c.src file to a C++ template source file (einsum_sumprod.cpp). The goal is to improve readability and maintainability while preserving the existing behavior and performance characteristics.
There are no intended changes to the public API.
Summary of changes:
einsum_sumprod.c.srcwitheinsum_sumprod.cpp.Notes:
Considerations not implemented:
Benchmarks
Please let me know if you'd like me to adjust naming or structure.