Thanks to visit codestin.com
Credit goes to github.com

Skip to content

ENH: half precision complex #14753

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
danielhrisca opened this issue Oct 22, 2019 · 18 comments
Open

ENH: half precision complex #14753

danielhrisca opened this issue Oct 22, 2019 · 18 comments

Comments

@danielhrisca
Copy link
Contributor

Some binary file formats use half precision complex numbers (float16 for the real and imaginary parts)

Since numpy already supports half precision floats wouldn't it be possible to support the complex counterpart?

Size Float Complex
2 f2 ???
4 f4 c8
8 f8 c16
@charris
Copy link
Member

charris commented Oct 22, 2019

It is currently a major project to add new types to core NumPy, so this sort of enhancement will wait on a redo of the type system.

@danielhrisca
Copy link
Contributor Author

@charris
this sounds good, can you add this issue to the item list of this project?

@leofang
Copy link
Contributor

leofang commented Jan 3, 2021

Hi @mattip @charris I was told by @rgommers that the new dtype system will likely land in the upcoming 1.20. Do you think someone from the NumPy team will have the bandwidth to work on adding numpy.complex32 in the near future? This will save us from CuPy a lot of headache (cupy/cupy#4454), because we really just wanna focus on the GPU part (all of our CPU-side handling is done by NumPy via aliasing NumPy's types, so for example cupy.complex32 would simply be numpy.complex32). A few CUDA libraries provide complex32 support, so this would help us expose such support to our users, though we have other things to resolve too (cupy/cupy#3370). Thanks!

@mattip
Copy link
Member

mattip commented Jan 3, 2021

It does seem like something worthwhile to add. @seberg thoughts?

@charris
Copy link
Member

charris commented Jan 3, 2021

It does seem like something worthwhile to add.

It has been in the back of my mind since float16 was introduced. If there is now a use for complex32 I'm in favor of adding support. @leofang NumPy does its float16 computations in float32, converting back and forth, so it should not be too difficult to add. I assume the GPU versions are more direct. We could probably add an 'E' type even without the new machinery, it just completes the old types.

@seberg
Copy link
Member

seberg commented Jan 3, 2021

The new dtypes "landed" but it is still very limited. Most importantly ufuncs still need a complete revamp and clean up, which I think will happen in the next months (NEP 43).

Chuck is right, you can add a new 'E' type to ensure that the ufunc part works. But I am not sure I want to dive into it (happy to give pointers though): Inserting an 'E' in the loop order might occasionally change the output dtype at the moment (although only for ufuncs that always take a complex output/input, so it likely doesn't matter).

Other than ufuncs, I have to clean out promotion. It slowed me down a bit at the end of last year, since NumPy promotion is a bit broken and I want to find the least annoying way to keep supporting it without special cases all over the place (most likely, that isn't possible...).

To be clear, there may be trickier things to deal with also since the scalar code is pretty tricky in itself and you probably would want to make the complex32 scalar use the same implementation as the other ones.


In any case, if I was now asked whether we can add float16, I would pause and ask to consider writing it externally first, since I doubt it is used all that much (or at least only in specialized code, for storage or gpu related). But we already have float16, so I am not opposed.
But, I have to say that others may point out that bfloat16 is used more than complex numbers, so the reason for adding it would be that we want symmetry of including all complex types rather than the usefulness to users (i.e. it is unexpected if float16 is bulitin, but complex32 is externa but it isn't super unexpected if you have to do import numpyer_bfloat16).

@leofang
Copy link
Contributor

leofang commented Jan 4, 2021

@mattip @charris @seberg Thank you all for quick response!

NumPy does its float16 computations in float32, converting back and forth, so it should not be too difficult to add. I assume the GPU versions are more direct.

With a preliminary test I think our GPU ufuncs can already do the same thing (converting to/from complex64), which is enough to get things moving. After NVIDIA provides better coverage in their math libraries this conversion will be eliminated.

We could probably add an 'E' type even without the new machinery, it just completes the old types.

Right, having an 'E' type is enough for us; I made such an attempt in cupy/cupy#4454 but it'd be much better if we don't have to do this ourselves.

Inserting an 'E' in the loop order might occasionally change the output dtype at the moment (although only for ufuncs that always take a complex output/input, so it likely doesn't matter).

To be clear, there may be trickier things to deal with also since the scalar code is pretty tricky in itself and you probably would want to make the complex32 scalar use the same implementation as the other ones.

I am not sure I understand these parts @seberg. Are you having numpy.result_type() in mind? We mostly follow NumPy's behavior, with limited exceptions like FFT in which the output dtype is the same as input dtype if possible (since cuFFT allows this).

In CuPy we don't have scalars; all scalars in NumPy are 0-d arrays in CuPy. I hope this could eliminate some complexities? (Plus, the actual type casting is done on GPU, so really we just need an 'E' dtype as an identifier to select a code path, specialize a C++ template, launch the correct library API, etc.)

But, I have to say that others may point out that bfloat16 is used more than complex numbers, so the reason for adding it would be that we want symmetry of including all complex types rather than the usefulness to users

Well I thought you'd need special CPU instructions to support bfloat16 but not float16/complex32, so IMHO the amount of work is not comparable; the latter should be easier?

In fact for CuPy's purpose I suspect it's enough if there is a "storage-only" dtype for complex32 etc, since the computation is delegated to GPU and we just need the ability to check the correctness of the result (on CPU or GPU, possibly involving casting) and it's OK that the storage-only dtypes do not support CPU computing (ufunc, reduction, etc).

@seberg
Copy link
Member

seberg commented Jan 4, 2021

@leofang result_type is one problem (the one that is now up for fixing, so if you have any ideas how an API could look like that allows you to insert complex32 into the NumPy type hierarchy, I am all ears).

For the ufuncs, if your ufunc currently supports complex64 and complex128 (but not integers/floats), the current way ufuncs work is a linear search. So an int8 input, uses complex64, but if you add complex32 that will be what you get (if you define that int8 can cast safely to complex32 – or better if you say int8 and complex32 promote to complex32). I want to change that lookup (NEP 43).

Yes no scalars might eliminate some complexities, but at least to me it would seem a bit strange if complex32 was the odd one out that behaved differently?

Using E and complex32 to mean those things seems uncontroversial. I am unsure if we should do storage only in NumPy. The ufunc scheme will upcast to complex64 and do the calculation anyway (if safe casts/promotion is defined).

@leofang
Copy link
Contributor

leofang commented Jan 14, 2021

Sorry @seberg that I dropped the ball...

result_type is one problem (the one that is now up for fixing, so if you have any ideas how an API could look like that allows you to insert complex32 into the NumPy type hierarchy, I am all ears).

For the ufuncs, if your ufunc currently supports complex64 and complex128 (but not integers/floats), the current way ufuncs work is a linear search. So an int8 input, uses complex64, but if you add complex32 that will be what you get (if you define that int8 can cast safely to complex32 – or better if you say int8 and complex32 promote to complex32). I want to change that lookup (NEP 43).

Am I understanding it correctly that these are the two facets of the same problem, so if we fix one we fix them all? What if we make complex32 a bit more restrictive than other dtypes by, say, making it only safe to cast to/from complex64 but nothing further? Would it help to maintain the current type hierarchy as is? I don't know if it's even possible though, but this would more or less achieve the essence of a storage dtype (+ some very limited computing capability).

Yes no scalars might eliminate some complexities, but at least to me it would seem a bit strange if complex32 was the odd one out that behaved differently?

Not sure if I correctly get the different behavior that you referred to, but given that the actual computation performed on a complex32 array would necessarily involve an upcast before (and downcast after), I think it's worth granting it a special status, by which I mean it mets the minimal requirement of serving as a functional dtype (i.e. storage + minimal compute). Am I making sense?

@seberg
Copy link
Member

seberg commented Jan 14, 2021

Am I understanding it correctly that these are the two facets of the same problem, so if we fix one we fix them all?

Yes, you could hack around the ufunc behaviour by not allowing safe casts that are normally around (as well as promotions). NumPy currently uses "safe casting" semantics for promotion (in ufuncs mostly), which in my current opinion mixes two concepts that do not really quite fit.

My hesitation is that float16 is not very limited or special, even though it also is de-facto in the same category. So this would introduce a slightly "odd one out" dtype. Oh the other hand, one problem I just realized is that if you want to do this outside of NumPy getting arr.real and arr.imag right may require some additional method/protocol.

Also, you could argue that complex32 will be just as odd as float16 with respect to all of these things. So maybe it just doesn't matter, as it is not like we can fix float16 soon (so at least we have two dtypes with the same "odd one out" behaviour.

@leofang
Copy link
Contributor

leofang commented Jan 16, 2021

Thanks for the evaluation, Sebastian. To me it's fine that both float16 and complex32 are on the "odd one out" side as you coined, as I think a fully functional dtype implies it can be used to carry out all kinds of operations it could possibly be permitted to. If in most occasions they need to be upcast first, this would be enough to disqualify them. (So I actually like when you said "at least we have two (such) dtypes" -- float16 is no longer alone 🙂)

As for .real/.imag for complex32, it's unclear to me what would the problem be. Wouldn't they simply be non-contiguous (stride 4) float16 arrays that we can already handle? Could you elaborate your concern?

Finally, it seems our discussion is converging. What does it take to proceed from here? Should I bomb the mailing list? Does it require a NEP? I might not be able to contribute to the actual code, but I'd be happy to do some logistics (if any).

Thanks.

@HarshVardhanKumar
Copy link

In most of the processors (except some of the really recent ones), they don't support the half-precision floating-point for computation. They allow for storage purposes, but if a computation is needed, the half-float is first converted to a full-float, the computation is done and then the results are stored as half-precision.

I was curious how NumPy handles this restriction. Is the numpy.half data type really a half type (in that case how to handle the processor and language incompatibility) or is it simply the above approach?

Thanks

@charris
Copy link
Member

charris commented Jul 9, 2021

I was curious how NumPy handles this restriction.

NumPy does float16 computations using float32. There is no hardware/compiler support used.

@leofang
Copy link
Contributor

leofang commented Dec 1, 2021

Hi @seberg I'd like to follow up and see how much closer we are / how much work is needed to get numpy.complex16 supported, thanks!

@seberg
Copy link
Member

seberg commented Dec 1, 2021

@leofang no, and it is not on anyones roadmap.

@guberti
Copy link

guberti commented Dec 20, 2024

Seconding this issue! Having native support for complex32 would be massively helpful for my use case.

@FedeMPouzols
Copy link

The last discussions in this issue date back to about four years ago. I'd guess that the many changes made in the data type system since then, including release 2.x, might have changed the landscape? Any comments on the feasibility of this?

@jbms
Copy link

jbms commented Mar 13, 2025

One possibility would be to attempt to implement this in ml-dtypes in a generic way (supporting all of the additional floating-point types defined by ml-dtypes as well). Some of the existing code in that package for defining custom dtypes is likely to be helpful, though I imagine there will be new challenges for complex numbers that have to be resolved. The promotion behavior likely won't be perfect but I don't think that will affect most use cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants