-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
[ENH]: Which array libraries should Matplotlib support (and test support for)? #22645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think natively anything that supports .to_numpy() & agree it might be worth adding xarray to the tests Broadly this intersects w/ my dissertation work & the NASA RSE work, but basically as far as I can tell the idea is if we can revamp the API enough to decouple the data bits from the artist/drawing bits, then external packages can write adapters against a data interface API. This allows external libraries to define what support means on their terms. |
Is there not a NumPy standard compatibility NEP we can say we follow? |
I think it looks like a lot of the relevant stuff is open neps? |
For completeness, we should add h5py datasets to the above list. Their way of converting to numpy is |
with h5py dataset slicing we have a bunch of tests which are "do h5py and numpy slice exactly the same" so it is very very close to numpy. |
Good to know. OTOH I don’t think we’re slicing in Matplotlib, so I assume we‘re relying on other numpy-like aspects. |
Maybe pyarrow (instead of polars specifically), then it would make matplotlib easier to use for anyone in the Arrow ecosystem, even from other languages that are doing interop with python |
I didn't really get the relation between pyarrow and polars (just did a quick search), but it seems like both support Can make sense to add a test for both though and I added it to the list (as well as h5py). |
I guess I view this as the other way around - we are not guaranteeing compatibility with anything except numpy. However, as a courtesy, we will take results of your |
Good question - apache arrow is a memory model, so it just tells you what the data should look like in memory but it doesn't provide any functions to manipulate the data. Polars uses apache arrow as its memory model and provides methods to transform data. One of the motivations for using Arrow is passing data around in a zero-copy way between libraries or even between languages. For example calling Hope this helps and thank you for considering to support these libs. |
The conclusion we came to was to add tests for "all" libraries, but only install them in the weekly event testing the newest version of numpy etc. (And not claim compatibility beyond that "if it has a to_numpy you are quite likely to be able to plot something, if not run the right function yourself".) matplotlib/.github/workflows/tests.yml Lines 215 to 227 in 8c94a22
In this way, we can get a heads up in case something breaks. |
We probably should put some floor on usage (polar has like 4k downloads from pypi a day, pandas has 2-3M, xarray hase 20-40k, and we have 500k-1M) before we worry about testing it. |
Compatibility in the ecosystem has evolved. IMHO we should try to accept everything that follows the array API standard. |
Uh oh!
There was an error while loading. Please reload this page.
Problem
Related to e.g. #16402 #21036 #22560
We currently explicitly support numpy and pandas. What other array libraries are of interest to support natively (as in users can just feed an array and Matplotlib handles the conversion)? Should we test for them? (There seems to be a decision that only pandas will be tested for: #19574 (comment) )
Some alternatives:
to_numpy
, so right now it works because of pandas using the same approach. But is not explicitly tested (see Improve pandas/xarray/... conversion #22560).to_numpy
exists.tonumpy
method (that is quite costly)numpy()
method and numpy emulationto_numpy
https://arrow.apache.org/docs/python/numpy.htmlProposed solution
I think it can make sense to have a test that tests with "all" dependencies. It may not have to be executed on all platforms and all Python versions, but it will at least give some idea if things still work and when they break.
That is maybe not the same thing that we guarantee that these will always work.
cupy relies on GPUs so it is not clear if it is possible to test that.
We should also probably add something to the documentation about which are supported (and not). (Maybe there is, primarily looked at the code...)
The text was updated successfully, but these errors were encountered: