-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Improve pandas/xarray/... conversion #22560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
lib/matplotlib/axes/_axes.py
Outdated
if hasattr(X, 'values'): # support pandas.Series | ||
X = X.values | ||
if hasattr(X, 'to_numpy'): # support pandas.Series | ||
X = X.to_numpy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any libraries we need to worry about that implement X.values
, but not X.to_numpy()
? Maybe we also want to leave an elif
with the X.values
block as a fallback just in case...?
Perhaps should be coordinated with #22141? |
That's where I got the idea. As seen, I have not touched Related to the question from @greglucas, I do not know. But I saw this code in matplotlib/lib/matplotlib/cbook/__init__.py Lines 1369 to 1377 in 3a994d2
This is a bit more complex as it uses a try-except approach (not sure how much that affects things though), has a fallback on values and check that values actually returns an nparray. One can of course use a similar approach here (and in #22141). Possibly slightly improved as I do not know if there are cases if values actually is non-trivial, so no need to run it twice. Edit: I have now touched this and use a similar approach, but with |
I guess you could write a |
Yes, I was also thinking about a helper function (but didn't know where to place it, so great info that cbook is the place). I do not have any strong opinions as such, more that I read the link and it seemed like the right thing to do. |
a0488df
to
64a301e
Compare
Seems like the easiest way is to wait for #22141, add those conversions here as well, and then discuss the correct name for the function. |
I'm thinking that maybe |
64a301e
to
a1757d1
Compare
This is now updated:
|
a1757d1
to
873dff2
Compare
873dff2
to
1e27b8a
Compare
Please update the lines touched by #22141 |
19e86ca
to
f17f3af
Compare
There is also this function where it doesn't work to simply replace matplotlib/lib/matplotlib/cbook/__init__.py Lines 1607 to 1639 in 0359832
|
np.testing.assert_array_equal(Idx, IdxRef) | ||
|
||
|
||
def test_index_of_xarray(xr): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does xarray get us more coverage here? They have a to_numpy()
method the same as pandas I believe.
https://xarray.pydata.org/en/stable/generated/xarray.DataArray.to_numpy.html
So, it seems like a pretty heavy dependency to add for just this one test...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not so much for coverage as for actually testing using data of specified formats. With the discussion about which formats we support, it makes sense to test them as well. Right now some of these are tested in the plots, but it can possibly make sense to simply test them here as these are the core function used to get data that can be plotted.
If we claim (which we actually don't, maybe we should?) that we can plot xarray, we should probably test it as well. And other types that we may want to claim to support. Or maybe fork off a specific dependency test that is not executed on all platforms/version, including pandas (which is 11.7 MB, xarray is 870 kB).
(There is another xarray-test above, so two.)
I can of course remove them, but I think we should discuss if we want to support more formats than pandas and numpy (and Python list/tuple), and, if so, have explicit tests for them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, we should probably discuss what we want to support/test. To me, this doesn't seem to add a whole lot of value for adding a new dependency.
There was also a discussion around removing Scipy as a dependency in the docs: #22120
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the dependencies but kept the tests. Hence, they will run if xarray is available.
I also opened #22645 for discussions (probably should be discussed at a dev-call as well).
05e8501
to
f17f3af
Compare
f17f3af
to
7b51044
Compare
I made an executive decision to install xarray on CI. We already have all of its dependencies installed and it is a pure-python package. |
|
||
|
||
def _unpack_to_numpy(x): | ||
"""Internal helper to extract data from e.g. pandas and xarray objects.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please document what we intend to support, i.e. everything with .to_numpy() or .values, and what types we expect to catch with it, e.g. values-> older pandas dataframes(?).
We discussed this on the call, I will write up an issues shortly about the documentation of the intention etc. |
meeseeksdev backport to v3.5.x |
…560-on-v3.5.x Backport PR #22560 on branch v3.5.x (Improve pandas/xarray/... conversion)
see matplotlib/matplotlib#22973, matplotlib/matplotlib#22879, and matplotlib/matplotlib#22560 It is not clear to me as to which is the standard interface for unit handling (for eg, hist still doesn't handle unit by default)
PR Summary
See https://stackoverflow.com/questions/13187778/convert-pandas-dataframe-to-numpy-array/54508052#54508052 for motivation.
Related to #16402
PR Checklist
Tests and Styling
pytest
passes).flake8-docstrings
and runflake8 --docstring-convention=all
).Documentation
doc/users/next_whats_new/
(follow instructions in README.rst there).doc/api/next_api_changes/
(follow instructions in README.rst there).