-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Fix check 1d #22141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix check 1d #22141
Conversation
lib/matplotlib/tests/test_axes.py
Outdated
for x in [pd.Series([1, 2], dtype="float64"), | ||
pd.Series([1, 2], dtype="Float32")]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this specifically for "Float32" or would the test also do with "Float64"? If it's only the custom type and precision does not matter, I'd got with "Float64" to communicate that we're primarily testing the custom pandas type.
Side-note: It seems the capital "Float" types are not yet documented. There's only a v1.2.0 change note and the GH issues linked therein.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell its undocumented. Maybe we should hold of on support, or maybe this PR can go in, but without the test?
I actually disagree with Pandas having a new type here - it seems people want this for pedantic reasons, but I think 99.999% of the world doesn't care if NaN means a failed computation or missing data. However, if they feel strongly, they should get numpy onboard, and then everyone will have this flag rather than making a new data type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think any of their capital F floats will work.
@@ -1649,7 +1650,7 @@ def index_of(y): | |||
The x and y values to plot. | |||
""" | |||
try: | |||
return y.index.values, y.values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need both this change and the additional exception handling above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know, but I think we should be in the habit of using to_numpy()
when possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to_numpy
came in via pandan 0.24 on Jan 25, 2019 so we can rely on it being there and https://stackoverflow.com/a/54508052/380231 makes an arguement in favor of no_numpy
.
I think this change may be un-related (or fix this by chance) but I do not think it will avoid needing the other one and does no harm.
@tacaswell, thanks for your review, but dismissing as this approach is completely different, and attempts to remove the pandas-ness of the data right at the beginning. |
Approach now is completely different, so requires a re-review
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some concerns (mostly because any time we change complicated code we find that the complexity was there to handle some weird corner case that failed to get a test), but overall am 👍 on this as I think this will also fix other data containers. I think there is a risk that something subtle will break, but we are already dealing with something subtle breaking and I am optimistic that the cure will not be worse than the malady.
If you think of a DF as a 2D array with columns, then this is consistent with how we handle 2D arrays being passed into plot
.
This gets used in matplotlib/lib/matplotlib/axes/_base.py Lines 486 to 495 in f0593a6
and it gets called before we check units. So I guess what I am proposing here would strip an object that has a
Meh, it has done this conversion for a long time. |
@tacaswell, this simplified a bit more since your approval - if you wanted to double check, that would be appreciated. |
In #22560 I added I guess it can make sense to have a coordinates merge of this and that PR. If this is merged first, I'll update my PR. If my PR is merged first, it can make sense to use that function here. |
Well first it's not only pandas that has to_numpy so that is a bit of a misnomer. But also, why have a separate method at all? |
It was suggested to use a separate function. Right now, slightly different approaches are used at different locations in the code. Sometimes a fallback to Regarding naming, I considered that, but I do not know which other libraries support that function. However, I do not really see the name neither written in stone nor something that should prohibit using a single function, there will be a name that is correct enough. (And as you can see, there are explicit comments mentioning pandas at all the other locations where it was used.) |
I then assume that we merge this first, find a good name for the function and, if you want to strongly object using the function here, you can do that in #22560. |
I meant why have a separate method than check_1d? Our problem is inconsistent duckttping so if we can have it all in one spot that would be very helpful. If check1d does more than duck type pandas then sure it could call the ducktype converter. I do actually wonder if all of this should just be part of the unit conversion machinery rather than cbook calls |
I do not have an enough overview of the code base to see if one should/could have used check_1d (or check_2d?) instead. But if possible, that is of course even better. |
…141-on-v3.5.x Backport PR #22141 on branch v3.5.x (Fix check 1d)
PR Summary
Closes #22125 and #22330
This was an issue with handling
in the logic of our plot argument parsing.
x.values
returns aFloatingArray
. But it doesn't behave like a numpy array, in thatx.values[:, None]
returns aValueError: values must be a 1D array
. I actually think this is a Pandas bug, but I'll leave that to someone with more pandas knowledge to inform them. However, we can get around it by usingx.to_numpy()
which seems to work fine.PR Checklist
Tests and Styling
pytest
passes).flake8-docstrings
and runflake8 --docstring-convention=all
).Documentation
doc/users/next_whats_new/
(follow instructions in README.rst there).doc/api/next_api_changes/
(follow instructions in README.rst there).