Fix Axes.hist crash for numpy timedelta64 inputs#31203
Fix Axes.hist crash for numpy timedelta64 inputs#31203timhoffm merged 10 commits intomatplotlib:mainfrom
Conversation
|
Thank you for opening your first PR into Matplotlib! If you have not heard from us in a week or so, please leave a new comment below and that should bring it to our attention. Most of our reviewers are volunteers and sometimes things fall through the cracks. You can also join us on gitter for real-time discussion. For details on testing, writing docs, and our review process, please see the developer guide. We strive to be a welcoming and open project. Please follow our Code of Conduct. |
|
|
||
| new_x = [] | ||
| for xi in x: | ||
| arr = np.asarray(xi) |
There was a problem hiding this comment.
This is unnecessary. cbook._reshape_2D() already returns a list of arrays.
So at least
x = [arr / np.timedelta64(1, 'D') if np.issubdtype(arr.dtype, np.timedelta64) else arr
for arr in x]is possible.
Another question to be investigated: Is arr / np.timedelta64(1, 'D') equivalent to arr.astype(float) for timedelta arrays? If so, would a general x = [arr.astype(float) for arr in x] be reasonable?
There was a problem hiding this comment.
Is
arr / np.timedelta64(1, 'D')equivalent toarr.astype(float)for timedelta arrays?
No, which is the whole problem here. If it was possible to cast timedelta64 to float, the comparison would work.
There was a problem hiding this comment.
astype(float) is valid, and I think is the better solution. Dividing by one day is going to give you tiny numbers if you started with seconds, and is going to error if you started with months or years.
There was a problem hiding this comment.
astype(float) is unsafe so if you choose to cast in this function you should be explicit in the documentation
| nx = len(x) # number of datasets | ||
| nx = len(x) | ||
|
|
||
| for arr in x: |
There was a problem hiding this comment.
Overall, I'm not in favor of checking for all kinds of erronous inputs. This is not good, but it's possibly the least bad option (other than making timedelta work properlyI) because timedelta is an accepted input in other cases.
|
The windows Tk backend test ( |
|
The timeouts are a known CI issue and unrelated: #30851 |
There was a problem hiding this comment.
Please correct the timedelata spelling typo, and indent the closing parens to align with the raise blocks.
I'm a little wary of an O(n) type check for object arrays. I think checking just the first element is fine, as mixed object arrays fail anyways.
Eg this fails:
import matplotlib.pyplot as plt
import numpy as np
data = np.array([1, 'hello', 3.5], dtype=object)
fig, ax = plt.subplots()
ax.hist(data) # Fails
print(np.array([1, 'hello', 3.5]))
# Note that without an explicit dtype, these all get cast to strings
# ['1' 'hello' '3.5']|
Thanks @jayaprajapatii ! |
|
Thankyouu:) for the review and merge ! Looking forward to contributing more to matplotlib. |
Closes #31182
Summary
Axes.histfails when passed arrays ofnumpy.timedelta64This happens because the histogram implementation performs numeric operations (range estimation, binning) that are not directly compatible withtimedelta64dtypes.What this PR does:
numpy.timedelta64inputs inAxes.hist.timedelta64inputs to a numeric representation prior to range estimation and binning.timedelta64inputs no longer crash.Example:
import numpy as np
import matplotlib.pyplot as plt
arr = np.array([1,2,5,7], dtype="timedelta64[D]")
plt.hist(arr)
plt.show()