-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
BUG: boxplot fails when one column is all NaNs #8240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -764,3 +764,4 @@ Bug Fixes | |||
needed interpolating (:issue:`7173`). | |||
- Bug where ``col_space`` was ignored in ``DataFrame.to_string()`` when ``header=False`` | |||
(:issue:`8230`). | |||
- Bug where ``Dataframe.boxplot()`` failed when entire column was empty. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls add issue number
minor comments and squash @TomAugspurger pass to you, merge at your discretion |
Add test case for empty column Replace empty arrays with [np.nan] Add fix to the release notes Add issue numbers
bbe54c3
to
d645f23
Compare
@TomAugspurger this one's had the issue numbers added and then squashed, OK to merge or do I need to come up with a different approach? |
I think this is fine. I'll merge tomorrow when I get connected to wifi again. |
Sorry, I let this get out of date. Merged via: fb977d7 |
This is more properly addressed by matplotlib/matplotlib#3571 |
Empty distributions are plottable in mpl < 1.4.0. In 1.4.0, a ValueError is raised. This has been fixed in mpl 1.4.0-dev (see matplotlib/matplotlib#3571). In order for skbio.draw.boxplots to support empty distributions across mpl versions, empty distributions are replaced with [np.nan]. See pandas-dev/pandas#8382 and pandas-dev/pandas#8240 for details.
Fixes #8181. Currently the boxplot fails when trying to compute quantiles on an empty array, which numpy can't deal with. Works okay if we use an array with a single
np.nan
instead, the empty column is just left out of the plot as expected.I guess that might be a bit of a hack, relying on how numpy deals with a single
np.nan
in an array versus an empty array? If people think it's too risky to depend on that behaviour I'll try to rethink.New output: