Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Add appropriate error on color size mismatch in scatter #7363

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Nov 9, 2016

Conversation

bcongdon
Copy link
Contributor

Addresses #7314.

I imagine that the call to ravel is a necessary evil, but if not, it might be more prudent to compare shape of x and c_array

@NelleV
Copy link
Member

NelleV commented Oct 30, 2016

Can you please add a test?

@bcongdon
Copy link
Contributor Author

Yup! Added one.

@NelleV NelleV changed the title Add appropriate error on color size mismatch in scatter [MRG+1] Add appropriate error on color size mismatch in scatter Oct 30, 2016
@NelleV
Copy link
Member

NelleV commented Oct 30, 2016

Thanks!
LGTM

Copy link
Member

@efiring efiring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scatter argument handling is tricky...

c = np.ma.ravel(c_array)
elif c_array.size not in (3, 4):
# Wrong size. Not a rgb/rgba and not same size as x
raise ValueError("x and c must be the same size")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the right place or way to do the checking. It is not taking into account the docstring description of c. See below.

@cleanup
def test_color_length_mismatch():
N = 500
x, y = np.random.rand(N), np.random.rand(N)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the interest of keeping tests minimal and direct, you could reduce N to 5, and use arange instead of rand.
You could also add a test for the handling of c when it is an rgba or sequence of them.

c = np.ma.ravel(c_array)
elif c_array.size not in (3, 4):
# Wrong size. Not a rgb/rgba and not same size as x
raise ValueError("x and c must be the same size")

if c_array is None:
colors = c # must be acceptable as PathCollection facecolors
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note the comment. I think what you need to do is put in a check here to ensure that c is "acceptable as PathCollection facecolors". You can look at the Collections code to see what that entails.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or push this checking in to collections and make sure a sensible exception is raised there?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Er, no we can't due to the masked point logic a few lines down.

@@ -4014,14 +4014,15 @@ def scatter(self, x, y, s=None, c=None, marker=None, cmap=None, norm=None,
else:
try:
c_array = np.asanyarray(c, dtype=float)
if c_array.size == x.size:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking back at @anntzer's comment in #7314, I think this line should be checking that c_array.shape matches the original shape of x and y. That implies that x and y should be checked earlier for matching shapes, and that their original common shape should be saved so it can be used in the c shape-check.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstrings say x/y must be (n, ) shaped, but it does not look like we check that anywhere. I suspect we have the ravel to in fact support (n, 1) and (1, n) shaped arrays arrays.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about a preprocess step where x and y get squeezed? that'd convert both (n, 1) and (1, n) to (n, ) and leave anything else alone.

@tacaswell
Copy link
Member

I basically agree with @efiring on this review.

@bcongdon
Copy link
Contributor Author

Thanks for the comments. I switched around the logic to test c_array against the original shape of x and y. I allowed it to match either one, as I wasn't sure how else to differentiate the (1,n) vs (n,1) case that @tacaswell described.

It looks like PathCollections eventually uses mcolors.to_rgba_array to check the validity of the facecolors, so I used that to assert the validity of c as a color sequence.

mcolors.to_rgba_array(colors)
except ValueError:
# c not acceptable as PathCollection facecolor
raise ValueError("c not acceptable as color sequence")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error message should probably contain the x shape, y shape, and c shape. At this point we have tried all of the things the user could have meant, have not been able to work it out so we should give feedback on everything we tried.

@bcongdon
Copy link
Contributor Author

Added the shape of c, x, and y to the ValueError message.

@story645
Copy link
Member

story645 commented Nov 2, 2016

@bcongdon I think the point is that arrays of shape (n, 1) and (1, n) should be treated as if they have equal shape, but checking on size is a bad idea 'cause an array of shape (2,2) has equal size with an array of (4,1) or (1,4).

@tacaswell
Copy link
Member

@bcongdon This seems to have conflicts in the test file, can you rebase?

@efiring can you please re-review (you have a request changes which I am pretty sure are addressed).

I agree with @story645 that using squeeze instead of ravel would make more sense, but worry that would break undocumented (but probably relied on) behavior (because users). If we do that it should be a follow up PR and not block this one as-is.

@tacaswell tacaswell added this to the 2.0.1 (next bug fix release) milestone Nov 8, 2016
@tacaswell tacaswell changed the title [MRG+1] Add appropriate error on color size mismatch in scatter [MRG+2] Add appropriate error on color size mismatch in scatter Nov 8, 2016
@bcongdon
Copy link
Contributor Author

bcongdon commented Nov 8, 2016

Please excuse my novice git skills. Not sure if I rebased correctly... Is there a better way to rebase the current master changes without having all the subsequent commits showing up in the PR?

@tacaswell
Copy link
Member

The last step needs to be a force-push. your last commit merges the just-rebased commits back into your local branch so they will show up twice. See http://matplotlib.org/devdocs/devel/gitwash/development_workflow.html?highlight=rebase#rebasing-a-pull-request-pr

@bcongdon bcongdon force-pushed the scatter-error-msg-fix branch from c07ccf1 to 12a5fdd Compare November 8, 2016 16:15
@bcongdon
Copy link
Contributor Author

bcongdon commented Nov 8, 2016

Thanks! 👍

@QuLogic
Copy link
Member

QuLogic commented Nov 8, 2016

There are still 8 commits when there should be 4; when you rebase, be sure to drop the extra copy. Use git rebase -i for an interactive rebase and drop one of the copies.

@bcongdon bcongdon force-pushed the scatter-error-msg-fix branch from 12a5fdd to 65e6a64 Compare November 8, 2016 21:14
@bcongdon
Copy link
Contributor Author

bcongdon commented Nov 8, 2016

Whoops, didn't realize that. Thanks

# c not acceptable as PathCollection facecolor
msg = ("c of shape {0} not acceptable as a color sequence "
"for x with shape {1}, y with shape {2}")
raise ValueError(msg.format(c.shape, x.shape, y.shape))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see possibilities for improvement in this block: first, the error message is based on the raveled x and y shapes, not the original shapes, so this could be confusing. Maybe just say 'for x with size...' etc. and use x.size etc. After all, x and y are documented as 1-D and treated as 1-D, so it is their size that matters, not their shapes. Second, I think that you could make it both more compact and more efficient by keeping the work done by mcolors.to_rgba_array. You can delete the colors = c, and replace the line inside the 'try' with 'colors = mcolors.to_rgba_array(c)'. It will be subject to another to_rgba_array call inside PathCollection, which will be able to recognize that it already has an rgba array, instead of using a list comprehension with a function call for each color.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I made those changes.

@efiring efiring changed the title [MRG+2] Add appropriate error on color size mismatch in scatter Add appropriate error on color size mismatch in scatter Nov 9, 2016
@efiring efiring merged commit 5901b38 into matplotlib:master Nov 9, 2016
@efiring
Copy link
Member

efiring commented Nov 9, 2016

When I tried to backport this, I realized that it contains an error: instead of adding a test, it is replacing an unrelated test. I should have noticed this before merging. I will fix it right now.

@efiring
Copy link
Member

efiring commented Nov 9, 2016

I don't want to spend any more time thrashing around with the tests, which are incompatible between v2.x and master, so I am not going to backport this. Therefore I am moving the milestone to 2.1. That should be soon enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants