Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[MRG + 1] Fix element-wise comparison for numpy #8011

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 12, 2016

Conversation

aashil
Copy link
Contributor

@aashil aashil commented Dec 8, 2016

Reference Issue

Working on #7994

What does this implement/fix? Explain your changes.

Any other comments?

Fix the element-wise comparision issue with numpy.

@aashil
Copy link
Contributor Author

aashil commented Dec 8, 2016

@lesteve: I followed your instructions which gave me the below error in this particular line:
multioutput = check_array(multioutput, ensure_2d=False)

Error:

TypeError: Singleton array array('variance_weighted', 
          dtype='|S17') cannot be considered a valid collection.

On a side note, what do you mean when you say install numpy from master ?

@aashil aashil changed the title [WIP] [WIP] Fix element-wise comparison for numpy Dec 8, 2016
@lesteve
Copy link
Member

lesteve commented Dec 8, 2016

Error:

TypeError: Singleton array array('variance_weighted',
dtype='|S17') cannot be considered a valid collection.

Hmmm that means that somewhere the string 'variance_weighted' is turned into an array ... so the string_types clause is skipped. You will need to either:

  1. understand where in our code we do this conversion 'variance_weighted' -> np.array('variance_weighted') and see how easy it is to get rid of it
  2. treat the case of singleton string arrays in the string_types clause

My preference is for 1.

On a side note, what do you mean when you say install numpy from master ?

That means installing the numpy development version because that is what the original issue was about. You can still work on this issue without it but the only way to make sure the original issue is fixed is through Travis. This can be a bit cumbersome and frustrating because the feedback loop is a lot longer that testing things locally.

@@ -90,7 +90,10 @@ def _check_reg_targets(y_true, y_pred, multioutput):
n_outputs = y_true.shape[1]
multioutput_options = (None, 'raw_values', 'uniform_average',
'variance_weighted')
if multioutput not in multioutput_options:
if isinstance(multioutput, string_types) and multioutput not in multioutput_options:
Copy link
Contributor

@dalmia dalmia Dec 9, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When any value among raw_values, uniform_average or variance_weighted is passed, this condition is not met since it checks for invalid multioutput_option. We don't have any check for multioutput being a string type and a valid multioutput_option. That check needs to be added.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will refine the if else to make sure we account for the correct multioutput check there.

@@ -90,7 +90,10 @@ def _check_reg_targets(y_true, y_pred, multioutput):
n_outputs = y_true.shape[1]
multioutput_options = (None, 'raw_values', 'uniform_average',
'variance_weighted')
if multioutput not in multioutput_options:
if isinstance(multioutput, string_types) and multioutput not in multioutput_options:
raise ValueError("Invalid multioutput value")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel it's better to use Invalid multioutput option here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. If that's what you prefer.

@dalmia
Copy link
Contributor

dalmia commented Dec 9, 2016

However, it does seem natural that check_array should have a default behavior if the array passed is a string, which it currently doesn't seem to have. Your views @lesteve?

@aashil
Copy link
Contributor Author

aashil commented Dec 9, 2016

@lesteve I think the string_types clause is not skipped but within the check_array condition it converts the string variance_weighted to an array using np.asarray("variance_weighted"). Later we check if that array is singleton or not and raise the above error if it is singleton. @dalmia Would you like to help me out here ?

@dalmia
Copy link
Contributor

dalmia commented Dec 9, 2016

Sure @aashil. From what I see, you need to add another check if multioutput is among the valid multioutput_options and do the corresponding functionality there. So just change:

elif multioutput is not None:

To:

elif multioutput is not None and multioutput not in multioutput_options:

That should be all.

@aashil
Copy link
Contributor Author

aashil commented Dec 9, 2016

@dalmia I believe you mean elif multioutput is not None and multioutput in multioutput_options: But the real problem is in check_array() method inside the elif clause which is not happy with the array being singleton. Take a look at my latest commit.

@dalmia
Copy link
Contributor

dalmia commented Dec 9, 2016

@aashil Sorry I made a small mistake. Also, I don't think I conveyed properly what I intended to say. Let me elaborate what I intend to say. _check_reg_targets intends to return a proper value of multioutput. So, if a valid multioutput string is passed as a parameter, it won't be modified. However, if it's not a string, only then it needs to go for check_array.
The whole patch then becomes:

multioutput_options = (None, 'raw_values', 'uniform_average',
                           'variance_weighted')
   # If it is a string, but not a valid option, raise an error
    if isinstance(multioutput, string_types) and multioutput not in multioutput_options:
        raise ValueError("Invalid multioutput option")
   # If it is not a string then check for the validity of the array
    elif multioutput is not None and not isinstance(multioutput, string_types):
        multioutput = check_array(multioutput, ensure_2d=False)
        if n_outputs == 1:

@aashil
Copy link
Contributor Author

aashil commented Dec 9, 2016

@dalmia Ahh, that makes it so clear. Thank you.

@aashil aashil force-pushed the dev-fix-numpy-broken branch from 1e8d7c5 to 019f892 Compare December 9, 2016 06:33
* Refactored the if clause and add proper check for valid strings.
* Fix PEP8 errors.
@aashil aashil force-pushed the dev-fix-numpy-broken branch from 019f892 to 7ef3323 Compare December 9, 2016 06:56
@dalmia
Copy link
Contributor

dalmia commented Dec 9, 2016

Sure, happy to help :)

@aashil aashil changed the title [WIP] Fix element-wise comparison for numpy [MRG] Fix element-wise comparison for numpy Dec 9, 2016
@lesteve
Copy link
Member

lesteve commented Dec 9, 2016

@aashil I pushed a cosmetic change (I feel the if clause logic is more readable this way) and a test for the error message, have a look at a6efd19.

@amueller
Copy link
Member

amueller commented Dec 9, 2016

LGTM

@amueller amueller changed the title [MRG] Fix element-wise comparison for numpy [MRG + 1] Fix element-wise comparison for numpy Dec 9, 2016
@aashil
Copy link
Contributor Author

aashil commented Dec 9, 2016

Tested locally. LGTM

@lesteve
Copy link
Member

lesteve commented Dec 12, 2016

OK, merging then, thanks a lot @aashil!

@lesteve lesteve merged commit 6a42ea2 into scikit-learn:master Dec 12, 2016
sergeyf pushed a commit to sergeyf/scikit-learn that referenced this pull request Feb 28, 2017
Was causing "ValueError: The truth value of an array with more than one element is ambiguous"
@Przemo10 Przemo10 mentioned this pull request Mar 17, 2017
Sundrique pushed a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017
Was causing "ValueError: The truth value of an array with more than one element is ambiguous"
NelleV pushed a commit to NelleV/scikit-learn that referenced this pull request Aug 11, 2017
Was causing "ValueError: The truth value of an array with more than one element is ambiguous"
paulha pushed a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017
Was causing "ValueError: The truth value of an array with more than one element is ambiguous"
maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017
Was causing "ValueError: The truth value of an array with more than one element is ambiguous"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants