-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
[WIP] add matrix checking function for quiver input #7461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I don't understand why |
@@ -1734,7 +1735,7 @@ def recursive_remove(path): | |||
os.removedirs(fname) | |||
else: | |||
os.remove(fname) | |||
#os.removedirs(path) | |||
# os.removedirs(path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you really want to touch this function, I'd just alias recursive_remove
to shutil.rmtree
(with onerror
set to remove files) :-) Otherwise this is kind of pointless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just changed the comment part according to the pep8 style. Nothing is changed within the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I don't think that's really in the scope of this PR.
The purpose of this function is to make the error more clear and force users to use other inputs except matrix. Otherwise the error raised is not clear and points to other places. |
I don't understand how the function makes the error clearer. How is raising from a different function clearer than raising from the method that the user actually called? |
If you don't raise this error or don't use this helper function, try this
|
Yes, but I'm not asking why the new exception is necessary, but why the extra function is necessary. Also, why the cast with |
@@ -2701,3 +2702,15 @@ def __exit__(self, exc_type, exc_value, traceback): | |||
os.rmdir(path) | |||
except OSError: | |||
pass | |||
|
|||
|
|||
def is_matrix(obj): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my point of view any is_*
function must return a boolean value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd indeed rename this check_array
@@ -2701,3 +2702,15 @@ def __exit__(self, exc_type, exc_value, traceback): | |||
os.rmdir(path) | |||
except OSError: | |||
pass | |||
|
|||
|
|||
def is_matrix(obj): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd indeed rename this check_array
|
||
def is_matrix(obj): | ||
''' | ||
This is a test for whether the input is a matrix. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please use triple double quotes """
: this is the convention for docstring, not triple single quotes.
The docstring is also very unclear on what this function does.
cast_result = np.asanyarray(obj) | ||
if isinstance(cast_result, np.matrix): | ||
raise ValueError("The input cannot be matrix") | ||
return obj |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'd want to return the cast objects. Ie, if X is a list, you want the returned object to be an array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then should I raise the error in this function or just return false and raise the error inside the function who calls check_array?
Once this is properly done for quiver, this helper function can be used in other places as well. |
It looks to me like this needs some more thought and discussion at the design stage. For example, here is an (untested) alternative: def _fail_with_matrix(arglist):
for name, arr in arglist:
if isinstance(arr, np.matrix):
raise ValueError("Input argument %s is a numpy matrix subclass instance, which is not supported." % name)
# example of usage:
_fail_with_matrix([('X', X), ('Y', Y), ('Z', Z)]) Advantages: a single line handles multiple arguments, and the error message names In the quiver case, this checking probably should occur inside _parse_args. |
I am obviously biased from my experience on sklearn, but I do think that validation and transformation into ndarray could (and should) be done in the same function. On sklearn, we have a check_array function that does some input validation and converts to an ndarray of float (https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/utils/validation.py). This particular code is used everywhere in sklearn and is very useful. It enforces a very consistent API across the project. I also think that as long as the function we add is private, it will be very easy to replace and extend when needed. |
Modifying my example to apply |
It would be nice to standardize our input data handling across our numerous functions, but that's out of scope here. |
@NelleV, side comment: yes, we can do more to consolidate and standardize argument handling, and we already do quite a bit of it now. But we have much more need for flexibility than sklearn, and blanket conversion to float ndarrays is not appropriate for us. |
Hi @efiring |
I put the
|
@NelleV, maybe we can discuss our disagreement some time. Part of my point is that if you really want to make the API user-friendly, it is good to tell the user which argument is failing a test. |
if len(args) == 2 or len(args) == 4: | ||
V = np.atleast_1d(args.pop(-1)) | ||
U = np.atleast_1d(args.pop(-1)) | ||
elif len(args) == 3 or len(args) == 5: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are increasing the number of lines of code here, and I don't see any benefit to it. If you left it the way it was, all you would have to do is insert your _check_array
into each of the lines extracting U, V, and C, like this:
U = np.atleast_1d(cbook._check_array(args.pop(-1)))
Better yet, you could modify your checking function to use np.atleast_1d
instead of np.asanyarray
, because the former calls the latter internally. Then in the original you would just replace np.atleast1d
with cbook._check_array
in each of the 3 lines. Much nicer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that doing the _check_array
on U, V, C here (where they are defined) is more suitable than leaving them at the end. But I don't think stuffing the _check_array
and atleast_1d
together is a good idea, at least, seem less readable to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@trpham, I disagree. We need to encapsulate logical chunks of argument validation that typically go together so as to minimize code duplication. The consolidation I am suggesting here is really minimal--it is ensuring that a given argument is some sort of ndarray but not a matrix. That's consistent with the name _check_array
(or it could be _ensure_array
, etc.) With regard to matrix checking: all of our array-like argument validation should be explicitly blocking (or converting) matrix types because they are known to have odd behavior. This needs to be embedded in the more general validation and "argument scrubbing" functions. We have never claimed to support the matrix subclass, and have always known we didn't want to try to do so--but in the evolution of the library, basic functionality has come first, and ever more stringent validation and scrubbing is gradually being added.
Superseded by #13089. Thanks for the PR! |
refer to the problem mentioned in #1558. Add checking condition in quiver.py to ensure that any input except matrix are token.