-
-
Notifications
You must be signed in to change notification settings - Fork 26.5k
ENH Support CSR matrix in type_of_target #14862
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH Support CSR matrix in type_of_target #14862
Conversation
9f8d916 to
cb184fd
Compare
cb184fd to
92177d1
Compare
f212400 to
6203a53
Compare
f26476b to
3c31304
Compare
|
@jnothman is there any action expected on my side to merge this PR? |
|
No, there is just a lot of competing demand on reviewers' time. Thanks for pinging |
|
Hi @leonardbinet, thanks for your patience! Do you mind fixing conflicts? Hopefully, this will bring some attention again. Thanks! |
|
@cmarmo here it is :) |
|
Thanks @alk-lbinet . It seems to me that the failing check is unrelated to this PR. Perhaps @rth will find some time to review? Thanks! |
glemaitre
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to update the docstring:
Parameters
----------
y : {array-like, sparse matrix}
The target array. If a sparse matrix, `y` is expected to be a
CSR matrix.|
We would also need an entry in what's new to announce that we support CSR matrices. |
|
Bump, we currently accept sparse labels into autosklearn but we have to de-sparsify them to use @leonardbinet would you be continuing this PR? If not, I am happy to try finish and push this feature through next week. |
|
@leonardbinet can I get access to push this one, please? |
|
Hi, @ilivans I gave you access, feel free to update my branch 👍 |
|
I think @GuoqiangOu 's questions come down to the next one:
Another edge case that I found is
|
|
In the last commits I addressed the comments regarding documentation, csc_matrix and explicit zeros. It seems to be done to me. Please take a look, somebody 🙏 cc @glemaitre @rth |
It has two columns.
Is there sense in rejecting a full "sparse" matrix with no zeros and two nonzero values? -1 has been a longstanding label for "negative" with thanks to support vector machines at least. |
|
thanks @jnothman 🙌
It does, however the documentation (of the function) says "'continuous-multioutput': If it was a (2,1) matrix (a column) it would be treated as
It's a good question. I tried to give my opinion on this, I think it makes sense to reject such cases because of the underlying assumption of 0 being the "missing" value (there is a method It's easy to change the logic tho, so just lmk if you believe it's necessary to make it consistent with the dense case. |
|
btw I can't change the PR description, but it also fixes #18611 now |
I have to add, this logic hasn't been introduced by the PR, the PR just fixes exceptions, the logic was introduced in 'multilabel-indicator': [
...
csr_matrix(np.array([[0, 1]])),
# Only valid when data is dense
np.array([[-1, 1], [1, -1]]),
np.array([[-3, 3], [3, -3]]),
], |
jeremiedbb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I synced with main and added a what's new entry.
In addition to the exhaustive added tests, I checked that it fixes the 2 reported issues, and does not break any existing behavior. Let's merge.
Thanks @leonardbinet and everybody else for the help and feedback.
Reference Issues/PRs
Fixes #14860
Fixes #18611
Closes #23569
What does this implement/fix? Explain your changes.
This fixes the "sklearn.utils.multiclass.type_of_target" function for sparse matrices.