Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Ensure predictions sparse before sp.hstack in ClassifierChain #27905

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
lucyleeow opened this issue Dec 6, 2023 · 3 comments
Closed

Ensure predictions sparse before sp.hstack in ClassifierChain #27905

lucyleeow opened this issue Dec 6, 2023 · 3 comments
Labels
Needs Triage Issue requires triage

Comments

@lucyleeow
Copy link
Member

We use sp.hstack in a number of places in ClassifierChain where we may be stacking sparse with dense, e.g.,:

X_aug = sp.hstack((X, previous_predictions))

and

X_aug = sp.hstack((X, Y_pred_chain), format="lil")

AFAICT it seems stacking a sparse with dense via sp.hstack gives you a sparse array (even though sp.hstack is not documented to support dense):

In [34]: from scipy.sparse import coo_matrix, hstack
    ...: 
    ...: A = coo_matrix([[1, 2], [3, 4]])

In [35]: B = np.zeros((2,2))

In [36]: hstack([A,B])
Out[36]: 
<2x4 sparse matrix of type '<class 'numpy.float64'>'
        with 4 stored elements in COOrdinate format>

Maybe due to: https://github.com/scipy/scipy/blob/f990b1d2471748c79bc4260baf8923db0a5248af/scipy/sparse/_construct.py#L654 ?

Should we ensure y is sparse before using sp.hstack ?

I had quick look at our code, I could not find any other cases where it would be possible to be stacking dense + sparse. I think ClassifierChain is unique in that we do not usually combine X with y

Discussed here: #27700 (comment)

cc @glemaitre

@lucyleeow lucyleeow changed the title Convert predictions to sparse before sp.hstack in ClassifierChain Ensure predictions sparse before sp.hstack in ClassifierChain Dec 6, 2023
@github-actions github-actions bot added the Needs Triage Issue requires triage label Dec 6, 2023
@lucyleeow lucyleeow added Bug Needs Triage Issue requires triage and removed Needs Triage Issue requires triage Bug labels Feb 24, 2024
@glemaitre
Copy link
Member

So I looked closer at the issue. This is indeed supported. If the behaviour change, then I expect our test suite to fail so this is not a big deal.

I also did not spot any improvement in terms of performance (speed and memory) so I think that we can let this code as-is.

@lucyleeow
Copy link
Member Author

Interesting! Is there documentation about this or just in the code? Thanks

@glemaitre
Copy link
Member

It is only in the code because the documentation state of scipy state that the array should be sparse. So here there is an implicit conversion happening.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Triage Issue requires triage
Projects
None yet
Development

No branches or pull requests

2 participants