Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[MRG+1] Sparse multilabel target support in metrics #3395

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Aug 12, 2014

Conversation

jnothman
Copy link
Member

This introduces a series of helper classes that abstract away aggregation over multilabel structures. This enables efficient calculation in sparse and dense binary indicator matrices, while maintaining support for the deprecated sequences format.

@arjoly
Copy link
Member

arjoly commented Jul 16, 2014

great ! I will review this pr :-) when I ditch some time.

@jnothman jnothman changed the title [MRG] Sparse multilabel target support [MRG] Sparse multilabel target support in metrics Jul 16, 2014
def _weight(self, X):
print(X)
if self.sample_weight is not None:
print(X * self.sample_weight[:, None])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They change the travis output in an interesting way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed and hidden from history!

@vene
Copy link
Member

vene commented Jul 16, 2014

Travis tests are not really failing here, it's just that the output gets truncated. At the very least they work on my box.

if labels is None:
labels = unique_labels(y_true, y_pred)
if binarize:
binarizer = MultiLabelBinarizer([labels])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can (and I guess should) use the sparse_output=True param which has been merged to master in the meanwhile.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... currently the binarize option isn't used (or tested, indeed), but you're right that it would better produce sparse output. I can just remove it. Or as @arjoly suggests, get rid of _SequencesMultilabelHelper and use it always. Perhaps that deserves a benchmark.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.1%) when pulling f6bb3d9 on jnothman:sparse_multi_metrics into 2e7ef3c on scikit-learn:master.

@jnothman
Copy link
Member Author

A few comments:

  • There may be no benefit in handling the dense case at all. Maybe we should just convert everything to sparse matrices, but even then it is useful to have a helper to give names .
  • Perhaps this is worth benchmarking; or perhaps metric calculation is so small compared to other costs that we don't care. Is memory an issue here?
  • The sparse matrix code is very easy to adapt for either CSC or CSR (but not a mixture): just swap the axis argument from 1 to 0 and vice-versa. LabelBinarizer and MultiLabelBinarizer output CSR, while OvR will probably output CSC. So one of the arguments will usually need to be converted.

@jnothman
Copy link
Member Author

FWIW, this is benchmark without turning everything into sparse matrix:

$ benchmarks/bench_multilabel_metrics.py --samples 10000 --classes 15 --density .2
Metric             csc       csr     dense sequences
accuracy         0.018     0.011     0.027     0.087
f1               0.023     0.011     0.051     0.137
f1-by-sample     0.032     0.012     0.056     0.105
hamming          0.027     0.013     0.047     0.082
jaccard          0.018     0.012     0.027     0.109

This is with:

$ benchmarks/bench_multilabel_metrics.py --samples 10000 --classes 15 --density .2
Metric             csc       csr     dense sequences
accuracy         0.018     0.011     0.072     0.293
f1               0.023     0.011     0.086     0.298
f1-by-sample     0.021     0.012     0.092     0.295
hamming          0.023     0.016     0.089     0.292
jaccard          0.022     0.013     0.067     0.310

We're dealing with small numbers apart from the sequences case, which is being deprecated, but is substantially faster without binarizing, so seeing as the implementation is here and tested, we might as well make it fast until deprecation is complete. Making dense data sparse adds ~.04s, but we're currently dealing with density settings that may favour sparse. We could experiment with varying those parameters, but I don't see the point.

@arjoly
Copy link
Member

arjoly commented Jul 17, 2014

Thanks for the benchmark !

@vene
Copy link
Member

vene commented Jul 17, 2014

Thanks Joel,
Looking at the benchmark, my thoughts re: converting to sparse are:

  • until complete deprecation, we should keep the sequences helper instead of converting.
  • If this is merged at the same time (rather, in the same release) as [MRG+1] Sparse One vs. Rest #3276 which makes LabelEncoder output sparse by default, we can do away with the dense helper and convert to sparse, since it doesn't seem to make a difference.

@jnothman
Copy link
Member Author

If I understand the state of play, #3276 only makes LabelBinarizer sparse
by default for OvR, which is irrelevant here, because by the time it gets
here the Binarization is already inverted. Still, I'm happy to remove
direct handling of dense matrices here if there's another +1 (and maybe
with another benchmark for a positive indicator density where dense is
appropriate).

On 17 July 2014 17:08, Vlad Niculae [email protected] wrote:

Thanks Joel,
Looking at the benchmark, my thoughts re: converting to sparse are:

  • until complete deprecation, we should keep the sequences helper
    instead of converting.
  • If this is merged at the same time (rather, in the same release) as
    [MRG+1] Sparse One vs. Rest #3276 [MRG+1] Sparse One vs. Rest #3276 which
    makes LabelEncoder output sparse by default, we can do away with the dense
    helper and convert to sparse, since it doesn't seem to make a difference.


Reply to this email directly or view it on GitHub
#3395 (comment)
.

@arjoly
Copy link
Member

arjoly commented Jul 19, 2014

Still, I'm happy to remove
direct handling of dense matrices here if there's another +1 (and maybe
with another benchmark for a positive indicator density where dense is
appropriate).

+1 for a benchmark !

@GaelVaroquaux
Copy link
Member

Beautiful implementation of a generic pattern. It certainly makes the code in metrics much more readable.

However, I am a bit worried that such patterns require some learning to be able to read the codebase, and will make it harder for people without a lot of expertise to maintain the codebase.

My gut feeling, and it's only a gut feeling, is that we could try to have a set of functions that implement the methods that you created, but as functions, not as methods. This means that the routing of the genericity would be done inside the function. I find it hard to know beforehand if it is actually going to result in more readable code or not. Would you bare with me, and try to implement this approach? Maybe in a separate PR to compare (I don't know if the separate PR is a good or a bad idea).

@jnothman
Copy link
Member Author

As an intuition, as long as we are handling formats that require
non-trivial type detection (as performed by type_of_target), we either add
what may be substantial overhead in doing that detection in each function,
or we pass it to each function call which I think will be much less pretty.
If you still want me to implement it, I'll try find time to do so. The
current proposal is nice in ensuring that checking type and ensuring the
correct format (i.e. CSR) is done once.

The other option that avoids a helper class is to put all the metric
implementations into polymporphic class hierarchy, but I don't think this
is really worth considering.

On 20 July 2014 02:32, Gael Varoquaux [email protected] wrote:

Beautiful implementation of a generic pattern. It certainly makes the
code in metrics much more readable.

However, I am a bit worried that such patterns require some learning to be
able to read the codebase, and will make it harder for people without a lot
of expertise to maintain the codebase.

My gut feeling, and it's only a gut feeling, is that we could try to have
a set of functions that implement the methods that you created, but as
functions, not as methods. This means that the routing of the genericity
would be done inside the function. I find it hard to know beforehand if it
is actually going to result in more readable code or not. Would you bare
with me, and try to implement this approach? Maybe in a separate PR to
compare (I don't know if the separate PR is a good or a bad idea).


Reply to this email directly or view it on GitHub
#3395 (comment)
.

@jnothman
Copy link
Member Author

I think 3 classes with 2 out of 3 filled may make for an appropriate dense vs convert-to-sparse benchmark:

$ benchmarks/bench_multilabel_metrics.py --samples 10000 --classes 3 --density .7
Metric           dense     as csr
accuracy         0.006     0.030
f1               0.012     0.040
f1-by-sample     0.014     0.029
hamming          0.012     0.028
jaccard          0.008     0.028

This says nothing about memory, but in time the conversion has a cost, but runtime is still very small, so I could consider removing dense support.

@GaelVaroquaux, I could convert everything to sparse matrices and simplify the code a lot (unless we wanted a similar polymorphism for the multiclass case), particularly in using functions rather than methods of a helper class. This would mean slower sequences of sequences support, as shown above, but perhaps that is incentive for people to heed the DeprecationWarning.

@jnothman
Copy link
Member Author

@arjoly, damn your cruel refactoring...

@jnothman
Copy link
Member Author

Hopefully I correctly edited that rebase

@coveralls
Copy link

Coverage Status

Coverage increased (+0.01%) when pulling 5a7c600 on jnothman:sparse_multi_metrics into 41d02e0 on scikit-learn:master.

@arjoly
Copy link
Member

arjoly commented Jul 20, 2014

sorry :-(

@arjoly
Copy link
Member

arjoly commented Jul 20, 2014

This would mean slower sequences of sequences support, as shown above, but perhaps that is incentive for people to heed the DeprecationWarning.

I am fine with slower sequence of sequence.

@jnothman
Copy link
Member Author

@GaelVaroquaux, I think you'll much prefer this version where everything is calculated over CSR matrices... I am only concerned that the + and - and .multiply operations are a bit opaque, and really stand in for np.logcal_* functions not supported by scipy.sparse until recently.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.0%) when pulling cdcea67 on jnothman:sparse_multi_metrics into 8dab222 on scikit-learn:master.

@jnothman
Copy link
Member Author

Rebased.

@arjoly
Copy link
Member

arjoly commented Jul 21, 2014

# Master
(sklearn) ± python benchmarks/bench_multilabel_metrics.py --classes 4000 --samples 20000 --density 0.01
Metric           dense sequences
accuracy        20.851     0.814
f1              45.332    15.495
f1-by-sample    45.069    15.296
hamming         39.844     1.144
jaccard         21.153     0.913


# PR
(sklearn) ± python benchmarks/bench_multilabel_metrics.py --classes 4000 --samples 20000 --density 0.01
Metric             csc       csr     dense sequences
accuracy         0.261     0.084    26.684     4.477
f1               0.335     0.162    25.658     4.705
f1-by-sample     0.292     0.111    25.846     4.171
hamming          0.292     0.116    26.862     4.189
jaccard          0.332     0.146    26.152     4.111

Performance looks great ! Through I have a high discrepancy for some metrics between master and this pr.

@arjoly
Copy link
Member

arjoly commented Jul 21, 2014

Awesome pr !!! Could you update the docstrings and the narrative doc to highlight your work?

@arjoly
Copy link
Member

arjoly commented Jul 21, 2014

The extra call for unique labels is fine to me.

@jnothman
Copy link
Member Author

Could you update the docstrings and the narrative doc to highlight your work?

Fair point. I'd better go looking for things to change...

@jnothman
Copy link
Member Author

Do you think each docstring needs to specify "or sparse/dense label indicator matrix"?

@arjoly
Copy link
Member

arjoly commented Jul 21, 2014

I would do something in the spirit of what we have done for sparse one versus rest.

@jnothman
Copy link
Member Author

I see there you use {array-like, sparse matrix}

Currently, though, we say "array-like or label indicator matrix". What we want to say is something like: "1d array-like, or label indicator array / sparse matrix". I.e. "{array-like 1d,array of 1s and 0s,sparse matrix of 1s}"

@coveralls
Copy link

Coverage Status

Coverage increased (+0.0%) when pulling 744a7ad on jnothman:sparse_multi_metrics into 0d57c23 on scikit-learn:master.

@arjoly
Copy link
Member

arjoly commented Jul 21, 2014

code is awesome !

Currently, though, we say "array-like or label indicator matrix". What we want to say is something like: "1d array-like, or label indicator array / sparse matrix". I.e. "{array-like 1d,array of 1s and 0s,sparse matrix of 1s}"

If it holds on 80 character and contains shape, I am +1 for improved doctring.

@jnothman
Copy link
Member Author

Something else I now realise is missing here is sparse support in LRAP

@arjoly
Copy link
Member

arjoly commented Jul 25, 2014

Something else I now realise is missing here is sparse support in LRAP

If you are at it, I would be happy to see sparse input support for lrap.

@jnothman
Copy link
Member Author

Pushed changes to support sparse matrix in LRAP, and to improve documentation.

@jnothman
Copy link
Member Author

Assuming Travis is happy, I think this is where we want it to be. Votes for merge? @arjoly? @GaelVaroquaux?

@coveralls
Copy link

Coverage Status

Coverage increased (+0.0%) when pulling 15be75e on jnothman:sparse_multi_metrics into 1b2833a on scikit-learn:master.

@arjoly
Copy link
Member

arjoly commented Jul 30, 2014

Thanks for the lrap metric!!!

@arjoly
Copy link
Member

arjoly commented Jul 30, 2014

You get my +1

@jnothman jnothman changed the title [MRG] Sparse multilabel target support in metrics [MRG+1] Sparse multilabel target support in metrics Jul 30, 2014
@jnothman
Copy link
Member Author

Thanks @arjoly

@arjoly
Copy link
Member

arjoly commented Aug 4, 2014

A last reviewer ?

@arjoly
Copy link
Member

arjoly commented Aug 4, 2014

It should probably be updated to take into account the sample weight support in the jaccard metric.

@jnothman
Copy link
Member Author

jnothman commented Aug 4, 2014

It should probably be updated to take into account the sample weight support in the jaccard metric.

What's to update?

@arjoly
Copy link
Member

arjoly commented Aug 5, 2014

Apparently nothing, sorry. I should have checked the implementation.

@arjoly
Copy link
Member

arjoly commented Aug 11, 2014

A last reviewer? (ping randomly @ogrisel, @vene )

@larsmans larsmans merged commit 15be75e into scikit-learn:master Aug 12, 2014
@arjoly
Copy link
Member

arjoly commented Aug 12, 2014

Thanks @larsmans and @jnothman !!! :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants