-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Invariance tests for clustering metrics #8102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
See for instance #8101 (comment) |
Looks like a feasible first contribution, I'd like to look into this. @jnothman Could you please help me on how to start and the files associated . Thanks |
Thanks. Perhaps take a look at the common tests for classification and regression metrics at sklearn/metrics/test_common.py. |
@jnothman okay . Thanks |
I think I understand what needs to be done. Is there any functions other than checking invariance of the metrics in cluster and seeing if values reduce with less perfect clustering? |
@jnothman I have written the code .Could you please check if I am going in the right directions and are there any more tests that need to be added? How do I run the test file on my laptop. When I run it on spyder it gives no output. https://github.com/anki08/scikit-learn/blob/d4875f4a862a2fabf07d1e71b6f37f1bc6a88779/test_file.py |
|
and thank you, @anki08. At a very basic glance, this looks like it's heading in the right direction. |
We should have common tests for clustering metrics including that labels can be permuted (e.g. 0 and 1 swapped) to achieve the same score. General properties such as scores reduce when clustering is not perfect can also be tested.
The text was updated successfully, but these errors were encountered: