-
-
Notifications
You must be signed in to change notification settings - Fork 26.5k
FEA Add support for arbitrary metrics and informative initialization to MDS #32229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: antoinebaker <[email protected]>
Co-authored-by: Adrin Jalali <[email protected]>
Co-authored-by: antoinebaker <[email protected]>
|
@antoinebaker As promised, now that my Classical MDS PR has been merged, I prepared a follow-up PR to modify the |
|
@adrinjalali Also pinging you here explicitly. This is a straightforward PR that completes the MDS-related changes I was planning. It would be great to have this merged in time for 1.8, so that it goes out together with the previous MDS-related PR... 🙏 Hope this is still feasible! |
Co-authored-by: Adrin Jalali <[email protected]>
|
@adrinjalali Thanks a lot! All good comments, I have implemented all of them. One question. The code contained the following legacy snippet: if X.shape[0] == X.shape[1] and (self._metric != "precomputed"):
warnings.warn(
"The MDS API has changed. ``fit`` now constructs a"
" dissimilarity matrix from data. To use a custom "
"dissimilarity matrix, set "
"``metric='precomputed'``."
)This warning can be triggered in case X has as many features as samples (admittedly unlikely case, but it can happen). In that case there will be no way to get rid of this warning. Is this OK? Should we drop this warning altogether? It sounds like a future warning but is not implemented as a future warning... |
When that happens, we have more significant problems 😅 There are other places in the code where things break when number of features is exactly the same as number of samples and it's not a pairwise thing. |
Okay, but I edited the warning text now to be more explicit and not to say that the "API has changed" (it was a long time ago, so that part is obsolete). |
|
Hi @adrinjalali, let me know if there is anything I can still do here! 🙏 |
antoinebaker
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here a few suggestions, but I'm unfamiliar with the conventions for the deprecation cycle, so to be confirmed by a more seasoned developer :)
To summarize the changes in 1.8:
dissimilarityis renamedmetricmetricis renamedmetric_mdsmetric_paramsis added
Co-authored-by: antoinebaker <[email protected]>
|
Thanks @antoinebaker, this makes sense and indeed follows the official guidelines. I pushed your edits. |
|
@antoinebaker Do you think there is anything else I should do here? I am not sure when the deadline for 1.8 is, but I am getting worried that this PR won't make it in time... |
antoinebaker
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from a nitpick, LGTM !
It's cool that the new initialization fixes the non-metric MDS examples (like plot_compare_methods and plot_manifold_sphere).
It would be nice if the second reviewer can confirm that the API changes are well handled.
Co-authored-by: antoinebaker <[email protected]>
adrinjalali
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise LGTM.
doc/whats_new/upcoming_changes/sklearn.manifold/32229.feature.rst
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. We need a second reviewer here.\
EDIT: I just saw Antoine having reviewed. So good to go.
This is a follow-up to #31322 that added a classical MDS implementation as
ClassicalMDSclass. As discussed over there, this PR does the following:MDSclass, following example ofClassicalMDS.MDSto make it consistent withClassicalMDSand other sklearn classes: distance metric is now set via themetricargument, and metric/non-metric MDS can be toggled viametric_mds=True/Falseargument. Backwards compatibility is ensured for the next scikit-learn versions.MDSclass, to be made default in the future.PS. Apologies for lots of obsolete commits here :-( Just ignore them.