Thanks to visit codestin.com
Credit goes to github.com

Skip to content

FEA Implement classical MDS #31322

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open

Conversation

dkobak
Copy link
Contributor

@dkobak dkobak commented May 6, 2025

Fixes #15272. Supersedes #22330.

This PR implements classical MDS, also known as principal coordinates analysis (PCoA) or Torgerson's scaling, see https://en.wikipedia.org/wiki/Multidimensional_scaling#Classical_multidimensional_scaling. As discussed in #22330, it is implemented as a new class ClassicalMDS.

Simple demonstration:

import pylab as plt
import numpy as np

from sklearn.datasets import load_iris
from sklearn.manifold import ClassicalMDS
from sklearn.decomposition import PCA

X, y = load_iris(return_X_y=True)

Z1 = PCA(n_components=2).fit_transform(X)
Z2 = ClassicalMDS(n_components=2, metric="euclidean").fit_transform(X)
Z3 = ClassicalMDS(n_components=2, metric="cosine"   ).fit_transform(X)
Z4 = ClassicalMDS(n_components=2, metric="manhattan").fit_transform(X)

fig, axs = plt.subplots(nrows=2, ncols=2, figsize=(6, 6), layout="constrained")

axs.flat[0].scatter(Z1[:,0], Z1[:,1], c=y)
axs.flat[0].set_title("PCA")

axs.flat[1].scatter(Z2[:,0], Z2[:,1], c=y)
axs.flat[1].set_title("Classical MDS, Euclidean dist.")

axs.flat[2].scatter(-Z3[:,0], Z3[:,1], c=y)
axs.flat[2].set_title("Classical MDS, cosine dist.")

axs.flat[3].scatter(Z4[:,0], Z4[:,1], c=y)
axs.flat[3].set_title("Classical MDS, Manhattan dist.")

cmds

Classical MDS is also set as default initialization for metric/non-metric MDS in the MDS() class.

For consistency, this PR also adds support for non-Euclidean metrics to the MDS class.

Copy link

github-actions bot commented May 6, 2025

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: da28a2d. Link to the linter CI: here

@dkobak
Copy link
Contributor Author

dkobak commented May 8, 2025

Pinging @antoinebaker for a review :-) I have been doing some updates to this PR but am now finished with it.

@antoinebaker
Copy link
Contributor

Thanks for the PR @dkobak !

Could you maybe use this PR to implement ClassicalMDS only, and do the enhancements of MDS in a separate / follow up PR ? It will ease the reviewing / merging process I think.

@dkobak
Copy link
Contributor Author

dkobak commented May 12, 2025

Could you maybe use this PR to implement ClassicalMDS only, and do the enhancements of MDS in a separate / follow up PR ? It will ease the reviewing / merging process I think.

All right. I took all changes to the MDS class out of this PR now. I will wait until this PR is merged and then do a follow-up PR to change the MDS class.

@dkobak dkobak changed the title Implement classical MDS FEA Implement classical MDS May 14, 2025
@antoinebaker
Copy link
Contributor

Could you please add ClassicalMDS to the common test suite for estimators ? It can be done by adding an entry to the INIT_PARAMS dict:

INIT_PARAMS = {
AdaBoostClassifier: dict(n_estimators=5),
AdaBoostRegressor: dict(n_estimators=5),

Then it will be tested for common checks on estimators in sklearn/tests/test_common.py::test_estimators

@dkobak
Copy link
Contributor Author

dkobak commented May 28, 2025

Hi @antoinebaker. I think the INIT_PARAMS dictionary is only needed to provide non-default params (like a small number of iterations), which for ClassicalMDS is not needed. An estimator does not have to be in INIT_PARAMS in order to be tested, it will simply be tested with default params in that case. Let me know if I am wrong.

Copy link
Contributor

@antoinebaker antoinebaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here a first round of reviews

@antoinebaker
Copy link
Contributor

Hi @antoinebaker. I think the INIT_PARAMS dictionary is only needed to provide non-default params (like a small number of iterations), which for ClassicalMDS is not needed. An estimator does not have to be in INIT_PARAMS in order to be tested, it will simply be tested with default params in that case. Let me know if I am wrong.

Ah my bad, I think you're right. If ClassicalMDS appears in the sklearn/tests/test_common.py::test_estimators suite that's all good.

@adrinjalali
Copy link
Member

I'm not sure if we discussed this, but I favor this comment (#22330 (comment)) (except the default value) as an overall API, instead of introducing a new class. Any blockers for doing so?

@dkobak
Copy link
Contributor Author

dkobak commented Jun 3, 2025

@adrinjalali: Yes, we did discuss it. Just below the comment you linked to, @antoinebaker gave detailed reasons for why he prefers a separate class, please see here: #22330 (comment). He convinced me, and you wrote "I'm happy with the suggestions here" (#22330 (comment)), which is why I implemented a separate class...

@adrinjalali
Copy link
Member

Hmm. Yeah fair enough. Just to avoid future surprises, maybe @lorentzenchr @GaelVaroquaux could also give their opinion?

@antoinebaker
Copy link
Contributor

antoinebaker commented Jun 4, 2025

The TLDR of #22330 (comment) is that ClassicalMDS (Principal Coordinates Analysis) implemented in this PR and the current MDS (SMACOF) implemented in sklearn don't have much in common except their names (different algorithms, objectives, arguments, attributes).

EDIT: removed outdated comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sklearn MDS vs skbio PCoA
3 participants