Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Do we have a compelling reason to enforce tags? #18798

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
thomasjpfan opened this issue Nov 9, 2020 · 5 comments
Open

Do we have a compelling reason to enforce tags? #18798

thomasjpfan opened this issue Nov 9, 2020 · 5 comments
Labels
API Developer API Third party developer API related Question

Comments

@thomasjpfan
Copy link
Member

With #18797 we are making tags an optional requirement for third-party estimators. If there was a compelling reason to adopt tags, I would think that third-party developers would adopt it.

Currently, the tags are use to hold the metadata of an estimator. This is mostly use in our estimator checks.

One of the selling points is that the tags can replace the attributes we have in our estimators. We recently deprecated _pairwise attribute in #18143 . We are also working on deprecating _estimator_type in #17806. In this point of view, the tags provide a namespace for provided metadata in estimators, which can be used by other libraries that ingest estimators.

Are there any other reasons for third party developers to adopt tags?

@thomasjpfan thomasjpfan changed the title Do we have a good enough reason to enforce tags? Do we have a compelling reason to enforce tags? Nov 9, 2020
@glemaitre
Copy link
Member

FWIW, we start to use tags for our own test suite (e.g. run some checks on sampler supporting Dask array and dataframe).
But it is also limited currently because scikit-learn does not have a stable API on the testing framework (cf. #18750) which might be a little refraining as a usecase for most third-party libraries.

#18797 is indeed making the use of tag optional. However, it should be noticed that, currently, tags are enforced by using BaseEstimator. But BaseEstimator is indeed doing more than enforcing tag. My point is that if we go toward enforcing tag, we might want to have a specific Mixin which should provide only tag functionality?

@rth
Copy link
Member

rth commented Nov 11, 2020

Are there any other reasons for third party developers to adopt tags?

None that I can think of. I think you summarized the main points well. But we can also ask for feedback from maintainers of scikit-learn compatible projects, to get a better idea of how they feel about tags.

@glemaitre
Copy link
Member

Are there any other reasons for third party developers to adopt tags?

Actually an issue where a developer of a third-package wanted to use tags programmatically in their own estimator. It might be difficult to find the comment through our issue tracker.

@adrinjalali
Copy link
Member

Last year when I was working on a training on "how to write your own estimator" on a language usecase, since the input wasn't a numerical ndarray, and rather a list of strings, I had to use tags to pass check_estimator. To me for all those usecases using tags is a must for third party developers, and I see no reason why we shouldn't make them use it.

@adrinjalali
Copy link
Member

Update: Tags are not a part of the public API via __sklearn_tags__, and they're used in sklearn.utils.validation.validate_data, which is also now the recommended way to validate input.

So I'd say we can move ahead with requiring tags.

@adrinjalali adrinjalali added the Developer API Third party developer API related label Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Developer API Third party developer API related Question
Projects
None yet
Development

No branches or pull requests

4 participants