Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Polymorphic clone #5080

Closed
Closed
@jnothman

Description

@jnothman

sklearn.base.clone is defined to reconstruct an object of the argument's type with its constructor parameters (from get_params(deep=False)) recursively cloned and other attributes removed.

There are cases where I think the One Obvious Way to provide an API entails allowing polymorphic overriding of clone behaviour. In particular, my longstanding implementation of wrappers for memoized and frozen estimators relies on this, and I would like to have that library of utilities not depend on a change to sklearn.base. So we need to patch the latter.

Let me try to explain. Let's say we want a way to freeze a model. That is, cloning it should not flush its fit attributes, and calling fit again should not affect it. A syntax like the following seems far and away the clearest:

est = freeze_model(MyEstimator().fit(special_X, special_Y))

It should be obvious that the standard definition of clone won't make this operate very easily: we need to keep more than will be returned by get_params, unless MyEstimator().__dict__ becomes a param of the freeze_model instance, which is pretty hacky.

Alternative syntax could be class decoration (freeze_model(MyEstimator)()) or mixin (class MyFrozenEstimator(MyEstimator, FrozenModel): pass) such that the first call to fit then sets a frozen model. These are not only uglier, but encounter the same problems.

Ideally this sort of estimator wrapper should pass through {set,get}_params of the wrapped estimator without adding underscored prefixes (not that this is so pertinent for a frozen model, but for other applications of similar wrappers). It should also delegate all attributes to the wrapped estimator. Without making a mess of freeze_model.__init__ this is also not possible, IMO, without redefining clone.

So. Can we agree:

  • that it would not be a Bad Thing to allow polymporphism in cloning?
  • on a name for the polymorphic clone method: clone or clone_params or sklearn_clone
    ?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions