-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
API Freezing estimators #8370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
We could give up on the
|
I've not checked whether any of these are reasonable use contexts for frozen models. |
• FrozenModel delegates all attribute access (get, set, del) to its wrapped estimator (except where specified)
I don't like the feeling of this. It's the type of behavior that creates
surprises in the long term for users and developers (and surprises mean
bugs).
• isinstance(freeze(obj), type(obj)) == True and isinstance(freeze(obj), FrozenModel) == True
Do we really need this? It does make things tricky.
As a general design rule, I tend to prefer when we are able to build
designs that do not rely on isinstance. When libraries grow large, I find
that these designs tend to drive complexity in the library in a
superlinear way.
I think that if we want to rely on inheritence, we'll have to dynamically
generate classes. I worry a lot about this. It's feasible, but the
intersection of the people who understand how it works and what it
entails with the people who understand machine learning is small.
|
We could give up on the isinstance criterion and require that users do not use isinstance with estimator types on the RHS except in testing and similar.
Looks like we're thinking in the same direction. If we can replace these
tests by something else, such as estimator tags (#6599), I think that we
have more chances of keeping a sane codebase.
|
I agree that we've got too many tricky criteria here, and it's not going to
work with all of them maintained.
…On 16 February 2017 at 20:57, Gael Varoquaux ***@***.***> wrote:
> We could give up on the isinstance criterion and require that users do
not use isinstance with estimator types on the RHS except in testing and
similar.
Looks like we're thinking in the same direction. If we can replace these
tests by something else, such as estimator tags (#6599), I think that we
have more chances of keeping a sane codebase.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#8370 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz601LFXLcQHJA11C-0o0hE2E7hO1qks5rdB2HgaJpZM4MCnFJ>
.
|
I was checking how transfer learning is achieved in Keras. In this manner, you would not require a What are the drawbacks of such design? |
What are the drawbacks of such design?
Every estimator needs to be coded to check in fit whether to fit.
This can be done by implementing the fitting logic in something like
"_fit" and having "fit" check, and then call _fit. We've frowned on the
complexity of such code so far.
|
I don't think that's reasonable. More simply it can be the responsibility of meta-estimators. Each could support or not support Disadvantage is then that meta-estimators become more complicated, and their designers have another thing to remember... |
And I'm leaning towards that approach. I don't think we should modify |
Since it needs to be handled in each meta-estimator, it is open to buggy implementations (remembering to check for it in one place and not another), so we need to be careful. |
In discussion earlier, @GaelVaroquaux and I came to some agreement that a simple design goes as follows (more or less what I said in February):
I would further suggest that:
Main problem with this design is that meta-estimators become more complicated. We have already burdened them with things like caching. |
Although #8374 may remain conceptually simpler for users (because it should work in all contexts), but a little more magical. It overwrites the fit method in frozen estimators with a special implementation. |
Hi @jnothman - I saw that this issue is inactive for a while - do you know if this is being addressed elsewhere? We have a use case for a Dynamic Classifier Ensemble library that could be solved by this implementation. |
Can you describe your use case, so that we can evaluate whether this is
indeed the best solution, and thus have more support for its inclusion?
|
Sure @jnothman. I am one of the developers of a library for Dynamic Ensemble Selection (DES) methods (the library is called DESlib). We are having problems to make the classes compatible with GridSearch / other CV functions. One of the main use cases of this library is to facilitate research on the field of Dynamic Selection of classifiers, and this led to a design decision where the base classifiers are fit by the user, and the DES methods receive a pool of base classifiers that were already fit - this allow users to compare many DES techniques with the same base classifiers. Note that this could also be helpful for any ensemble technique, as it gives more freedom for the users on how to create the pool of classifiers (see the example code below) However, this approach is creating an issue with GridSearch, since the clone method (defined in sklearn.base) is not cloning the classes as we would like. It does a shallow (non-deep) copy of the parameters, but we would like the pool of base classifiers to be deep-copied. I analyzed this issue and I could not find a solution that does not require changes on the scikit-learn code. Here is the sequence of steps that cause the problem:
The problem is that, to my knowledge, there is no way for my classifier to inform "clone" that a parameter should be always deep copied. I see that other ensemble methods in sklearn always fit the base classifiers within the "fit" method of the ensemble, so this problem does not happen there. It seems to me that this github issue could address our problem, by allowing us to "freeze" the pool of classifiers inside the DES classes, such that they are always deep copied / re-used (i.e. they retain the trained parameters of the pool of classifiers) Here is a short code that reproduces the issue: from sklearn.model_selection import GridSearchCV, train_test_split
from sklearn.base import BaseEstimator, ClassifierMixin
from sklearn.ensemble import BaggingClassifier
from sklearn.datasets import load_iris
class MyClassifier(BaseEstimator, ClassifierMixin):
def __init__(self, base_classifiers, k):
self.base_classifiers = base_classifiers # Base classifiers that are already trained
self.k = k # Simulate a parameter that we want to do a grid search on
def fit(self, X_dsel, y_dsel):
pass # Here we would fit any parameters for the Dynamic selection method, not the base classifiers
def predict(self, X):
return self.base_classifiers.predict(X) # In practice the methods would do something with the predictions of each classifier
X, y = load_iris(return_X_y=True)
X_train, X_dsel, y_train, y_dsel = train_test_split(X, y, test_size=0.5)
base_classifiers = BaggingClassifier()
base_classifiers.fit(X_train, y_train)
clf = MyClassifier(base_classifiers, k=1)
params = {'k': [1, 3, 5, 7]}
grid = GridSearchCV(clf, params)
grid.fit(X_dsel, y_dsel) # Raises error that the bagging classifiers are not fitted This code fails since the grid search clones the clf, which in turn will make a shallow copy of the base_classifiers (that will no longer be fitted). FYI, the actual failing test on our library is this: https://github.com/Menelau/DESlib/blob/sklearn-estimators/deslib/tests/test_des_integration.py#L36 |
One variant we have considered is where the meta-estimator can mark some of its parameters as not to be cloned. Would that work? |
Just to confirm: would this mean that the clone would simply refer to the same object? (e.g. something like This should work for our use case, since we make no changes on the base classifiers after they are fit. |
Yes, the clone would refer to the same object.
|
Thanks for confirming. This would solve the problem for our use case. |
I also have a potential use case for this. I work with forecasts from weather models. For training models I have access to historic weather data (wind speed, say) to use as features. However, when making a real-time forecasts that information is not available, I only have forecasts of those features. I want to be able to fit an estimator on the historical features, freeze it, and then use it within a pipeline to make predictions, without it being refit. |
A the sprint today we discussed four solutions for this, with a focus on transfer learning:
We have resolved to take the fourth option, since it involves no changes to clone, metaestimators, etc., and implement an example of it (serialising with joblib and/or onnx) as a prototype. The third was otherwise seen as the best solution available. |
Discussing this with @thomasjpfan : why do we want If I freeze an SVC, I shouldn't expect it to behave like an SVC... so why do we want this? (asking because this seems to be the main reason for rejecting one of the PRs which is otherwise pretty solid) |
@GaelVaroquaux @jnothman :) ? |
Do we really want a frozen SVC to pretend it's and SVC and yet its |
Maybe that isn't an important constraint, just as long as its behaviours
are equivalent?
|
I'm not sure what you mean by that @jnothman |
I mean that being an instance of the frozen type is not necessarily
important as long as it has all attributes of the frozen type.
|
I can't quite tell from this discussion if it is possible to freeze estimators. Gather the "open" setting means it is still in discussion. How bad can I be led astray by using Clearly the individual steps of a Pipeline could be manipulated so this would not be freezing. My main use case is that I do not want to refit in my use of the estimators within a few functions / method. I clearly not control what other users do offline. If freezing were possible, I might clone and freeze but this would lose the reference to the original estimator which a user might want to inspect. |
Regarding a static transformer (option 4 referenced above), maybe something like this?
The base model that is loaded (frozen model) does not change, and it seems to survive a |
In terms of API, now that we have |
+1 to close as we even have the |
I recall a discussion IRL with users where it was making their life much easier to have to just to import the estimator. I'm wondering if it was during the OpenML workshop and it was linked with some AutoML use cases. |
I'm also happy to have it inside the library. I've opened #29705 for this. |
TODO: motivate freezing: pipeline components, calibration, transfer/semisupervised learning
This should probably be a SLEP, but I just want it saved somewhere.
Features required for estimator freezing:
clone
must haveis isinstance(obj, FrozenModel): return obj
(or do so via class polymorphism /singledispatch
)FrozenModel
delegates all attribute access (get, set, del) to its wrapped estimator (except where specified)FrozenModel().estimator
but at some more munged name.FrozenModel
hasdef fit(self, *args, **kwargs): return self
FrozenModel
hasdef fit_transform(self, *args, **kwargs): return fit(self, *args, **kwargs).transform(self, args[0])
(and similar forfit_predict
?)isinstance(freeze(obj), type(obj)) == True
andisinstance(freeze(obj), FrozenModel) == True
type(freeze(obj))
(excluding__instancecheck__
, which seems irrelevant), this appears to be the hardest criterion to fulfill__class__
(!), overloading of__reduce__
, help! I think I've gone down the wrong path!!pickle
andcopy.[deep]copy
freeze(some_list)
will freeze every element of the listThe text was updated successfully, but these errors were encountered: