-
-
Notifications
You must be signed in to change notification settings - Fork 26.5k
[MRG] MAINT add base class for voting and stacking #15084
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@thomasjpfan @ogrisel @rth @adrinjalali So the naming of the base class is terrible but I wanted to have a WIP PR such that we see what is in common and if it makes sense to merge code. NB: the tests will fail because I did not add support for WDYT? |
thomasjpfan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this refactoring. The _BaseEnsembleHeterogeneousEstimator class has have well defined boundaries.
|
Good to be reviewed. I will open a PR to deprecate |
|
Having the init make it explicit that the estimators parameter is the
common denominator for all inherited classes.
I am fine keeping it.
…On Wed, 2 Oct 2019 at 13:07, Nicolas Hug ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In sklearn/ensemble/base.py
<#15084 (comment)>
:
> @@ -178,3 +182,76 @@ def _partition_estimators(n_estimators, n_jobs):
starts = np.cumsum(n_estimators_per_job)
return n_jobs, n_estimators_per_job.tolist(), [0] + starts.tolist()
+
+
+class _BaseHeterogeneousEnsemble(MetaEstimatorMixin, _BaseComposition,
+ metaclass=ABCMeta):
+ """Base class for ensemble learners based on heterogeneous estimators."""
+ _required_parameters = ['estimators']
+
+ @Property
+ def named_estimators(self):
+ return Bunch(**dict(self.estimators))
+
+ @AbstractMethod
+ def __init__(self, estimators):
+ self.estimators = estimators
i'm suggesting to not have the init method
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#15084?email_source=notifications&email_token=ABY32P6RROAFMVM32WZOLMTQMR6FTA5CNFSM4I2CTMN2YY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCGTRNOY#discussion_r330490319>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABY32PYGSAUSWSRVBIUX6RLQMR6FTANCNFSM4I2CTMNQ>
.
--
Guillaume Lemaitre
Scikit-learn @ Inria Foundation
https://glemaitre.github.io/
|
NicolasHug
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I'm becoming increasingly skeptical about the relevance of inheritance in some cases (like here where all it does is set a single attribute). Makes the code easy to write, but often harder to understand.
But LGTM anyway.
doc/whats_new/v0.22.rst
Outdated
| - |Fix| Stacking and Voting estimators now ensure that their underlying | ||
| estimators are either all classifiers or all regressors. | ||
| We introduced a new base class | ||
| :class:`ensemble.base._BaseHeterogeneousEnsemble` to raise consistent error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we include a private class in the whats new? This can be something like:
Stacking and Voting estimators now raise consistent error messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might be misunderstood what @NicolasHug meant by adding a link?
Did you mean mentioning the class or do you expect something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are not generating a new API doc for _BaseHeterogeneousEnsemble, there is nothing to link to: https://76528-843222-gh.circle-artifacts.com/0/doc/whats_new/v0.22.html#sklearn-ensemble
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The links referred to the Stacking and Voting estimators, sorry if that wasn't clear. I agree we shouldn't link a private class. (and I'm also fine not linking the estimators... it's just a nit)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh ok make sense.
doc/whats_new/v0.22.rst
Outdated
|
|
||
| - |Fix| Stacking and Voting estimators now ensure that their underlying | ||
| estimators are either all classifiers or all regressors. | ||
| We introduced a new base class |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not need a "new base class" part?
:class:
ensemble.StackingClassifier, :class:ensemble.StackingRegressor, :class:ensemble.VotingClassifier, and :class:ensemeble.VotingRegressornow raise consistent error messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wrong edit
|
Thank you @glemaitre ! |
closes #15056
Create a base class for Voting* and Stacking*. They both are an ensemble of multiple learners type.
They could share the
get_params,set_paramsand validation ofestimators(as well as the fitted attributes then).This base class could be contrasted with the ensemble of single learner type such as boosting (adaboost, GBDT), RF and Bagging.