[MRG] GridSearchCV.use_warm_start parameter for efficiency #8230

jnothman · 2017-01-24T12:15:58Z

Alternative to #8226, provides a generic CV optimisation making use of warm_start.

The example, modified from #8226, shows the benefit of using this for optimising GBRT n_estimators.

raghavrv · 2017-01-24T14:26:46Z

This is amazing and simple!! Thanks for the PR!

jnothman · 2017-01-24T14:33:04Z

it needs a caveat emptor, and is not as readily user-friendly as #8226, but yes, it's simple.

jnothman · 2017-01-24T14:37:27Z

Presuming @agramfort had some interest in #8226, could you comment on whether you think this is sufficiently elegant/usable wrt API?

agramfort · 2017-01-24T20:52:18Z

sklearn/model_selection/_search.py

+                   if k == 'warm_start' or k.endswith('__warm_start')})
+            # one clone per fold
+            out = parallel(delayed(_warm_fit_and_score)(candidate_params,
+                                                        clone(base_estimator),


the clone flushes the previous param values no? what am i missing?

clone is per fold, not per parameter candidate. all params are passed to _warm_fit_and_score which fits each in turn.

agramfort · 2017-01-24T20:54:12Z

do you get an error if you try to fit next with less estimators?

for a Lasso for example you would need to fit first with a high alpha and then reduce it.

@raghavrv can you see if this would work smoothly for Lasso? Would it match LassoCV perf?

jnothman · 2017-01-24T22:50:57Z

do you get an error if you try to fit next with less estimators?

Never mind that; it's awkward if you modify a parameter like min_samples_split instead. That's why we need a caveat emptor; or why this might not be the right solution.

jnothman · 2017-01-25T12:04:02Z

for a Lasso for example you would need to fit first with a high alpha and then reduce it.

Maybe these sorts of details, i.e. what is appropriate to change when using warm_start, should be noted in warm_start descriptions.

agramfort · 2017-01-25T13:39:36Z

my problem how do you tell grid search was is a parameter that you can warm started what cannot be? eg in GBRT n_estimators can be. What if you do:

gbrt = GradientBoostingClassifier(max_depth=3, n_estimators=100)
gbrt.fit(X, y)
gbrt.set_params(max_depth=5, n_estimators=50)

does it break?

the GradientBoostingClassifierCV is simpler API wise as it's obvious that only n_estimators can be warm started.

raghavrv · 2017-01-25T16:12:49Z

BTW not cloning for non-warm-startable params makes it break silently for those params... You can see that from the below snippet (when run on this branch)...

import numpy as np
import matplotlib.pyplot as plt

from sklearn.ensemble import GradientBoostingClassifier
from sklearn import datasets
from sklearn.model_selection import GridSearchCV
import pandas as pd


data_list = [datasets.load_iris(return_X_y=True),
             datasets.load_digits(return_X_y=True),
             datasets.make_hastie_10_2()]
names = ['Iris Data', 'Digits Data', 'Hastie Data']

search_max_depths = range(1, 5)

times = []
ests = []

for use_warm_start in [False, True]:
    for X, y in data_list:
        gb_gs = GridSearchCV(
            GradientBoostingClassifier(random_state=42),
            param_grid={'max_depth': search_max_depths},
            scoring='f1_micro', cv=3, refit=True, verbose=True,
            use_warm_start=use_warm_start).fit(X, y)
        times.append(gb_gs.cv_results_['mean_fit_time'].sum())
    ests.append(gb_gs)


pd.DataFrame(ests[0].cv_results_)
pd.DataFrame(ests[1].cv_results_)

Now we have 2 options - Either make a _WARM_START_PARAMS for each class that supports warm start or document it clearly that it would lead to weird results if used on params that are not warm startable and expect that the user would follow it (Not very explicit)...

raghavrv · 2017-01-25T16:14:54Z

This makes me wonder if it would be the time to revisit @amueller's suggestion elsewhere on fit_more(**new_params) or refit(**new_params) and let GridSearchCV use it if available...

raghavrv · 2017-01-25T16:23:44Z

sklearn/model_selection/_search.py

+                **{k: True
+                   for k in base_estimator.get_params(deep=True)
+                   if k == 'warm_start' or k.endswith('__warm_start')})
+            # one clone per fold


If we can somehow let the estimator communicate to the GridSearchCV what params can be searched this way, we can clone one estimator per fold for those params alone and for the rest use the previous technique. That way this solution would not produce weird results for params that do not make use of the warm_start (and also make use of the speedup due to warm_start)...

WDYT @agramfort @jnothman @amueller

Also this becomes a complicated if you do a combined search for both n_estimators (warm-start-able) and max_depth (non-warm-start-able)...

Hmmm.. this is where EstimatorCV can be useful I think... You can do something like GridSearchCV(GradientBoostingClassifierCV(n_estimators_range=range(1, 10)), param_grid={'max_depth': range(1, 5)})...

jnothman · 2017-01-25T22:20:56Z

i think complicated engineering solutions here are unlikely to be helpful. i think we either provide this facility with use-at-own-risk type of warning and as much documentation as is reasonable or don't offer it: obviously where it helps, its benefits are substantial and it saves in the maintenance costs of specialised CV estimators. But it can't be used as a black box. I think that's true of warm_start in general. Nested grid search with this feature on the inner search is just the same as grid searching over a specialised CV. I'm inclining against automatically switching on warm_start for similar reasons.

…

On 26 Jan 2017 3:31 am, "(Venkat) Raghav (Rajagopalan)" < ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In sklearn/model_selection/_search.py <#8230>: > + + if not self.use_warm_start: + out = parallel(delayed(_fit_and_score)(clone(base_estimator), + train=train, test=test, + parameters=parameters, + **fit_and_score_kw) + for train, test in cv.split(X, y, groups) + for parameters in candidate_params) + else: + # Enable warm_start on all constituent estimators + # XXX: is this the right thing to do? + base_estimator.set_params( + **{k: True + for k in base_estimator.get_params(deep=True) + if k == 'warm_start' or k.endswith('__warm_start')}) + # one clone per fold Hmmm.. this is where EstimatorCV can be useful I think... You can do something like GridSearchCV(GradientBoostingClassifierCV(n_estimators_range=range(1, 10)), param_grid={'max_depth': range(1, 5)})... — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#8230>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz69yn5InsuIAXc7MS3NFcgv5uCR-Eks5rV3jggaJpZM4LsMTW> .

jnothman · 2017-01-31T07:48:03Z

I think I have the solution and will implement it soon. use_warm_start will take a parameter name string. Parameter grids will be sorted to change this parameter innermost. Standard clones and parallelism will be used for all other parameters, and warm fits for this parameter only.

…

On 26 Jan 2017 9:20 am, "Joel Nothman" ***@***.***> wrote: i think complicated engineering solutions here are unlikely to be helpful. i think we either provide this facility with use-at-own-risk type of warning and as much documentation as is reasonable or don't offer it: obviously where it helps, its benefits are substantial and it saves in the maintenance costs of specialised CV estimators. But it can't be used as a black box. I think that's true of warm_start in general. Nested grid search with this feature on the inner search is just the same as grid searching over a specialised CV. I'm inclining against automatically switching on warm_start for similar reasons. On 26 Jan 2017 3:31 am, "(Venkat) Raghav (Rajagopalan)" < ***@***.***> wrote: > ***@***.**** commented on this pull request. > ------------------------------ > > In sklearn/model_selection/_search.py > <#8230>: > > > + > + if not self.use_warm_start: > + out = parallel(delayed(_fit_and_score)(clone(base_estimator), > + train=train, test=test, > + parameters=parameters, > + **fit_and_score_kw) > + for train, test in cv.split(X, y, groups) > + for parameters in candidate_params) > + else: > + # Enable warm_start on all constituent estimators > + # XXX: is this the right thing to do? > + base_estimator.set_params( > + **{k: True > + for k in base_estimator.get_params(deep=True) > + if k == 'warm_start' or k.endswith('__warm_start')}) > + # one clone per fold > > Hmmm.. this is where EstimatorCV can be useful I think... You can do > something like GridSearchCV(GradientBoostingClassifierCV(n_estimators_range=range(1, > 10)), param_grid={'max_depth': range(1, 5)})... > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#8230>, or mute the > thread > <https://github.com/notifications/unsubscribe-auth/AAEz69yn5InsuIAXc7MS3NFcgv5uCR-Eks5rV3jggaJpZM4LsMTW> > . >

jnothman · 2017-01-31T12:52:38Z

I've implemented the changed design. Note that a search over both max_depth and n_estimators now makes sense, while it was not possible with the previous design nor with GradientBoostingClassifierCV.

jnothman · 2017-01-31T13:41:48Z

@agramfort, what do you think?

thomasjpfan

The benefits of this PR are quite big. I left an API question when it comes to extending this to RandomSearchCV.

thomasjpfan · 2021-07-13T01:35:28Z

sklearn/model_selection/_search.py

+        Candidate parameter settings will be reordered to maximise use of this
+        efficiency feature.


I think its worth thinking about how to get this working for RandomSearchCV, so all the *SearchCV has a consistent API for defining the correct order.

What do you think of something very explicit like the following?

HalvingGridSearchCV(..., use_warm_start={"n_estimators": "increasing"})

In the future, if we have this information inside estimator tags, we can have a use_warm_start='auto'.

Do you see this specification as part of the initial release, or a subsequent iteration? We could go ahead and implement this, but I wonder if the feature as presented is usable and forward compatible enough...?

sklearn/model_selection/_search.py

jnothman · 2022-03-13T14:53:51Z

The test failures here are false alarms due to numpy/numpydoc#365

sklearn/model_selection/_search.py

and fix handling of non-comparable parameter values

jnothman · 2022-03-28T12:51:08Z

This now has tests for _generate_warm_start_groups as requested by @thomasjpfan, and is ready for review.

# Conflicts: # sklearn/model_selection/_search.py # sklearn/model_selection/tests/test_search.py # sklearn/model_selection/tests/test_successive_halving.py

jnothman · 2023-12-29T00:33:19Z

I still see this as beneficial in many cases. I've merged the latest main.

@thomasjpfan want to give it another go?

jnothman · 2023-12-29T00:34:33Z

examples/model_selection/plot_grid_search_use_warm_start.py

+for use_warm_start in [None, "n_estimators"]:
+    for X, y in data_list:
+        gb_gs = GridSearchCV(
+            GradientBoostingClassifier(random_state=42, warm_start=True),


Perhaps we should update this to HistGradientBoostingClassifier?

thomasjpfan · 2024-01-31T03:00:11Z

@jnothman I'm trying to think of a scikit-learn estimator that we can use to really show this off. HistGradientBoosting is the likely target, but I do not tend to see code that searches over the warm-startable parameter (max_iter).

amueller · 2024-02-12T20:11:45Z

@thomasjpfan is that because of early stopping?

thomasjpfan · 2024-02-12T21:41:02Z

is that because of early stopping?

Now that I am looking over search space for some automl libraries, it's a bit all over the place:

Includes `max_iter` in search space

Does not include `max_iter` in search space

In that case, I think we can move forward with this PR.

Reading over: #15125

The trick proposed by @jnothman in #8230 is to transform the list generated by ParameterGrid from
[{'a': 1, 'b': 3}, {'a': 1, 'b': 4}, {'a': 2, 'b': 3}, {'a': 2, 'b': 4}]

to

[[{'a': 1, 'b': 3}, {'a': 1, 'b': 4}],
[{'a': 2, 'b': 3}, {'a': 2, 'b': 4}]]

@jnothman Is this still the case? If so, does that mean when n_jobs=4, only 2 jobs will be spawned?

AhmedThahir · 2024-12-07T19:24:17Z

Any updates?

ENH GridSearchCV.use_warm_start parameter for efficiency

3c59b53

jnothman added Enhancement Waiting for Reviewer labels Jan 24, 2017

Faster example runtime

665c5cd

raghavrv mentioned this pull request Jan 24, 2017

[WIP] GradientBoostingClassifierCV without early stopping #8226

Closed

6 tasks

Might as well take some credit

ca79cc3

agramfort reviewed Jan 24, 2017

View reviewed changes

raghavrv reviewed Jan 25, 2017

View reviewed changes

Allow use_warm_start to be a str/list

f76521a

jnothman added 2 commits January 31, 2017 23:53

Merge branch 'master' into use_warm_start

027d89f

Clearer context for example

54ca5ea

jnothman added 4 commits February 1, 2017 10:13

TST initial test for use_warm_start

ef8f681

Further testing

33d1708

Test sorting in ParameterGrid

5a685d6

Some narrative docs

96eab9f

jnothman changed the title ~~[WIP] GridSearchCV.use_warm_start parameter for efficiency~~ [MRG] GridSearchCV.use_warm_start parameter for efficiency Feb 1, 2017

jnothman changed the title ~~[MRG] GridSearchCV.use_warm_start parameter for efficiency~~ [WIP] GridSearchCV.use_warm_start parameter for efficiency Feb 1, 2017

jnothman added 2 commits July 13, 2021 00:32

Black and merge fix

2ec76f7

Merge remote-tracking branch 'upstream/main' into use_warm_start

10ddc1c

thomasjpfan reviewed Jul 13, 2021

View reviewed changes

jnothman mentioned this pull request Jan 12, 2022

Add a way to use warm_start together with cross_val_score? #22044

Open

jnothman added 4 commits March 12, 2022 22:08

Merge remote-tracking branch 'upstream/main' into use_warm_start

4b7c50f

Merge remote-tracking branch 'upstream/main' into use_warm_start

fefdf04

Separate out _generate_warm_start_groups (tests still TODO)

c84946c

Restore pyproject

205528b

thomasjpfan reviewed Mar 13, 2022

View reviewed changes

sklearn/model_selection/_search.py Outdated Show resolved Hide resolved

jnothman and others added 4 commits March 17, 2022 13:57

Thomas's refactor

e4bbce9

Add tests for _generate_warm_start_groups

c470189

and fix handling of non-comparable parameter values

Add what's new and version added

8048cc5

pep8

9694ff6

jnothman added 6 commits March 28, 2022 23:57

black (oops, new laptop not set up)

067acb8

Catch warnings triggered in helper in some versions

3d9fd3f

Fix incorrect equality condition and improve text

d903046

M1erge branch 'main' into use_warm_start

7334a56

Adopt the latest black conventions

1e5b2d5

Merge branch 'main' into use_warm_start

857a9f3

# Conflicts: # sklearn/model_selection/_search.py # sklearn/model_selection/tests/test_search.py # sklearn/model_selection/tests/test_successive_halving.py

jnothman requested review from thomasjpfan and glemaitre December 29, 2023 00:33

Fix what's news

acd8042

jnothman commented Dec 29, 2023

View reviewed changes

		Candidate parameter settings will be reordered to maximise use of this
		efficiency feature.

Uh oh!

[MRG] GridSearchCV.use_warm_start parameter for efficiency #8230

Are you sure you want to change the base?

[MRG] GridSearchCV.use_warm_start parameter for efficiency #8230

Uh oh!

Conversation

jnothman commented Jan 24, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

raghavrv commented Jan 24, 2017

Uh oh!

jnothman commented Jan 24, 2017

Uh oh!

jnothman commented Jan 24, 2017

Uh oh!

agramfort Jan 24, 2017

Choose a reason for hiding this comment

Uh oh!

jnothman Jan 24, 2017

Choose a reason for hiding this comment

Uh oh!

agramfort commented Jan 24, 2017

Uh oh!

jnothman commented Jan 24, 2017

Uh oh!

jnothman commented Jan 25, 2017

Uh oh!

agramfort commented Jan 25, 2017 • edited by TomDLT Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

raghavrv commented Jan 25, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

raghavrv commented Jan 25, 2017

Uh oh!

raghavrv Jan 25, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

raghavrv Jan 25, 2017

Choose a reason for hiding this comment

Uh oh!

raghavrv Jan 25, 2017

Choose a reason for hiding this comment

Uh oh!

jnothman commented Jan 25, 2017 via email

Uh oh!

jnothman commented Jan 31, 2017 via email

Uh oh!

jnothman commented Jan 31, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jnothman commented Jan 31, 2017

Uh oh!

thomasjpfan left a comment

Choose a reason for hiding this comment

Uh oh!

thomasjpfan Jul 13, 2021

Choose a reason for hiding this comment

Uh oh!

jnothman Mar 28, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jnothman commented Mar 13, 2022

Uh oh!

Uh oh!

jnothman commented Mar 28, 2022

Uh oh!

jnothman commented Dec 29, 2023

Uh oh!

jnothman Dec 29, 2023

Choose a reason for hiding this comment

Uh oh!

thomasjpfan commented Jan 31, 2024

Uh oh!

amueller commented Feb 12, 2024

Uh oh!

thomasjpfan commented Feb 12, 2024

Includes max_iter in search space

jnothman commented Jan 24, 2017 •

edited

Loading

agramfort commented Jan 25, 2017 •

edited by TomDLT

Loading

raghavrv commented Jan 25, 2017 •

edited

Loading

raghavrv Jan 25, 2017 •

edited

Loading

jnothman commented Jan 31, 2017 •

edited

Loading

Includes `max_iter` in search space

Does not include `max_iter` in search space