-
-
Notifications
You must be signed in to change notification settings - Fork 26.5k
Closed
Description
Description
I just tried running GridSearchCV on HistGradientBoostingClassifier but it crashed with a division by zero error. I was running with verbose logging. The last report from GridSearchCV indicated the following status:
[CV] learning_rate=1, max_iter=500, max_leaf_nodes=None, tol=1e-08, total=16.8min
[CV] learning_rate=1, max_iter=500, max_leaf_nodes=31, tol=1e-06 .....
Binning 0.015 GB of data: 0.089 s
Fitting gradient boosted rounds:
[1/500] 1 tree, 31 leaves, max depth = 8, in 0.042s
[2/500] 1 tree, 31 leaves, max depth = 9, in 0.058s
...
[489/500] 1 tree, 2 leaves, max depth = 1, in 0.016s
[490/500]
<crash>
Steps/Code to Reproduce
from sklearn.experimental import enable_hist_gradient_boosting
from sklearn.ensemble import HistGradientBoostingClassifier
parameters={
'learning_rate':(1,0.1,0.01),
'max_iter':(100,500,1000),
'max_leaf_nodes':(None,31,50),
'tol':(1e-6,1e-7,1e-8)}
hgb = HistGradientBoostingClassifier(loss='binary_crossentropy',verbose=2)
clf=GridSearchCV(hgb, parameters, cv=5, verbose=2)
clf.fit(x,y)Expected Results
Not to crash.
Actual Results
ZeroDivisionError Traceback (most recent call last)
<ipython-input-486-22ccf9e7ea94> in <module>
7 clf=GridSearchCV(hgb,parameters,cv=5,verbose=2)
8 # Order:
----> 9 clf.fit(x,y)
10 clf.best_estimator_
d:\programming\python\3.7.2\lib\site-packages\sklearn\model_selection\_search.py in fit(self, X, y, groups, **fit_params)
685 return results
686
--> 687 self._run_search(evaluate_candidates)
688
689 # For multi-metric evaluation, store the best_index_, best_params_ and
d:\programming\python\3.7.2\lib\site-packages\sklearn\model_selection\_search.py in _run_search(self, evaluate_candidates)
1146 def _run_search(self, evaluate_candidates):
1147 """Search all candidates in param_grid"""
-> 1148 evaluate_candidates(ParameterGrid(self.param_grid))
1149
1150
d:\programming\python\3.7.2\lib\site-packages\sklearn\model_selection\_search.py in evaluate_candidates(candidate_params)
664 for parameters, (train, test)
665 in product(candidate_params,
--> 666 cv.split(X, y, groups)))
667
668 if len(out) < 1:
d:\programming\python\3.7.2\lib\site-packages\joblib\parallel.py in __call__(self, iterable)
922 self._iterating = self._original_iterator is not None
923
--> 924 while self.dispatch_one_batch(iterator):
925 pass
926
d:\programming\python\3.7.2\lib\site-packages\joblib\parallel.py in dispatch_one_batch(self, iterator)
757 return False
758 else:
--> 759 self._dispatch(tasks)
760 return True
761
d:\programming\python\3.7.2\lib\site-packages\joblib\parallel.py in _dispatch(self, batch)
714 with self._lock:
715 job_idx = len(self._jobs)
--> 716 job = self._backend.apply_async(batch, callback=cb)
717 # A job can complete so quickly than its callback is
718 # called before we get here, causing self._jobs to
d:\programming\python\3.7.2\lib\site-packages\joblib\_parallel_backends.py in apply_async(self, func, callback)
180 def apply_async(self, func, callback=None):
181 """Schedule a func to be run"""
--> 182 result = ImmediateResult(func)
183 if callback:
184 callback(result)
d:\programming\python\3.7.2\lib\site-packages\joblib\_parallel_backends.py in __init__(self, batch)
547 # Don't delay the application, to avoid keeping the input
548 # arguments in memory
--> 549 self.results = batch()
550
551 def get(self):
d:\programming\python\3.7.2\lib\site-packages\joblib\parallel.py in __call__(self)
223 with parallel_backend(self._backend, n_jobs=self._n_jobs):
224 return [func(*args, **kwargs)
--> 225 for func, args, kwargs in self.items]
226
227 def __len__(self):
d:\programming\python\3.7.2\lib\site-packages\joblib\parallel.py in <listcomp>(.0)
223 with parallel_backend(self._backend, n_jobs=self._n_jobs):
224 return [func(*args, **kwargs)
--> 225 for func, args, kwargs in self.items]
226
227 def __len__(self):
d:\programming\python\3.7.2\lib\site-packages\sklearn\model_selection\_validation.py in _fit_and_score(estimator, X, y, scorer, train, test, verbose, parameters, fit_params, return_train_score, return_parameters, return_n_test_samples, return_times, return_estimator, error_score)
512 estimator.fit(X_train, **fit_params)
513 else:
--> 514 estimator.fit(X_train, y_train, **fit_params)
515
516 except Exception as e:
d:\programming\python\3.7.2\lib\site-packages\sklearn\ensemble\_hist_gradient_boosting\gradient_boosting.py in fit(self, X, y)
255 min_samples_leaf=self.min_samples_leaf,
256 l2_regularization=self.l2_regularization,
--> 257 shrinkage=self.learning_rate)
258 grower.grow()
259
d:\programming\python\3.7.2\lib\site-packages\sklearn\ensemble\_hist_gradient_boosting\grower.py in __init__(self, X_binned, gradients, hessians, max_leaf_nodes, max_depth, min_samples_leaf, min_gain_to_split, max_bins, actual_n_bins, l2_regularization, min_hessian_to_split, shrinkage)
194 self.total_compute_hist_time = 0. # time spent computing histograms
195 self.total_apply_split_time = 0. # time spent splitting nodes
--> 196 self._intilialize_root(gradients, hessians, hessians_are_constant)
197 self.n_nodes = 1
198
d:\programming\python\3.7.2\lib\site-packages\sklearn\ensemble\_hist_gradient_boosting\grower.py in _intilialize_root(self, gradients, hessians, hessians_are_constant)
259 return
260 if sum_hessians < self.splitter.min_hessian_to_split:
--> 261 self._finalize_leaf(self.root)
262 return
263
d:\programming\python\3.7.2\lib\site-packages\sklearn\ensemble\_hist_gradient_boosting\grower.py in _finalize_leaf(self, node)
398 """
399 node.value = -self.shrinkage * node.sum_gradients / (
--> 400 node.sum_hessians + self.splitter.l2_regularization)
401 self.finalized_leaves.append(node)
402
ZeroDivisionError: float division by zeroVersions
System:
python: 3.7.2 (tags/v3.7.2:9a3ffc0492, Dec 23 2018, 23:09:28) [MSC v.1916 64 bit (AMD64)]
executable: D:\Programming\Python\3.7.2\python.exe
machine: Windows-10-10.0.17763-SP0
BLAS:
macros:
lib_dirs:
cblas_libs: cblas
Python deps:
pip: 19.1.1
setuptools: 40.8.0
sklearn: 0.21.2
numpy: 1.16.2
scipy: 1.2.1
Cython: None
pandas: 0.24.2
Metadata
Metadata
Assignees
Labels
No labels