Division by Zero running GridSearchCV with HistGradientBoostingClassifier

#### Description
I just tried running `GridSearchCV` on `HistGradientBoostingClassifier` but it crashed with a division by zero error. I was running with verbose logging. The last report from GridSearchCV indicated the following status:
```
[CV]  learning_rate=1, max_iter=500, max_leaf_nodes=None, tol=1e-08, total=16.8min
[CV] learning_rate=1, max_iter=500, max_leaf_nodes=31, tol=1e-06 .....
Binning 0.015 GB of data: 0.089 s
Fitting gradient boosted rounds:
[1/500] 1 tree, 31 leaves, max depth = 8, in 0.042s
[2/500] 1 tree, 31 leaves, max depth = 9, in 0.058s
...
[489/500] 1 tree, 2 leaves, max depth = 1, in 0.016s
[490/500] 
<crash>
```
#### Steps/Code to Reproduce
```python
from sklearn.experimental import enable_hist_gradient_boosting
from sklearn.ensemble import HistGradientBoostingClassifier

parameters={
    'learning_rate':(1,0.1,0.01),
    'max_iter':(100,500,1000),
    'max_leaf_nodes':(None,31,50),
    'tol':(1e-6,1e-7,1e-8)}
hgb = HistGradientBoostingClassifier(loss='binary_crossentropy',verbose=2)
clf=GridSearchCV(hgb, parameters, cv=5, verbose=2)
clf.fit(x,y)
```
#### Expected Results
Not to crash.

#### Actual Results
```python
ZeroDivisionError                         Traceback (most recent call last)
<ipython-input-486-22ccf9e7ea94> in <module>
      7 clf=GridSearchCV(hgb,parameters,cv=5,verbose=2)
      8 # Order:
----> 9 clf.fit(x,y)
     10 clf.best_estimator_

d:\programming\python\3.7.2\lib\site-packages\sklearn\model_selection\_search.py in fit(self, X, y, groups, **fit_params)
    685                 return results
    686 
--> 687             self._run_search(evaluate_candidates)
    688 
    689         # For multi-metric evaluation, store the best_index_, best_params_ and

d:\programming\python\3.7.2\lib\site-packages\sklearn\model_selection\_search.py in _run_search(self, evaluate_candidates)
   1146     def _run_search(self, evaluate_candidates):
   1147         """Search all candidates in param_grid"""
-> 1148         evaluate_candidates(ParameterGrid(self.param_grid))
   1149 
   1150 

d:\programming\python\3.7.2\lib\site-packages\sklearn\model_selection\_search.py in evaluate_candidates(candidate_params)
    664                                for parameters, (train, test)
    665                                in product(candidate_params,
--> 666                                           cv.split(X, y, groups)))
    667 
    668                 if len(out) < 1:

d:\programming\python\3.7.2\lib\site-packages\joblib\parallel.py in __call__(self, iterable)
    922                 self._iterating = self._original_iterator is not None
    923 
--> 924             while self.dispatch_one_batch(iterator):
    925                 pass
    926 

d:\programming\python\3.7.2\lib\site-packages\joblib\parallel.py in dispatch_one_batch(self, iterator)
    757                 return False
    758             else:
--> 759                 self._dispatch(tasks)
    760                 return True
    761 

d:\programming\python\3.7.2\lib\site-packages\joblib\parallel.py in _dispatch(self, batch)
    714         with self._lock:
    715             job_idx = len(self._jobs)
--> 716             job = self._backend.apply_async(batch, callback=cb)
    717             # A job can complete so quickly than its callback is
    718             # called before we get here, causing self._jobs to

d:\programming\python\3.7.2\lib\site-packages\joblib\_parallel_backends.py in apply_async(self, func, callback)
    180     def apply_async(self, func, callback=None):
    181         """Schedule a func to be run"""
--> 182         result = ImmediateResult(func)
    183         if callback:
    184             callback(result)

d:\programming\python\3.7.2\lib\site-packages\joblib\_parallel_backends.py in __init__(self, batch)
    547         # Don't delay the application, to avoid keeping the input
    548         # arguments in memory
--> 549         self.results = batch()
    550 
    551     def get(self):

d:\programming\python\3.7.2\lib\site-packages\joblib\parallel.py in __call__(self)
    223         with parallel_backend(self._backend, n_jobs=self._n_jobs):
    224             return [func(*args, **kwargs)
--> 225                     for func, args, kwargs in self.items]
    226 
    227     def __len__(self):

d:\programming\python\3.7.2\lib\site-packages\joblib\parallel.py in <listcomp>(.0)
    223         with parallel_backend(self._backend, n_jobs=self._n_jobs):
    224             return [func(*args, **kwargs)
--> 225                     for func, args, kwargs in self.items]
    226 
    227     def __len__(self):

d:\programming\python\3.7.2\lib\site-packages\sklearn\model_selection\_validation.py in _fit_and_score(estimator, X, y, scorer, train, test, verbose, parameters, fit_params, return_train_score, return_parameters, return_n_test_samples, return_times, return_estimator, error_score)
    512             estimator.fit(X_train, **fit_params)
    513         else:
--> 514             estimator.fit(X_train, y_train, **fit_params)
    515 
    516     except Exception as e:

d:\programming\python\3.7.2\lib\site-packages\sklearn\ensemble\_hist_gradient_boosting\gradient_boosting.py in fit(self, X, y)
    255                     min_samples_leaf=self.min_samples_leaf,
    256                     l2_regularization=self.l2_regularization,
--> 257                     shrinkage=self.learning_rate)
    258                 grower.grow()
    259 

d:\programming\python\3.7.2\lib\site-packages\sklearn\ensemble\_hist_gradient_boosting\grower.py in __init__(self, X_binned, gradients, hessians, max_leaf_nodes, max_depth, min_samples_leaf, min_gain_to_split, max_bins, actual_n_bins, l2_regularization, min_hessian_to_split, shrinkage)
    194         self.total_compute_hist_time = 0.  # time spent computing histograms
    195         self.total_apply_split_time = 0.  # time spent splitting nodes
--> 196         self._intilialize_root(gradients, hessians, hessians_are_constant)
    197         self.n_nodes = 1
    198 

d:\programming\python\3.7.2\lib\site-packages\sklearn\ensemble\_hist_gradient_boosting\grower.py in _intilialize_root(self, gradients, hessians, hessians_are_constant)
    259             return
    260         if sum_hessians < self.splitter.min_hessian_to_split:
--> 261             self._finalize_leaf(self.root)
    262             return
    263 

d:\programming\python\3.7.2\lib\site-packages\sklearn\ensemble\_hist_gradient_boosting\grower.py in _finalize_leaf(self, node)
    398         """
    399         node.value = -self.shrinkage * node.sum_gradients / (
--> 400             node.sum_hessians + self.splitter.l2_regularization)
    401         self.finalized_leaves.append(node)
    402 

ZeroDivisionError: float division by zero
```

#### Versions
```
System:
    python: 3.7.2 (tags/v3.7.2:9a3ffc0492, Dec 23 2018, 23:09:28) [MSC v.1916 64 bit (AMD64)]
executable: D:\Programming\Python\3.7.2\python.exe
   machine: Windows-10-10.0.17763-SP0

BLAS:
    macros:
  lib_dirs:
cblas_libs: cblas

Python deps:
       pip: 19.1.1
setuptools: 40.8.0
   sklearn: 0.21.2
     numpy: 1.16.2
     scipy: 1.2.1
    Cython: None
    pandas: 0.24.2
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Division by Zero running GridSearchCV with HistGradientBoostingClassifier #14014

Description

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Division by Zero running GridSearchCV with HistGradientBoostingClassifier #14014

Description

Description

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions