-
-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Closed
Labels
Milestone
Description
Following Bergstra NIPS 2011 (@jaberg for the intimates) it would be great to have a GridSearch-like object implementing Gaussian-Process based hyper-parameter optimization.
To keep things simple, we would optimize only on continuous parameter, i.e. in a hyper-cube. The API would specify an initial bounding box, and also whether each parameter should be varied in the log-domain or in the linear-domain.
Some notes from a discussion with James Bergstra:
- The algorithm goes mostly as
- Compute the score on a set of random points in the specified hyper-cube
- For i in budget:
2.a fit Gaussian Process to the pairs (parameters, scores)
2.b find the parameters that optimizes the expectation of getting a lower score then the best currently available. For this, optimize it using a black-box scipy optimizer, such as the simulated annealing).
- A lot of the gain comes from sequentially choosing the points, so, as we don't have a queue mechanism, we should do this sequentially for now. Parallel can be done in internal cross-validations, or to compute initial points.
Remark: to computing the expectancy of (X, y) with Gaussian process, we should overide score to use something more clever than the standard LinearRegression score.