Description
Describe the workflow you want to enable
I find the current experimental implementation of HalvingGridSearchCV problematic. At the first rounds it tends to select candidates with hyperparameters adapted to small sample sizes that are bad in hindsight, when it's too late. Think of regularization, tree depth, number of leaves, etc. This is a problem with CV in general, but 4/5 or 9/10 are a far cry from #samples / #candidates.
Describe your proposed solution
I've the following suggestion, although TBH I haven't thoroughly thought about it: take a splitter as usual and in each iteration of the splitter reduce the candidates, say by 2 or 3. So, for example, you start with cv=5 and 100 candidates, fit them on folds 2-5, compute scores on fold 1, discard half the candidates, proceed to the next split with test fold = 2 and 50 remaining candidates, etc. It obviously requires more resources than the current implementation, but early selected candidates would be better adapted to the last rounds.
Describe alternatives you've considered, if relevant
Implementing the above on top of GridSearchCV.
Additional context
No response