Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Allow LogisticRegression with lbfgs solver to control maxfun parameter of solver #27484

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
vascosmota opened this issue Sep 27, 2023 · 12 comments
Labels

Comments

@vascosmota
Copy link

Describe the workflow you want to enable

Similarly to what is mentioned on #9273

Training an MLP regressor (or classifier) using l-bfgs currently cannot run for more than (approx) 15000 iterations.
This artificial limit is caused by the call site to l-bfgs passing the MLP argument value "max_iters" to the argument for "maxfun" (maximum number of function calls), but not for "maxiter" (maximum number of iterations), so that no matter how large a number you pass as "max_iters" to train for MLP, the iterations are capped by the default value for maxiter (15000).

training a LogisticRegression regressor using l-bfgs currently cannot perform more than (approx) 15000 function evaluations.
This artificial limit is caused by the call site to l-bfgs passing the LogisticRegression argument value max_iters to the argument for maxiter (maximum number of iterations), but not for maxfun (maximum number of function evaluations), so that no matter how large a number you pass as "max_iters" to train for LogisticRegression, the function evaluations are capped by the default value for maxiter (15000, defined on https://github.com/scipy/scipy/blob/bf776169c753fff655200dc15ae26db95a083b02/scipy/optimize/_lbfgsb_py.py#L212).

Describe your proposed solution

When calling the l-bfgs solver, set maxfun to be the same as maxiter, allow the user to control it.

As stated on #9274 by @daniel-perry,

Ideally you would want to pass in both a 'max_iter' and 'max_fun' argument to MLP, however 'max_fun' doesn't make sense for anything but l-bfgs, so using 'max_iter' to control both seems a reasonable compromise.

The same rational could be used here.

Describe alternatives you've considered, if relevant

No response

Additional context

No response

@vascosmota vascosmota added Needs Triage Issue requires triage New Feature labels Sep 27, 2023
@KartikeyBartwal
Copy link

starting to work on this features

@KartikeyBartwal
Copy link

KartikeyBartwal commented Oct 1, 2023

please share me the closest piece of code to your problem statement

@glemaitre glemaitre added Enhancement and removed New Feature Needs Triage Issue requires triage labels Oct 6, 2023
@glemaitre
Copy link
Member

@lorentzenchr @ogrisel I assume that it would make sense to be able to control these low-level parameters sometimes. I am bit skeptical to add yet another parameter because it will be solver dependent. However, it could be handy to have a generic parameter to pass additional parameters, that could specific to the underlying solver.

Do you have any thoughts on this?

@lorentzenchr
Copy link
Member

Could we please first merge #26721 (needs a 2nd reviewer) and then see if we still need control over maxfun. I'm not opposed to it, just first things first.

@glemaitre
Copy link
Member

@lorentzenchr I'll review today.

@lorentzenchr
Copy link
Member

@vascosmota Could you provide a minimal example?

@vascosmota
Copy link
Author

I am working on creating a minimal example, but it is proving harder than expected. I will update as soon as I have one.

@lorentzenchr
Copy link
Member

+1 for this as I stumbled over a case where I need to set a higher maxfun.

@lorentzenchr
Copy link
Member

The question is a bit: Should we add a maxfun parameter to estimators using lbfgs, or should be add a solver_options argument?

@afbarnard
Copy link

Just started fixing this myself.

Design-wise, I would opt for passing an optimizer to fit and/or __init__. The idea is that the user has complete freedom to set up an optimizer of their choice and then they combine it with a model in order to do the fitting. However, this is not how scikit-learn has been put together.

The option that aligns most with the code that already exists is to just add a max_f_evals argument (or to just set maxfun to max_iter). Whether this or a general solver_options argument would be better seems (to me) to depend on a survey of the available optimizers to find out what other parameters users might want to control. However, reducing all the solver-related arguments to the arguments solver and solver_options seems like a good general solution short of passing an optimizer directly. This approach seems like large-scale design work, though, in that it would affect the constructor of every model that uses an optimizer for fitting. So, in that sense, are there short- and long-term solutions (small and big solutions) to be considered here? (Is there someone designated to make these calls?)

@afbarnard
Copy link

Taking the design work done by the Optim.jl people as a proxy for surveying the available SciPy optimizers myself, they have the following standard options: x_tol, f_tol, g_tol, f_calls_limit, g_calls_limit, h_calls_limit, allow_f_increases, iterations, store_trace, show_trace, extended_trace, trace_simplex, show_every, callback, time_limit. See https://julianlsolvers.github.io/Optim.jl/stable/#user/config/#general-options.

@afbarnard
Copy link

afbarnard commented Jan 17, 2024

Example fixes:

(I've ignored making the same changes to LogisticRegressionCV for now.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Needs decision
Development

No branches or pull requests

5 participants