-
Notifications
You must be signed in to change notification settings - Fork 73
Benchmark linear models in higher dimensions #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@ogrisel That's your suggestion is to keeping ration |
I do not think DAAL allows |
Also, could you compare against |
Ok let's close this issue then as there nothing to do on DAAL's side. One may argue that users still want to run linear regression on those regime. For reference a user also reported a related performance problem in scikit-learn: scikit-learn/scikit-learn#13923 I opened scikit-learn/scikit-learn#14268 and scikit-learn/scikit-learn#14269 on scikit-learn's side. |
adding logistic regression and fixing indents in generated code
The current benchmarks only use 50 features for 1e6 samples. I would argue that this is not a case where won't would use a linear model as it would under-fit and the same test accuracy could probably be reached much faster with 1e3 data points instead of 1e6 yielding a speed up in the order of 1000x.
It would therefore be more interesting to benchmark linear regression, ridge regression and logistic regression in regimes in the order of 1e3 to 1e5 features.
In particular, Ridge regression is likely to be most useful in cases where
num_features >> n_samples
, otherwise, Linear regression (no penalty) is likely to give the same result.The text was updated successfully, but these errors were encountered: