-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
LogisticRegressionCV not compatible with LeaveOneGroupOut #8950
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
yes. sigh. Workaround for now, as long as you don't need this in a nested
context, is that you should be able to pass a list of (train, test) index
pairs directly to cv... I hope.
This is another request for sample properties etc (#4696). It perhaps also
suggests, from the perspective of maintainability that we should move away
from specialised CV objects, in preference for something like *SearchCV
with use_warm_start (#8230).
…On 29 May 2017 11:10 am, "Shudong Hao" ***@***.***> wrote:
In LogisticRegressionCV, the documentation says the argument cv could be
a cross-validation object from model_selection. However, when I pass
LeaveOneGroupOut as an argument, it has errors in it. I read the source
code. I think the problem is that in LogisticRegressionCV, when calling
the split method, it only pass X and y, but LeaveOneGroupOut requires an
additional argument groups, and we have nowhere to pass this argument.
This also raises problems when I use LogisticRegressionCV as the base
estimator for something like OneVsRestClassifier.
Here's a small example:
from sklearn.model_selection import LeaveOneGroupOut
from sklearn.linear_model import LogisticRegressionCV
X = np.array([[1, 2], [3, 4], [5, 6], [7, 8]])
y = np.array([1, 2, 1, 2])
groups = np.array([1, 1, 2, 2])
logo = LeaveOneGroupOut()
clf = LogisticRegressionCV(cv = logo)
clf.fit(X,y)
Error message:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/sklearn/linear_model/logistic.py", line 1581, in fit
folds = list(cv.split(X, y))
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/sklearn/model_selection/_split.py", line 91, in split
for test_index in self._iter_test_masks(X, y, groups):
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/sklearn/model_selection/_split.py", line 778, in _iter_test_masks
raise ValueError("The groups parameter should not be None")
ValueError: The groups parameter should not be None
------------------------------
Darwin-15.6.0-x86_64-i386-64bit
Python 3.4.3 (v3.4.3:9b73f1c3e601, Feb 23 2015, 02:52:03)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]
NumPy 1.11.1
SciPy 0.17.1
Scikit-Learn 0.18.1
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#8950>, or mute the
thread
<https://github.com/notifications/unsubscribe-auth/AAEz60Ct1f1FYarNz_EfQ7WYgXIrQ11kks5r-hrugaJpZM4No0XD>
.
|
The same here for GroupKFold and Gridsearch. The error I get is
|
|
In LogisticRegressionCV, the documentation says the argument
cv
could be a cross-validation object frommodel_selection
. However, when I passLeaveOneGroupOut
as an argument, it has errors in it. I read the source code. I think the problem is that inLogisticRegressionCV
, when calling thesplit
method, it only passX
andy
, butLeaveOneGroupOut
requires an additional argumentgroups
, and we have nowhere to pass this argument. This also raises problems when I useLogisticRegressionCV
as the base estimator for something likeOneVsRestClassifier
.Here's a small example:
Error message:
Darwin-15.6.0-x86_64-i386-64bit
Python 3.4.3 (v3.4.3:9b73f1c3e601, Feb 23 2015, 02:52:03)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]
NumPy 1.11.1
SciPy 0.17.1
Scikit-Learn 0.18.1
The text was updated successfully, but these errors were encountered: