Thanks to visit codestin.com
Credit goes to github.com

Skip to content

LogisticRegressionCV not compatible with LeaveOneGroupOut #8950

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ghost opened this issue May 29, 2017 · 3 comments · Fixed by #26525
Closed

LogisticRegressionCV not compatible with LeaveOneGroupOut #8950

ghost opened this issue May 29, 2017 · 3 comments · Fixed by #26525

Comments

@ghost
Copy link

ghost commented May 29, 2017

In LogisticRegressionCV, the documentation says the argument cv could be a cross-validation object from model_selection. However, when I pass LeaveOneGroupOut as an argument, it has errors in it. I read the source code. I think the problem is that in LogisticRegressionCV, when calling the split method, it only pass X and y, but LeaveOneGroupOut requires an additional argument groups, and we have nowhere to pass this argument. This also raises problems when I use LogisticRegressionCV as the base estimator for something like OneVsRestClassifier.

Here's a small example:

import numpy as np
from sklearn.model_selection import LeaveOneGroupOut
from sklearn.linear_model import LogisticRegressionCV

X = np.array([[1, 2], [3, 4], [5, 6], [7, 8]])
y = np.array([1, 2, 1, 2])
groups = np.array([1, 1, 2, 2])
logo = LeaveOneGroupOut()

clf = LogisticRegressionCV(cv = logo)
clf.fit(X,y)

Error message:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/sklearn/linear_model/logistic.py", line 1581, in fit
    folds = list(cv.split(X, y))
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/sklearn/model_selection/_split.py", line 91, in split
    for test_index in self._iter_test_masks(X, y, groups):
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/sklearn/model_selection/_split.py", line 778, in _iter_test_masks
    raise ValueError("The groups parameter should not be None")
ValueError: The groups parameter should not be None

Darwin-15.6.0-x86_64-i386-64bit
Python 3.4.3 (v3.4.3:9b73f1c3e601, Feb 23 2015, 02:52:03)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]
NumPy 1.11.1
SciPy 0.17.1
Scikit-Learn 0.18.1

@jnothman
Copy link
Member

jnothman commented May 29, 2017 via email

@anselal
Copy link

anselal commented Sep 15, 2017

The same here for GroupKFold and Gridsearch. The error I get is

for train, test in super(_BaseKFold, self).split(X, y, groups): File "F:\Development\IDE\Python35-keras2-tf-gpu\lib\site-packages\sklearn\model_selection\_split.py", line 91, in split for test_index in self._iter_test_masks(X, y, groups): File "F:\Development\IDE\Python35-keras2-tf-gpu\lib\site-packages\sklearn\model_selection\_split.py", line 103, in _iter_test_masks for test_index in self._iter_test_indices(X, y, groups): File "F:\Development\IDE\Python35-keras2-tf-gpu\lib\site-packages\sklearn\model_selection\_split.py", line 475, in _iter_test_indices raise ValueError("The groups parameter should not be None") ValueError: The groups parameter should not be None

@mgbckr
Copy link

mgbckr commented Mar 26, 2020

LogisticRegressionCV just lacks the groups parameter in it's fit method (and the corresponding implementation). GridSearchCV actually works fine if you pass the groups via fit parameters, i.e., .fit(X, y, groups=groups). So, we probably just need to copy that functionality from GridSearchCV.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants