Thanks to visit codestin.com
Credit goes to github.com

Skip to content

extend StratifiedKFold to float for regression #4757

@RNAer

Description

@RNAer

It is important to stratify the samples according to y for cross-validation in regression models; otherwise, you might possibly get totally different ranges of y in training and validation sets. However, current StratifiedKFold doesn't allow float:


$ x=sklearn.cross_validation.StratifiedKFold(np.random.random(9), 2)                                                                                                                                        
/anaconda/envs/py3/lib/python3.4/site-packages/sklearn/cross_validation.py:417: Warning: The least populated class in y has only 1 members, which is too few. The minimum number of labels for any class cannot be less than n_folds=2.
  % (min_labels, self.n_folds)), Warning)

$ list(x)
[(array([], dtype=int64), array([0, 1, 2, 3, 4, 5, 6, 7, 8])),
 (array([0, 1, 2, 3, 4, 5, 6, 7, 8]), array([], dtype=int64))]

In case I may miss something, is there any reason why StratifiedKFold does not work properly for float?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions