More scoring flexibility for SelectKBest / SelectPercentile

# Issue 1: Allow scoring function to be a function that does not need to return pvalues

`SelectKBest` and `SelectPercentile` both require that the return value be a pair of arrays `(scores, pvalues)`. But this prevents functions that return only one array of scores for each feature, such as
- [mutual_info_regression](http://scikit-learn.org/dev/modules/generated/sklearn.feature_selection.mutual_info_regression.html#sklearn.feature_selection.mutual_info_regression)
- [mutual_info_classif](http://scikit-learn.org/dev/modules/generated/sklearn.feature_selection.mutual_info_classif.html#sklearn.feature_selection.mutual_info_classif)
- `np.var(X, axis=0)` (In this case, `y` is None). `lambda X: np.var(X, axis=0)` to be precise.
# Issue 2: Make a wrapper around functions that score an individual feature

Requested from [this comment](https://github.com/scikit-learn/scikit-learn/issues/6673#issuecomment-213856669).

This would be useful for certain scipy correlation functions, such as 
- [spearmanr](http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.stats.spearmanr.html)
- [pearsonr](http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.stats.pearsonr.html)

For example, 

```
def make_feature_scorer(feature_scorer):
    """A wrapper around functions which returns scores for each feature.

    `feature_scorer` should be a function that takes in as input (X, y), where 
    y is allowed to be None.

    Example
    ------------
    >>> from sklearn.feature_selection import make_feature_scorer, SelectKBest
    >>> from scipy.stats import spearmanr
    >>> from sklearn.datasets import make_classification
    >>> 
    >>> X, y = make_classification(random_state=0)
    >>> skb = SelectKBest(make_feature_scorer(spearmanr), k=10)
    >>> skb.fit(X, y)    # Calculates spearmanr for each feature in `X`
    >>> new_X = skb.transform(X)    # Selects the top k=10 features with greatest spearmanr

```

Note that the `make_feature_scorer` needs to also handle functions that return a score that might be negative. For example, [kendalltau](http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.stats.kendalltau.html#scipy.stats.kendalltau) returns values between `[-1, 1]`, where `0` means `X` has no correlation to the response `y`.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

More scoring flexibility for SelectKBest / SelectPercentile #6673

Issue 1: Allow scoring function to be a function that does not need to return pvalues

Issue 2: Make a wrapper around functions that score an individual feature

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

More scoring flexibility for SelectKBest / SelectPercentile #6673

Description

Issue 1: Allow scoring function to be a function that does not need to return pvalues

Issue 2: Make a wrapper around functions that score an individual feature

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions