Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Make KNeighborsClassifier.predict and KNeighborsRegressor.predict react the same way to X=None #29722

@dkobak

Description

@dkobak

Describe the workflow you want to enable

Currently KNeighborsRegressor.predict() accepts None as input, in which case it returns prediction for all samples in the training set based on the nearest neighbors not including the sample itself (consistent with NearestNeighbors behavior). However, KNeighborsClassifier.predict() does not accept None as input. This is inconsistent and should arguably be harmonized:

from sklearn.neighbors import KNeighborsClassifier, KNeighborsRegressor, NearestNeighbors
import numpy as np

X = np.random.normal(size=(10, 5))
y = np.random.normal(size=(10, 1))

knn = NearestNeighbors(n_neighbors=3)
knn.fit(X)
knn.kneighbors() # works

knn = KNeighborsRegressor(n_neighbors=3)
knn.fit(X, y)
knn.predict(None) # works (NB: does not work without "None")

knn = KNeighborsClassifier(n_neighbors=3)
knn.fit(X, np.ravel(y) > 0)
knn.predict(None) # fails with an error

Describe your proposed solution

My proposed solution is to make KNeighborsClassifier.predict(None) behave the same as KNeighborsRegressor.predict(None). As explained in #27747, the necessary fix requires changing only two lines of code.

Additional context

As explained in #27747, this would be a great feature, super useful and convenient for computing LOOCV accuracy simply via score(None, y). Using score(X, y) where X is the training set used in fit(X) gives a biased result because each (training set) sample gets included into its own neighbors.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions