-
-
Notifications
You must be signed in to change notification settings - Fork 26.3k
MAINT Move DistanceMetric
under metrics
#21177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
jeremiedbb
merged 20 commits into
scikit-learn:main
from
jjerphan:move-dist-metric-under-metrics
Oct 8, 2021
Merged
Changes from all commits
Commits
Show all changes
20 commits
Select commit
Hold shift + click to select a range
6cf1521
MAINT Move DistanceMetric under metrics
jjerphan 2e0fff9
Add whats_new entry
jjerphan ec7ca1b
Add a test for the deprecation cycle
jjerphan 4dbe651
Add a space to make Sphinx happy
jjerphan a3e03cf
Apply suggestions from review
jjerphan e31e2f8
Merge branch 'main' into move-dist-metric-under-metrics
jjerphan b6e54ba
DOC Fix formatting in doc/whats_new/v1.1.rst
jjerphan cd3cd5d
Fix formatting
jjerphan cb8223b
FIX out of bound error in split_indices (#21130)
lorentzenchr 7cc80df
DOC Remove unused import from example (#21253)
he7d3r b1223e7
MAINT Enable and run black on examples (#20502)
thomasjpfan 12f46cc
DOC Ensures that SplineTransformer passes numpydoc validation (#21248)
Pinky-Chaudhary e11e820
BLD Fixes osx build by downgrading to 11.X (#21227)
thomasjpfan ff7d9c6
DOC Cross-link check_estimator and parametrize_with_checks (#21269)
rth 39fd93f
DOC Clarify use_idf in TfidfTransformer/TfidfVectorizer docstrings (#…
hongshaoyang 4abc00b
DOC Ensures that SelfTrainingClassifier passes numpydoc validation (#…
jmloyola eb2b5fa
DOC Remove some str/unicode leftovers from Python 2 (#21270)
DimitriPapadopoulos 74f67ef
Merge branch 'main' into move-dist-metric-under-metrics
jjerphan 46a6cf2
Re-introduce 'surrogate' for the wording and adapt docstrings accordi…
jjerphan f00c134
Re-word even more for "rank-preserving surrogate"
jjerphan File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,14 +1,12 @@ | ||
#!python | ||
#cython: boundscheck=False | ||
#cython: wraparound=False | ||
#cython: cdivision=True | ||
# cython: boundscheck=False | ||
# cython: cdivision=True | ||
# cython: initializedcheck=False | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note that checks settable via I preferred to turned it off here: in practice, it does not change something but it might for new interfaces such as the ones introduced in #20254. |
||
# cython: wraparound=False | ||
|
||
cimport cython | ||
cimport numpy as np | ||
from libc.math cimport fabs, sqrt, exp, cos, pow | ||
from libc.math cimport sqrt, exp | ||
|
||
from ._typedefs cimport DTYPE_t, ITYPE_t, DITYPE_t | ||
from ._typedefs import DTYPE, ITYPE | ||
from ..utils._typedefs cimport DTYPE_t, ITYPE_t | ||
|
||
###################################################################### | ||
# Inline distance functions | ||
|
@@ -60,7 +58,7 @@ cdef class DistanceMetric: | |
cdef DTYPE_t dist(self, const DTYPE_t* x1, const DTYPE_t* x2, | ||
ITYPE_t size) nogil except -1 | ||
|
||
cdef DTYPE_t rdist(self, DTYPE_t* x1, DTYPE_t* x2, | ||
cdef DTYPE_t rdist(self, const DTYPE_t* x1, const DTYPE_t* x2, | ||
ITYPE_t size) nogil except -1 | ||
|
||
cdef int pdist(self, const DTYPE_t[:, ::1] X, DTYPE_t[:, ::1] D) except -1 | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,8 +1,7 @@ | ||
#!python | ||
#cython: boundscheck=False | ||
#cython: wraparound=False | ||
#cython: initializedcheck=False | ||
#cython: cdivision=True | ||
# cython: boundscheck=False | ||
# cython: cdivision=True | ||
# cython: initializedcheck=False | ||
# cython: wraparound=False | ||
|
||
# By Jake Vanderplas (2013) <[email protected]> | ||
# written for the scikit-learn project | ||
|
@@ -19,7 +18,7 @@ cdef extern from "arrayobject.h": | |
int typenum, void* data) | ||
|
||
|
||
cdef inline np.ndarray _buffer_to_ndarray(DTYPE_t* x, np.npy_intp n): | ||
cdef inline np.ndarray _buffer_to_ndarray(const DTYPE_t* x, np.npy_intp n): | ||
# Wrap a memory buffer with an ndarray. Warning: this is not robust. | ||
# In particular, if x is deallocated before the returned array goes | ||
# out of scope, this could cause memory errors. Since there is not | ||
|
@@ -33,8 +32,8 @@ cdef inline np.ndarray _buffer_to_ndarray(DTYPE_t* x, np.npy_intp n): | |
from libc.math cimport fabs, sqrt, exp, pow, cos, sin, asin | ||
cdef DTYPE_t INF = np.inf | ||
|
||
from ._typedefs cimport DTYPE_t, ITYPE_t, DITYPE_t, DTYPECODE | ||
from ._typedefs import DTYPE, ITYPE | ||
from ..utils._typedefs cimport DTYPE_t, ITYPE_t, DITYPE_t, DTYPECODE | ||
from ..utils._typedefs import DTYPE, ITYPE | ||
|
||
|
||
###################################################################### | ||
|
@@ -98,7 +97,7 @@ cdef class DistanceMetric: | |
|
||
Examples | ||
-------- | ||
>>> from sklearn.neighbors import DistanceMetric | ||
>>> from sklearn.metrics import DistanceMetric | ||
>>> dist = DistanceMetric.get_metric('euclidean') | ||
>>> X = [[0, 1, 2], | ||
[3, 4, 5]] | ||
|
@@ -291,14 +290,13 @@ cdef class DistanceMetric: | |
|
||
cdef DTYPE_t rdist(self, const DTYPE_t* x1, const DTYPE_t* x2, | ||
ITYPE_t size) nogil except -1: | ||
"""Compute the reduced distance between vectors x1 and x2. | ||
"""Compute the rank-preserving surrogate distance between vectors x1 and x2. | ||
|
||
This can optionally be overridden in a base class. | ||
|
||
The reduced distance is any measure that yields the same rank as the | ||
distance, but is more efficient to compute. For example, for the | ||
Euclidean metric, the reduced distance is the squared-euclidean | ||
distance. | ||
The rank-preserving surrogate distance is any measure that yields the same | ||
jjerphan marked this conversation as resolved.
Show resolved
Hide resolved
|
||
rank as the distance, but is more efficient to compute. For example, for the | ||
Euclidean metric, the surrogate distance is the squared-euclidean distance. | ||
""" | ||
return self.dist(x1, x2, size) | ||
|
||
|
@@ -323,25 +321,24 @@ cdef class DistanceMetric: | |
return 0 | ||
|
||
cdef DTYPE_t _rdist_to_dist(self, DTYPE_t rdist) nogil except -1: | ||
"""Convert the reduced distance to the distance""" | ||
"""Convert the rank-preserving surrogate distance to the distance""" | ||
return rdist | ||
|
||
cdef DTYPE_t _dist_to_rdist(self, DTYPE_t dist) nogil except -1: | ||
"""Convert the distance to the reduced distance""" | ||
"""Convert the distance to the rank-preserving surrogate distance""" | ||
return dist | ||
|
||
def rdist_to_dist(self, rdist): | ||
"""Convert the Reduced distance to the true distance. | ||
"""Convert the rank-preserving surrogate distance to the distance. | ||
|
||
The reduced distance, defined for some metrics, is a computationally | ||
more efficient measure which preserves the rank of the true distance. | ||
For example, in the Euclidean distance metric, the reduced distance | ||
is the squared-euclidean distance. | ||
The surrogate distance is any measure that yields the same rank as the | ||
distance, but is more efficient to compute. For example, for the | ||
Euclidean metric, the surrogate distance is the squared-euclidean distance. | ||
|
||
Parameters | ||
---------- | ||
rdist : double | ||
Reduced distance. | ||
Surrogate distance. | ||
|
||
Returns | ||
------- | ||
|
@@ -351,12 +348,11 @@ cdef class DistanceMetric: | |
return rdist | ||
|
||
def dist_to_rdist(self, dist): | ||
"""Convert the true distance to the reduced distance. | ||
"""Convert the true distance to the rank-preserving surrogate distance. | ||
|
||
The reduced distance, defined for some metrics, is a computationally | ||
more efficient measure which preserves the rank of the true distance. | ||
For example, in the Euclidean distance metric, the reduced distance | ||
is the squared-euclidean distance. | ||
The surrogate distance is any measure that yields the same rank as the | ||
distance, but is more efficient to compute. For example, for the | ||
Euclidean metric, the surrogate distance is the squared-euclidean distance. | ||
|
||
Parameters | ||
---------- | ||
|
@@ -366,7 +362,7 @@ cdef class DistanceMetric: | |
Returns | ||
------- | ||
double | ||
Reduced distance. | ||
Surrogate distance. | ||
""" | ||
return dist | ||
|
||
|
@@ -519,7 +515,7 @@ cdef class ChebyshevDistance(DistanceMetric): | |
|
||
Examples | ||
-------- | ||
>>> from sklearn.neighbors.dist_metrics import DistanceMetric | ||
>>> from sklearn.metrics.dist_metrics import DistanceMetric | ||
>>> dist = DistanceMetric.get_metric('chebyshev') | ||
>>> X = [[0, 1, 2], | ||
... [3, 4, 5]] | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.