Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 7fb4261

Browse files
committed
DOC add missing docstring + improve inline comment
1 parent 1224fb6 commit 7fb4261

File tree

1 file changed

+21
-2
lines changed

1 file changed

+21
-2
lines changed

sklearn/metrics/tests/test_pairwise_distances_reduction.py

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,18 @@ def assert_no_missing_neighbors(
111111
indices_row_b,
112112
threshold,
113113
):
114+
"""Compare the indices of neighbors in two results sets.
115+
116+
Any neighbor index with a distance below the precision threshold should
117+
match one in the other result set. We ignore the last few neighbors beyond
118+
the threshold as those can typically be missing due to rounding errors.
119+
120+
For radius queries, the threshold is just the radius minus the expected
121+
precision level.
122+
123+
For k-NN queries, it is the maxium distance to the k-th neighbor minus the
124+
expected precision level.
125+
"""
114126
mask_a = dist_row_a < threshold
115127
mask_b = dist_row_b < threshold
116128
missing_from_b = np.setdiff1d(indices_row_a[mask_a], indices_row_b)
@@ -179,8 +191,15 @@ def assert_compatible_argkmin_results(
179191
atol,
180192
)
181193

182-
# Check that any neighbor with distances below the rounding error threshold have
183-
# matching indices.
194+
# Check that any neighbor with distances below the rounding error
195+
# threshold have matching indices. The threshold is the distance to the
196+
# k-th neighbors minus the expected precision level:
197+
#
198+
# (1 - rtol) * dist_k - atol
199+
#
200+
# Where dist_k is defined as the maxium distance to the kth-neighbor
201+
# among the two result sets. This way of defining the threshold is
202+
# stricter than taking the minimum of the two.
184203
threshold = (1 - rtol) * np.maximum(
185204
np.max(dist_row_a), np.max(dist_row_b)
186205
) - atol

0 commit comments

Comments
 (0)