Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Specifying 'cosine' as metric in KDTree throws error #25364

@01-vyom

Description

@01-vyom

Describe the bug

I am trying to implement the KDTree Algorithm with cosine as a distance metric. I first started with scipy's implementation, but it didn't support cosine as a metric. Then, I came across sklearn implementation. The documentation indicates that the supported metric can be found (scipy.spatial.distance) and (distance_metrics). Both of these places show that the cosine metric is supported. Also, the user guide also suggests that cosine is supported as it tries to point out cosine as cosine_distance. However, it still throws an error for me when I try to use that metric.

Steps/Code to Reproduce

import numpy as np
from sklearn.neighbors import KDTree
rng = np.random.RandomState(0)
X = rng.random_sample((10, 3))  # 10 points in 3 dimensions
tree = KDTree(X, leaf_size=2, metric= "cosine")              

Expected Results

No error is thrown

Actual Results

Traceback (most recent call last):
  File "sklearn/metrics/_dist_metrics.pyx", line 270, in sklearn.metrics._dist_metrics.DistanceMetric.get_metric
KeyError: 'cosine'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "sklearn/neighbors/_binary_tree.pxi", line 844, in sklearn.neighbors._kd_tree.BinaryTree.__init__
  File "sklearn/metrics/_dist_metrics.pyx", line 272, in sklearn.metrics._dist_metrics.DistanceMetric.get_metric
ValueError: Unrecognized metric 'cosine'

Versions

System:
    python: 3.8.10 (default, Mar 15 2022, 12:22:08)  [GCC 9.4.0]
executable: /bin/python3
   machine: Linux-5.4.0-132-generic-x86_64-with-glibc2.29

Python dependencies:
      sklearn: 1.2.0
          pip: 22.3.1
   setuptools: 65.6.3
        numpy: 1.23.3
        scipy: 1.9.3
       Cython: 0.29.30
       pandas: 1.4.2
   matplotlib: 3.6.2
       joblib: 1.2.0
threadpoolctl: 3.1.0

Built with OpenMP: True

threadpoolctl info:
       user_api: blas
   internal_api: openblas
         prefix: libopenblas
       filepath: /usr/local/lib/python3.8/dist-packages/numpy.libs/libopenblas64_p-r0-742d56dc.3.20.so
        version: 0.3.20
threading_layer: pthreads
   architecture: SkylakeX
    num_threads: 28

       user_api: openmp
   internal_api: openmp
         prefix: libgomp
       filepath: /usr/local/lib/python3.8/dist-packages/scikit_learn.libs/libgomp-a34b3233.so.1.0.0
        version: None
    num_threads: 28

       user_api: blas
   internal_api: openblas
         prefix: libopenblas
       filepath: /usr/local/lib/python3.8/dist-packages/scipy.libs/libopenblasp-r0-41284840.3.18.so
        version: 0.3.18
threading_layer: pthreads
   architecture: SkylakeX
    num_threads: 28

       user_api: openmp
   internal_api: openmp
         prefix: libgomp
       filepath: /usr/local/lib/python3.8/dist-packages/torch/lib/libgomp-a34b3233.so.1
        version: None
    num_threads: 14

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions