Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Array API support for k-means #26585

Open
@ogrisel

Description

@ogrisel

This is an early issue to publicly discuss the possibility (or not) to use the Array API (see #22352) for k-means and make it run on GPUs using PyTorch in particular.

@fcharras has already started to run some promising experiments using the raw PyTorch API. Maybe you could link to a gist with your code?

Unfortunately, the current state of the Array API is likely too limiting because AFAIK it does not yet expose the equivalent of torch.cdist, torch.expand and torch.scatter_add_.

The purpose of this issue is to precisely identify what is blocking us with the current state of Array API and discuss potential solutions:

  • use this use case to report to the Array API standardization committee what are our needs to make the spec evolve and benefit everybody;
  • alternatively, explore the use of multi-dispatch system such as uarray that is being adopted in scipy to make it possible to maintain a pytorch-specific optimized code path as an alternative to a slower yet generic Array API code path and numpy-optimized code path that would rely on our current Cython code,
  • decide that the estimator-level engine API proposed in [DRAFT] Engine plugin API and engine entry point for Lloyd's KMeans #25535 is the only sane way to make this estimator GPU (which I now doubt personally).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions