You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some people (including @betatim@ogrisel@jjerphan and I) have been devising a plugin system that would open up sklearn estimators to other external implementations, and in particular implementations with GPU backends - see #22438 .
Some of the plugins we're considering can materialize the data in memory with an array library that is compatible with the Array API - namely CuPy and dpctl.tensor.
One thing we've found is that internally those plugins can benefit from using directly BaseEstimator._validate_data and check_array from scikit-learn to do the data acceptation and preparation step.
Describe your proposed solution
To enable this it would be nice to be able to pass a asarray_fn to check_array and _validate_data, that would be called instead of xp.asarray in _asarray_with_order . This would enable the plugin to convert directly the input data to an array that the plugin supports (e.g. cupy or dpctl.tensor) while still benefiting from reusing existing validation code in check_array.
The override can be necessary in case the asarray method from the array library implements a superset of the array api that is necessary for the plugin, but is currently not used by check_array because it's not part of the array api (for instance, the order argument isn't passed to asarray for array libraries other than numpy)