Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Numpy float32 giving buffer mismatch error (as double) in Cython code of Isotonic regression #15004

@narendramukherjee

Description

@narendramukherjee

Description

The _make_unique method in _isotonic.pyx throws buffer mismatch error (only when using float 32, which seems to be recast as double). The error doesn't come up when using float64

(Larger story: I am using CalibratedClassifierCV with isotonic calibration with the probabilities produced by an XGBoost model (which are in float32). CalibratedClassifierCV does not check for float64 before shipping off the probabilities to IsotonicRegression, which is what started the problem).

Steps/Code to Reproduce

Minimal example:

a = np.random.normal(size = 100)                                     
b = np.random.choice([0, 1], p = [0.5, 0.5], size = 100)                         
from sklearn.isotonic import IsotonicRegression                      
regress = IsotonicRegression()
regress.fit(a.astype('float32'), b)

The problem does not arise if a is in its default type of float64.

Expected Results

No error is thrown, as happens when a is cast as float64.

Actual Results

~/anaconda3/envs/xgboost_test/lib/python3.7/site-packages/sklearn/isotonic.py in fit(self, X, y, sample_weight)
    333         # Transform y by running the isotonic regression algorithm and
    334         # transform X accordingly.
--> 335         X, y = self._build_y(X, y, sample_weight)
    336 
    337         # It is necessary to store the non-redundant part of the training set

~/anaconda3/envs/xgboost_test/lib/python3.7/site-packages/sklearn/isotonic.py in _build_y(self, X, y, sample_weight, trim_duplicates)
    271         X, y, sample_weight = [array[order] for array in [X, y, sample_weight]]
    272         unique_X, unique_y, unique_sample_weight = _make_unique(
--> 273             X, y, sample_weight)
    274 
    275         # Store _X_ and _y_ to maintain backward compat during the deprecation

sklearn/_isotonic.pyx in sklearn._isotonic._make_unique()

ValueError: Buffer dtype mismatch, expected 'float' but got 'double'

Versions

In [114]: sklearn.show_versions()                                              

System:
    python: 3.7.4 (default, Aug 13 2019, 15:17:50)  [Clang 4.0.1 (tags/RELEASE_401/final)]
executable: /Users/nmukherjee/anaconda3/envs/xgboost_test/bin/python
   machine: Darwin-18.7.0-x86_64-i386-64bit

BLAS:
    macros: SCIPY_MKL_H=None, HAVE_CBLAS=None
  lib_dirs: /Users/nmukherjee/anaconda3/envs/xgboost_test/lib
cblas_libs: mkl_rt, pthread

Python deps:
       pip: 19.2.2
setuptools: 41.0.1
   sklearn: 0.21.2
     numpy: 1.16.5
     scipy: 1.3.1
    Cython: None
    pandas: 0.25.1

Dealing with the problem

Either figuring out how to deal with float32 within IsotonicRegression, or checking (and recasting to) float64 within CalibratedClassifierCV

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions