-
-
Notifications
You must be signed in to change notification settings - Fork 26.6k
Description
Description
The _make_unique method in _isotonic.pyx throws buffer mismatch error (only when using float 32, which seems to be recast as double). The error doesn't come up when using float64
(Larger story: I am using CalibratedClassifierCV with isotonic calibration with the probabilities produced by an XGBoost model (which are in float32). CalibratedClassifierCV does not check for float64 before shipping off the probabilities to IsotonicRegression, which is what started the problem).
Steps/Code to Reproduce
Minimal example:
a = np.random.normal(size = 100)
b = np.random.choice([0, 1], p = [0.5, 0.5], size = 100)
from sklearn.isotonic import IsotonicRegression
regress = IsotonicRegression()
regress.fit(a.astype('float32'), b)
The problem does not arise if a is in its default type of float64.
Expected Results
No error is thrown, as happens when a is cast as float64.
Actual Results
~/anaconda3/envs/xgboost_test/lib/python3.7/site-packages/sklearn/isotonic.py in fit(self, X, y, sample_weight)
333 # Transform y by running the isotonic regression algorithm and
334 # transform X accordingly.
--> 335 X, y = self._build_y(X, y, sample_weight)
336
337 # It is necessary to store the non-redundant part of the training set
~/anaconda3/envs/xgboost_test/lib/python3.7/site-packages/sklearn/isotonic.py in _build_y(self, X, y, sample_weight, trim_duplicates)
271 X, y, sample_weight = [array[order] for array in [X, y, sample_weight]]
272 unique_X, unique_y, unique_sample_weight = _make_unique(
--> 273 X, y, sample_weight)
274
275 # Store _X_ and _y_ to maintain backward compat during the deprecation
sklearn/_isotonic.pyx in sklearn._isotonic._make_unique()
ValueError: Buffer dtype mismatch, expected 'float' but got 'double'
Versions
In [114]: sklearn.show_versions()
System:
python: 3.7.4 (default, Aug 13 2019, 15:17:50) [Clang 4.0.1 (tags/RELEASE_401/final)]
executable: /Users/nmukherjee/anaconda3/envs/xgboost_test/bin/python
machine: Darwin-18.7.0-x86_64-i386-64bit
BLAS:
macros: SCIPY_MKL_H=None, HAVE_CBLAS=None
lib_dirs: /Users/nmukherjee/anaconda3/envs/xgboost_test/lib
cblas_libs: mkl_rt, pthread
Python deps:
pip: 19.2.2
setuptools: 41.0.1
sklearn: 0.21.2
numpy: 1.16.5
scipy: 1.3.1
Cython: None
pandas: 0.25.1
Dealing with the problem
Either figuring out how to deal with float32 within IsotonicRegression, or checking (and recasting to) float64 within CalibratedClassifierCV