Open
Description
Describe the bug
Pandas COW will be enabled by default in version 3.0.
For example, today I just found that TargetEncoder
doesn't work properly with it enabled.
There are probably many other examples that could be uncovered by testing.
Steps/Code to Reproduce
import pandas as pd
from sklearn.preprocessing import TargetEncoder
pd.options.mode.copy_on_write = True
df = pd.DataFrame({
"x": ["a", "b", "c", "c"],
"y": [4., 5., 6., 7.]
})
t = TargetEncoder(target_type="continuous")
t.fit(df[["x"]], df["y"])
Expected Results
No error.
Actual Results
ValueError Traceback (most recent call last)
Cell In[2], line 10
5 df = pd.DataFrame({
6 "x": ["a", "b", "c", "c"],
7 "y": [4., 5., 6., 7.]
8 })
9 t = TargetEncoder(target_type="continuous")
---> 10 t.fit(df[["x"]], df["y"])
File ~/.conda/envs/jhop311/lib/python3.11/site-packages/sklearn/base.py:1152, in _fit_context.<locals>.decorator.<locals>.wrapper(estimator, *args, **kwargs)
1145 estimator._validate_params()
1147 with config_context(
1148 skip_parameter_validation=(
1149 prefer_skip_nested_validation or global_skip_validation
1150 )
1151 ):
-> 1152 return fit_method(estimator, *args, **kwargs)
File ~/.conda/envs/jhop311/lib/python3.11/site-packages/sklearn/preprocessing/_target_encoder.py:203, in TargetEncoder.fit(self, X, y)
186 @_fit_context(prefer_skip_nested_validation=True)
187 def fit(self, X, y):
188 """Fit the :class:`TargetEncoder` to X and y.
189
190 Parameters
(...)
201 Fitted encoder.
202 """
--> 203 self._fit_encodings_all(X, y)
204 return self
File ~/.conda/envs/jhop311/lib/python3.11/site-packages/sklearn/preprocessing/_target_encoder.py:332, in TargetEncoder._fit_encodings_all(self, X, y)
330 if self.smooth == "auto":
331 y_variance = np.var(y)
--> 332 self.encodings_ = _fit_encoding_fast_auto_smooth(
333 X_ordinal, y, n_categories, self.target_mean_, y_variance
334 )
335 else:
336 self.encodings_ = _fit_encoding_fast(
337 X_ordinal, y, n_categories, self.smooth, self.target_mean_
338 )
File sklearn/preprocessing/_target_encoder_fast.pyx:82, in sklearn.preprocessing._target_encoder_fast._fit_encoding_fast_auto_smooth()
File stringsource:660, in View.MemoryView.memoryview_cwrapper()
File stringsource:350, in View.MemoryView.memoryview.__cinit__()
ValueError: buffer source array is read-only
Versions
System:
python: 3.11.3 | packaged by conda-forge | (main, Apr 6 2023, 08:57:19) [GCC 11.3.0]
executable: /home/jhopfens/.conda/envs/jhop311/bin/python
machine: Linux-3.10.0-1160.99.1.el7.x86_64-x86_64-with-glibc2.17
Python dependencies:
sklearn: 1.3.2
pip: 23.0.1
setuptools: 67.6.1
numpy: 1.25.2
scipy: 1.11.2
Cython: 3.0.0
pandas: 2.1.0
matplotlib: 3.7.2
joblib: 1.2.0
threadpoolctl: 3.1.0
Built with OpenMP: True
threadpoolctl info:
user_api: blas
internal_api: openblas
prefix: libopenblas
filepath: /home/jhopfens/.conda/envs/jhop311/lib/python3.11/site-packages/numpy.libs/libopenblas64_p-r0-5007b62f.3.23.dev.so
version: 0.3.23.dev
threading_layer: pthreads
architecture: SkylakeX
num_threads: 64
user_api: openmp
internal_api: openmp
prefix: libgomp
filepath: /home/jhopfens/.conda/envs/jhop311/lib/python3.11/site-packages/scikit_learn.libs/libgomp-a34b3233.so.1.0.0
version: None
num_threads: 128
user_api: blas
internal_api: openblas
prefix: libopenblas
filepath: /home/jhopfens/.conda/envs/jhop311/lib/python3.11/site-packages/scipy.libs/libopenblasp-r0-23e5df77.3.21.dev.so
version: 0.3.21.dev
threading_layer: pthreads
architecture: SkylakeX
num_threads: 64