Thanks to visit codestin.com
Credit goes to github.com

Skip to content

_safe_indexing triggers SettingWithCopyWarning when used with slice #31290

Open
@MarcoGorelli

Description

@MarcoGorelli

Describe the bug

Here's something I noticed while looking into #31127

The test

pytest sklearn/utils/tests/test_indexing.py::test_safe_indexing_pandas_no_settingwithcopy_warning

checks that a copy is produced, and that no SettingWithCopyWarning is produced

Indeed, no copy is raised, but why is using _safe_indexing with a slice allowed to not make a copy? Is this intentional?

Based on responses, I can suggest what to do instead in #31127

(I am a little surprised that this always makes copies, given that a lot of the discussion in #28341 centered around wanting to avoid copies)

Steps/Code to Reproduce

import numpy as np

from sklearn.utils import _safe_indexing
import pandas as pd

X = pd.DataFrame({"a": [1, 2, 3], "b": [3, 4, 5]})
subset = _safe_indexing(X, slice(0, 2), axis=0)
subset.iloc[0, 0] = 10

Expected Results

No SettingWithCopyWarning

Actual Results

/home/marcogorelli/scikit-learn-dev/t.py:13: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  subset.iloc[0, 0] = 10

Versions

System:
    python: 3.11.11 (main, Dec  4 2024, 08:55:07) [GCC 11.4.0]
executable: /home/marcogorelli/scikit-learn-dev/.venv/bin/python
   machine: Linux-5.15.167.4-microsoft-standard-WSL2-x86_64-with-glibc2.35

Python dependencies:
      sklearn: 1.7.dev0
          pip: 24.2
   setuptools: None
        numpy: 2.1.0
        scipy: 1.14.0
       Cython: 3.0.11
       pandas: 2.2.2
   matplotlib: None
       joblib: 1.4.2
threadpoolctl: 3.5.0

Built with OpenMP: True

threadpoolctl info:
       user_api: blas
   internal_api: openblas
    num_threads: 16
         prefix: libscipy_openblas
       filepath: /home/marcogorelli/scikit-learn-dev/.venv/lib/python3.11/site-packages/numpy.libs/libscipy_openblas64_-ff651d7f.so
        version: 0.3.27
threading_layer: pthreads
   architecture: SkylakeX

       user_api: blas
   internal_api: openblas
    num_threads: 16
         prefix: libscipy_openblas
       filepath: /home/marcogorelli/scikit-learn-dev/.venv/lib/python3.11/site-packages/scipy.libs/libscipy_openblas-c128ec02.so
        version: 0.3.27.dev
threading_layer: pthreads
   architecture: SkylakeX

       user_api: openmp
   internal_api: openmp
    num_threads: 16
         prefix: libgomp
       filepath: /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0
        version: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions