Thanks to visit codestin.com
Credit goes to github.com

Skip to content

ValueError: buffer source array is read-only #25247

Closed
@rxk2rxk

Description

@rxk2rxk

Describe the bug

Hi,

When training RandomForestClassifier using multiple cores (n_jobs=-1) I get the following error (full traceback below):

ValueError: buffer source array is read-only

This doesn't happen when using just one core or when a small dataset is used for training (by subsampling a large one).

It's not straightforward to provide reproducible code as this happens with a fairly large dataset (~100K training records).

The code is running on a MacBook Pro (6-Core Intel Core i7, Monterey 12.5.1) under Python 3.9 (see version info below).

Note: A similar error is mentioned in bug reports #15851 and #16331 - but it appears this issue has not been fully fixed.

Thanks,
Ron

Steps/Code to Reproduce

Here are the relevant lines of code:

    clf = Pipeline([
        ('tfidf', TfidfVectorizer(ngram_range=(1, 1),
                                  use_idf=True,
                                  max_df=1.0,
                                  max_features=None
                                  )
         ),
        ('chi2p', SelectPercentile(chi2, percentile=100)),
        ('clf', CalibratedClassifierCV(RandomForestClassifier(random_state=None,
                                                              max_depth=50,
                                                              class_weight='balanced',
                                                              n_jobs=-1
                                                              )
                                       )
         )
    ])

    clf.fit(data_train, target_train)

Expected Results

No error is thrown.

Actual Results

Traceback (most recent call last):
  File "/Users/ron.katriel/PycharmProjects/Classifier/COST/DRG/sklearn_sgdclassifier_autocoder.py", line 365, in <module>
    clf_drg_code = trainClassifier(df_train, target_column, num_estimators, model_filename)
  File "/Users/ron.katriel/PycharmProjects/Classifier/COST/DRG/sklearn_sgdclassifier_autocoder.py", line 233, in trainClassifier
    clf = ensembleClassifier(df_train, target_column, num_estimators)
  File "/Users/ron.katriel/PycharmProjects/Classifier/COST/DRG/sklearn_sgdclassifier_autocoder.py", line 155, in ensembleClassifier
    clf.fit(data_train, target_train)
  File "/Users/ron.katriel/PycharmProjects/Classifier/COST/venv/lib/python3.9/site-packages/sklearn/ensemble/_voting.py", line 347, in fit
    return super().fit(X, transformed_y, sample_weight)
  File "/Users/ron.katriel/PycharmProjects/Classifier/COST/venv/lib/python3.9/site-packages/sklearn/ensemble/_voting.py", line 83, in fit
    self.estimators_ = Parallel(n_jobs=self.n_jobs)(
  File "/Users/ron.katriel/PycharmProjects/Classifier/COST/venv/lib/python3.9/site-packages/joblib/parallel.py", line 1085, in __call__
    if self.dispatch_one_batch(iterator):
  File "/Users/ron.katriel/PycharmProjects/Classifier/COST/venv/lib/python3.9/site-packages/joblib/parallel.py", line 901, in dispatch_one_batch
    self._dispatch(tasks)
  File "/Users/ron.katriel/PycharmProjects/Classifier/COST/venv/lib/python3.9/site-packages/joblib/parallel.py", line 819, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/Users/ron.katriel/PycharmProjects/Classifier/COST/venv/lib/python3.9/site-packages/joblib/_parallel_backends.py", line 208, in apply_async
    result = ImmediateResult(func)
  File "/Users/ron.katriel/PycharmProjects/Classifier/COST/venv/lib/python3.9/site-packages/joblib/_parallel_backends.py", line 597, in __init__
    self.results = batch()
  File "/Users/ron.katriel/PycharmProjects/Classifier/COST/venv/lib/python3.9/site-packages/joblib/parallel.py", line 288, in __call__
    return [func(*args, **kwargs)
  File "/Users/ron.katriel/PycharmProjects/Classifier/COST/venv/lib/python3.9/site-packages/joblib/parallel.py", line 288, in <listcomp>
    return [func(*args, **kwargs)
  File "/Users/ron.katriel/PycharmProjects/Classifier/COST/venv/lib/python3.9/site-packages/sklearn/utils/fixes.py", line 117, in __call__
    return self.function(*args, **kwargs)
  File "/Users/ron.katriel/PycharmProjects/Classifier/COST/venv/lib/python3.9/site-packages/sklearn/ensemble/_base.py", line 46, in _fit_single_estimator
    estimator.fit(X, y)
  File "/Users/ron.katriel/PycharmProjects/Classifier/COST/venv/lib/python3.9/site-packages/sklearn/pipeline.py", line 406, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "/Users/ron.katriel/PycharmProjects/Classifier/COST/venv/lib/python3.9/site-packages/sklearn/calibration.py", line 396, in fit
    self.calibrated_classifiers_ = parallel(
  File "/Users/ron.katriel/PycharmProjects/Classifier/COST/venv/lib/python3.9/site-packages/joblib/parallel.py", line 1085, in __call__
    if self.dispatch_one_batch(iterator):
  File "/Users/ron.katriel/PycharmProjects/Classifier/COST/venv/lib/python3.9/site-packages/joblib/parallel.py", line 901, in dispatch_one_batch
    self._dispatch(tasks)
  File "/Users/ron.katriel/PycharmProjects/Classifier/COST/venv/lib/python3.9/site-packages/joblib/parallel.py", line 819, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/Users/ron.katriel/PycharmProjects/Classifier/COST/venv/lib/python3.9/site-packages/joblib/_parallel_backends.py", line 208, in apply_async
    result = ImmediateResult(func)
  File "/Users/ron.katriel/PycharmProjects/Classifier/COST/venv/lib/python3.9/site-packages/joblib/_parallel_backends.py", line 597, in __init__
    self.results = batch()
  File "/Users/ron.katriel/PycharmProjects/Classifier/COST/venv/lib/python3.9/site-packages/joblib/parallel.py", line 288, in __call__
    return [func(*args, **kwargs)
  File "/Users/ron.katriel/PycharmProjects/Classifier/COST/venv/lib/python3.9/site-packages/joblib/parallel.py", line 288, in <listcomp>
    return [func(*args, **kwargs)
  File "/Users/ron.katriel/PycharmProjects/Classifier/COST/venv/lib/python3.9/site-packages/sklearn/utils/fixes.py", line 117, in __call__
    return self.function(*args, **kwargs)
  File "/Users/ron.katriel/PycharmProjects/Classifier/COST/venv/lib/python3.9/site-packages/sklearn/calibration.py", line 578, in _fit_classifier_calibrator_pair
    estimator.fit(X_train, y_train, **fit_params_train)
  File "/Users/ron.katriel/PycharmProjects/Classifier/COST/venv/lib/python3.9/site-packages/sklearn/ensemble/_forest.py", line 474, in fit
    trees = Parallel(
  File "/Users/ron.katriel/PycharmProjects/Classifier/COST/venv/lib/python3.9/site-packages/joblib/parallel.py", line 1098, in __call__
    self.retrieve()
  File "/Users/ron.katriel/PycharmProjects/Classifier/COST/venv/lib/python3.9/site-packages/joblib/parallel.py", line 975, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "/Users/ron.katriel/PycharmProjects/Classifier/COST/venv/lib/python3.9/site-packages/joblib/_parallel_backends.py", line 567, in wrap_future_result
    return future.result(timeout=timeout)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/concurrent/futures/_base.py", line 440, in result
    return self.__get_result()
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/concurrent/futures/_base.py", line 389, in __get_result
    raise self._exception
ValueError: buffer source array is read-only

Process finished with exit code 1

Versions

System:
    python: 3.9.0 (v3.9.0:9cf6752276, Oct  5 2020, 11:29:23)  [Clang 6.0 (clang-600.0.57)]
executable: /Users/ron.katriel/PycharmProjects/Classifier/COST/venv/bin/python
   machine: macOS-10.16-x86_64-i386-64bit
Python dependencies:
      sklearn: 1.2.0
          pip: 22.3.1
   setuptools: 65.6.3
        numpy: 1.23.3
        scipy: 1.9.1
       Cython: None
       pandas: 1.5.2
   matplotlib: 3.5.3
       joblib: 1.2.0
threadpoolctl: 3.1.0
Built with OpenMP: True

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions