Thanks to visit codestin.com
Credit goes to github.com

Skip to content

HistGradientBoostingRegressor is slower when torch not imported #26752

Closed
@davidgilbertson

Description

@davidgilbertson

Describe the bug

This is perhaps not a bug but an opportunity for improvement. I've noticed that scikit-learn runs considerably faster if I happen to have import torch before any sklearn imports.

This first block of code runs much slower:

from sklearn.ensemble import HistGradientBoostingRegressor
import numpy as np


X = np.random.random(size=(50, 10000))
y = np.random.random(size=50)

estimator = HistGradientBoostingRegressor(verbose=True)
estimator.fit(X, y)

Than this second block of code:

import torch  # The only difference
from sklearn.ensemble import HistGradientBoostingRegressor
import numpy as np


X = np.random.random(size=(50, 10000))
y = np.random.random(size=50)

estimator = HistGradientBoostingRegressor(verbose=True)
estimator.fit(X, y)

Here's the run times over 6 runs each on my actual code, the only difference being an import of torch
image

I know it's confusing that I'm importing torch but not using it, so to be clear, I don't use the torch module in any way on the page. I just happened to stumble across the performance improvement at one point when I imported torch for some other purpose. It's literally just sitting there as an 'unused import' making my code run much faster.

I've tested with a few other regressors, including RandomForestRegressor and GradientBoostingRegressor and I don't see any difference.

I compared os.environ in both cases and they're the same. I looked at sklearn.base.get_config() and they're identical in both cases too. I notice that torch sets OMP_NUM_THREADS to 10, while without the torch import this value is set to 20 (on my machine with 20 cores). But even manually setting this to 10 doesn't bridge the gap.

I don't know enough about torch or sklearn to be able to work out what else is going on, I'm guessing someone who's worked on HistGradientBoostingRegressor might know what's going on? Seems like there's a nice performance gain to be found somewhere in here.

Steps/Code to Reproduce

As above

Expected Results

Should be max fast all the time.

Actual Results

Is not max fast unless I import torch.

Also as a general thing it would be nice to be able to pass n_jobs to the constructor. Having something use all 20 cores is not always the fastest way.

Versions

System:
    python: 3.10.8 (main, Oct 12 2022, 19:14:26) [GCC 9.4.0]
executable: /home/davidg/.virtualenvs/learning/bin/python
   machine: Linux-5.15.90.1-microsoft-standard-WSL2-x86_64-with-glibc2.31
Python dependencies:
      sklearn: 1.2.2
          pip: 23.1.2
   setuptools: 59.5.0
        numpy: 1.24.3
        scipy: 1.10.1
       Cython: 0.29.33
       pandas: 2.0.1
   matplotlib: 3.7.0
       joblib: 1.2.0
threadpoolctl: 3.1.0
Built with OpenMP: True
threadpoolctl info:
       user_api: blas
   internal_api: openblas
         prefix: libopenblas
       filepath: /home/davidg/.virtualenvs/learning/lib/python3.10/site-packages/numpy.libs/libopenblas64_p-r0-15028c96.3.21.so
        version: 0.3.21
threading_layer: pthreads
   architecture: Haswell
    num_threads: 20
       user_api: openmp
   internal_api: openmp
         prefix: libgomp
       filepath: /home/davidg/.virtualenvs/learning/lib/python3.10/site-packages/torch/lib/libgomp-a34b3233.so.1
        version: None
    num_threads: 10
       user_api: openmp
   internal_api: openmp
         prefix: libgomp
       filepath: /home/davidg/.virtualenvs/learning/lib/python3.10/site-packages/scikit_learn.libs/libgomp-a34b3233.so.1.0.0
        version: None
    num_threads: 20
       user_api: blas
   internal_api: openblas
         prefix: libopenblas
       filepath: /home/davidg/.virtualenvs/learning/lib/python3.10/site-packages/scipy.libs/libopenblasp-r0-41284840.3.18.so
        version: 0.3.18
threading_layer: pthreads
   architecture: Haswell
    num_threads: 20

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions