deadlock in multioutput #8543

yupbank · 2017-03-06T14:52:41Z

Description

Example: MultioutputClassifier.fit never ends with base classifier supports n_jobs

Steps/Code to Reproduce

Example:

import sklearn.datasets as datasets
import numpy as np
from sklearn.linear_model.logistic import  LogisticRegression
from sklearn.multioutput import MultiOutputClassifier
import time

x, y = datasets.make_classification(n_samples=100000, n_features=200)
multi_y = np.hstack([y[:,np.newaxis] for i in xrange(1000)])

base_old = LogisticRegression(solver='lbfgs', n_jobs=-1)
multi_clf_never_end = MultiOutputClassifier(base_old, n_jobs=-1)
multi_clf_never_end.fit(x, multi_y)

Expected Results

mulfi_clf fitted with expected

Actual Results

it blocks

Versions

Darwin-16.4.0-x86_64-i386-64bit
('Python', '2.7.13 (default, Dec 18 2016, 07:03:39) \n[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)]')
('NumPy', '1.11.2')
('SciPy', '0.16.1')
('Scikit-Learn', '0.19.dev0')

The text was updated successfully, but these errors were encountered:

rth · 2017-03-06T17:07:16Z

@yupbank Doesn't sound like a deadlock. You are training LogisticRegression 1000 times on a (100000, 200) input dataset. To do that on a (1000, 200) dataset takes already ~1 min on a 4 core cpu. Not sure what's the exact scaling of LogistcRegression with n_samples when using lbfgs, but this would probably take hours to compute.

MultioutputClassifier.fit never ends with base classifier supports n_jobs

How long have you waited?

yupbank · 2017-03-06T17:41:23Z

~40 minutes.

rth · 2017-03-06T17:56:10Z

Using how many CPU? This could take ≳ 1.3 CPU-hours in my estimation...

rth · 2017-03-06T18:10:49Z

Also using n_jobs > 1 actually slow things down here,

Using a 100 output vector (instead of 1000 as in your example) on a 4 core CPU,

MultiOutputClassifier(base_old, n_jobs=1):

$ time python /tmp/test.py

real    0m42.301s
user    2m42.168s
sys     0m2.268s

MultiOutputClassifier(base_old, n_jobs=4):

$ time python /tmp/test.py

real    2m49.912s
user    9m4.964s
sys     2m8.304s

so using n_jobs=4 slows the computations almost exactly 4 times here. Probably for the same reason as #8216 due joblib.Parallel pickling / memmaping overhead...

yupbank · 2017-03-06T18:14:16Z

interesting.. well.. i have made this finish within ~3mins with 1000 output vector with updated loss_function and gradient

lesteve · 2017-03-10T12:27:56Z

It's hard to tell whether there is an actual problem here, or whether it is just that your snippet take a long time to run. I am going to close this one, @yupbank feel free to reopen if you have some new information to add to this issue.

lesteve closed this as completed Mar 10, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

deadlock in multioutput #8543

deadlock in multioutput #8543

yupbank commented Mar 6, 2017 •

edited

Loading

rth commented Mar 6, 2017 •

edited

Loading

yupbank commented Mar 6, 2017

rth commented Mar 6, 2017

rth commented Mar 6, 2017

yupbank commented Mar 6, 2017

lesteve commented Mar 10, 2017

deadlock in multioutput #8543

deadlock in multioutput #8543

Comments

yupbank commented Mar 6, 2017 • edited Loading

Description

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

rth commented Mar 6, 2017 • edited Loading

yupbank commented Mar 6, 2017

rth commented Mar 6, 2017

rth commented Mar 6, 2017

yupbank commented Mar 6, 2017

lesteve commented Mar 10, 2017

yupbank commented Mar 6, 2017 •

edited

Loading

rth commented Mar 6, 2017 •

edited

Loading