[MRG] Change dataset for test_classifier_chain_vs_independent_models #9255

qinhanmin2014 · 2017-06-30T06:50:36Z

Reference Issue

What does this implement/fix? Explain your changes.

I follow the author's method in the previous test to construct a dataset. I don't use the function provided by the author(generate_multilabel_dataset_with_correlations) because we need a random state to ensure a certain result. Sometimes, better model doesn't get better result.

Any other comments?

jnothman · 2017-07-01T10:30:41Z

sklearn/tests/test_multioutput.py

-    X_test = X[2000:, :]
-    Y_train = Y[:2000, :]
-    Y_test = Y[2000:, :]
+    X, y = make_classification(n_samples=1000,


does generate_multilabel_dataset_with_correlations not work as is?

@jnothman Thanks. I think we need a random state here to ensure a certain result(In the previous test, the author's purpose is to ensure that different calculation methods get same result so random state is not needed). There are indeed some cases when ClassifierChain get wrose result.

jnothman · 2017-07-01T10:53:49Z

Ordinarily we tend to include a random state at all invocation in the tests to avoid occasional failures... (Although I wish that we had marked all tests that should work for any random state.)

Might be cleanest to add a random_state param to generate_multilabel_dataset_with_correlations...?

qinhanmin2014 · 2017-07-01T12:34:33Z

@jnothman Thanks. I also think it is good. For this test, the improvement of the model is around 1% both with the original dataset and the new dataset. Considering new dataset relies on random state, without a random state, I'm really worring about test failure.

jnothman · 2017-07-03T14:19:13Z

@adamklec, could you remind me if there was any special motivation for this test being applied to yeast? We're having trouble with the mldata servers' unreliability, and could do without depending on it for tests to pass.

jmschrei · 2017-07-03T17:13:26Z

It is probably worth it to amend the authors function to avoid effort duplication now/in the future.

jmschrei · 2017-07-03T17:13:48Z

sklearn/tests/test_multioutput.py

-                            order=np.array([0, 2, 4, 6, 8, 10,
-                                            12, 1, 3, 5, 7, 9,
-                                            11, 13]))
+    chain = ClassifierChain(LogisticRegression())


Why the change in orderings?

jnothman · 2017-07-03T22:28:58Z

The specified ordering was chosen on the basis that the test only aimed to show that there exists some order for which the chain exceeded independent models in performance. (Perhaps a comment should note this.) If the default order works, that's fine.

…

On 4 Jul 2017 3:13 am, "Jacob Schreiber" ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In sklearn/tests/test_multioutput.py <#9255 (comment)> : > ovr = OneVsRestClassifier(LogisticRegression()) ovr.fit(X_train, Y_train) Y_pred_ovr = ovr.predict(X_test) - chain = ClassifierChain(LogisticRegression(), - order=np.array([0, 2, 4, 6, 8, 10, - 12, 1, 3, 5, 7, 9, - 11, 13])) + chain = ClassifierChain(LogisticRegression()) Why the change in orderings? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#9255 (review)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz6yUA8-aiNjQqixQ-I-O9y_QidaDMks5sKSFNgaJpZM4OKNj-> .

qinhanmin2014 · 2017-07-03T23:42:06Z

Thanks. Seems that there's already such comment at the beginning of the function. If there's something more to clarify, please leave a comment.

ogrisel · 2017-07-04T12:41:12Z

LGTM, merging.

adamklec · 2017-07-11T15:31:14Z

@jnothman Sorry for the delayed response. Your explanation for why I used the yeast dataset is correct. I wanted to write a test that asserted that in the presence of correlated classes ClassifierChains out perform independent models. For some reason I was having difficult doing this using the generate_multilabel_dataset_with_correlations function. Yeast is small and manageable and reasonably well known so I went with it. But I never liked the fact the unit test needed to connect to a remote server. If there is a way to use generate_multilabel_dataset_with_correlations I would go with it.

…cikit-learn#9255)

change dataset of ClassifierChain test

29406f3

qinhanmin2014 changed the title ~~[WIP] Change dataset for test_classifier_chain_vs_independent_models~~ [MRG] Change dataset for test_classifier_chain_vs_independent_models Jun 30, 2017

jnothman reviewed Jul 1, 2017

View reviewed changes

random state

ab983fc

jmschrei reviewed Jul 3, 2017

View reviewed changes

jnothman mentioned this pull request Jul 4, 2017

[MRG+1] ENH: dataset-fetching with use figshare and checksum #9240

Merged

ogrisel merged commit a294149 into scikit-learn:master Jul 4, 2017

massich pushed a commit to massich/scikit-learn that referenced this pull request Jul 13, 2017

TST Change dataset for test_classifier_chain_vs_independent_models (s…

fbc333a

…cikit-learn#9255)

qinhanmin2014 deleted the my-feature-1 branch July 26, 2017 08:41

dmohns pushed a commit to dmohns/scikit-learn that referenced this pull request Aug 7, 2017

TST Change dataset for test_classifier_chain_vs_independent_models (s…

107a580

…cikit-learn#9255)

dmohns pushed a commit to dmohns/scikit-learn that referenced this pull request Aug 7, 2017

TST Change dataset for test_classifier_chain_vs_independent_models (s…

8955adf

…cikit-learn#9255)

NelleV pushed a commit to NelleV/scikit-learn that referenced this pull request Aug 11, 2017

TST Change dataset for test_classifier_chain_vs_independent_models (s…

1617df6

…cikit-learn#9255)

paulha pushed a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017

TST Change dataset for test_classifier_chain_vs_independent_models (s…

965393c

…cikit-learn#9255)

AishwaryaRK pushed a commit to AishwaryaRK/scikit-learn that referenced this pull request Aug 29, 2017

TST Change dataset for test_classifier_chain_vs_independent_models (s…

43edfd8

…cikit-learn#9255)

maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017

TST Change dataset for test_classifier_chain_vs_independent_models (s…

06774fb

…cikit-learn#9255)

jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017

TST Change dataset for test_classifier_chain_vs_independent_models (s…

409b2c2

…cikit-learn#9255)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[MRG] Change dataset for test_classifier_chain_vs_independent_models #9255

[MRG] Change dataset for test_classifier_chain_vs_independent_models #9255

Uh oh!

qinhanmin2014 commented Jun 30, 2017 •

edited

Loading

Uh oh!

jnothman Jul 1, 2017

Uh oh!

qinhanmin2014 Jul 1, 2017

Uh oh!

jnothman commented Jul 1, 2017

Uh oh!

qinhanmin2014 commented Jul 1, 2017

Uh oh!

jnothman commented Jul 3, 2017

Uh oh!

jmschrei commented Jul 3, 2017

Uh oh!

jmschrei Jul 3, 2017

Uh oh!

jnothman commented Jul 3, 2017 via email

Uh oh!

qinhanmin2014 commented Jul 3, 2017

Uh oh!

ogrisel commented Jul 4, 2017

Uh oh!

adamklec commented Jul 11, 2017

Uh oh!

Uh oh!

Uh oh!

[MRG] Change dataset for test_classifier_chain_vs_independent_models #9255

[MRG] Change dataset for test_classifier_chain_vs_independent_models #9255

Uh oh!

Conversation

qinhanmin2014 commented Jun 30, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issue

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

jnothman Jul 1, 2017

Choose a reason for hiding this comment

Uh oh!

qinhanmin2014 Jul 1, 2017

Choose a reason for hiding this comment

Uh oh!

jnothman commented Jul 1, 2017

Uh oh!

qinhanmin2014 commented Jul 1, 2017

Uh oh!

jnothman commented Jul 3, 2017

Uh oh!

jmschrei commented Jul 3, 2017

Uh oh!

jmschrei Jul 3, 2017

Choose a reason for hiding this comment

Uh oh!

jnothman commented Jul 3, 2017 via email

Uh oh!

qinhanmin2014 commented Jul 3, 2017

Uh oh!

ogrisel commented Jul 4, 2017

Uh oh!

adamklec commented Jul 11, 2017

Uh oh!

Uh oh!

qinhanmin2014 commented Jun 30, 2017 •

edited

Loading