DOC Revisit SVM C scaling example #25115

ArturoAmorQ · 2022-12-05T13:20:30Z

Reference Issues/PRs

Follow-up of #21776. See also #779.

What does this implement/fix? Explain your changes.

This example had room for improvement in terms of wording and clarity of scope. Hopefully this PR fixes it.

Any other comments?

This PR removes one out of two synthetic datasets that were present in the previous narrative, meaning that now a sparse dataset is used for demoing both the L1 and L2 penalty.

…nto svm_c

thomasjpfan

Thanks for the PR!

thomasjpfan · 2022-12-05T21:41:11Z

examples/svm/plot_svm_scale_c.py

-# Now, we can define a linear SVC with the `l1` penalty.
+# L1-penalty case
+# ---------------
+# In the L1 case, theory says that prediction consistency (i.e. that under


While updating this example, have you came across a reference to the "theory says ..." claim?

(It would help resolve #4657)

maybe Theorem 5.1. in https://arxiv.org/pdf/0801.1095.pdf ?

maybe Theorem 5.1. in arxiv.org/pdf/0801.1095.pdf ?

That theorem establishes an approximate equivalence between the Lasso and the Dantzig selector. I don't think that this theorem is related to the claim that "it is not possible for the learned estimator to predict as well as model knowing the true distribution because of the bias of the L1" simply because it is not true that the L1 norm always introduces bias. I really think we should remove such claim.

examples/svm/plot_svm_scale_c.py

agramfort · 2023-07-29T08:15:47Z

ok I played a bit with this it. I agree with @glemaitre that if you don't scale the ramp up is not aligned in the L1 case yet the maximum is better aligned. See:

now if you use rescale in the L1 case by \sqrt(1/n_samples) then you get:

which is even more aligned.

the reason I tried this is that for the Lasso asymptotic theory says that \lambda should scale with 1 / sqrt(n_samples). See eg thm 3 in https://arxiv.org/pdf/1402.1700.pdf or in https://arxiv.org/pdf/0801.1095.pdf where the regularization parameter r is always assumed to be proportional to 1 / sqrt(n_samples).

What I would suggest is just to say that yes scaling the C in the L1 case aligns the ramp up but the peak is what matters and the current behavior with is no scaling is pretty OK when it comes to aligning the peaks.

my 2c

glemaitre · 2023-07-29T08:29:02Z

Thanks @agramfort. We can change the example accordingly with a better narration :).

agramfort · 2023-07-29T08:30:40Z

❤

…

Message ID: ***@***.***>

github-actions · 2023-07-31T09:06:30Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: 6e4e9c4. Link to the linter CI: here}

glemaitre · 2023-08-12T10:46:13Z

examples/svm/plot_svm_scale_c.py

-# Now, we can define a linear SVC with the `l1` penalty.
+# L1-penalty case
+# ---------------
+# In the L1 case, theory says that prediction consistency (i.e. that under given


Since we mention "theory says", I think we should refer to the article cited by Alex.

I am not sure the claim "theory says" is justified in the cited documents, mostly because "prediction consistency" and "model consistency" aren't standard terms in the machine learning . I do cite the references in current lines 145 to 148, as they are more relevant at that level of the discussion.

I could still try to rephrase this paragraph to avoid such concepts and keep the underlying idea: L1 may set some coefficients to zero, reducing variance/increasing bias even in the limit where the sample size grows to infinity.

Indeed, this would be nice.

glemaitre

Otherwise, LGTM.

…nto svm_c

glemaitre · 2023-08-22T19:44:22Z

LGTM. Thanks @ArturoAmorQ

Co-authored-by: ArturoAmorQ <[email protected]> Co-authored-by: Guillaume Lemaitre <[email protected]>

ArturoAmorQ added 7 commits November 30, 2022 15:31

DOC Revisit SVM C scaling example

6db95a5

Merge branch 'main' of https://github.com/scikit-learn/scikit-learn i…

4bb72cb

…nto svm_c

Remove redundant synthetic data generation

e677498

Factorize code

474fb89

Increase n_jobs

108e079

Undo factorization for validation curve

d618c30

Improve general wording

fb5d3c8

github-actions bot added the Documentation label Dec 5, 2022

thomasjpfan reviewed Dec 5, 2022

View reviewed changes

Merge branch 'main' into svm_c

d9c2740

glemaitre self-requested a review May 24, 2023 08:49

Merge branch 'main' into svm_c

854b8e4

glemaitre reviewed May 24, 2023

View reviewed changes

examples/svm/plot_svm_scale_c.py Outdated Show resolved Hide resolved

Merge branch 'main' into svm_c

97132c4

ArturoAmorQ added 5 commits July 31, 2023 11:53

Draw vertical lines on optimal values

b057d94

Increase n_splits for better results in l2 case

8945a32

Wording

c0d23e1

Format tweak

6c6c2dc

Apply suggestion from Alex Gramfort

1271b78

ArturoAmorQ added the Waiting for Reviewer label Aug 8, 2023

glemaitre reviewed Aug 12, 2023

View reviewed changes

glemaitre approved these changes Aug 12, 2023

View reviewed changes

ArturoAmorQ added 4 commits August 21, 2023 15:32

Merge branch 'main' of https://github.com/scikit-learn/scikit-learn i…

1facae5

…nto svm_c

Fix wording

851488f

Prefer verbs in present mode

e3bc48d

Merge branch 'main' of https://github.com/scikit-learn/scikit-learn i…

6e4e9c4

…nto svm_c

glemaitre merged commit 1a9c006 into scikit-learn:main Aug 22, 2023

ArturoAmorQ deleted the svm_c branch August 22, 2023 20:04

akaashpatelmns pushed a commit to akaashp2000/scikit-learn that referenced this pull request Aug 25, 2023

DOC Revisit SVM C scaling example (scikit-learn#25115)

1931469

Co-authored-by: ArturoAmorQ <[email protected]> Co-authored-by: Guillaume Lemaitre <[email protected]>

glemaitre added a commit to glemaitre/scikit-learn that referenced this pull request Sep 18, 2023

DOC Revisit SVM C scaling example (scikit-learn#25115)

d05edf8

Co-authored-by: ArturoAmorQ <[email protected]> Co-authored-by: Guillaume Lemaitre <[email protected]>

jeremiedbb pushed a commit that referenced this pull request Sep 20, 2023

DOC Revisit SVM C scaling example (#25115)

25d9689

Co-authored-by: ArturoAmorQ <[email protected]> Co-authored-by: Guillaume Lemaitre <[email protected]>

REDVM pushed a commit to REDVM/scikit-learn that referenced this pull request Nov 16, 2023

DOC Revisit SVM C scaling example (scikit-learn#25115)

660f863

Co-authored-by: ArturoAmorQ <[email protected]> Co-authored-by: Guillaume Lemaitre <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

DOC Revisit SVM C scaling example #25115

DOC Revisit SVM C scaling example #25115

Uh oh!

ArturoAmorQ commented Dec 5, 2022

Uh oh!

thomasjpfan left a comment

Uh oh!

thomasjpfan Dec 5, 2022

Uh oh!

agramfort May 25, 2023

Uh oh!

ArturoAmorQ May 31, 2023

Uh oh!

Uh oh!

agramfort commented Jul 29, 2023

Uh oh!

glemaitre commented Jul 29, 2023

Uh oh!

agramfort commented Jul 29, 2023 via email

Uh oh!

github-actions bot commented Jul 31, 2023 •

edited

Loading

Uh oh!

glemaitre Aug 12, 2023

Uh oh!

ArturoAmorQ Aug 18, 2023

Uh oh!

glemaitre Aug 21, 2023

Uh oh!

glemaitre left a comment

Uh oh!

glemaitre commented Aug 22, 2023

Uh oh!

Uh oh!

Uh oh!

DOC Revisit SVM C scaling example #25115

DOC Revisit SVM C scaling example #25115

Uh oh!

Conversation

ArturoAmorQ commented Dec 5, 2022

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

thomasjpfan left a comment

Choose a reason for hiding this comment

Uh oh!

thomasjpfan Dec 5, 2022

Choose a reason for hiding this comment

Uh oh!

agramfort May 25, 2023

Choose a reason for hiding this comment

Uh oh!

ArturoAmorQ May 31, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

agramfort commented Jul 29, 2023

Uh oh!

glemaitre commented Jul 29, 2023

Uh oh!

agramfort commented Jul 29, 2023 via email

Uh oh!

github-actions bot commented Jul 31, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

glemaitre Aug 12, 2023

Choose a reason for hiding this comment

Uh oh!

ArturoAmorQ Aug 18, 2023

Choose a reason for hiding this comment

Uh oh!

glemaitre Aug 21, 2023

Choose a reason for hiding this comment

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

glemaitre commented Aug 22, 2023

Uh oh!

Uh oh!

github-actions bot commented Jul 31, 2023 •

edited

Loading