DOC Improve documentation regarding some pitfalls in interpretation #20451

jygerardy · 2021-07-03T02:00:54Z

Reference Issues/PRs

Fixes #19413

What does this implement/fix? Explain your changes.

In the Common pitfalls in interpretation of coefficients of linear models:

add warning on wrongly giving a causal interpretation to coefficients.
add warning on how interpretation from a model may not apply to the Data Generating Process
when the model is poor or sample used is not representative of the population.

Add a tutorial to show--via simulation--that coefficients can be biased in the presence of unobserved
confounders.

Any other comments?

glemaitre

I think this is a nice addition. I added a couple of style changes but regarding the example, I think this is good. We would need more insights from people that have more expertise regarding causal inference indeed.

ping @dsleo @GaelVaroquaux

glemaitre · 2021-07-21T15:30:20Z

examples/inspection/plot_causal_interpretation.py

+print(__doc__)
+
+import numpy as np
+from sklearn.linear_model import LinearRegression


You can move this import next to where it is used and the same for numpy

examples/inspection/plot_causal_interpretation.py

examples/inspection/plot_linear_model_coefficient_interpretation.py

examples/inspection/plot_causal_interpretation.py

examples/inspection/plot_linear_model_coefficient_interpretation.py

dsleo · 2021-07-23T07:33:54Z

Full disclosure, we work with @jygerardy and I've pointed him to this issue after the last tech committee. And @jygerardy has strong expertise in causal inference - shameless article plug.

thomasjpfan

Thank you for the PR @jygerardy!

(Thinking about the data generating process first feels very... Bayesian :D)

examples/inspection/plot_causal_interpretation.py

jygerardy · 2021-07-26T18:32:39Z

Thank you for all the useful suggestions @glemaitre @dsleo @thomasjpfan !
I added them all.

thomasjpfan

I left some small comments, otherwise I think this PR is ready to go!

examples/inspection/plot_causal_interpretation.py

jygerardy · 2021-12-08T13:56:51Z

I think we're good now @glemaitre @thomasjpfan.
Thanks!

ogrisel · 2021-12-20T08:57:18Z

It seems that there was a problem in the last merge: many unrelated commits ended up in this branch. Would you mind starting this PR again with only the few files that are originally intended to be edited by the PR?

Co-authored-by: Thomas J. Fan <[email protected]>

glemaitre · 2022-12-01T17:31:01Z

I updated the example with the following changes:

sync with main
split the 2 predictive models analysis into 2 sections
add a plot to compare the coefficients of the true generative model and the predictive models

@ArturoAmorQ do you want to have a look at this example as well and give a review?
I think that it can go in the 1.3 release.

examples/inspection/plot_linear_model_coefficient_interpretation.py

ArturoAmorQ

Are you still working on this PR @jygerardy? If so, here is a first batch of comments.

ArturoAmorQ · 2022-12-06T10:02:59Z

examples/inspection/plot_linear_model_coefficient_interpretation.py

+# Warning: data and model quality
+# -------------------------------
+#
+# Keep in mind that the outcome `y` and features `X` are the product
+# of a data generating process that is hidden from us. Machine
+# learning models are trained to approximate the unobserved
+# mathematical function that links `X` to `y` from sample data. As a
+# result, any interpretation made about a model may not necessarily
+# generalize to the true data generating process. This is especially
+# true when the model is of bad quality or when the sample data is
+# not representative of the population.


Instead of creating a new section, I would add this text as a note on the current line 21, i.e. in the header. This way such an important statement will gain visibility.

examples/inspection/plot_linear_model_coefficient_interpretation.py

examples/inspection/plot_causal_interpretation.py

betatim · 2022-12-08T13:00:09Z

This is a nice example that helps with (what I think of) common misconceptions/limits of interpretability.

Is there something I can help with to move this forward/towards merge?

glemaitre · 2022-12-08T15:20:53Z

I will apply the changes and make sure the CI works. Then we can make a final review and merge upon the three approvals.

Co-authored-by: Arturo Amor <[email protected]> Co-authored-by: Tim Head <[email protected]>

haiatn · 2022-12-16T12:46:33Z

Waiting for this to merge. Good job

…cikit-learn#20451) Co-authored-by: Jean-Yves Gerardy <[email protected]> Co-authored-by: Jean-Yves Gerardy <[email protected]> Co-authored-by: Thomas J. Fan <[email protected]> Co-authored-by: Olivier Grisel <[email protected]> Co-authored-by: Guillaume Lemaitre <[email protected]> Co-authored-by: Arturo Amor <[email protected]> Co-authored-by: Tim Head <[email protected]>

…20451) Co-authored-by: Jean-Yves Gerardy <[email protected]> Co-authored-by: Jean-Yves Gerardy <[email protected]> Co-authored-by: Thomas J. Fan <[email protected]> Co-authored-by: Olivier Grisel <[email protected]> Co-authored-by: Guillaume Lemaitre <[email protected]> Co-authored-by: Arturo Amor <[email protected]> Co-authored-by: Tim Head <[email protected]>

Jean-Yves Gerardy added 6 commits July 1, 2021 11:12

Add causal tutorial and warning

4a0757e

Fix typos

07b5ebd

Fix typos

9d72e9b

Fix pep8 violations

2b8ac8a

Fix typo

a8f77f2

Fix pep8 violations

5fb0801

glemaitre changed the title ~~[MRG] Fix Improve documentation regarding some pitfalls in interpretation~~ DOC Improve documentation regarding some pitfalls in interpretation Jul 21, 2021

glemaitre reviewed Jul 21, 2021

View reviewed changes

github-actions bot added the Documentation label Jul 21, 2021

Jean-Yves Gerardy added 2 commits July 22, 2021 15:26

Add suggestions

75a75eb

Fix pep8 violation

5ec5b10

dsleo reviewed Jul 23, 2021

View reviewed changes

examples/inspection/plot_causal_interpretation.py Outdated Show resolved Hide resolved

dsleo reviewed Jul 23, 2021

View reviewed changes

examples/inspection/plot_causal_interpretation.py Outdated Show resolved Hide resolved

dsleo reviewed Jul 23, 2021

View reviewed changes

examples/inspection/plot_causal_interpretation.py Outdated Show resolved Hide resolved

dsleo reviewed Jul 23, 2021

View reviewed changes

examples/inspection/plot_linear_model_coefficient_interpretation.py Outdated Show resolved Hide resolved

dsleo reviewed Jul 23, 2021

View reviewed changes

examples/inspection/plot_linear_model_coefficient_interpretation.py Outdated Show resolved Hide resolved

dsleo reviewed Jul 23, 2021

View reviewed changes

examples/inspection/plot_linear_model_coefficient_interpretation.py Outdated Show resolved Hide resolved

thomasjpfan reviewed Jul 24, 2021

View reviewed changes

examples/inspection/plot_causal_interpretation.py Outdated Show resolved Hide resolved

examples/inspection/plot_causal_interpretation.py Outdated Show resolved Hide resolved

Jean-Yves Gerardy added 3 commits July 26, 2021 11:44

Add suggestions and fix typos

5f76950

Merge branch 'main' into causal_interpretation

f26389d

Add suggestions

5e3f801

Subsitute RandomState for Seed

5f22102

thomasjpfan reviewed Oct 5, 2021

View reviewed changes

github-actions bot added the cython label Dec 7, 2021

glemaitre self-requested a review December 14, 2021 18:34

Update examples/inspection/plot_causal_interpretation.py

c6b4538

Co-authored-by: Thomas J. Fan <[email protected]>

jygerardy and others added 5 commits January 12, 2022 11:37

Update examples/inspection/plot_causal_interpretation.py

9b45dff

Co-authored-by: Thomas J. Fan <[email protected]>

Update examples/inspection/plot_causal_interpretation.py

b4c6fac

Co-authored-by: Thomas J. Fan <[email protected]>

Update examples/inspection/plot_causal_interpretation.py

970b753

Co-authored-by: Thomas J. Fan <[email protected]>

fix typos and change grid order

40ba0ba

reformat plot_causal_interpretation

21958f9

jygerardy force-pushed the causal_interpretation branch from 5e0d8a6 to 21958f9 Compare January 12, 2022 16:39

Merge branch 'main' into causal_interpretation

9bc814b

cmarmo added the Waiting for Reviewer label Feb 14, 2022

Merge branch 'main' into causal_interpretation

1eb78df

jjerphan removed the cython label Jul 29, 2022

glemaitre added 3 commits December 1, 2022 16:04

Merge remote-tracking branch 'origin/main' into pr/jygerardy/20451

409a06e

reformat

a0ba2e2

DOC plot the coefficients

38af3b9

glemaitre added this to the 1.3 milestone Dec 1, 2022

avoid multiple pandas import

01a7da7

betatim reviewed Dec 2, 2022

View reviewed changes

examples/inspection/plot_linear_model_coefficient_interpretation.py Outdated Show resolved Hide resolved

ArturoAmorQ reviewed Dec 6, 2022

View reviewed changes

glemaitre and others added 2 commits December 8, 2022 16:38

Apply suggestions from code review

b555b34

Co-authored-by: Arturo Amor <[email protected]> Co-authored-by: Tim Head <[email protected]>

Merge branch 'main' into causal_interpretation

f9d5286

glemaitre approved these changes Jan 11, 2023

View reviewed changes

glemaitre merged commit c892ade into scikit-learn:main Jan 11, 2023

ArturoAmorQ mentioned this pull request Jan 20, 2023

DOC Improve visibility of warning message on example "Pitfalls in the interpretation of coefficients of linear models" #25441

Merged

Uh oh!

DOC Improve documentation regarding some pitfalls in interpretation #20451

DOC Improve documentation regarding some pitfalls in interpretation #20451

Uh oh!

Conversation

jygerardy commented Jul 3, 2021

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

glemaitre Jul 21, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dsleo commented Jul 23, 2021

Uh oh!

thomasjpfan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jygerardy commented Jul 26, 2021

Uh oh!

thomasjpfan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jygerardy commented Dec 8, 2021

Uh oh!

ogrisel commented Dec 20, 2021

Uh oh!

glemaitre commented Dec 1, 2022

Uh oh!

Uh oh!

ArturoAmorQ left a comment

Choose a reason for hiding this comment

Uh oh!

ArturoAmorQ Dec 6, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

betatim commented Dec 8, 2022

Uh oh!

glemaitre commented Dec 8, 2022

Uh oh!

haiatn commented Dec 16, 2022

Uh oh!

Uh oh!