Thanks to visit codestin.com
Credit goes to github.com

Skip to content

MNT Configure sphinx linkcheck to be more useful #23577

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 15, 2022

Conversation

lesteve
Copy link
Member

@lesteve lesteve commented Jun 10, 2022

Right now, we rarely run make linkcheck because there is too much noise in the output and it takes a while.

Here are the list of changes that this PR introduces:

  • do not run the examples when running linkcheck
  • excluding whats_new files from linkcheck, this checks a lot of github links and takes a lot of time (on my machines ~15 minutes when checking whats_new files, ~3 minutes when not checking whats_new files). Alternatively we could only check a few of the latest whats_new files.
  • set github token from environment variable if set to be able to avoid github rate limits
  • setting timeout to have faster failure on some problematic websites
  • ignore local links (e.g. in image directive target). There may be a better way but I have not found it ...
  • allows redirects, this turns redirects into warnings rather than broken links
  • use a browser-like user agent, to decrease the number for falsely broken links, i.e. that linkcheck identifies as broken but that work fine in a browser
  • ignore some broken links. There are more broken links, I am planning to open a meta-issue about well-identified broken links in the near future. I would say a fair fraction of them are links to articles that have moved somewhere else since.

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this.

When running this locally, there are quite a few errors for redirects. Moving forward, should we update those links with the URL it ended up redirecting to?

The alternative would be to add a bunch more links into linkcheck_allowed_redirects.

@lesteve
Copy link
Member Author

lesteve commented Jun 13, 2022

When running this locally, there are quite a few errors for redirects. Moving forward, should we update those links with the URL it ended up redirecting to?

I think that by defining linkcheck_allowed_redirects all redirects are treated as warnings not error. Do you see errors e.g. "broken" something in the line rather than "redirect"?

Overall I think broken links should be fixed first. I find redirect links tolerable and I would avoid writing complex linkcheck_allowed_redirects to get rid of all of them. At worst one day the redirect stop working, and we have to fix it because it will be a broken link.

An example of not worth fixing redirects:

Examples of maybe worth fixing redirects:

  • moved projects on github
  • links to old numpy doc (e.g. docs.scipy.org for numpy doc)

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried the PR locally and it runs as expected. +1 for merging this an open follow-up PRs to deal with the contextual redirects we want to ignore and fix the broken links or update the permanent redirects in our doc.

r"https://github.com/conda-forge/miniforge#miniforge",
r"https://stackoverflow.com/questions/5836335/"
"consistently-create-same-random-numpy-array/5837352#comment6712034_5837352",
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to do a separate PR to remove testimonials of organizations that no longer exist.

@ogrisel ogrisel merged commit a5828d6 into scikit-learn:main Jun 15, 2022
@ogrisel
Copy link
Member

ogrisel commented Jun 15, 2022

I merged without waiting for @thomasjpfan +1 because this PR itself is a net improvement that does not directly impact the scikit-learn users and we can better address suggestions for improvement in follow-up PR.

@lesteve lesteve deleted the configure-linkcheck branch June 15, 2022 08:56
@thomasjpfan
Copy link
Member

Do you see errors e.g. "broken" something in the line rather than "redirect"?

You are correct, I see warnings and not errors.

ogrisel pushed a commit to ogrisel/scikit-learn that referenced this pull request Jul 11, 2022
glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Aug 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants