Back-reference optimizations #272

QuLogic · 2017-07-29T00:33:44Z

Finding back-references involves downloading a file for every possibly used object in order to do two things:

Check for the correct extension
Check for the correct anchor

All this downloading makes back-referencing slow, so instead:

Download the index page, parse out the DOCUMENTATION_OPTIONS and find the FILE_SUFFIX from there. If it's not defined, it's a really old Sphinx so assume .rst.html.
Use the same method of finding the anchor as the JavaScript search code.

With these two changes, we can stop downloading all those files and just download 2 per back-reference URL. Additionally, use Sphinx's built-in parser to cut down on the parsing code.

QuLogic · 2017-07-29T04:57:42Z

sphinx_gallery/docs_resolv.py

-        if full_name in self._searchindex['objects']:
-            value = self._searchindex['objects'][full_name]
-            if isinstance(value, dict):
-                value = value[next(iter(value.keys()))]


Note that this is completely wrong. For example, in plot_gallery_version, np.random.random is linked to np.random.lognormal. I believe this is also in part due to #273.

I just tested this PR. but in this case it does not even recognize np.random.random I'll see with #273

Definitely should be because of #273.

Titan-C

This looks good. Thank you for doing this. I would love to have a test for the _get_link, not sure how to do it yet.

Titan-C · 2017-07-29T11:16:37Z

sphinx_gallery/docs_resolv.py

+        else:
+            value = int(value)
+
+        docopts[key] = value


Can you make a test for this function. Or at least move the parse to a separate function and then test that part at least.

Test added, but this parsing is a separate function already?

Titan-C · 2017-07-29T11:17:28Z

sphinx_gallery/docs_resolv.py

@@ -30,6 +30,9 @@

 from io import StringIO

+from sphinx.search import js_index
+from sphinx.util import jsdump


This is not used

Titan-C · 2017-07-29T12:02:49Z

sphinx_gallery/docs_resolv.py

-        if full_name in self._searchindex['objects']:
-            value = self._searchindex['objects'][full_name]
-            if isinstance(value, dict):
-                value = value[next(iter(value.keys()))]


I just tested this PR. but in this case it does not even recognize np.random.random I'll see with #273

Titan-C · 2017-07-29T12:03:54Z

sphinx_gallery/docs_resolv.py

+        else:
+            value = int(value)
+
+        docopts[key] = value


Can you provide a test for this function. Or at least move this parse to an independent function an test that part at least.

+1 for tests

choldgraf · 2017-07-29T15:17:49Z

this sounds like a great addition, thanks @QuLogic ... I don't have a ton of experience w/ intersphinx code but I agree that it'd be good to test any new parsing functionality built in here.

choldgraf · 2017-08-01T03:19:45Z

just a ping that #273 is merged now!

Titan-C · 2017-08-01T10:50:52Z

This works well on my machine. I relaunched Circle CI, but it did not make the build on top of master to include #273, so I could not check on the other versions.
+1 after a rebase, just to check.

larsoner · 2017-08-01T21:15:10Z

On my OSX machine, on current master it gets stuck on the embedding step. With this PR things at least don't hang, so +1 for merge from my end!

This removes a lot of fragile manual parsing that is already maintained elsewhere.

There are only two callers and neither use these options.

This uses the data in the search index exactly as the JavaScript code does, meaning there's no need to search through the HTML of the page to check for the anchor.

Otherwise, it's very vague what URL is failing.

QuLogic · 2017-08-01T21:28:07Z

Rebased and np.random.random is now pointing to the correct function here.

Titan-C · 2017-08-02T10:40:55Z

Merged. Thanks @QuLogic

QuLogic mentioned this pull request Jul 29, 2017

Various documentation updates matplotlib/matplotlib#8955

Merged

1 task

QuLogic commented Jul 29, 2017

View reviewed changes

Titan-C requested changes Jul 29, 2017

View reviewed changes

QuLogic force-pushed the backref-opt branch from 0325d8a to aec089f Compare July 30, 2017 03:46

QuLogic mentioned this pull request Jul 31, 2017

Log what URLs fail to be resolved. #274

Closed

Titan-C approved these changes Aug 1, 2017

View reviewed changes

QuLogic added 6 commits August 1, 2017 17:18

Use Sphinx searchindex parser.

a725b77

This removes a lot of fragile manual parsing that is already maintained elsewhere.

Also download Sphinx options with search index.

5cc6afe

Determine backref file extension from options.

edba080

Remove unused options to SphinxDocLinkResolver.

1186bfb

There are only two callers and neither use these options.

Find link anchor directly from searchindex.

0c73dcc

This uses the data in the search index exactly as the JavaScript code does, meaning there's no need to search through the HTML of the page to check for the anchor.

Add URL to HTTPError warning.

80b92d0

Otherwise, it's very vague what URL is failing.

QuLogic force-pushed the backref-opt branch from aec089f to 80b92d0 Compare August 1, 2017 21:18

Titan-C merged commit 525120e into sphinx-gallery:master Aug 2, 2017

QuLogic deleted the backref-opt branch August 2, 2017 19:10

Titan-C mentioned this pull request Aug 24, 2017

Caching the "Embedding documentation hyperlinks in examples" #286

Closed

GaelVaroquaux mentioned this pull request Aug 24, 2017

Crasher in doc_resolv, in js_index.loads #287

Closed

Back-reference optimizations #272

Back-reference optimizations #272

Uh oh!

Conversation

QuLogic commented Jul 29, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Titan-C left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

QuLogic Jul 30, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

choldgraf commented Jul 29, 2017

Uh oh!

choldgraf commented Aug 1, 2017

Uh oh!

Titan-C commented Aug 1, 2017

Uh oh!

larsoner commented Aug 1, 2017

Uh oh!

QuLogic commented Aug 1, 2017

Uh oh!

Titan-C commented Aug 2, 2017

Uh oh!

Uh oh!

QuLogic Jul 30, 2017 •

edited

Loading