-
Notifications
You must be signed in to change notification settings - Fork 207
ENH: Parallel gallery generation #877
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
5d2372d
to
9baa7d8
Compare
Thinking about this a bit more, I'd expect this not to work for (at least) the matplotlib, mayavi, and pyvista scrapers because these are all global-state based. And then there will be tricky interactions with |
I'll copy that over to the issue so more people can tell me how I'm wrong :) |
maybe the sphinx folks (@samdoran or @tk0miya) could give hints, or sklearn (@Titan-C @GaelVaroquaux) would be interested in speeding up their builds too ? |
It seems this handler is invoked on the bootstrap process of Sphinx (on builder-inited event). So there is no special support from Sphinx framework. |
indeed, we're looking at parallelizing the gallery jobs at the extension level |
@jschueller there has been some renewed interest in this so I'm taking a stab at actually making it work. I think it's close but I need to fix a test and maybe a couple of examples, we'll see! |
this is great! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to work! Thanks a lot
Do you plan a new release with this ?
I still need to figure out how to test this. And also decide if But yes I think once this is in and maybe @jschueller you and @lagru run some preliminary tests we could cut a release. I'll probably write in the docs that it's new/experimental, though, since I do expect there to be issues just because parallelization seems to always create them!
|
I already tested it on a real project (openturns) and it just works |
Okay pushed a doc update -- @lucyleeow feel free to review and merge if happy when you get a chance! |
Okay I realized that for MNE-Python we treat all warnings in doc builds/examples as errors, so I needed to prevent raising the MNE-PythonNo examples (noplot)This build takes ~8 min. All examples, parallel=1 (serial)Four slowest examples took:
the full doc build takes about an hour:
So ~52 minutes to build the examples. All examples, parallel=4My machine has 4 physical cores so I tried this next and saw a bit of CPU oversubscription by eye and looking at the four slowest examples:
but the overall build time was cut in half (hooray!):
And the example part of that is ~19min. So our speedup for the example-running portion is ~52min down to ~19min! A factor of ~2.7 which is pretty good I think! There is one fancy example that has a custom scraper in MNE that I don't think I could actually get to work properly in parallel based on what it does (creates HTML files, sets them to be copied later, etc.). I think we'll want something like a |
Sorry, I've missed the ping, I'll take a quick look today. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I pushed a small commit to note version support drop in CHANGES.rst
I know you've noted more fixes to do, happy to look again later.
"sphinx": ("https://www.sphinx-doc.org/en/master", None), | ||
"pandas": ("https://pandas.pydata.org/pandas-docs/stable/", None), | ||
} | ||
intersphinx_mapping = get_intersphinx_mapping( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
f"{gallery_conf['show_memory']=} disabled due to " | ||
f"{gallery_conf['parallel']=}." | ||
) | ||
gallery_conf["show_memory"] = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question - how is the update to gallery_conf["show_memory"]
passed outside the function ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a dict,
gallery_conf` is mutable so the change does not need to be passed back, changing it here changes it everywhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Welp I returned gallery_conf unnecessarily in my configuration clean up PR then. Will fix this later
sphinx_gallery/gen_rst.py
Outdated
elif "passing" in out_vars: | ||
assert "stale" not in out_vars | ||
gallery_conf["passing_examples"].append(src_file) | ||
elif "stale" in out_vars: # non-executable files have none of these three |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what are "these three"?
Would it be clearer to say that 'stale' examples are 'not re-executed'? or something similar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing to at the top of the three conditionals:
# n.b. non-executable files have none of these three variables defined,
# so the last conditional must be "elif" not just "else"
if sys.platform in ("win32", "darwin"): | ||
sleep, timeout = (1, 2) | ||
else: | ||
sleep, timeout = (0.5, 1) | ||
proc = subprocess.Popen( | ||
[sys.executable, "-c", f"import time, sys; time.sleep({sleep}); sys.exit(0)"], | ||
close_fds=True, | ||
) | ||
memories = memory_usage(proc, interval=1e-3, timeout=timeout) | ||
proc.communicate(timeout=timeout) | ||
# On OSX sometimes the last entry can be None | ||
memories = [mem for mem in memories if mem is not None] + [0.0] | ||
memory_base = max(memories) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for my info - why are we deleting? 😬
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It never really worked robustly. On macOS it almost always errored to open and measure the subprocess
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay pushing a commit, let me know what you think @lucyleeow !
sphinx_gallery/gen_rst.py
Outdated
elif "passing" in out_vars: | ||
assert "stale" not in out_vars | ||
gallery_conf["passing_examples"].append(src_file) | ||
elif "stale" in out_vars: # non-executable files have none of these three |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing to at the top of the three conditionals:
# n.b. non-executable files have none of these three variables defined,
# so the last conditional must be "elif" not just "else"
if sys.platform in ("win32", "darwin"): | ||
sleep, timeout = (1, 2) | ||
else: | ||
sleep, timeout = (0.5, 1) | ||
proc = subprocess.Popen( | ||
[sys.executable, "-c", f"import time, sys; time.sleep({sleep}); sys.exit(0)"], | ||
close_fds=True, | ||
) | ||
memories = memory_usage(proc, interval=1e-3, timeout=timeout) | ||
proc.communicate(timeout=timeout) | ||
# On OSX sometimes the last entry can be None | ||
memories = [mem for mem in memories if mem is not None] + [0.0] | ||
memory_base = max(memories) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It never really worked robustly. On macOS it almost always errored to open and measure the subprocess
f"{gallery_conf['show_memory']=} disabled due to " | ||
f"{gallery_conf['parallel']=}." | ||
) | ||
gallery_conf["show_memory"] = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a dict,
gallery_conf` is mutable so the change does not need to be passed back, changing it here changes it everywhere
@lucyleeow okay to merge this now? |
Yes I'll merge. Sorry I totally missed that you said fixes can be in follow up PR!
Just for info, what |
Should we release or should we wait for maybe #1313 ? |
Dont wait! (kidding :) |
Oh and forgot to say:
Inside |
still any plans for a new release ? I see PR1313 and PR1344 are done |
@lucyleeow you up for making a release soon? If not then I can do it |
"stale" | ||
True if the example was stale. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doc looks incorrect? On line 1315, the 'stale' key is set to a string of something.
PS, thanks for getting this done; it's going to speed up our CI a bunch now that we can enable parallel builds. |
It's probably joblib/joblib#883; it restarts the worker if it uses "too much" memory, which is a bit arbitrary. I also had to disable the warning-as-error to get Matplotlib to build; it might be a good idea to document that somewhere. |
I didn't realize other projects also had their own strict |
Closes #25