-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[BLD] CircleCI failed when a PR generates sphinx warnings #15355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Please use keywords in the commit message to perform a full doc build here |
96cf750
to
c19c4e3
Compare
@lucyleeow and @glemaitre have just found an interesting case in #15333 : see the correspondent _changed.html. |
Is it easy to determine which examples are relevant to a modified page?
|
At a glance I will say yes, in principle grepping |
@jnothman I'm not sure this is the most elegant way to do it but it works... |
I think @adrinjalali would love the bash command in this PR. |
Hello there, just to let you know that there are no sphinx warnings right now in master (wow!) :) |
This looks pretty good to me. For the And to be clear, this just tries to be a bit smarter detecting new warnings. It'll still fail to detect a new warning introduced in the example if no example or documentation is touched, but the estimator raises a new warning. Is that a correct statement? Also, not having sphinx warnings on master has been a dream of mine lol. Awesome. |
The problem is that I'm grepping in the
Totally, I'm only looking for warnings raised by the sphinx build not by the example script execution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. Thanks @cmarmo , so happy you're taking care of these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @cmarmo for working on this!
build_tools/circle/build_doc.sh
Outdated
do | ||
if [ $af != "build_tools/circle/build_doc.sh" ] | ||
then | ||
page_figure=$(grep figure $af | grep auto_example | awk -F "/" '{print $NF}' | sed 's/sphx_glr_//' | awk -F "_" '{OFS="_";$NF=""; print $0}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is missing the examples
root directory.
Please see comment regarding a chained bash script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right : the example
directory is not taken into account.
But the example
directory is irrelevant to this line of code. As said by @adrinjalali I'm looking for examples in modified .rst
pages in order to build them even if they have not been modified.
It looks to me that examples in that directory are never included in the .rst
pages because they are independent pages. Am I wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry.. I didn't understand your comment, my answer is nonsense... you mean in the script path right?
Well I tested the build process and apparently the EXAMPLES_PATTERN
argument accepts regexp, so no need for the complete path...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cmarmo ‘s code is I feel more readable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I forgot that EXAMPLES_PATTERN
is a regex, so the current solution works.
If @cmarmo 's version is more readable, we should go with the current version. I would prefer slightly to only include .rst
in filenames
to restrict reduce the items in the loop.
build_tools/circle/build_doc.sh
Outdated
@@ -58,6 +58,25 @@ get_build_type() { | |||
return | |||
fi | |||
changed_examples=$(echo "$filenames" | grep -E "^examples/(.*/)*plot_") | |||
if [ -n "$filenames" ] | |||
then | |||
for af in ${filenames[@]} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this section how do you feel about:
if [ -n "$filenames"] then
examples_in_rst=$(echo "$filenames" | \
grep -E "rst$" | \
xargs grep -shE "(figure|image)::" | \
grep auto_examples | \
awk -F "../auto_examples/" '{print $NF}' | \
sed 's/images\/sphx_glr_//' | \
sed -e 's/_[[:digit:]][[:digit:]][[:digit:]].png/.py/' | \
uniq | \
sed -e 's/^/examples\//')
changed_examples="${changed_examples}\n${examples_in_rst}"
fi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... well that's awesome... and clearly you had a lot of fun in building it! ;)
Is that a cryptic way to ask me to put your code in this PR? :)
Thanks for the [:digit:]
thing: I looked for it for an entire afternoon without luck... not the right keywords on google search clearly...
From a bash newbie which always wonders what is this back magic: I would
appreciate a couple of comments above the commands :)
@thomasjpfan maybe having a 2 liners code with comments might make it
easier to understand in a couple of weeks what these things are doing for
people, as me, not good with bash
WDYT?
…On Sat, 9 Nov 2019 at 20:04, Alexandre Gramfort ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In build_tools/circle/build_doc.sh
<#15355 (comment)>
:
> @@ -58,6 +58,25 @@ get_build_type() {
return
fi
changed_examples=$(echo "$filenames" | grep -E "^examples/(.*/)*plot_")
+ if [ -n "$filenames" ]
+ then
+ for af in ${filenames[@]}
+ do
+ if [ $af != "build_tools/circle/build_doc.sh" ]
+ then
+ page_figure=$(grep figure $af | grep auto_example | awk -F "/" '{print $NF}' | sed 's/sphx_glr_//' | awk -F "_" '{OFS="_";$NF=""; print $0}')
@cmarmo <https://github.com/cmarmo> ‘s code is I feel more readable
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#15355?email_source=notifications&email_token=ABY32PYRI5TGA3XZJHGEQS3QS4CSHA5CNFSM4JEVXEIKYY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCLAAABY#discussion_r344457543>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABY32P7AYUWMAZ36CRODD5LQS4CSHANCNFSM4JEVXEIA>
.
--
Guillaume Lemaitre
Scikit-learn @ Inria Foundation
https://glemaitre.github.io/
|
build_tools/circle/build_doc.sh
Outdated
do | ||
if [ $af != "build_tools/circle/build_doc.sh" ] | ||
then | ||
page_figure=$(grep figure $af | grep auto_example | awk -F "/" '{print $NF}' | sed 's/sphx_glr_//' | awk -F "_" '{OFS="_";$NF=""; print $0}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I forgot that EXAMPLES_PATTERN
is a regex, so the current solution works.
If @cmarmo 's version is more readable, we should go with the current version. I would prefer slightly to only include .rst
in filenames
to restrict reduce the items in the loop.
build_tools/circle/build_doc.sh
Outdated
if [ $af != "build_tools/circle/build_doc.sh" ] | ||
then | ||
page_figure=$(grep figure $af | grep auto_example | awk -F "/" '{print $NF}' | sed 's/sphx_glr_//' | awk -F "_" '{OFS="_";$NF=""; print $0}') | ||
page_image=$(grep image $af | grep auto_example | awk -F "/" '{print $NF}' | sed 's/sphx_glr_//' | awk -F "_" '{OFS="_";$NF=""; print $0}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we update change_examples
as we find them in the loop?
@glemaitre Yes the chain can be broken down into a few logical steps: if [ -n "$filenames" ]; then
# get rst files
rst_files="$(echo "$filenames" | grep -E "rst$")"
# get lines with figure or images
img_fig_lines="$(echo "$rst_files" | xargs grep -shE "(figure|image)::")"
# get only auto_examples
auto_example_files="$(echo "$img_fig_lines" | grep auto_examples | awk -F "../auto_examples/" '{print $NF}')"
# remove "images" from path and accept replace _\d\d\d.png with .py
image_paths="$(echo "$auto_example_files" | sed 's/images\/sphx_glr_//' | sed -e 's/_[[:digit:]][[:digit:]][[:digit:]].png/.py/')"
# get unique values
examples_in_rst="$(echo "$image_paths" | uniq)"
fi I hope this is clearer @cmarmo. You can use some of these ideas if you wish in your solution. |
IMO, this is nicer for maintainability if you want non-bash wizards to get what is going on and being able to modify it if needed. |
I think it's fine to keep it as a chain as long as the meaning of each step is documented. |
For instance I have no idea what |
|
I'm back after a lot of mistakes and some more tests (see here)... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries. It's bash, there are almost infinite ways of doing almost anything lol. I was happy with that solution, and I'm happier with this solution. I'm not too picky when it comes to bash. As long as it does the job, I'm fine with it. Thanks @cmarmo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comments detailing the different steps in parsing the content of the changed rst files.
I am not very familiar with the Circle CI configuration but it seems that the status named:
ci/circleci: doc artifact — Link to 0/doc/_changed.html
does not appear in the github interface of the PR that has a failed warning check: cmarmo#5
I think the configuration should be updated to always have this link in the github UI whenever the _changed.html
artifact is generated and uploaded.
Also here are few more suggestions:
build_tools/circle/build_doc.sh
Outdated
auto_example_files="$(echo "$img_fig_lines" | grep auto_examples | awk -F "/" '{print $NF}')" | ||
|
||
# remove "sphx_glr_" from path and accept replace _\d\d\d.png with .py | ||
image_paths="$(echo "$auto_example_files" | sed 's/sphx_glr_//' | sed -e 's/_[[:digit:]][[:digit:]][[:digit:]].png/.py/')" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those are no longer image paths but python source files for the examples that generated the images:
image_paths="$(echo "$auto_example_files" | sed 's/sphx_glr_//' | sed -e 's/_[[:digit:]][[:digit:]][[:digit:]].png/.py/')" | |
examples_in_changed_rst="$(echo "$auto_example_files" | sed 's/sphx_glr_//' | sed -e 's/_[[:digit:]][[:digit:]][[:digit:]].png/.py/')" |
and then:
# Filter duplicates:
examples_in_changed_rst="$(echo "$examples_in_changed_rst" | uniq)"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed... I think
build_tools/circle/build_doc.sh
Outdated
then | ||
# pattern for examples to run is the last line of output | ||
echo BUILD: detected examples/ filename modified in $git_range: $pattern | ||
echo $pattern |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that $pattern is undefined at this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Sorry, I've rewritten this part of the script. Let me know.
build_tools/circle/build_doc.sh
Outdated
if [[ -n "$changed_examples" ]] | ||
then | ||
echo BUILD: detected examples/ filename modified in $git_range: $changed_examples | ||
pattern=$(echo "$changed_examples" | paste -sd '|') | ||
if [[ -n "$examples_in_rst" ]] | ||
then | ||
pattern=$(echo "$changed_examples" | paste -sd '|')"|"$examples_in_rst | ||
else | ||
pattern=$(echo "$changed_examples" | paste -sd '|') | ||
fi | ||
# pattern for examples to run is the last line of output | ||
echo "$pattern" | ||
echo BUILD: detected examples/ filename modified in $git_range: $pattern | ||
echo $pattern | ||
return | ||
else | ||
if [[ -n "$examples_in_rst" ]] | ||
then | ||
# pattern for examples to run is the last line of output | ||
echo BUILD: detected examples/ filename modified in $git_range: $pattern | ||
echo $pattern | ||
return | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should be able to find way to simplify this. Maybe examples_in_changed_rst
could be kept as a list of separated lines (without the final | paste -sd '|'
):
examples_in_changed_rst="$(echo "$image_paths" | uniq)"
and then the concatenation in the pattern could be done as follows:
pattern=""
if [[ -n "$changed_examples" ]]
then
pattern="$(echo "$changed_examples" | paste -sd '|')"
fi
if [[ -n "$examples_in_changed_rst" ]]
then
if [[ -n "$pattern" ]]
then
pattern="$pattern|$(echo "$examples_in_changed_rst" | paste -sd '|')"
else
pattern="$(echo "$examples_in_changed_rst" | paste -sd '|')"
fi
fi
if [[ -n "$pattern" ]]
then
echo BUILD: detected examples/ filename modified in $git_range: $pattern
echo $pattern
fi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In retrospect I am not sure that this solution is that simpler/more readable...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rewritten
@@ -204,5 +238,12 @@ then | |||
echo "$warnings" | sed 's/\/home\/circleci\/project\//<li>/g' | |||
echo '</ul></body></html>' | |||
) > 'doc/_build/html/stable/_changed.html' | |||
|
|||
if [ $check ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can do this without introducing the check
variable:
if [ -n "$warnings" ]
then
echo "There are Sphinx Warnings in the documentation!"
echo "Please check doc/_build/html/stable/_changed.html"
exit 1
fi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried: but I was using grep in order to do this and grep throws an exit 1 when it could find anything, so the script exited due to the set -e
... I can dig deeper if it's reeeally necessary...
Replying to myself:
This because there is a specific github app registered on the main scikit-learn repo to add this link: #14731 So apparently there is nothing more to do. |
build_tools/circle/build_doc.sh
Outdated
auto_example_files="$(echo "$img_fig_lines" | grep auto_examples | awk -F "/" '{print $NF}')" | ||
|
||
# remove "sphx_glr_" from path and accept replace _\d\d\d.png with .py | ||
image_paths="$(echo "$auto_example_files" | sed 's/sphx_glr_//' | sed -e 's/_[[:digit:]][[:digit:]][[:digit:]].png/.py/')" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the thumb images are still here, we need to add another filter for py files:
echo "$auto_example_files" | sed 's/sphx_glr_//' | sed -e 's/_[[:digit:]][[:digit:]][[:digit:]].png/.py/' | grep -sh ".py$"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please explain to me what are thumb images? Thanks ! :)
(I mean, you probably mean thumbnails somewhere, but I don't know examples of that)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sphinx gallery generates files like IMAGE_NAME_001.png
and IMAGE_NAME_thumb.png
where the thumbs are used for thumbnails.
I was running some of these commands and saw some appear here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And where do _thumb.png
are linked? Could they be linked anywhere in .rst
files?
build_tools/circle/build_doc.sh
Outdated
image_paths="$(echo "$auto_example_files" | sed 's/sphx_glr_//' | sed -e 's/_[[:digit:]][[:digit:]][[:digit:]].png/.py/')" | ||
|
||
# get unique values | ||
examples_in_rst="$(echo "$image_paths" | uniq | paste -sd '|')" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we updated changed examples here by:
if [[ -n "$examples_in_rst" ]] then
changed_examples="$changed_examples\n$examples_in_rst"
fi
The rest of the code from here can be left unchanged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you take the example of cmarmo#5, in my local bash, the code
changed_examples="$changed_examples\n$examples_in_rst"
outputs
+ changed_examples='examples/svm/plot_custom_kernel.py examples/svm/plot_iris_svc.py\nplot_cluster_comparison.py plot_kmeans_assumptions.py plot_kmeans_digits.py plot_mini_batch_kmeans.py plot_affinity_propagation.py plot_mean_shift.py plot_segmentation_toy.py plot_coin_segmentation.py plot_linkage_comparison.py plot_agglomerative_dendrogram.py plot_ward_structured_vs_unstructured.py plot_agglomerative_clustering.py plot_agglomerative_clustering_metrics.py plot_dbscan.py plot_optics.py plot_birch_vs_minibatchkmeans.py plot_adjusted_for_chance_measures.py plot_tree_regression.py plot_iris_dtc.py plot_tree_regression.py plot_tree_regression_multioutput.py plot_multioutput_face_completion.py'
The \n
character is printed as is, it is not translated in an EOL
... if this is different from your bash, this is a portability issue anyway.
I think the actual version is ok from this point of view. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there anyway to add the $examples_in_rst
into $changed_examples
so we leave the rest of the code unchanged?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's all my drama... :( .. could not find it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@thomasjpfan you've got it! :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a little strange that the paste
statement is not:
paste -s -d '|' -
(With the -
) I needed the -
to work locally.
scripts_names="$(echo "$auto_example_files" | sed 's/sphx_glr_//' | sed -e 's/_[[:digit:]][[:digit:]][[:digit:]].png/.py/')" | ||
|
||
# get unique values | ||
examples_in_rst="$(echo "$scripts_names" | uniq )" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about:
if [[ -n "$examples_in_rst" ]]
then
changed_examples="$changed_examples|$examples_in_rst"
fi
And leave the rest unchanged. My goal is to reduce the number of if
statements after this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then if $changed_examples
is empty your regular expression for EXAMPLES_PATTERN
will begin with |
: this generates an error in every checker of regexp I've tested (maybe you know why?) and always crashes my checks in cmarmo#5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. Is our Makefile tolerant to extra |
when either changed_examples
or examples_in_rst
are empty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this verbose way work?
if [[ -n "$examples_in_rst" ]]
then
if [[ -n "$changed_examples" ]]
then
changed_examples="$changed_examples|$examples_in_rst"
else
changed_examples="$examples_in_rst"
fi
fi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I go for it.
Next time you write the code and I'll do the review? I feel like the entire process will be way more efficient... ;)
06b8119
to
2fef5e7
Compare
@cmarmo #15633 was to test an idea from @ogrisel: #15355 (comment) I think this PR is pretty much ready to be merged! |
@cmarmo My last comment is: if [ "$warnings" != "/home/circleci/project/ no warnings" ]
then
echo "Sphinx generated warnings when building the documentation related to files modified in this PR."
echo "Please check doc/_build/html/stable/_changed.html"
exit 1
fi as seen in https://github.com/scikit-learn/scikit-learn/pull/15633/files which removes the need for the Given how much effort you have put into this issue, I would prefer to merge this PR. |
Dear Thomas (@thomasjpfan) I would rather not. |
Many things are trickier than we think at first. But maybe messy bash scripting is the wrong thing to do in this repo in any case: should we rather write this script in python and increase maintainability?? |
Well as @thomasjpfan said, so many efforts have been put in the bash approach that I would rather see merged #15633 . The code is pretty clear and new sphinx warnings are already popping up in master. |
Reference Issues/PRs
Closes #6025
What does this implement/fix? Explain your changes.
This PR makes CircleCI documentation build fail when modified files generate sphinx warnings
Any other comments?
Note that the documentation is still rendered and visible in artifacts as before, but circleci throw
exit 1
.An example is visible here