-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
Make Azure Pipelines optional on GitHub PRs #84018
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The Azure Pipelines jobs have been reimplemented as GitHub actions which are better integrated with GitHub:
Azure Pipelines runs the same jobs, but it looks slower. It is voting and so prevents to merge a PR until it completes. I propose to simply remove the job. I already proposed it on python-dev: In this thread: |
Deleting the files is not the right first step. First, it needs to be changed to a non-required check. Then, I can use the web UI to disable it starting. *Then*, we can remove *some* of the files in the directory. Others are used for the official release, and have to stay. PR 18769 should *not* be merged. |
Ok, I closed it.
Who is allowed to do that? |
Brett did it the first time. I'm having too much trouble with GitHub right now to find the current admins. |
Yes, I can do it. And to answer Victor's question on the PR that he closed, we can make any individual status check required. So probably:
Just let me know when we are ready to merge a PR and I will switch off Azure Pipelines and make these checks required. |
There's no PR required. We need to change the required check so when I disable new builds from running it won't break existing PRs. As for actually changing the files in the repo, I don't see any hurry there. We can clean up a few of them, and maybe I can move the release build scripts into the PC folder (though would have to backport that through to 3.7). But I'm ready to disable the builds from running as soon as they're not a required check. |
Great! I saw a macOS failure this week which was caused by an internal CI error. I hope that it will be fixed soon. In the meanwhile, I would suggest to not make the macOS job mandatory, but pay attention manually to its status ;-) Apart of macOS, other jobs look reliable. |
Actually, I just realized we can't make these status checks required because they don't always run. :) Our Actions are smart enough to not run when they aren't necessary, i.e. doc changes don't run the rest of the checks. And so making the OS-specific tests required would block doc tests. Basically we would either have to waste runs on things that aren't really necessary but then be able to require runs, or we have to just rely on people paying attention to failures. I'm personally for the latter. |
Are you sure the required check isn't just for failures? Surely GitHub is smarter than requiring checks that it can tell aren't required (as it's their logic to include/exclude, not ours). If they're not, I now have many internal contacts there, so we can probably get it fixed :) |
It was an old issue that if required checks didn't run it would block, but hopefully it's fixed. :) I have gone ahead and removed the Azure Pipelines requirement from 3.7, 3.8, and master and flipped on the check requirements for the ones I listed. |
Looks like it isn't fixed... #18774 |
Instead of not running the job, is it technically possible to modify the jobs to do nothing for docs only changes? .travis.yml works like that: before_install:
|
Adding a screenshot here so I can point people at it. Let's not rush into complicating the build steps yet - AP is basically fine. We should switch back the required checks (@brett?) If anything, let's add a "condition: false" to the macOS build to disable it and rely on the non-required GH check for now. |
I've turned off the required checks for GH Actions and flipped Azure Pipelines back on. And to answer Victor's question, yes, you can make things conditional at the workflow, job, and job step level. I don't know what would happen if the check was moved from workflow to job level. |
It would make the job definition significantly more complicated, and I don't want to do that just to work around an issue with github until we've got positive confirmation that the behaviour is intentional and won't change. |
I've disabled macOS builds on Pipelines, so now they're essentially advisory through the GitHub Actions build. I also pinged some contacts about the not-very-useful behaviour of required checks vs. path filters. So will see what they say. |
I cannot merge a PR until it completes. It re-runs jobs which are already run as GH Actions. There is another annoying issue with Azure Pipelines. When a job fails randomly for whatever reason, a job cannot be re-run, even if I log in Microsoft Azure. Usually, the workaround is to close/reopen a PR to re-run all CIs. Except that for a backport PR created automatically by miss-islington bot, when I close the PR, the bot removes its branch and so the PR cannot be re-open. Well, the second workaround is to ask the bot to create a new PR backport. That what I did. I did that for PR 19276 of bpo-40121. It's annoying to have to use *two* workarounds. On the other side, Travis CI is not currently required, I don't understand why. Is it possible to make Travis CI required and make Azure Pipelines not required? |
Yes, but I don't want to to do that as we have had equivalent flakiness issues with Travis which is why it isn't required ATM. The only way to prevent flaky CI from blocking anything is to simply make no CI required and trust core devs not to merge unless they are certain they know why a CI run failed (although I don't know what that does to miss-islington). Passed that is being extremely specific about what CI is considered stable enough to block on an would probably need to be down to the OS level on top of what is being tested. |
That's what everyone said when Travis was required and before it went flaky the last time. ;) The point is I don't want to keep flipping on and off required checks based on whatever CI people deem flaky or not at any one time. |
I created bpo-40188: "Azure Pipelines jobs failing randomly with: Unable to connect to azure.archive.ubuntu.com". |
Another Azure Pipeline failure on my #19769 PR, it looks like a random networking failure. Sadly, I had to close/reopen my PR since there is no button to only restart the failure job, or even restart all Azure Pipeline jobs. This retrigger all CI jobs :-( The win64 job of Azure Pipelines PR fails to build Python because it failed to fetch bzip2: Fetching bzip2-1.0.6... |
Sadly (again), closing/reopening a PR re-runs all CIs. At the first run, GH Action macOS job passed. At the second run, the "Tests" step of GH Action macOS job, but I'm clueless with its logs: https://github.com/python/cpython/pull/19776/checks?check_run_id=627916923 2020-04-28T23:33:03.5559341Z ##[section]Starting: Request a runner to run this job On the web UI, I see that 6 steps completed, only the last "Tests" step failed. But can't I see logs of other steps? I would prefer to be able to merge a PR even when Azure Pipelines fails: make the job optional. Hopefully, GH Action macOS job is optional and so I can merge my PR ;-) Note: I'm not sure if it's the right place to report GH Action macOS failure, but it seems to be related to Azure Pipelines. |
Oops, I looked at two different PRs. In fact, the two CI failures are unrelated. |
Oh, I encountered the same trouble twice :( |
Another issue: I still see "Azure Pipelines PR Expected — Waiting for status to be reported" 15 min after I created my PR :-/ Technically, I created the PR and then pushed a second commit to the PR. The only option is to close/reopen the PR to re-trigger *all* CIs :-/ |
Best place to report workflow issues or to have discussions about it is https://github.com/python/core-workflow/. Otherwise there were so many posts I didn't find an explicit ask of what you wanted changed, Victor. |
I would like to make Azure Pipelines optional on GitHub PRs. I changed the issue title to make my request more explicit. |
Done. You will need to check that miss-islington doesn't solely rely on required checks passing but instead all CI checks passing, otherwise this just turned off gating for PRs when auto-merging. And I'm going to say future requests for this sort of stuff should happen on either on the core-workflow issue tracker or on discuss.python.org for better visibility. |
Thanks.
I have no idea how miss-islington check CIs.
I'm used to report buildbot failures on bugs.python.org. Almost all issues are Python bugs, rather than issues specific to buidbot themselves. I'm fine with reporting Azure Pipeline issues at core-workflow. I created python/core-workflow#365 " Make Travis CI (and Windows x64 ?) mandatory" :-) |
Bugs in Python should continue to be reported here. Requests to change the workflow should be discussed on one of the core-workflow groups (I think Discourse is the primary one now, right?). Once an action is agreed upon, it gets tracked on the core-workflow tracker. That's how we decided to turn Travis off and Azure Pipelines on in the first place. Let's just hope that Travis has stabilised compared to when we switched away from it, and maybe they have enough capacity now to handle our busy periods. |
Oh, and Victor, you should probably email python-dev to let everyone know you requested this change and it's been made. Otherwise people may be surprised that it changed without any discussion or notification. This is especially important if we have to disable all platforms other than Linux to avoid blocking PRs. |
Can't we be more flexible depending on the stability on CIs over the last weeks? I mean making a CI optional if it becomes flaky, but also try to make a CI mandatory when it becomes stable. In my experience, no CI is reliable and the stability varies a lot over time. In the past, the macOS job was very reliable. I have no idea why it became so flaky, but I don't have the bandwidth to investigate, moreover it seems like some issues are internal to Azure Pipelines / GH Actions, and I don't have access to these. I'm trying to do the best with my limited time. |
FWIW, I took a quick look at it and, with nothing to go on in the way of visible messages, the best guess I could come up with is that the test run step is hitting a time out and that, in that case, no status is shown. Anyone know if that is a reasonable guess? The next question would be why are the tests taking that long on that macOS instance. |
Steve:
Ok.
I'm not sure of what you mean by "no discussion", this issue has many comments.
I would be more confident if we could make at least one Windows job mandatory. I have no opinion on msg363405, so I'm fine with Brett choice ("we have to just rely on people paying attention to failures"). I don't know how to modify the Windows job to do nothing if it's a documentation change only. macOS was already non-voting (optional), no? |
I think it depends on the timeout. Some of my Ubuntu builds occasionally get hard-stuck on tkinter tests, so apparently it's possible for that to spoil CI. But I believe Pipelines is going to try and terminate the process "nicely" first. |
Let's say, no consensus. There were three votes cast in this discussion - yours (+1), mine (-1) and Brett's (I'll assume +0 because he made the change, despite saying he didn't want to ;) ). Meanwhile, *everyone* is impacted, some people very negatively. The rest of the dev team need to know that it was a deliberate change.
Yes, so would I :)
I can do it when I get time, but it's not very high on my list. I suggest looking at the Azure Pipelines definition, kind of like how I looked at the Travis definition to figure it out.
Only because you complained about it here :) That was PR 18818 |
I understood that such issue should be discussed in the Core Workflow category of Discourse, so I created: "Make one Windows CI job mandatory" I suggest to continue the discussion there. |
Me:
Steve:
Alright, I forgot about the whole history. Well, it's not my fault if macOS decided to fail :-) I did my part, I fixed os.getgrouplist() which started (!?) to fail on the macOS job of Azure (in fact, it was an old issue which wasn't noticed previously): https://bugs.python.org/issue40014 I'm not sure what to do with macOS job which never starts or fail with empty logs. I don't see what we can do on the Python side. It *seems* to be more on the Azure side which is a blackbox to me. Maybe Steve you may ask around you at Microsoft? If you feel that you can do something to unblock the situation, please open an issue. Note: I would also prefer to have a voting macOS job, but it's not like I can fix the macOS job myself, so I let others handle this one ;-) |
No because I'm tired of flipping CI on and off as mandatory based on the whims of CI systems and their stability. Either people need to accept CI is flaky or everyone needs to be careful in how they merge PRs by checking failures are legit. And that's why I flipped off Azure Pipelines: I am not changing any more branch protections until a full discussion is had somewhere and there's consensus on what should be mandatory and stay mandatory for several months barring emergencies. |
I created a follow-up issue to have again a mandatory Windows pre-commit CI: bpo-40548. |
I wanted to wait until the situation was being clarified. I fixed the "documentation only" issue in GitHub Action workflow. I sent an email to python-committers rather than python-dev, core devs are the first concerned by workflow changes: https://mail.python.org/archives/list/[email protected]/thread/B6WVI254L7GEOCKUOHZ6XBZD4GCLAIBV/ Slowly, it seems like the situation is being resolved. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: