"Skia Gold received an unapproved image in post-submit" failures incorrectly reported as flaky #105915

HansMuller · 2022-06-13T18:45:46Z

These three issues were automatically marked as P1 flakes: https://github.com/flutter/flutter/issues?q=is%3Aissue+issues%3A+105613+105608+105614. That's - in part - because of "unapproved image in post-submit" failures reported by Skia Gold. These Skia Gold failures should not contribute to the flaky test statistics, see for example #105613 (comment)

HansMuller · 2022-06-13T18:49:37Z

I've marked this issue P1 because the https://github.com/flutter/flutter/issues?q=is%3Aissue+issues%3A+105613+105608+105614 were all reported P1.

keyonghan · 2022-06-13T19:04:52Z

Unfortunately there is not an easy way for the flake bot to analyze raw error messages and distinguish them differently when detecting flakes (it sees only failure or pass).
Instead, I think we should enforce the Gold image workflow if that will cause task failure in CI and tree red. Does it make sense? /cc @Piinks

Piinks · 2022-06-13T19:07:15Z

I think we should enforce the Gold image workflow if that will cause task failure in CI and tree red

We already do this. If the workflow is not followed, the tree turns red. That is why it is being accidentally marked flaky.

keyonghan · 2022-06-13T19:12:50Z

We already do this. If the workflow is not followed, the tree turns red. That is why it is being accidentally marked flaky.

I am wondering if this should be caught in presubmit tests?

Piinks · 2022-06-13T19:15:13Z

Please see #93515 for more context.
We cannot catch this reliably in presubmit.

keyonghan · 2022-06-13T21:25:35Z

Thanks for the context @Piinks .

I would repurpose this issue to support flake filtering out customized test failure errors. Maybe a flag to skip flake counts for such failures. But this will need non-trivial support to parse error messages or make tests output to be parsable/general.

As this issue happened on only one commit, and is not blocking anything, decreasing to P4 and moving to technical debt.

keyonghan · 2022-08-02T21:12:53Z

Trying to understand the workflow. If people don't follow the correct way, this will cause post-submit CI (tree) red. And this expects to keep the tree red. However our auto retry will rerun the failed builds and try greening the tree asap. The successful retry contributes to the flake.
My question is: how those unapproved images would pass on the retry?

Regarding pre-submit check:

For Fix lerp to eccentric circle. #108743, the bot complains about golden file change, and added label will affect goldens. Does it make sense to fail the gold status check based on that?
For Deprecate toggleableActiveColor #97972, no complains about golden file change, but it fails post-submit CI due to unapproved golden images. Is this expected?

As there is no easy way for the flake bot to exclude such failures/flakes, I am looking for any potential workarounds re such golden image errors. Based on dashboard, this issue happens on multiple commits in recent days and caused tree red intermittently.

Piinks · 2022-08-02T21:51:58Z

Sometimes golden file test can be flakey, in that they do not produce the same image every time. I have only seen this happen on canvas kit image tests, but @yjbanov adding some fuzziness to the image test to reduce that flakiness. That is what happened in the case of 2 above.

In the first case, the PR did introduce image changes, but the flutter-gold check can never go red in presubmit. Doing so would break the engine auto roller when it introduces image changes. That is why flutter-gold holds a pending state in presubmit until images are approved.

There may not be a way to filter these tests out of flaky reports, it is not easy to distinguish. If the tree goes (correctly red) on an unapproved image, someone can go to https://flutter-gold.skia.org/, approve the image, and then it would pass on a retry.

keyonghan · 2022-08-03T16:25:03Z

In the first case, the PR did introduce image changes, but the flutter-gold check can never go red in presubmit. Doing so would break the engine auto roller when it introduces image changes. That is why flutter-gold holds a pending state in presubmit until images are approved.

I see. So it seems we choose to block the framework tree later on, instead of failing the engine roller? IIUC, when golden image changes in an engine roller PR, either it needs a manual approval in presubmit (which we are not doing), or a manual approval in postsubmit when it reds the tree (which is what we are doing now).

Why do we not fail earlier? As the latter will block all development workflow, and both need a manual intervention anyway. Did I miss anything? /cc @zanderso

Piinks · 2022-08-03T16:31:29Z

either it needs a manual approval in presubmit (which we are not doing)

Oh no, this is what we are doing. The engine sheriff approved images if the roll introduces new images

zanderso · 2022-08-03T16:39:42Z

Yeah, the Engine roll shouldn't turn red, but should be held pending waiting for manual approval. (I think the notification that a roll is in this state probably needs improvement. Sometimes the sheriff doesn't notice for several hours.)

keyonghan · 2022-08-03T22:06:05Z

Sync'ed with @Piinks , the workflow does make sense.

Now we are having two issues from gold server side which contributes to flakiness from our side.

When PR gets large with a manual submission: https://bugs.chromium.org/p/skia/issues/detail?id=13589
- The Gold team attributes this to a bug regarding force pushes in a PRs git history
Gold server is flaky: Gold service is flaky in post submit #108905

yjbanov · 2022-09-02T00:37:03Z

I have only seen this happen on canvas kit image tests, but @yjbanov adding some fuzziness to the image test to reduce that flakiness.

Fuzzy matching is only used for HTML renderer. CanvasKit uses strict matching, like the non-web version.

ricardoamador · 2023-02-02T23:26:03Z

@godofredoc Any update?

matanlurey · 2025-02-22T07:01:51Z

Please see #93515 for more context. We cannot catch this reliably in presubmit.

@Piinks Low priority - do you think this is still important?

Piinks · 2025-02-26T01:12:28Z

I think so, I agree though it's low priority right now. In an ideal world, our flaky bot detector could be a bit smarter about this scenario.

HansMuller added the infra: auto flake bot Issues with the bot that files flake issues label Jun 13, 2022

This was referenced Jun 13, 2022

Mac framework_tests_libraries is 2.00% flaky #105613

Closed

Linux framework_tests_libraries is 2.06% flaky #105608

Closed

Windows framework_tests_widgets is 2.00% flaky #105614

Closed

Piinks added team-infra Owned by Infrastructure team infra: metrics Infrastructure metrics-related issues labels Jun 13, 2022

HansMuller added the P1 label Jun 13, 2022

keyonghan added P2 Important issues not at the top of the work list and removed P1 labels Jun 13, 2022

keyonghan added team: flakes labels Aug 3, 2022

godofredoc self-assigned this Dec 8, 2022

This was referenced Jan 4, 2023

Windows framework_tests_misc is 2.00% flaky #117980

Closed

Windows framework_tests_widgets is 6.00% flaky #117982

Closed

Linux framework_tests_widgets is 2.02% flaky #106421

Closed

Windows framework_tests_libraries is 2.13% flaky #117067

Closed

HansMuller mentioned this issue Jan 11, 2023

Mac framework_tests_widgets is 3.00% flaky #117741

Closed

HansMuller mentioned this issue Jan 11, 2023

Mac framework_tests_libraries is 4.17% flaky #118326

Closed

godofredoc removed their assignment Jun 14, 2023

flutter-triage-bot bot added the c: flake Tests that sometimes, but not always, incorrectly pass label Jul 7, 2023

Hixie removed the passed secondary triage label Jul 8, 2023

chinmaygarde removed the team: flakes label Jul 19, 2023

HansMuller mentioned this issue Aug 2, 2023

Windows framework_tests_widgets is 2.02% flaky #131758

Closed

ricardoamador added the triaged-infra Triaged by Infrastructure team label Aug 23, 2023

matanlurey assigned Piinks Feb 22, 2025

Piinks removed their assignment Feb 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

"Skia Gold received an unapproved image in post-submit" failures incorrectly reported as flaky #105915

"Skia Gold received an unapproved image in post-submit" failures incorrectly reported as flaky #105915

HansMuller commented Jun 13, 2022

HansMuller commented Jun 13, 2022

Uh oh!

keyonghan commented Jun 13, 2022

Uh oh!

Piinks commented Jun 13, 2022

Uh oh!

keyonghan commented Jun 13, 2022

Uh oh!

Piinks commented Jun 13, 2022

Uh oh!

keyonghan commented Jun 13, 2022

Uh oh!

keyonghan commented Aug 2, 2022

Uh oh!

Piinks commented Aug 2, 2022

Uh oh!

keyonghan commented Aug 3, 2022 •

edited

Loading

Uh oh!

Piinks commented Aug 3, 2022

Uh oh!

zanderso commented Aug 3, 2022

Uh oh!

keyonghan commented Aug 3, 2022 •

edited by Piinks

Loading

Uh oh!

yjbanov commented Sep 2, 2022

Uh oh!

ricardoamador commented Feb 2, 2023

Uh oh!

matanlurey commented Feb 22, 2025

Uh oh!

Piinks commented Feb 26, 2025

Uh oh!

"Skia Gold received an unapproved image in post-submit" failures incorrectly reported as flaky #105915

"Skia Gold received an unapproved image in post-submit" failures incorrectly reported as flaky #105915

Comments

HansMuller commented Jun 13, 2022

HansMuller commented Jun 13, 2022

Uh oh!

keyonghan commented Jun 13, 2022

Uh oh!

Piinks commented Jun 13, 2022

Uh oh!

keyonghan commented Jun 13, 2022

Uh oh!

Piinks commented Jun 13, 2022

Uh oh!

keyonghan commented Jun 13, 2022

Uh oh!

keyonghan commented Aug 2, 2022

Uh oh!

Piinks commented Aug 2, 2022

Uh oh!

keyonghan commented Aug 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Piinks commented Aug 3, 2022

Uh oh!

zanderso commented Aug 3, 2022

Uh oh!

keyonghan commented Aug 3, 2022 • edited by Piinks Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yjbanov commented Sep 2, 2022

Uh oh!

ricardoamador commented Feb 2, 2023

Uh oh!

matanlurey commented Feb 22, 2025

Uh oh!

Piinks commented Feb 26, 2025

Uh oh!

keyonghan commented Aug 3, 2022 •

edited

Loading

keyonghan commented Aug 3, 2022 •

edited by Piinks

Loading