Allow configuration of commit batch size in ci.yaml #130499

dnfield · 2023-07-13T15:47:17Z

Context: Firebase Testlab has physical devices, only some of which are highly available. If we run tests too frequently on a non-highly available device, it will time out. This is not an issue for virtual devices.

#130497 appears to be the result of running a test on a busy day on a device that probably isn't highly available right now.

It would be great if we could tell cocoon to run this test in batches of say 30 commits, so that it runs more or less once a day on a busy day (it's ok if it doesn't run at all on some days). Today the yaml file says that all targets are run on every commit.

@godofredoc @CaseyHillers for input

@reidbaker @gmackall @zanderso @jonahwilliams fyi

keyonghan · 2023-07-13T16:00:34Z

Cocoon is already batch scheduling framework tasks with a batch size of 6. It is the back-filling logic which back fills those skipped tasks within a batch.
One thing we can do is add a flag in the .ci.yaml to skip backfilling for the target.

Question: if we can bear with a test running as a batch size of 30 commits, or even no run at all for some days, is the test important enough to validate the tree?

gmackall · 2023-07-13T16:16:36Z

Context question about the particular test on the linked PR, I see that it was enabled on presubmit and therefore running in the checks on PRs - it was probably running a good bit more than 30 times a day then, right?

I ask because 1) I want to confirm that we do in fact run firebase tests on pre-submit (I didn't know this if so) and 2) I don't have a good understanding of the ratio of presubmit runs to postsubmit runs.

zanderso · 2023-07-13T17:10:27Z

@keyonghan The request is to configure the batch size on a per-test basis. The tests are important, but for cases where FTL lacks capacity for specific devices, running more frequently will cause tests to timeout while waiting for available devices, and close the tree.

keyonghan · 2023-07-13T17:39:25Z

@keyonghan The request is to configure the batch size on a per-test basis. The tests are important, but for cases where FTL lacks capacity for specific devices, running more frequently will cause tests to timeout while waiting for available devices, and close the tree.

This should be doable, but I am concerned when a real tree breakage change exists within the batch. Say a test with a batch size of 30 and the 2rd commit contains a breaking change, then we can only catch the breakage 30 commits later.

Also is there any other use case in addition to this FTL test? Does it make more sense to mark it as flaky (as what Dan did now) to have staging pool validate all the time (though maybe failing consistently, it will not block the tree and will not miss validation on any breaking change)? Marking a test as flaky or changing the batch size (if supported) each needs a PR change, and each needs a revert PR when bot/device capacity is back.

dnfield · 2023-07-13T17:47:32Z

The basic problem is that we'd like to test on hardware that does not have enough availability to test on every commit.

We cannot run those tests on presubmit, as that will be too much.

We currently have two options:

Mark the test as flaky indefinitely and manually check if it's failing.
Mark the test as non-flaky and deal with flakes.

If we run the test less frequently, it should be less flaky (because we won't be overloading the availablity of devices in FTL).

dnfield · 2023-07-13T17:47:49Z

Ideally FTL would give us an API to check if a device is available but that does not exist AFAIK.

dnfield · 2023-07-13T17:51:30Z

I started a thread in the internal Flutter/FTL group to see if we can figure out why this test took so long to time out and whether there's a better option the FTL team can give us too.

keyonghan · 2023-07-13T17:53:07Z

Mark the test as flaky indefinitely and manually check if it's failing.

The target is being validated in staging pool (after marked as flaky). See a passing build: https://ci.chromium.org/ui/p/flutter/builders/staging/Linux%20firebase_oriol33_abstract_method_smoke_test/484/overview

keyonghan · 2023-07-13T17:56:15Z

Another thing we want to do in the future is to disable presubmit run with presubmit: false. When it was enabled, it was being validated in the try pool: https://github.com/flutter/flutter/pull/130497/files
Maybe that is one reason for potential high consumption of the capacity.

dnfield · 2023-07-13T17:57:29Z

Ahh so maybe we should just mark the ones with lower availability as presubmit: false?

keyonghan · 2023-07-13T17:59:26Z

Ahh so maybe we should just mark the ones with lower availability as presubmit: false?

That should help. The number of presubmit runs is much bigger than that of the post-submit.

reidbaker · 2023-07-13T19:14:12Z

The basic problem is that we'd like to test on hardware that does not have enough availability to test on every commit.

We cannot run those tests on presubmit, as that will be too much.

We currently have two options:

Mark the test as flaky indefinitely and manually check if it's failing.

Mark the test as non-flaky and deal with flakes.

If we run the test less frequently, it should be less flaky (because we won't be overloading the availablity of devices in FTL).

The issue here is the test is not flaky the infrastructure running the test is flakey and we have a way to deal with with that. Also we have the option of extending the timeout.

dnfield · 2023-07-13T19:54:34Z

Right now, here's the list of availability. The device we're running for this test is "medium" availability, there is a high availability API 33 device (panther) we should try instead.

 gcloud firebase test android list-device-capacities
┌──────────────┬─────────────────────────────┬───────────────┬─────────────────┐
│   MODEL_ID   │          MODEL_NAME         │ OS_VERSION_ID │ DEVICE_CAPACITY │
├──────────────┼─────────────────────────────┼───────────────┼─────────────────┤
│ 1610         │ vivo 1610                   │ 23            │ Medium          │
│ F01L         │ F-01L                       │ 27            │ High            │
│ FRT          │ Nokia 1                     │ 27            │ High            │
│ G8142        │ G8142                       │ 25            │ Low             │
│ HWCOR        │ COR-L29                     │ 27            │ Medium          │
│ HWMHA        │ MHA-L29                     │ 24            │ Medium          │
│ SH-01L       │ SH-01L                      │ 28            │ High            │
│ TC77         │ TC77                        │ 27            │ Low             │
│ a10          │ SM-A105FN                   │ 29            │ High            │
│ a51          │ SM-A515U                    │ 31            │ Low             │
│ b0q          │ SM-S908U1                   │ 33            │ High            │
│ bluejay      │ Pixel 6a                    │ 32            │ Medium          │
│ blueline     │ Pixel 3                     │ 28            │ High            │
│ cactus       │ Redmi 6A                    │ 27            │ High            │
│ cheetah      │ Pixel 7 Pro                 │ 33            │ Medium          │
│ crownqlteue  │ SM-N960U1                   │ 29            │ Medium          │
│ dreamlte     │ SM-G950F                    │ 28            │ High            │
│ f2q          │ SM-F916U1                   │ 30            │ Medium          │
│ felix        │ Pixel Fold                  │ 33            │ High            │
│ felix_camera │ Pixel Fold (Camera-enabled) │ 33            │ Low             │
│ grandppltedx │ SM-G532G                    │ 23            │ High            │
│ griffin      │ XT1650                      │ 24            │ Low             │
│ gts3lltevzw  │ SM-T827V                    │ 28            │ Low             │
│ gts8uwifi    │ SM-X900                     │ 33            │ High            │
│ hammerhead   │ Nexus 5                     │ 23            │ Medium          │
│ harpia       │ Moto G Play                 │ 23            │ Medium          │
│ java         │ Motorola G20                │ 30            │ High            │
│ lv0          │ LG-AS110                    │ 23            │ High            │
│ oriole       │ Pixel 6                     │ 31            │ High            │
│ oriole       │ Pixel 6                     │ 32            │ Medium          │
│ oriole       │ Pixel 6                     │ 33            │ Medium          │
│ panther      │ Pixel 7                     │ 33            │ High            │
│ pettyl       │ moto e5 play                │ 27            │ Medium          │
│ q2q          │ SM-F926U1                   │ 30            │ Low             │
│ q2q          │ SM-F926U1                   │ 31            │ Low             │
│ r11          │ Google Pixel Watch          │ 30            │ Medium          │
│ redfin       │ Pixel 5                     │ 30            │ High            │
│ sailfish     │ Pixel                       │ 25            │ Medium          │
│ starqlteue   │ SM-G960U1                   │ 26            │ High            │
│ tangorpro    │ Pixel Tablet                │ 33            │ High            │
│ x1q          │ SM-G981U1                   │ 29            │ High            │
└──────────────┴─────────────────────────────┴───────────────┴─────────────────┘

CaseyHillers · 2023-07-13T20:39:11Z

Should we disable FTL on the release branches? RCs don't batch tasks.

dnfield · 2023-07-13T21:49:05Z

RCs also only run pretty infrequently right?

CaseyHillers · 2023-07-13T22:57:41Z

RCs also only run pretty infrequently right?

Based on Q2, there were 2 runs/workday from RCs

godofredoc · 2023-08-22T01:59:21Z

Based on the comments, it seems like the solution was to mark the test as presubmit=false.

github-actions · 2023-09-05T02:05:04Z

This thread has been automatically locked since there has not been any recent activity after it was closed. If you are still experiencing a similar issue, please open a new bug, including the output of flutter doctor -v and a minimal reproduction of the issue.

dnfield added a: tests "flutter test", flutter_test, or one of our tests team-infra Owned by Infrastructure team labels Jul 13, 2023

keyonghan mentioned this issue Jul 20, 2023

[CI/FTL] Oriole to Panther, presubmit false #130912

Merged

godofredoc closed this as completed Aug 22, 2023

github-actions bot locked as resolved and limited conversation to collaborators Sep 5, 2023

Allow configuration of commit batch size in ci.yaml #130499

Allow configuration of commit batch size in ci.yaml #130499

Comments

dnfield commented Jul 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

keyonghan commented Jul 13, 2023

Uh oh!

gmackall commented Jul 13, 2023

Uh oh!

zanderso commented Jul 13, 2023

Uh oh!

keyonghan commented Jul 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dnfield commented Jul 13, 2023

Uh oh!

dnfield commented Jul 13, 2023

Uh oh!

dnfield commented Jul 13, 2023

Uh oh!

keyonghan commented Jul 13, 2023

Uh oh!

keyonghan commented Jul 13, 2023

Uh oh!

dnfield commented Jul 13, 2023

Uh oh!

keyonghan commented Jul 13, 2023

Uh oh!

reidbaker commented Jul 13, 2023

Uh oh!

dnfield commented Jul 13, 2023

Uh oh!

CaseyHillers commented Jul 13, 2023

Uh oh!

dnfield commented Jul 13, 2023

Uh oh!

CaseyHillers commented Jul 13, 2023

Uh oh!

godofredoc commented Aug 22, 2023

Uh oh!

github-actions bot commented Sep 5, 2023

Uh oh!

dnfield commented Jul 13, 2023 •

edited

Loading

keyonghan commented Jul 13, 2023 •

edited

Loading