sanity HWTest on release builders is unreliable: causes test team / TPM manual work |
|||||||
Issue descriptionBuilders regularly fail the sanity phase. When this happens, test suites are not run: bvt-cq, bvt-inline, or the out of band suites However, the test team has learned to ignore this failing sanity phase signal. We just kick off the test suites ourselves and release the build anyway. On the most recent stable build that we released we had two boards fail sanity: https://luci-milo.appspot.com/buildbot/chromeos_release/daisy_skate-release%20release-R63-10032.B/67 https://luci-milo.appspot.com/buildbot/chromeos_release/kefka-release%20release-R63-10032.B/67 ON the most recent beta build that we released we had 15+ boards fail sanity: https://cros-goldeneye.corp.google.com/chromeos/console/monitorRelease?releaseName=M64-BETA-CHROMEOS-7 As is, the failing sanity phase just gives TPMs/Test team more work to do by kicking off the suites ourselves. Can we kick off suites anyway if the sanity fails or is there a better signal we can use that tells us "This build will not even boot if you tried it so there is not point in kicking off the test suites."
,
Jan 25 2018
I think the stated problem is important, but I don't think the stated solution is right. Sanity test is supposed to do exactly that: Check if the build will even boot. If sanity tests are failing too often, and hence getting ignored, the direction we want to go is to make them a real signal, and keep the signal healthy. Towrads this: - cros-infra should create signals from metrics around release sanity test failures, preferably per-channel. - cros-infra should ensure that infra errors do not cause sanity failures too often. My spot checking of the failures above showed that they were all timeouts. Also, our failure mode in case of HWTest timeout is stupid: issue 730729
,
Jan 25 2018
I don't think that M-66 label is agreed upon. Putting into OKR bucket though.
,
Jan 26 2018
,
Jan 26 2018
,
Jan 30 2018
,
Mar 23 2018
+Kalin: this would help with your OOB problem too
,
Mar 23 2018
Thanks David, My problem is similar and little (or more) different b/c: - I and few more TEs have dedicated high-touch pools of lab hosts, and not part of general lab pools - I would want to run my suites on my pools not just at green builds, but when build is red, and bvt suites have ran Can there be solution for separate and dedicated to OOB suites lab pools to be served from scheduler, even when build is red(and bvt suites have started)? |
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by jkop@chromium.org
, Jan 25 2018Status: Assigned (was: Untriaged)