UnitTest failures in pfq-informational are not causing the builder to fail |
|||||
Issue descriptionStarting here: https://uberchromegw.corp.google.com/i/chromeos.chrome/builders/tricky-tot-chrome-pfq-informational/builds/339 We started to see this output in the unit tests: WARNING: The following packages failed once or more, but succeeded upon retry. This might indicate incorrect dependencies. chromeos-base/vpn-manager-0.0.1-r1197 chromeos-base/libchromeos-ui-0.0.1-r200 @@@STEP_WARNINGS@@@ However the failures were not individual test failures, the test itself is failing mysteriously: vpn-manager-0.0.1-r1197: * ERROR: chromeos-base/vpn-manager-0.0.1-r1197::chromiumos failed (test phase): vpn-manager-0.0.1-r1197: * (no error message) This started showing up in the PFQ with enough frequency to prevent any succesful runs in the last several days. We need to get more information from this type of failure and cause the informational builder to fail so that we can catch these earlier.
,
Apr 11 2016
From what I can tell, chromite is working as intended -- it retries ebuilds and unit tests up to 1 time per ebuild. Is the bug that you don't want these retires, or is it to fund the root cause of the unit test flake?
,
Apr 11 2016
+jdufault, apparantly this is getting triggered due to this change: https://chromium-review.googlesource.com/#/c/335250/
,
Apr 11 2016
akeshet@ - could we skip the rety if the failure was not due to a particular test failing? i.e. in this case all tests passed, but the test itself failed because it did not clean itself up properly (or some other reason) and we detected that. When that happens, I think we would like the builder to fail.
,
Apr 11 2016
FYI, revert is at https://chromium-review.googlesource.com/#/c/338151.
,
Apr 11 2016
failure to clean up properly is a test problem and should be flagged as such
,
Apr 11 2016
So, to be clear there are two separate issues here: a) The bug causing the unit tests to fail, reverted for now (comment #5). b) The fact that this failure did not cause the informational builder to fail, making it harder to identify. (a) is what revealed (b), but this issue should track (b). I'm not actually sure that we should repeate unit tests at all - those really really shouldn't be flakey. However, if we do, we should only do so if an individual test fails. Also, since this is failing semi-consistently on the PFQ, are we not repeating the unit tests on the PFQ? Whatever we do it should be consistent.
,
Apr 11 2016
Looks like we are retrying unit tests on the pfq https://uberchromegw.corp.google.com/i/chromeos/builders/tricky-chrome-pfq/builds/1743/steps/UnitTest/logs/stdio Perhaps the pfq builder environment is such that the flake is more likely.
,
Apr 11 2016
It doesn't actually appear flakey exactly on the continuous builder - it fails consistently the first time, but not the second time. Not clear why it is sometimes failing the second time on the PFQ.
,
Apr 11 2016
CPU load on the builder is probably a factor.
,
Apr 11 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform2/+/98e1516a40e126465e2cd7a10d65fdfb1610de70 commit 98e1516a40e126465e2cd7a10d65fdfb1610de70 Author: Jacob Dufault <jdufault@chromium.org> Date: Mon Apr 11 18:15:09 2016 Revert "common-mk: Kill any auxiliary child processes after the child terminates." The commit introduced some flaky failures, reverting to remove the flakiness. This reverts commit fd24e9b9796336cb7506e17e8719b9395ec308bc. BUG= chromium:602304 Change-Id: I03c24f86a8fb2fef804e2ae2c7aab11e61ec7f64 Reviewed-on: https://chromium-review.googlesource.com/338151 Commit-Ready: Jacob Dufault <jdufault@chromium.org> Tested-by: Jacob Dufault <jdufault@chromium.org> Reviewed-by: Mike Frysinger <vapier@chromium.org> [modify] https://crrev.com/98e1516a40e126465e2cd7a10d65fdfb1610de70/common-mk/platform2_test.py
,
Apr 11 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform2/+/a6fd3dab187237d0bb509abd0e28a95effd00466 commit a6fd3dab187237d0bb509abd0e28a95effd00466 Author: Jacob Dufault <jdufault@chromium.org> Date: Mon Apr 11 18:15:09 2016 Revert "common-mk: Kill any auxiliary child processes after the child terminates." The commit introduced some flaky failures, reverting to remove the flakiness. This reverts commit fd24e9b9796336cb7506e17e8719b9395ec308bc. BUG= chromium:602304 Change-Id: I03c24f86a8fb2fef804e2ae2c7aab11e61ec7f64 Reviewed-on: https://chromium-review.googlesource.com/338151 Commit-Ready: Jacob Dufault <jdufault@chromium.org> Tested-by: Jacob Dufault <jdufault@chromium.org> Reviewed-by: Mike Frysinger <vapier@chromium.org> (cherry picked from commit 98e1516a40e126465e2cd7a10d65fdfb1610de70) Reviewed-on: https://chromium-review.googlesource.com/338172 Reviewed-by: Ilja Friedel <ihf@chromium.org> Tested-by: Ilja Friedel <ihf@chromium.org> [modify] https://crrev.com/a6fd3dab187237d0bb509abd0e28a95effd00466/common-mk/platform2_test.py
,
May 10 2016
,
Nov 14 2016
|
|||||
►
Sign in to add a comment |
|||||
Comment 1 by steve...@chromium.org
, Apr 11 2016