Issue metadata
Sign in to add a comment
|
betty-pre-cq flaky. |
||||||||||||||||||||||||
Issue descriptionVMTests in the betty PreCQ are overly flaky. Can we restructure VMTest retries to be more reliable? Perhaps by only retesting failures, instead of the entire test suite? Here is an example failure with no CLs: https://luci-milo.appspot.com/buildbot/chromiumos.tryserver/pre_cq/72859 The overall succes stats for betty-pre-cq over the last month: betty-pre-cq: 74% successes 4% timeouts 2573 builds. By way of comparison with another default PreCQ builder: samus-no-vmtest-pre-cq: 87% successes 3% timeouts 2328 builds.
,
Dec 13 2017
Unless someone has a better way to improve betty stability, these should be duped together, and the data here should boost priority of the original bug. betty does appear to be the least stable of our PreCQ builders, though I would have to enhance "cros stats" slightly to prove that.
,
Dec 13 2017
I don't see these as duplicate. There is likely a legitimate fixable product bug that contributes to VMTest flake. That can be resolved independently from the vmtest retry semantics.
,
Dec 15 2017
There are two failures in the log above affecting two different CTS runs. Each of these runs has the same symptoms though: 1) Android comes up. 2) basic connection via adb is established. 3) Android doesn't respond to tradefed. Overall this looks like a product issue (apparently betty only?) I could try harder to restart/recover Android if this is common. But presumably we have issues that crept into betty. --- Backing off though, a builder that not only builds but also tests will always show the worst success rate. After all, it needs to build and pass all tests.
,
Dec 15 2017
,
Dec 15 2017
betty-pre-cq seems pretty flaky recently, but betty-paladin doesn't look bad, even though afaict they're running the same tests (smoke suite). Could it be the difference between baremetal and GCE?
,
Dec 15 2017
That's a thought. I would expect performance differences between the two, which could affect timing sensitive tests. There shouldn't be many other differences, but could be a few, for example, network connections and behavior.
,
Dec 18 2017
I looked at a bunch of logs and vmtest can take a long time to start Android. I will relax timeouts for betty.
,
Dec 19 2017
I've created crbug.com/795976 to create a betty-tot-paladin builder.
,
Dec 19 2017
Looks like the Chrome/Android start times on my change's pre-cq/GCE run were 3*63s and 1*80s. No wonder it is hard to pass when the first timeout was 60s (120s on second attempt, which often failed due to VM having problems to reboot?). This change is increasing the login timeouts https://chromium-review.googlesource.com/#/c/chromiumos/third_party/autotest/+/833502/ But notice that betty failure recovery (via reboot) is likely not functioning.
,
Dec 19 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/6ad5aba532a163eb5b5f34c4204e5a716797ce67 commit 6ad5aba532a163eb5b5f34c4204e5a716797ce67 Author: Ilja H. Friedel <ihf@chromium.org> Date: Tue Dec 19 12:35:28 2017 tradefed_test: tune login timeouts. We are interested in fairly tight login timeouts for the CQ to not wait too long in case of problems. But it appears that we need to be able to relax the login timeout for some boards like betty, which can run on slow GCE instances. This change - increases the regular Chrome login timeout from 60s to 90s. - increases betty timeout from 60s to 300s. BUG= chromium:794707 TEST=pre-cq will test. Change-Id: Ifeea56cd609395a052ea7ee059de450a504b73b2 Reviewed-on: https://chromium-review.googlesource.com/833502 Commit-Ready: Ilja H. Friedel <ihf@chromium.org> Tested-by: Ilja H. Friedel <ihf@chromium.org> Reviewed-by: Po-Hsien Wang <pwang@chromium.org> [modify] https://crrev.com/6ad5aba532a163eb5b5f34c4204e5a716797ce67/server/cros/tradefed_test.py |
|||||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||||
Comment 1 by akes...@chromium.org
, Dec 13 2017