New issue
Advanced search Search tips

Issue 909074 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Dec 5
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Fuchsia
Pri: 1
Type: Bug-Regression

Blocked on:
issue 909936
issue 910029
issue 911160



Sign in to add a comment

fuchsia_x64 causing a massive spike in false rejects.

Project Member Reported by erikc...@chromium.org, Nov 28

Issue description

In the last week, there's been a ~10X spike in false rejects:
https://viceroy.corp.google.com/chrome_infra/Chromium/cq_slo_dash

According to go/top-cq-flakes, almost half of the flakes are due to INVALID_TEST_RESULTS, and almost all of those are happening on fuchsia_x64.

Example build with INVALID_TEST_RESULTS:
https://ci.chromium.org/b/8928759600134030912

Example error:
"""
2018-11-26 15:26:19,581:INFO:root:Connecting to Fuchsia using SSH.
2018-11-26 15:27:25,783:ERROR:root:Timeout limit reached.
2018-11-26 15:27:25,783:INFO:root:Shutting down QEMU.
Traceback (most recent call last):
  File "/b/s/w/ir/build/fuchsia/test_runner.py", line 117, in <module>
    sys.exit(main())
  File "/b/s/w/ir/build/fuchsia/test_runner.py", line 91, in main
    target.Start()
  File "/b/s/w/ir/build/fuchsia/qemu_target.py", line 160, in Start
    self._WaitUntilReady();
  File "/b/s/w/ir/build/fuchsia/target.py", line 158, in _WaitUntilReady
    raise FuchsiaTargetException('Couldn\'t connect using SSH.')
target.FuchsiaTargetException: Couldn't connect using SSH.
"""

Similarly, flakiness due to TEST_FAILURE is also mainly due to fuchsia_x64.
example build: https://ci.chromium.org/b/8928762758760847120

 
Screen Shot 2018-11-27 at 7.48.59 PM.png
440 KB View Download
Screen Shot 2018-11-27 at 7.52.24 PM.png
401 KB View Download
Cc: dpranke@chromium.org st...@chromium.org
Cc: fdegans@chromium.org
Components: Internals>PlatformIntegration
Labels: -Type-Bug -Pri-3 M-73 OS-Fuchsia Pri-1 Type-Bug-Regression
Owner: w...@chromium.org
Status: Started (was: Untriaged)
Auto-assigning, as today's Gardener.  Thanks for the data, Erik.

tl;dr: Fuchsia SDK roll ~week ago introduced several sources of flakiness, while also coinciding with us losing buildbot failure notifications. The roll only landed in this case because the CQ retried the flaked bots.

We've rolled-back to an older SDK (see  issue 907804  and  issue 908125 ) and are expecting up-stream fixes for both that and  issue 908895  shortly.
Issue 908404 has been merged into this issue.
Labels: Infra-Platform-Test
Blockedon: 911160 909936 910029
 Issue 909936  tracks SSH failing to even connect to the Fuchsia/QEMU guests (the INVALID_TEST_RESULTS issue).
 Issue 910029  tracks flakiness due to race-condition between ScopedTempDir and SimpleBackendImpl; this likely only affected Fuchsia because we were running without TestLauncher retries until  issue 911160  was resolved - it's a cross-platform issue.
Status: Fixed (was: Started)
Marking this as Fixed, since we've been back to Android & Windows dominating false-rejects since November 30th.

Sign in to add a comment