New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 911160 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Dec 4
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Fuchsia
Pri: 1
Type: Bug

Blocking:
issue 909074



Sign in to add a comment

Set bot-mode command-line or environment option on Fuchsia bots

Project Member Reported by erikc...@chromium.org, Dec 3

Issue description

content_unittests are flaky on fuchsia right now:
https://ci.chromium.org/p/chromium/builders/luci.chromium.try/fuchsia_x64/162201
https://ci.chromium.org/p/chromium/builders/luci.chromium.try/fuchsia_x64/162204

When tests flake, they are not retried. This causes the CQ to then 'retry without patch' and 'retry with patch'. If the latter fails, then the whole build gets retried. Adding a test-suite layer retry will significantly cut down on cycle time for this flaky test suite.

 
Status: WontFix (was: Assigned)
This issue is not actually a test-suite flake, but a platform/OS flake triggered by the size of this particular suite.

We absolutely should _not_ add any further retries to patch over this - the SDK roll that pulled-in the flakiness only landed because of the existing retries in our CQ!

We're working w/ the Fuchsia folks to get a fix or workaround ASAP, will disable this suite in the meantime to stabilize the CQ.
wez: Let's have a 1:1 to discuss further. 
FYI content_unittests is now disabled on waterfall+CQ pending upstream fixes. :)
Status: Assigned (was: WontFix)
Summary: Explicitly specify retry-on-failure behavior for 'with patch' steps. (was: Add test-suite layer retries to content_unittests on fuchsia.)
According to wez@, content_unittests is supposed to retry failing tests 3 times. It doesn't do so because we set a filter file, which disables the default retries:
--test-launcher-filter-file=../../testing/buildbot/filters/fuchsia.content_unittests.filter

We should explicitly set the retry-on-failure behavior for 'with patch' to ensure that we have the desired behavior.
Components: Tests>Flaky
Labels: -Pri-3 M-73 OS-Fuchsia Pri-1
Owner: jbudorick@chromium.org
Summary: Set (was: Explicitly specify retry-on-failure behavior for 'with patch' steps.)
Default retry-limit logic is here: https://cs.chromium.org/chromium/src/base/test/launcher/test_launcher.cc?l=989

and the filter file is applied here:
https://cs.chromium.org/chromium/src/base/test/launcher/test_launcher.cc?l=1063

so I think the disabling of retries in the presence of a filter/filter-file is no longer the case.

Looking at the actual test-bot output, I see that it sets:
    CHROME_DEVEL_SANDBOX=/opt/chromium/chrome_sandbox
    CHROME_HEADLESS=1
    LANG=en_US.UTF-8
which is incorrect - bot-mode is now indicated either by the environment via CHROMIUM_TEST_LAUNCHER_BOT_MODE or via the command-line via --test-launcher-bot-mode.

So the reason we're not getting the default retry limit is that the TestLauncher doesn't think that it is actually running on a bot, because our bot is mis-configured.

jbudorick, can you please take a look?
Summary: Set bot-mode command-line or environment option on Fuchsia bots (was: Set )
Cc: erikc...@chromium.org
Components: Internals>PlatformIntegration
Perhaps an interesting data point in discussion around retries: It seems that the Fuchsia bots have _never_ had the default retry-limit enabled, because of this configuration issue. :O
Owner: w...@chromium.org
Status: Started (was: Assigned)
AFAICT other bots get --test-launcher-bot-mode applied by the mb.py script, whereas Fuchsia is special-cased at https://cs.chromium.org/chromium/src/tools/mb/mb.py?l=1111

We just need to add the command-line option there, and fix our runner scripts to respect it.
Project Member

Comment 9 by bugdroid1@chromium.org, Dec 4

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/9d5c0b51ab70bf829330938ad7f62f5a88f27bb5

commit 9d5c0b51ab70bf829330938ad7f62f5a88f27bb5
Author: Wez <wez@chromium.org>
Date: Tue Dec 04 00:53:44 2018

[Fuchsia] Enable TestLauncher bot-mode for all test-suite runs.

Add pass-through for --test-launcher-bot-mode and set it via mb.py for
all non-script steps that we run on Fuchsia.  The main difference this
makes is to enable the TestLauncher's default bot-mode retry-limit.

Bug:  911160 
Change-Id: I036c0d466c37dfdcd64c7faebdfc7d25ee6a00a5
Reviewed-on: https://chromium-review.googlesource.com/c/1358636
Commit-Queue: Wez <wez@chromium.org>
Reviewed-by: Kevin Marshall <kmarshall@chromium.org>
Reviewed-by: Erik Chen <erikchen@chromium.org>
Reviewed-by: John Budorick <jbudorick@chromium.org>
Cr-Commit-Position: refs/heads/master@{#613360}
[modify] https://crrev.com/9d5c0b51ab70bf829330938ad7f62f5a88f27bb5/build/fuchsia/test_runner.py
[modify] https://crrev.com/9d5c0b51ab70bf829330938ad7f62f5a88f27bb5/tools/mb/mb.py

Status: Fixed (was: Started)
Labels: Infra-Platform-Test
Blocking: 909074

Sign in to add a comment