New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 714203 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner: ----
Closed: Sep 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug



Sign in to add a comment

linux_layout_tests_layout_ng is failing in "webkit_tests" step and not retrying results

Project Member Reported by qyears...@chromium.org, Apr 21 2017

Issue description

This new optional try bot linux_layout_tests_layout_ng (see  bug 706183 ) appears to not be behaving correctly (I believe).

See recent builds:
https://build.chromium.org/p/tryserver.chromium.linux/builders/linux_layout_tests_layout_ng
Example build: https://build.chromium.org/p/tryserver.chromium.linux/builders/linux_layout_tests_layout_ng/builds/42

For comparison, see this similar builder:

https://build.chromium.org/p/tryserver.chromium.linux/builders/linux_layout_tests_slimming_paint_v2
Example build: https://build.chromium.org/p/tryserver.chromium.linux/builders/linux_layout_tests_slimming_paint_v2/builds/3978

I think that it's supposed to have a "webkit_tests (without patch)", and I don't think that it should have a "webkit_tests" step.
 
Labels: Test-failure

Comment 2 by glebl@chromium.org, Apr 24 2017

Cc: glebl@chromium.org
This is weird, since some recent builds have passed:
https://build.chromium.org/p/tryserver.chromium.linux/builders/linux_layout_tests_layout_ng

Something must be flaky, but I'm not sure what. There are no discernible infra differences between the recent passing and failing builds.
As far as I can see it seems to be doing the retry stuff?

Comment 5 by kojii@chromium.org, May 7 2017

Cc: kojii@chromium.org
qyearsley@, could we increase the threshold for tests to exit early on the linux_layout_tests_layout_ng?

Due to its nature of running different layout engine, most new tests fail/crash/timeout, and bot exiting tests early will cause "without patch" phase to fail. I've added more to FlagExpectations and that turned the bot back normal.

From CL https://chromium-review.googlesource.com/c/469254/ I'm guessing it's in
scripts/slave/recipes/chromium_trybot.expected/full_tryserver_chromium_linux_linux_layout_tests_layout_ng.json
      "--exit-after-n-failures",
      "5000",
      "--exit-after-n-crashes-or-timeouts",
      "100",

I don't know what are the good values. Last week we've got 1000-2000 new failures in a week, so I guess 5000 for both might be good.

Comment 6 by kojii@chromium.org, May 7 2017

Forgot to write why, here's my analysis. The example build:
> Example build: https://build.chromium.org/p/tryserver.chromium.linux/builders/linux_layout_tests_layout_ng/builds/42

and such builds have "Testing exited early." in test results page:
https://storage.googleapis.com/chromium-layout-test-archives/linux_layout_tests_layout_ng/42/layout-test-results/results.html

It looks like we don't try "without patch" pass, probably because comparing results is meaningless if some tests didn't run?
We currently assume that if you have a large number of failures the problem is your patch.

The theory is that the tree is never so broken that it would be meaningful to diff the result. The with/without patch is only really designed to detect the case where there are a small number of failing tests currently committed.

Even if we wanted too to enable it for a large number of failures we can't currently. The "without patch" only runs tests which previously failed -- to do this it needs to pass them on the command line currently. The command line has a limited number of bytes which limits us to passing roughly 10-20 test names. With layout tests on swarming rolling out we currently rerun the *whole* test suite which could possibly allow meaningful diffs.

Comment 8 by kojii@chromium.org, May 8 2017

Thank you tansell, the theory makes sense for normal cases. But for NG bots, since many of new tests are likely to fail, we'd like to do something else.

Could you explain a bit more on what we do for swarming? "The whole test suites" mean it never exists early? If that's possible, that can work for us too.


Comment 9 by kojii@chromium.org, May 10 2017

tansell@, could you mind to explain a bit more on what swarming does? Does it mean never to early exit, by removing all early exit options? If that's possible, I think we want it for this bot too.

If not, I have a bit difficulty to understand when you said only 10-20 tests is possible, when we have 5000 failures and 100 crashes-or-timeouts.

If it's not possible to change it, well, it looks like last week was special, we've got only 100 or so new failures in this 3 days this week, so we can probably live with the current settings.

Comment 10 by kojii@chromium.org, May 12 2017

Quinten, do you happen to know what's possible, or what other bots do? If total # has the limitation, we're probably fine with something like:
       "--exit-after-n-failures",
      "3000",
      "--exit-after-n-crashes-or-timeouts",
      "2000",
but 100 crashes to exit early is a bit challenging for LayoutNG bot.
Project Member

Comment 11 by bugdroid1@chromium.org, May 14 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/6e27c23b503ebc4cef57d104f40950aae9dc1bcc

commit 6e27c23b503ebc4cef57d104f40950aae9dc1bcc
Author: kojii <kojii@chromium.org>
Date: Sun May 14 08:25:28 2017

Re-land: Trigger linux_layout_tests_layout_ng for LayoutNG changes

This is a re-land of r463177, after 23k lines were added to:
  LayoutTests/FlagExpectations/enable-blink-features=LayoutNG

Change PRESUBMIT.py for third_party/WebKit to trigger the LayoutNG test
bot, linux_layout_tests_layout_ng, which runs all layout tests with NG.

R=thakis@chromium.org, eae@chromium.org, glebl@chromium.org
BUG= 706183 ,  714203 

Review-Url: https://codereview.chromium.org/2854783002
Cr-Commit-Position: refs/heads/master@{#471625}

[modify] https://crrev.com/6e27c23b503ebc4cef57d104f40950aae9dc1bcc/third_party/WebKit/PRESUBMIT.py

Labels: -Test-failure Test-Layout
Status: Started (was: Unconfirmed)
Sorry for the slow response here -- I think that adding a LayoutTests/FlagExpectations/enable-blink-features=LayoutNG file is the nicest solution here, and after that it looks like the number of failures relatively small, the retry without patch is being done, and (https://luci-milo.appspot.com/buildbot/tryserver.chromium.linux/linux_layout_tests_layout_ng/168)

Now, it looks like jobs are running with and without patch and the number of failures is relatively small:
https://luci-milo.appspot.com/buildbot/tryserver.chromium.linux/linux_layout_tests_layout_ng/168

Should this be considered fixed? Should we add any remaining expectations before marking this as fixed?

Comment 13 by kojii@chromium.org, May 15 2017

As above, it'd be great if you can increase "--exit-after-n-crashes-or-timeouts".

It's not special for someone to add one wpt directory and new 100 tests starts to crash/timeouts for ng bot. We then have to wait for someone to update the FlagExpectations.

If this is possible without too much efforts, it'd make our life a little easier. If it's hard for whatever reasons, I think we can live without this.

Comment 14 by kojii@chromium.org, May 16 2017

Someone added ~200 tests and the bot starts failing again.

I'm updating FlagExpectations, but considering increasing threshold is really appreciated.
Re #14: Currently, the threshold is set in once place (recipe_modules/chromium_tests/steps.py) for all bots.

If possible, I don't think we should have a builder-specific threshold, since that adds a little bit of complexity.

CL https://chromium-review.googlesource.com/c/506780/ would change the threshold for all builders.

Comment 16 by kojii@chromium.org, May 17 2017

Oh, I see, thank you Quinten for having a look, and I understand it's much larger work than I thought. And thank you for the CL.
Cc: dpranke@chromium.org
I don't think we should have bots that are routinely failing 100's of tests, that makes it too hard to distinguish new failures from old ones.

I'm not sure I understand the context for this, but it sounds like what is happening is that someone starts to import a new directory under wpt, and the normal bots run fine, but the layout_ng bots crash all over the place? 

If so, then maybe we should either (a) make sure we run layout_ng as part of the import tryjobs, so that we don't import tests that are totally broken in LayoutNG, or, if we don't want to block imports on LayoutNG yet, add blanket Skip expectations in the FlagExpectations file so that LayoutNG isn't affected by (most) new imports?

I don't think we should increase the crash/timeout number from 100 to 500 for all bots. Most builds should never get anywhere close to a 100 failures, so increasing that number has little to no upside for anything but the case described above, which as noted I think we should avoid by other means.

If the above options are not viable, then we should figure out how to make things work so that you can increase the timeouts for just one bot. This should be easy once we switch to swarming and can configure things via the JSON files, so maybe that's the thing to do next?

Comment 18 by kojii@chromium.org, May 23 2017

The purpose of layout_ng bot is to detect regressions for where we have worked on, and only enabled when code in core/layout/ng was changed. Large part of layout isn't even implemented yet, and that new tests to crash/fail is fine at this moment.

layout_ng bot should not block importing new tests. It is not the purpose of the bot. If a set of tests is important, we add virtual/layout_ng. layout_ng bot is like a virtual/layout_ng for all tests to measure our progress and protect unexpected regressions without forcing everyone in the team to run the whole tests twice. For the purpose of the layout_ng bot, maybe eae@ can speak better.

I'm ok for only layout_ng bot to have 500, or if it's too troublesome for you, I'll try to maintain FlagExpectations more often and team can live with while I'm off-work.
Cc: -tansell@chromium.org
Status: WontFix (was: Started)
I think we can close this. Swarming helped this very much, and the frequency of this happening has down to manageable level.

Sign in to add a comment