New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 674692 link

Starred by 3 users

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug

Blocked on:
issue 739522

Blocking:
issue 725160



Sign in to add a comment

[WPT import] DCHECK enabled for CQ bots but not for Blink try bots.

Project Member Reported by qyears...@chromium.org, Dec 15 2016

Issue description

In the W3C test auto-import today (https://codereview.chromium.org/2579783002) the initial try jobs had a Timeout result when running imported/wpt/workers/opaque-origin.html; so, the Timeout expectation was added added to TestExpectations and the regular CQ was tried. On the win_chromium_rel_ng and linux_chromium_rel_ng builders, the result was a crash rather than timeout.

For both Win and Linux, the crash log started with something like:
STDOUT: CONSOLE ERROR: line 4: Uncaught NotSupportedError: Failed to construct 'BroadcastChannel': Can't create BroadcastChannel in an opaque origin
STDOUT: #CRASHED - renderer
STDERR: [1:12:1215/090125.390914:2985388043:FATAL:WorkerOrWorkletScriptController.cpp(297)] Check failed: !m_globalScope->shouldSanitizeScriptError(state.m_location->url(), NotSharableCrossOrigin). 

This is a DCHECK at https://chromium.googlesource.com/chromium/src/+/master/third_party/WebKit/Source/bindings/core/v8/WorkerOrWorkletScriptController.cpp#296.

Now, the Blink try bots have DCHECK disabled so that the layout test results are consistent with the continuous release builders (https://chromium.googlesource.com/chromium/src/+/master/tools/mb/mb_config.pyl#453)

But, the CQ try bots have "release_trybot" in their config names, meaning that DCHECK is enabled. This is good for other tests besides the layout tests, since it may help to catch some problems.

But, as a consequence, this means that sometimes the CQ bots ({linux,mac,win}_chromium_rel) will crash in some tests where the Blink try bots and waterfall would not have any crashes -- which means that what happens on the Blink try bots cannot reliably predict what will happen on the CQ.


Idea #1: For W3C auto-imports, maybe we don't want to run layout tests using the default try bots, and instead we want try again with the tryserver.blink try bots?

Idea #2: Maybe we actually never want to run layout tests with DCHECK turned on on the CQ, because this is inconsistent with the layout tests on the waterfall -- it's possible for someone to check in a test with a Crash expectation which passes the CQ, but doesn't crash (and thus fails) on the waterfall. Maybe we actually do want separate dedicated layout test try bots on tryserver.chromium.{android,linux,mac,win} (which would all have DCHECK disabled, and a subset of them would also be on the CQ)? And no separate tryserver.blink waterfall?

Dirk, what do you think? Does #2 sound like a good idea long-term? Does #1 sound like a reasonable solution in the shorter term?
 
idea #1 is out, you can't easily turn off the default bots.

Is this actually blocking the roll? Is there a way we can easily manually work around this?

If this only happens 1-2 times a month (or less) I would either manually work around it when it happens. So, that's what I'd probably do if we can for now.

If it happens frequently, we'd need to automate this. In this case, I'd probably modify the importer to *also* run the default jobs w/ dcheck enabled, and then have the importer merge the expectations from the two sets of builders.

We intentionally only run release+dcheck in the CQ, because we can't afford to run both release and debug tests, and release by itself won't catch debug failures (but dcheck catches most of them). We also don't want to have to run both release and release+dcheck, since that's double the work.

How does that sound?
So far, it happens once a month or less, and working around it without changing any try bots would be OK.

Manually working around it involves patching in the CL made by the autoroller, and adding a Crash expectation to get it passed the CQ. (Actually, does that sound correct? The new crash expectation applies only to builders with DCHECK enabled, not to normal Release builds.)

It's also possible to change the w3c-test-autoroller without changing any try bots: We'd want to also look at results from the CQ try bots, and if there are crash results there, then we'd add a Crash expectation and try again.


Anyway, hypothetically, do you think it would be a bad idea to do add separate layout test CQ builders that build Release without DCHECK and run just the layout tests? This would imply adding another 3-4 CQ builders which do a build of content_shell etc. and run layout tests for each CQ run. This seems a little more correct for layout tests, but maybe not worth it overall?
I think it's not worth it overall to add more builders, but we can maybe revisit once we have swarmed layout tests running well all the times it should be running and we can  see the impact on fleet capacity.

Comment 4 by foolip@chromium.org, Dec 16 2016

Does a dry run not use the same configuration as the CQ does? If it does, then maybe a dry run can be used to get the baselines? (And if not, why isn't everything on fire?)
A CQ dry run uses the same configuration as the normal CQ as far as I know -- both use Release builds with DCHECK on.

The reason why we can't use *only* the CQ bots *_chromium_rel_ng to get the baselines now is that they don't cover all supported platforms -- they only cover one version of mac, windows and linux, rather than all supported versions.

But once a dry run is done, we can indeed check the results of those runs and see which tests crashed on those bots; if a test crashed on (some of) the CQ bots but not on the other Blink try bots, then that will usually indicate that it crashed due to tripping a DCHECK.
Labels: TE-NeedsTriageHelp
Labels: -TE-NeedsTriageHelp
Status: Available (was: Unconfirmed)
Summary: [W3C auto-import] DCHECK enabled for CQ bots but not for Blink try bots. (was: W3C auto-import: DCHECK enabled for CQ bots but not for Blink try bots.)
Next action for this issue: Add functionality to the auto-import code (in webkitpy/w3c/deps_updater.py) to check the results of the CQ, and if there are Crash results, add a Crash expectation for the test, then try the CQ again.

As long as there aren't often new crashes in tests triggered by DCHECK, this is probably not very high priority.
Components: Blink>Infra>Predictability
Components: -Blink>Infra
A thought:

In general, and not just for import, there's no set of expectations that's really specific and correct that lets layout tests with DCHECK failures get through the CQ. The most specific and correct expectation would be:
  crbug.com/xxx [ Debug ] test [ Crash ]

But the configuration for CQ builders is still considered release, so this expectation wouldn't apply then. This means that to get through the CQ, we have to use the expectation:
  crbug.com/xxx test [ Crash Pass ]

But this doesn't explicitly say that the crash is related to DCHECK, this test will apply non-flaky on the continuous builders.

In most cases, for non-imported tests, the best way to solve this would be to never commit any tests that trigger DCHECK failures in the first place. But for imported tests we can't control this.

Anyway, my best idea now for how to handle this would still be to modify the auto-importer code (now in webkitpy/w3c/test_importer.py) to check the results of the CQ, and if there are Crash results, add a flaky crash expectation and try the CQ again. Another alternative would be to trigger *_chromium_rel_ng after first uploading the import CL, update expectations to include both the results there and on the try bots.
Blocking: 725160
Components: Blink>Infra>Ecosystem
Components: -Blink>Infra>Predictability
Summary: [WPT import] DCHECK enabled for CQ bots but not for Blink try bots. (was: [W3C auto-import] DCHECK enabled for CQ bots but not for Blink try bots.)
Cc: qyears...@chromium.org
 Issue 752406  has been merged into this issue.
Cc: robertma@chromium.org
robertma@, this is a P2 issue that hasn't seen any activity for 60 days. Should we demote it to P3 if it's not causing regular problems, or should it actually be fixed soonish?
It only causes problems every once in a while (every 3-4 weeks?), but when this happens the wpt-import bot remains red for quite a while until we either manually skip the offending tests or fix the code that's triggering the DCHECKs, which affects the Q4 OKR to reduce WPT import latency.

This last happened earlier this week: builds #2521 up to #2536 (Sept 30th to Oct 2nd) were red until https://chromium-review.googlesource.com/689357 fixed the underlying issue.
Blockedon: 739522
Labels: -Pri-2 Pri-3
Sorry this is actually my first time noticing this issue.

In the past three months, the importer was tripped by DCHECK twice I think, which is lower than the threshold (1 or 2 times a month, same as suggested by dpranke@) I'd like to prioritize automation. It's therefore safe to downgrade to P3.

Regarding the potential pragmatic fix discussed here (esp. comment 11), I think it's blocked on issue 739522 (rebaselining from CQ), which is also a P3.
Cc: raphael....@intel.com
 Issue 830490  has been merged into this issue.
It happened in  issue 830490  again last week, but wasn't all bad, the human involvement probably increased the speed at which the Blink bugs were fixed.

Sign in to add a comment