New issue
Advanced search Search tips

Issue 794424 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Dec 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug



Sign in to add a comment

PrePaintTreeWalk/PaintPropertyTreeBuilder code change may pass CQ bots but break release-without-DCHECK builds

Project Member Reported by vmi...@chromium.org, Dec 13 2017

Issue description

third_party/WebKit/Source/core/paint/* changes don't trigger Blink release-without-DCHECK try jobs.  Do we have enough try bot bandwidth to add a trigger?

It would help avoid build breaks e.g.

change: third_party/WebKit/Source/core/paint/* changes don't trigger Blink try jobs.
errors missed in cq:  https://ci.chromium.org/buildbot/chromium.webkit/WebKit%20Linux%20Trusty/37183
 
Owner: wangxianzhu@chromium.org
Status: Assigned (was: Untriaged)
Summary: PrePaintTreeWalk/PaintPropertyTreeBuilder code change may pass CQ bots but break release-without-DCHECK builds (was: third_party/WebKit/Source/core/paint/* changes don't trigger Blink try jobs.)
The CL did trigger blink try bots and pass all of them. However, the change happens to crash release-without-DCHECK builds only. With DCHECK we force some special code path causes the crash situation doesn't happen. We need to examine the code to avoid such situation.

Comment 2 by vmi...@chromium.org, Dec 13 2017

Yeah.  It should maybe trigger master.tryserver.blink:linux_trusty_blink_rel.

Interestingly, some changes outside of Blink paint do trigger that, for example https://chromium-review.googlesource.com/c/chromium/src/+/778242.

For the CL you mentioned, the Cq-Include-Trybots line was added by this presubmit rule: https://cs.chromium.org/chromium/src/cc/PRESUBMIT.py?rcl=3a4ad4f8be59183ec01c7a8d5167c5e441dc5469&l=349. I think we can do the same for particular paint code which contains special DCHECK code paths which should be covered by a non-DCHECK blink bot.

Comment 4 by vmi...@chromium.org, Dec 13 2017

We add "master.tryserver.blink:linux_trusty_blink_rel" to try bots for CC and V8.  Let's consider adding for Paint, or Blink in general?
Cc: qyears...@chromium.org
+qyearsley

I'm not sure if linux_trusty_blink_rel has enough capacity to support all blink CLs. Triggering a bot for all blink CLs might need the generic CQ technique instead of a presubmit rule. I think it will be beneficial to including a non-DCHECK blink bot in CQ.  

Nevertheless the bot should has already enough capacity for paint CLs. We already use include linux_layout_tests_slimming_paint_v2 which seems to have similar (maybe even smaller) capacity, for paint CLs. Will try it.
Some data of linux_trusty_blink_rel (based on git log -- there may be better data source):
- It has 7 bots;
- Its current load is 10 builds per day on average in a 1 to 26 range.

With blink paint CLs added, it's load will be 15 builds per day on average in a 2 to 40 range (based on the current load data of linux_layout_tests_slimming_paint_v2 which is currently automatically added in CQ for blink paint CLs). linux_layout_tests_slimming_paint_v2 has 4 bots.

qyearsley@ do you think the load is proper for the current number of bots?
The number of slaves determines how many builds can happen at once - so 15-40 builds per day should be fine, as long as a max of 7 at any one time is OK.

The other factor that affects capacity is the swarming pool that's used for layout tests and the number of shards. linux_trusty_blink_rel uses the config of https://cs.chromium.org/chromium/src/testing/buildbot/chromium.webkit.json, which specifies 6 shards from Ubuntu-14.04 swarming bots. I believe there's no bottleneck there since there are a lot of bots with that OS (https://chromium-swarm.appspot.com/botlist?c=id&c=task&c=status&f=os%3AUbuntu-14.04).

So, I do think the load is proper for the number of bots.
Thanks qyearsley@ for the confirmation.

A related question: can we included a non-DCHECK testing bot in CQ for all CLs?
Cc: dpranke@chromium.org
I believe that we don't include both non-DCHECK + DCHECK bots in CQ due to capacity limits - adding CQ trybots requires a lot more capacity than adding optional trybots.

+dpranke who would know whether this is the case.
Description: Show this description
Project Member

Comment 11 by bugdroid1@chromium.org, Dec 13 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/fb51b1885ed5d11ff92e055ea0d837f7926b9d00

commit fb51b1885ed5d11ff92e055ea0d837f7926b9d00
Author: Xianzhu Wang <wangxianzhu@chromium.org>
Date: Wed Dec 13 18:24:02 2017

Include linux_trusty_blink_rel in CQ for paint changes

Some paint code has special DCHECK paths. Some CL may pass with DCHECK
but fail without DCHECK. We need a CQ bot to ensure the non-DCHECK path
is not broken.

Bug:  794424 
Cq-Include-Trybots: master.tryserver.blink:linux_trusty_blink_rel;master.tryserver.chromium.linux:linux_layout_tests_slimming_paint_v2
Change-Id: I184efbd8e07a14bd6f4d4767d42c2d14007b957a
Reviewed-on: https://chromium-review.googlesource.com/823762
Reviewed-by: Chris Harrelson <chrishtr@chromium.org>
Commit-Queue: Xianzhu Wang <wangxianzhu@chromium.org>
Cr-Commit-Position: refs/heads/master@{#523827}
[modify] https://crrev.com/fb51b1885ed5d11ff92e055ea0d837f7926b9d00/third_party/WebKit/PRESUBMIT.py
[modify] https://crrev.com/fb51b1885ed5d11ff92e055ea0d837f7926b9d00/third_party/WebKit/Source/core/paint/PaintLayerPainter.cpp

Status: Fixed (was: Assigned)
Sorry I just found a problem of the numbers in #c6: they count only the CLs successfully landed using the bots, but not jobs for dry run and webkit-patch rebaseline-cl and failed CQ jobs. Based on the bot status page, the load of linux_trusty_blink_rel in November was about 50 per day on average in range of 7~98. Adding blink paint CLs will increase the load by about 0.5 times. Let's see what will happen after the new presubmit rule.
We limit the number of configurations in the CQ both for capacity reasons ($$) and for maintainability reasons; for the latter case, for example, every additional configuration we add is another chance for a flaky test to hurt us.

We have to balance this off against having failures get through the CQ and just show up on the waterfalls, of course, and so we try to pay attention to that and make sure things aren't changing over time.

Release w/o DCHECK failures are actually pretty rare, so I wouldn't be inclined to want to add additional configs for them.

Separately, it's important that we try to make sure that tests run the same way with or without DCHECKs, since we don't have a way of distinguishing between the two in TestExpectations (and I don't want to add one, because I don't want to encourage tests that do behave differently).

Sign in to add a comment