telemetry_perf_unittests causing a DCHECK in paint on chromium.linux/Linux Tests (dbg)(1) |
||||||||
Issue descriptionFiled by sheriff-o-matic@appspot.gserviceaccount.com on behalf of clamy@google.com telemetry_perf_unittests failing on chromium.linux/Linux Tests (dbg)(1) Builders failed on: - Linux Tests (dbg)(1): https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Linux%20Tests%20%28dbg%29%281%29 I suspect this was caused by the latest catapult roll: https://chromium-review.googlesource.com/c/chromium/src/+/1256077
,
Oct 2
mattcary: can you check if this was caused by your CL from the roll: 2018-10-02 mattcary@chromium.org Tracing: add java_base* executable memory metrics.
,
Oct 2
Looking. Telemetry perf tests are timing out on one shard. It seems unlikely to be my change, which added a small amount of extra processing of memory stats in order to add a new metric. The perf tests have been sporadically timing out since yesterday, before my change was submitted. So I suspect that whatever caused that to start happening hasn't fixed itself yet :) There seems to be something strange going on in the SystemHealthSmokeTests, but as that seems to be a generated benchmark I'm not entirely sure what's going on.
,
Oct 3
I'm not clear how to read the output from the telemetry perf tests, but based on https://chromium-swarm.appspot.com/task?id=4053080c07276210&refresh=10&show_raw=1 it looks like there may be a crash happening? """ Found minidump via globbing in minidump dir ... Thread 0 (crashed) 0 libc-2.19.so + 0x36c37 ... Found by: given as instruction pointer in context 1 libbase.so!base::debug::BreakDebugger() + 0x18 ... 8 libbase.so!~LogMessage [logging.cc : 833 + 0x3e] ... 9 libblink_core.so!blink::FindVisualRectNeedingUpdateScopeBase::FindVisualRectNeedingUpdateScopeBase(blink::LayoutObject const&, blink::PaintInvalidatorContext const&, blink::LayoutRect const&) + 0x121 ... ~LogMessage implies a DCHECK or LOG(FATAL) to me. The FindVisualRectNeedingUpdateScopeBase constructor does have two of them: DCHECK(context.tree_builder_context_actually_needed_); DCHECK(context.NeedsVisualRectUpdate(object)); A long shot, but wangxianzhu@ was involved in fixing issue 836097 which also mentioned a DCHECK similar to the latter (but in a different context), so I'm cc-ing them. wangxianzhu@, can you add any information about FindVisualRectNeedingUpdateScopeBase ? (Please note this may be a red herring and I may be way off base).
,
Oct 3
Note: I am seeing this minidump consistently in the failing telemetry_perf_unittests cases; every run I've checked since 2018-10-02 10:04 AM.
,
Oct 3
vmpstr@, this seems related to the recent change about paint invalidation dirty flags (https://chromium-review.googlesource.com/c/chromium/src/+/1237246). It seems the case that some ancestor previously needed visual rect update but we just marked it through the ancestor chain of some descendant needing visual rect update.
,
Oct 3
Good call, that seems plausible to me. Reassigning the bug as I'm clearly out of my depth here.
,
Oct 3
We had two optimizations here. One that is mentioned in #6, which could cause the DCHECK(context.NeedsVisualRectUpdate(object)); dcheck. We also had one in https://chromium-review.googlesource.com/c/chromium/src/+/1252941 which could cause DCHECK(context.tree_builder_context_actually_needed_); to trigger as well. The latter is in the blamelist of https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Linux%20Tests%20%28dbg%29%281%29/74779 which is the first build where it started failing a lot. I'm going to revert a part of that change that I suspect cause this and then investigate.
,
Oct 3
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/5d1aacaa63cd61c15db0d96d63047a29a527cd35 commit 5d1aacaa63cd61c15db0d96d63047a29a527cd35 Author: Vladimir Levin <vmpstr@chromium.org> Date: Wed Oct 03 19:36:43 2018 Partially revert an isolation boundary optimization. This patch reverts a part of crrev.com/367a97e. This seems to have caused DCHECK failures on one of the telemetry tests. R=chrishtr@chromium.org, pdr@chromium.org, wangxianzhu@chromium.org Bug: 891270 Cq-Include-Trybots: luci.chromium.try:linux_layout_tests_slimming_paint_v2;master.tryserver.blink:linux_trusty_blink_rel Change-Id: Ib99768154470fe69796e3a47645ffb1c981661a9 Reviewed-on: https://chromium-review.googlesource.com/c/1259298 Reviewed-by: Philip Rogers <pdr@chromium.org> Reviewed-by: Xianzhu Wang <wangxianzhu@chromium.org> Reviewed-by: Chris Harrelson <chrishtr@chromium.org> Commit-Queue: vmpstr <vmpstr@chromium.org> Cr-Commit-Position: refs/heads/master@{#596325} [modify] https://crrev.com/5d1aacaa63cd61c15db0d96d63047a29a527cd35/third_party/blink/renderer/core/paint/pre_paint_tree_walk.cc
,
Oct 3
I can confirm locally that it's the latter dcheck from #8: on www.westpac.com.au (telemetry archive): [1:1:1003/131039.617050:FATAL:find_paint_offset_and_visual_rect_needing_update.h(72)] Check failed: context.tree_builder_context_actually_needed_. #0 0x7f23945fae3d base::debug::StackTrace::StackTrace() #1 0x7f239430388a base::debug::StackTrace::StackTrace() #2 0x7f239437369b logging::LogMessage::~LogMessage() #3 0x7f23761de961 blink::FindVisualRectNeedingUpdateScopeBase::FindVisualRectNeedingUpdateScopeBase() #4 0x7f2376f1cbf0 blink::FindObjectVisualRectNeedingUpdateScope::FindObjectVisualRectNeedingUpdateScope() #5 0x7f2376f1b65c blink::PaintInvalidator::InvalidatePaint() #6 0x7f2376f7d1ba blink::PrePaintTreeWalk::WalkInternal() ... From initial logging: LayoutBlockFlow UL class='slider js-slider' descendant needs paint property update LayoutListItem (relative positioned) LI class='slide slide-classic' needs paint property update LayoutBlockFlow (positioned) <pseudo:before> needs paint property update LayoutTextFragment (anonymous) not required ^^^^^^^^^^^^^^^^^^ - that one actually needs a visual rect update. Will update if I can reduce, or if I have further leads.
,
Oct 3
Here's the bug: When creating a tree builder context, we consider only the parent context's subtree needs visual rect update. That fails, so we don't actually need the tree builder context. Then, we update for self (this is a debug build so we always create a context). After that we hit this path: https://cs.chromium.org/chromium/src/third_party/blink/renderer/core/paint/pre_paint_tree_walk.cc?dr&sq=package:chromium&g=0&l=334 in this case, the clip_changed is an inherited property from the parent. Because of this, we set the subtree needs visual rect update, but we don't have a valid tree builder context (we wouldn't have it in release), which triggers a dcheck later on. The fix is probably to remove this line as well, since it follows the same reasoning: clip changing should not cause the subtree to need visual rect updates _especially_ if the clip changed is inherited from the parent. I'll be writing a test and put up a patch with the reland of #10 as well as the fix to ensure we don't set the subtree needs visual rect update here as well.
,
Oct 4
+sullivan, +mattcary The patch in 9 fixed the crash, which I'm no longer seeing in builds. However, the builder isn't green all the way. Ie there are still telemetry_perf_unittests and telemetry_unittests failures very frequently. I think this needs to be re-triaged. I'm renaming this issue to be specific to the DCHECK that was happening.
,
Oct 4
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/3fa6c505ea3f2e3dc229164c357052b11ca81023 commit 3fa6c505ea3f2e3dc229164c357052b11ca81023 Author: Vladimir Levin <vmpstr@chromium.org> Date: Thu Oct 04 19:00:56 2018 Reland of visual rect subtree walk optimization. This patch relands a previous revert (crrev.com/5d1aacaa) with a fix. The problem was that if self clip changed, then it would set a bitfield requiring a subtree walk, but at that point we already don't have a valid tree builder context. This should only have triggered in debug, but it's good to fix anyway since the release will cause an unnecessary recursion. Added a test that fails the same DCHECK with just a pure re-land, and passes with the additional changes. R=chrishtr@chromium.org, pdr@chromium.org, wangxianzhu@chromium.org Bug: 891270 Cq-Include-Trybots: luci.chromium.try:linux_layout_tests_slimming_paint_v2;master.tryserver.blink:linux_trusty_blink_rel Change-Id: I20bec21e474fe22752eaf2737e43a121125e20e2 Reviewed-on: https://chromium-review.googlesource.com/c/1261930 Reviewed-by: Xianzhu Wang <wangxianzhu@chromium.org> Commit-Queue: vmpstr <vmpstr@chromium.org> Cr-Commit-Position: refs/heads/master@{#596788} [modify] https://crrev.com/3fa6c505ea3f2e3dc229164c357052b11ca81023/third_party/blink/renderer/core/paint/pre_paint_tree_walk.cc [modify] https://crrev.com/3fa6c505ea3f2e3dc229164c357052b11ca81023/third_party/blink/renderer/core/paint/pre_paint_tree_walk_test.cc
,
Oct 4
,
Oct 4
,
Oct 5
Thanks for your fix! I'll look at the bot failures and create a bug as appropriate.
,
Oct 5
The linux telemetry tests seem stable now. |
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by clamy@chromium.org
, Oct 2Status: Assigned (was: Available)