Remove workarounds for null layers and invalid property tree indices |
|||||||||||||||||||
Issue descriptionWe shouldn't have null layers and invalid property tree indices on the compositor thread. But, we have many crash reports with them and have added workarounds. The actual reason for these null/invalid values should be investigated and the workarounds should be removed.
,
May 26 2017
One thing we could do to get a repro is make the DCHECKs in tree synchronizer that check for valid property tree indices and valid layers into CHECKs https://cs.chromium.org/chromium/src/cc/trees/tree_synchronizer.cc?q=tree_synchronizer.cc&sq=package:chromium&dr&l=105
,
May 26 2017
There are many places with workarounds for this, the recent ones are : https://codereview.chromium.org/2901393004/ , https://codereview.chromium.org/2889433002
,
May 26 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/954e46372100531c3e780f6c64da5fd91e988f4b commit 954e46372100531c3e780f6c64da5fd91e988f4b Author: jaydasika <jaydasika@chromium.org> Date: Fri May 26 23:17:19 2017 cc : Workaround for null layer impls and invalid property tree indices Temporary Workarounds to prevent crashing when we a null layer_impl in draw_property_utils::FindLayersThatNeedUpdates and invalid property tree indices in draw_property_utils::LayerShouldBeSkippedInternal BUG=725851, 726423 CQ_INCLUDE_TRYBOTS=master.tryserver.blink:linux_trusty_blink_rel Review-Url: https://codereview.chromium.org/2901393004 Cr-Commit-Position: refs/heads/master@{#475178} [modify] https://crrev.com/954e46372100531c3e780f6c64da5fd91e988f4b/cc/trees/draw_property_utils.cc
,
May 30 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/d86317b087ace8959d3f68936fb346a87a9b0ea3 commit d86317b087ace8959d3f68936fb346a87a9b0ea3 Author: Khushal <khushalsagar@chromium.org> Date: Tue May 30 21:07:27 2017 cc : Workaround for null layer impls and invalid property tree indices Temporary Workarounds to prevent crashing when we a null layer_impl in draw_property_utils::FindLayersThatNeedUpdates and invalid property tree indices in draw_property_utils::LayerShouldBeSkippedInternal BUG=725851, 726423 CQ_INCLUDE_TRYBOTS=master.tryserver.blink:linux_trusty_blink_rel Review-Url: https://codereview.chromium.org/2901393004 Cr-Original-Commit-Position: refs/heads/master@{#475178} Review-Url: https://codereview.chromium.org/2915653002 . Cr-Commit-Position: refs/branch-heads/3071@{#725} Cr-Branched-From: a106f0abbf69dad349d4aaf4bcc4f5d376dd2377-refs/heads/master@{#464641} [modify] https://crrev.com/d86317b087ace8959d3f68936fb346a87a9b0ea3/cc/trees/draw_property_utils.cc
,
Jun 2 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/5867068aa6d1677d3ff8a9da36ba793757b171aa commit 5867068aa6d1677d3ff8a9da36ba793757b171aa Author: khushalsagar <khushalsagar@chromium.org> Date: Fri Jun 02 22:11:04 2017 cc: Workaround invalid property tree state on Layers. The impl thread is somehow getting Layers with invalid property tree indices which cause crashes during UpdateDrawProperties. Don't add them to the visible layer list to keep from crashing later in the stack. BUG=728511, 727209, 726423 CQ_INCLUDE_TRYBOTS=master.tryserver.blink:linux_trusty_blink_rel Review-Url: https://codereview.chromium.org/2918163002 Cr-Commit-Position: refs/heads/master@{#476811} [modify] https://crrev.com/5867068aa6d1677d3ff8a9da36ba793757b171aa/cc/layers/layer_impl.cc [modify] https://crrev.com/5867068aa6d1677d3ff8a9da36ba793757b171aa/cc/layers/layer_impl.h [modify] https://crrev.com/5867068aa6d1677d3ff8a9da36ba793757b171aa/cc/trees/draw_property_utils.cc
,
Jun 2 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/5867068aa6d1677d3ff8a9da36ba793757b171aa commit 5867068aa6d1677d3ff8a9da36ba793757b171aa Author: khushalsagar <khushalsagar@chromium.org> Date: Fri Jun 02 22:11:04 2017 cc: Workaround invalid property tree state on Layers. The impl thread is somehow getting Layers with invalid property tree indices which cause crashes during UpdateDrawProperties. Don't add them to the visible layer list to keep from crashing later in the stack. BUG=728511, 727209, 726423 CQ_INCLUDE_TRYBOTS=master.tryserver.blink:linux_trusty_blink_rel Review-Url: https://codereview.chromium.org/2918163002 Cr-Commit-Position: refs/heads/master@{#476811} [modify] https://crrev.com/5867068aa6d1677d3ff8a9da36ba793757b171aa/cc/layers/layer_impl.cc [modify] https://crrev.com/5867068aa6d1677d3ff8a9da36ba793757b171aa/cc/layers/layer_impl.h [modify] https://crrev.com/5867068aa6d1677d3ff8a9da36ba793757b171aa/cc/trees/draw_property_utils.cc
,
Jun 6 2017
Most crash reports seem to indicate that this was observed more with OOPIF enabled.
,
Jun 21 2017
,
Jun 21 2017
Issue 728511 has been merged into this issue.
,
Jun 21 2017
Issue 730020 has been merged into this issue.
,
Jun 21 2017
Issue 733066 has been merged into this issue.
,
Jun 21 2017
,
Jun 21 2017
Users experienced this crash on the following builds: Win Beta 60.0.3112.32 - 0.17 CPM, 140 reports, 138 clients (signature cc::draw_property_utils::`anonymous namespace'::LayerShouldBeSkippedInternal<cc::LayerImpl>) If this update was incorrect, please add "Fracas-Wrong" label to prevent future updates. - Go/Fracas
,
Jun 21 2017
Users experienced this crash on the following builds: Win Beta 60.0.3112.32 - 0.58 CPM, 483 reports, 454 clients (signature cc::draw_property_utils::FindLayersThatNeedUpdates) If this update was incorrect, please add "Fracas-Wrong" label to prevent future updates. - Go/Fracas
,
Jun 21 2017
Users experienced this crash on the following builds: Win Dev 61.0.3128.0 - 0.81 CPM, 242 reports, 219 clients (signature cc::LayerImpl::render_target_effect_tree_index) If this update was incorrect, please add "Fracas-Wrong" label to prevent future updates. - Go/Fracas
,
Jun 21 2017
Users experienced this crash on the following builds: Win Canary 61.0.3136.0 - 0.74 CPM, 8 reports, 8 clients (signature cc::LayerImpl::CanUseLCDText) If this update was incorrect, please add "Fracas-Wrong" label to prevent future updates. - Go/Fracas
,
Jun 22 2017
,
Jun 22 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/1523e66b397af8bd58682fee98052cccbd4bd16c commit 1523e66b397af8bd58682fee98052cccbd4bd16c Author: khushalsagar <khushalsagar@chromium.org> Date: Thu Jun 22 02:02:30 2017 cc: Add CHECKS to diagnose invalid property tree indices on impl thread. The property tree state should be valid on the main thread after the UpdateLayers stage which builds/updates the tree and on the sync tree after the commit completes. Add CHECKs to validate that. BUG= 726423 R=enne@chromium.org CQ_INCLUDE_TRYBOTS=master.tryserver.blink:linux_trusty_blink_rel Review-Url: https://codereview.chromium.org/2951853004 Cr-Commit-Position: refs/heads/master@{#481402} [modify] https://crrev.com/1523e66b397af8bd58682fee98052cccbd4bd16c/cc/trees/layer_tree_host.cc
,
Jun 26 2017
Issue 736736 has been merged into this issue.
,
Jun 26 2017
So the change above points that the property tree state on main layers is valid before the commit but is not being synced correctly to LayerImpls. I thought it might be that we are updating the indices somewhere but not marking the layer for push properties but there is only one place that updates these indices and it does do that (https://cs.chromium.org/chromium/src/cc/layers/layer.cc?q=layer.cc&dr&l=943). Any other ideas?
,
Jun 26 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/1ec4a223a498c48e5f510a580c9890479f5714b6 commit 1ec4a223a498c48e5f510a580c9890479f5714b6 Author: khushalsagar <khushalsagar@chromium.org> Date: Mon Jun 26 21:05:41 2017 cc: Add CHECKs for diagnosing invalid property tree indices post-commit. Include mask layers when validating property tree indices post layer update, and peform this check right before starting commit as well. BUG= 726423 R=weiliangc@chromium.org CQ_INCLUDE_TRYBOTS=master.tryserver.blink:linux_trusty_blink_rel Review-Url: https://codereview.chromium.org/2962563002 Cr-Commit-Position: refs/heads/master@{#482394} [modify] https://crrev.com/1ec4a223a498c48e5f510a580c9890479f5714b6/cc/trees/layer_tree_host.cc
,
Jun 27 2017
Issue 737272 has been merged into this issue.
,
Jun 27 2017
Issue 729496 has been merged into this issue.
,
Jun 27 2017
Users experienced this crash on the following builds: Linux Dev 61.0.3135.4 - 0.66 CPM, 3 reports, 2 clients (signature cc::ScrollTree::MaxScrollOffset) If this update was incorrect, please add "Fracas-Wrong" label to prevent future updates. - Go/Fracas
,
Jun 28 2017
Can we get this addressed before M61 hits stable?
,
Jun 28 2017
Getting closer to the problem. So with https://codereview.chromium.org/2962563002, we check the validity of the property tree state on layers at 2 points. 1) During UpdateLayers, immediately after rebuilding/updating the PropertyTrees in PropertyTreeBuilder::BuildPropertyTrees. After this point, these indices should not change in cc until the main frame ends. 2) Right before starting the commit in LayerTreeHost::FinishCommitOnImplThread. Its crashing at 2) which implies that someone tried to mutate this state during painting. The only place where SetFooTreeIndex call is made on a Layer outside of PropertyTree building is from SetLayerTreeHost which invalidates all of them. We have DCHECKs for making sure that the layer properties are not changed during paint. I'm going to change the ones in SetFooTreeIndex to CHECKs and the stack trace should point to where that is coming from.
,
Jun 28 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/ac130a51d426be4e44cf2d8743e284cbd360e408 commit ac130a51d426be4e44cf2d8743e284cbd360e408 Author: khushalsagar <khushalsagar@chromium.org> Date: Wed Jun 28 22:27:51 2017 cc: Don't touch layer state during painting. The state on Layers, including property tree state, should not be changed during painting. Change DHCECKs for it to CHECKs at a few places to diagnose a crash. BUG= 726423 R=enne@chromium.org CQ_INCLUDE_TRYBOTS=master.tryserver.blink:linux_trusty_blink_rel Review-Url: https://codereview.chromium.org/2957373002 Cr-Commit-Position: refs/heads/master@{#483174} [modify] https://crrev.com/ac130a51d426be4e44cf2d8743e284cbd360e408/cc/layers/layer.cc
,
Jun 28 2017
We are seeing and increase on render crashes in Chrome OS after these last CLs merged to the last builds going from 1 to 78 and counting https://docs.google.com/spreadsheets/d/1PkObKunssh2WeLPNYIzrRa6312g-e_0CKY5MEINnnpA/edit#gid=702180007 khushalsagar@ can you evaluate?
,
Jun 28 2017
Discussed offline. The crashes we are seeing spiking are just from the crash moving one place to another. We traded: https://crash.corp.google.com/browse?q=product.name%3D%27Chrome_ChromeOS%27%20AND%20product.Version%20in%20(%2759.0.3071.91%27%2C%2759.0.3071.113%27)%20AND%20custom_data.ChromeCrashProto.magic_signature_1.name%3D%27cc%3A%3Adraw_property_utils%3A%3AFindLayersThatNeedUpdates%27&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D for https://crash.corp.google.com/browse?q=product.name%3D%27Chrome_ChromeOS%27%20AND%20custom_data.ChromeCrashProto.magic_signature_1.name%3D%27cc%3A%3ALayerImpl%3A%3ACanUseLCDText%27%20AND%20product.Version%20in%20(%2759.0.3071.91%27%2C%2759.0.3071.113%27)%20AND%20custom_data.ChromeCrashProto.magic_signature_1.name%3D%27cc%3A%3ALayerImpl%3A%3ACanUseLCDText%27&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D#samplereports
,
Jun 30 2017
The crashes I'm seeing today in 61.0.3144.0 don't seem consistent with the analysis in Comment 27 because I see crashes with cc::LayerTreeHost::FinishCommitOnImplThread on the callstack, but no crashes from those CHECKs added in the commit in Comment 28. The CHECKs added prevent the transform_tree_index from changing, but what prevents a layer from being added to the host later? What guarantees that all layers will have SetTransformTreeIndex called with index!=kInvalidNodeId at some point? What prevents a layer from being added to two hosts (I notice in crash f853999e08000000 that the layer has a 2 refcount; is that typical?)
,
Jul 5 2017
Any updates?
,
Jul 6 2017
Re#31: If a layer is added to the host later, it unconditionally invalidates all property tree indices (https://cs.chromium.org/chromium/src/cc/layers/layer.cc?l=141), which should hit the CHECKs added for property tree indices being changed during paint in #28. I expected this to happen during paint because between updating PropertyTrees in UpdateLayers and committing, this is the only point where we make calls into blink that could inadvertently change this state. During BeginMainFrame, if we proceeded to the commit stage, we ensure that we go through the UpdateLayers stage which is the only point that updates/rebuilds PropertyTrees and their indices on all layers, other than all indices being invalidated when a layer is added/removed from a host. I would expect us to hit this CHECK (https://cs.chromium.org/chromium/src/cc/trees/layer_tree_host.cc?dr&l=781), if valid indices were not set for any layer during this stage. A Layer can not be on 2 hosts. When it is added to the tree, it binds to the host for this tree in SetLayerTreeHost. The 2 refcount could be explained from the layer's parent and WebLayerImpl holding a reference to it. But yes, I'm confused that the CHECKs in #28 didn't change anything...
,
Jul 6 2017
Another crazy idea, could we have a commit without going through UpdateLayers? I think (https://cs.chromium.org/chromium/src/cc/trees/proxy_main.cc?q=proxy_main.cc&dr&l=216) evicted_ui_resources could cause that. But it would be incorrect to have a layer added to the tree, which has invalid indices, and not have that request a commit anyway. Just to be certain that we are not hitting this case, could we always go through UpdateLayers if we are going to be proceed to a commit? It looks like the ui resource eviction is the only path which could avoid that and its not a significant optimization. The eviction cases are rare, and if there was nothing to update, there shouldn't be much work done anyway.
,
Jul 7 2017
crbug/725851 was merged into this bug, that crash is happening on M-60 and the crash stack is seen on Android: https://crash.corp.google.com/browse?q=product.name%3D%27Chrome%27%20AND%20custom_data.ChromeCrashProto.ptype%3D%27renderer%27%20AND%20custom_data.ChromeCrashProto.magic_signature_1.name%3D%27cc%3A%3Adraw_property_utils%3A%3AFindLayersThatNeedUpdates%27&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D
,
Jul 7 2017
Oops, was using wrong query in c#35, removing Android
,
Jul 11 2017
Issue 733087 has been merged into this issue.
,
Jul 11 2017
A friendly reminder that M61 branch is coming soon on 07/20! Your bug is labelled as Stable ReleaseBlock, pls make sure to land the fix ASAP to trunk. This way we branch M61 from a high quality trunk. Thank you.
,
Jul 12 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/a21fb782d3d27a44ffbd174f37352f34a1de8159 commit a21fb782d3d27a44ffbd174f37352f34a1de8159 Author: Khushal <khushalsagar@chromium.org> Date: Wed Jul 12 21:48:41 2017 cc: Always update layers before commit. In the case of ui resource eviction on the impl thread, layers may not be updated prior to a commit on the main thread. Bug: 726423 Change-Id: Ia6874df31084b53619155959a5ed85694e23fdf5 Reviewed-on: https://chromium-review.googlesource.com/564159 Reviewed-by: Weiliang Chen <weiliangc@chromium.org> Commit-Queue: Khushal <khushalsagar@chromium.org> Cr-Commit-Position: refs/heads/master@{#486103} [modify] https://crrev.com/a21fb782d3d27a44ffbd174f37352f34a1de8159/cc/trees/proxy_main.cc
,
Jul 12 2017
If we have a fix, any chance we can get this into 59? This is currently blocking stable and we want to get a new 59 out for Chrome OS next week (with a build in the next day or so such that we can test this week).
,
Jul 12 2017
I'm not certain that the change above will fix the issue, we need to get some feedback from canary. Removing the 59 blocker since it won't be possible to get a fix stable enough to merge by next week.
,
Jul 15 2017
[Stability sheriff] Khushal, can you check if this crash has been fixed on canary? If this is fixed, please decide if you want to merge into M60. It's not clear to me what the latest crash signature is. Can you please add the crash link for that? Thanks!
,
Jul 15 2017
The change above made it to the 61.0.3156.4 canary. Here is the crash link: https://crash.corp.google.com/browse?q=custom_data.ChromeCrashProto.ptype%3D%27renderer%27%20AND%20custom_data.ChromeCrashProto.magic_signature_1.name%3D%27cc%3A%3A%60anonymous%20namespace%5C%27%3A%3AEnsureValidIndicesOnLayer%27&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D#-property-selector,samplereports:5,productversion I haven't seen any crashes since then but I want to give it a couple of more days to be certain before merging.
,
Jul 17 2017
Thanks for the update. Please update this bug when you're ready to merge. FYI there's an M60 beta release planned for Wed Jul 19.
,
Jul 18 2017
Merge request for change in #39.
,
Jul 18 2017
This bug requires manual review: We are only 6 days from stable. Please contact the milestone owner if you have questions. Owners: amineer@(Android), cmasso@(iOS), josafat@(ChromeOS), bustamante@(Desktop) For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Jul 18 2017
Thanks for the fix. We are only 6 days away from Stable, and I'd like to ensure if this change is really needed. How commonly is this occurring, and what are the implications if we wait until M61? We are too close to M60 stable promotion, and if this isn't truly a RB, my recommendation is to wait until M61.
,
Jul 18 2017
Its a renderer crasher. This is what the crash rate is like on 59 which also had the bug: https://crash.corp.google.com/browse?q=custom_data.ChromeCrashProto.magic_signature_1.name%3D%27cc%3A%3ALayerImpl%3A%3ACanUseLCDText%27%20AND%20custom_data.ChromeCrashProto.magic_signature_1.name%3D%27cc%3A%3ALayerImpl%3A%3ACanUseLCDText%27%20AND%20product.Version%20LIKE%20%27%2559.%25%27&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D Nothing drastic if we wait, just the crash will still be there. If the frequency doesn't warrant a merge, 61 should have the fix.
,
Jul 19 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/b5b55e3f2de94682b20407e0a699a8fa27c31832 commit b5b55e3f2de94682b20407e0a699a8fa27c31832 Author: Khushal <khushalsagar@chromium.org> Date: Wed Jul 19 00:24:31 2017 cc: Remove CHECKs and workarounds for invalid property tree state. Bug: 726423 Change-Id: I856b099cc571dbb114259c5e21fcb8c4acbd9bf7 Reviewed-on: https://chromium-review.googlesource.com/576462 Commit-Queue: Khushal <khushalsagar@chromium.org> Reviewed-by: enne <enne@chromium.org> Cr-Commit-Position: refs/heads/master@{#487685} [modify] https://crrev.com/b5b55e3f2de94682b20407e0a699a8fa27c31832/cc/layers/layer.cc [modify] https://crrev.com/b5b55e3f2de94682b20407e0a699a8fa27c31832/cc/layers/layer_impl.cc [modify] https://crrev.com/b5b55e3f2de94682b20407e0a699a8fa27c31832/cc/layers/layer_impl.h [modify] https://crrev.com/b5b55e3f2de94682b20407e0a699a8fa27c31832/cc/trees/draw_property_utils.cc [modify] https://crrev.com/b5b55e3f2de94682b20407e0a699a8fa27c31832/cc/trees/layer_tree_host.cc [modify] https://crrev.com/b5b55e3f2de94682b20407e0a699a8fa27c31832/cc/trees/layer_tree_host_common.cc
,
Jul 19 2017
My recommendation is to wait until M61. Since these are present in M59, and it's quite late (we've missed last M60 beta), I say let's wait until M61. +bustamante@ to confirm.
,
Jul 19 2017
Yeah I agree with waiting until M61.
,
Jul 19 2017
I just saw this crash in telemetry_perf_unittests on android_n5x_swarming_rel: 07-19 19:22:42.663 19050 19111 F chromium: [FATAL:property_tree.cc(368)] Check failed: property_trees.scroll_tree.current_scroll_offset( scroll_node->element_id) == scroll_offset. https://build.chromium.org/p/tryserver.chromium.android/builders/android_n5x_swarming_rel https://chromium-swarm.appspot.com/task?id=37750e1b0334b110&refresh=10&show_raw=1 There are many other failures for that test in the last 200 builds but I don't know if that's related to property trees. Are these crashes related to this bug?
,
Jul 19 2017
Its not a null node, so I doubt its related to this change. I think pdr@ was working on something with scroll tree crashes on telemetry tests. pdr@, do these crashes look related?
,
Jul 19 2017
Per #50 and #51 punting to M61
,
Jul 20 2017
I think the crashes in comment 52 are unrelated to this bug. Are they still happening? If so, please file a bug and assign to me to look into.
,
Jul 20 2017
Marking fixed since we don't need a merge to 60. |
|||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||
Comment 1 by weiliangc@chromium.org
, May 26 2017