New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 707281 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Apr 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux
Pri: 2
Type: Task



Sign in to add a comment

SPv2: Assorted virtual/threaded/animation LayoutTests crashing in PropertyTrees::GetAnimationScales.

Project Member Reported by wkorman@chromium.org, Mar 31 2017

Issue description

~10 layout tests crashing at ToT in both debug and release, I'm pretty sure this is a recent regression, investigating.

% ./third_party/WebKit/Tools/Scripts/run-webkit-tests --additional-driver-flag=--enable-slimming-paint-v2 virtual/threaded/animations

virtual/threaded/animations/composited-animations-rotate-zero-degrees.html	crash log sample
virtual/threaded/animations/responsive-neutral-keyframe.html	crash log sample stderr
virtual/threaded/animations/timing-functions.html	crash log sample stderr
virtual/threaded/animations/transitions-retarget.html	crash log sample
virtual/threaded/animations/animation-offscreen-to-onscreen.html	crash log sample stderr
virtual/threaded/animations/inline-block-transform.html	crash log sample
virtual/threaded/animations/3d/matrix-transform-type-animation.html	crash log sample stderr
virtual/threaded/animations/viewport-unit-animation-responsive.html	crash log sample stderr
virtual/threaded/animations/img-element-transform.html	crash log sample
virtual/threaded/animations/element-animate-positive-delay.html

Potentially related to existing already filed bugs but looking into these specifically right now as I'm seeing continued seemingly flaky crashing across a range of tests including these and others in attempting to land https://codereview.chromium.org/2724083002/

Sample crash stack for virtual/threaded/animations/transitions-retarget.html and similar for virtual/threaded/animations/img-element-transform.html:

#4 0x7ff49a7b9e40 cc::LayerImpl::GetMutatorHost()
#5 0x7ff49aa9c6e4 cc::PropertyTrees::GetAnimationScales()
#6 0x7ff49a7db6fb cc::PictureLayerImpl::RecalculateRasterScales()
#7 0x7ff49a7da442 cc::PictureLayerImpl::UpdateTiles()
#8 0x7ff49aa6db66 cc::LayerTreeImpl::UpdateDrawProperties()
#9 0x7ff49aa343ad cc::LayerTreeHostImpl::UpdateSyncTreeAfterCommitOrImplSideInvalidation()
#10 0x7ff49aa342cc cc::LayerTreeHostImpl::CommitComplete()
#11 0x7ff49aadea06 cc::ProxyImpl::ScheduledActionCommit()
#12 0x7ff49a955038 cc::Scheduler::ProcessScheduledActions()
#13 0x7ff49a9560fe cc::Scheduler::NotifyReadyToCommit()
#14 0x7ff49aada94c cc::ProxyImpl::NotifyReadyToCommitOnImpl()

See also  http://crbug.com/692310#c15 
 
Suspect change:

https://codereview.chromium.org/2738573002 - Streamline the presentation of ImageBitmapRenderingContext

Double checking to see if this is indeed cause.

Background -- this looks to be the first build that shows a batch of tests like this failing:

https://build.chromium.org/p/tryserver.chromium.linux/builders/linux_layout_tests_slimming_paint_v2/builds/3484

A near-preceding build that does not show those failing:

https://build.chromium.org/p/tryserver.chromium.linux/builders/linux_layout_tests_slimming_paint_v2/builds/3482
LOL that change is not yet landed. Research continues.
Cc: weiliangc@chromium.org
Bisected to breaking change as:

http://crrev.com/2758343002 cc: Use Element Id to Record Animation Changes

I will look more into why this broke SPv2 composited animations. But to prevent similar in future, pdr@ when can we auto-add the SPv2 trybot to changes in cc?
Pinged  http://crbug.com/703478  on bot capacity. Propose that we send note to graphics-dev@ asking whether now is a good time to add to existing cc/PRESUBMIT.py.
Cc: sunxd@chromium.org
The crash seen in PropertyTrees::GetAnimationScales is around a line modified as part of:

http://crrev.com/2105673003 - cc:: Compute animation scale on demand

I wonder whether:

http://crrev.com/2758343002 - cc: Use Element Id to Record Animation Changes

somehow missed invalidating/marking-for-update the transform cache where transforms are being animated. From the animation scale change description above:

"...computing animation scale may reuse animation scale data which
belongs to other transform nodes, so we need a PropertyTreesCachedData
to store this information, and invalidate it every time we update the
transform tree."

What we are specifically seeing however around the crash is that we somehow have a transform node with an owning_layer_id = -1. How does that happen? Is it ever a reasonable thing to see? Going to compare what we saw there before and after http://crrev.com/2758343002.
At the change immediately preceding http://crrev.com/2758343002 we never see a call to PropertyTrees::GetAnimationScales for virtual/threaded/animations/img-element-transform.html under SPv2.

Syncing to http://crrev.com/2758343002 and running test case in content shell we see a succession of calls to PropertyTrees::GetAnimationScales and then eventually the noted crash. Sample log below.

Compare was done synced to:

1. git checkout 1f4b78a69bd2994e484c7e5e7dc5743bead8a721 (breaking change)
2. git checkout 14d16f0ea00e7716032e6452ee8d04d3a54d9e59 (change before the breaking change)

The test case I'm currently focusing on as one of the batch of failures is a simple test that attempts to prove that an image element can be animated with a simple rotation transform. So, it creates an image element and then calls target.animate({transform: ['rotate(0deg)', 'rotate(180deg)']}, 100000) and animation.ready. We see the crash after animate() but before the ready promise is resolved.

Next planned step is to understand what about the patch leads to new calls to PropertyTrees::GetAnimationScales. There are only two callsites:

a. PictureLayerImpl::RecalculateRasterScales
b. a recursive call within PropertyTrees::GetAnimationScales which can't be the culprit since we see no logging at all synced to #2.

So, the change must have introduced a call to PictureLayerImpl::RecalculateRasterScales that wasn't seen previously.

It's possible this is a red herring, ex. that while the change did introduce these differences, they are not directly responsible for the bug. But everything starts somewhere.

Sample run command:

% ./out/Default/content_shell --expose-internals-for-testing --enable-blink-features=WebAnimationsAPI,CSSAdditiveAnimations,StackedCSSPropertyAnimations,WebAnimationsSVG --enable-slimming-paint-v2 --enable-threaded-compositing --disable-composited-antialiasing third_party/WebKit/LayoutTests/virtual/threaded/animations/img-element-transform.html

Log from running at #1 with instrumentation:

[1:5:0403/165458.261434:4502700905863:ERROR:property_tree.cc(1814)] PropertyTrees::GetAnimationScales [transform_node_id=4].
[1:5:0403/165458.261791:4502700906157:ERROR:property_tree.cc(1814)] PropertyTrees::GetAnimationScales [transform_node_id=3].
[1:5:0403/165458.262020:4502700906382:ERROR:property_tree.cc(1814)] PropertyTrees::GetAnimationScales [transform_node_id=2].
[1:5:0403/165458.262238:4502700906688:ERROR:property_tree.cc(1814)] PropertyTrees::GetAnimationScales [transform_node_id=1].
[1:5:0403/165458.262605:4502700907142:ERROR:property_tree.cc(1814)] PropertyTrees::GetAnimationScales [transform_node_id=0].
[1:5:0403/165458.263068:4502700907609:ERROR:property_tree.cc(1872)] About to enter if-sequence [node=0x12e1b9f1da20].
[1:5:0403/165458.263524:4502700908072:ERROR:property_tree.cc(1887)] Case 2 [node=0x12e1b9f1da20].
[1:5:0403/165458.263997:4502700908473:ERROR:property_tree.cc(1872)] About to enter if-sequence [node=0x12e1b9f1db70].
[1:5:0403/165458.264405:4502700908889:ERROR:property_tree.cc(1887)] Case 2 [node=0x12e1b9f1db70].
[1:5:0403/165458.264798:4502700909264:ERROR:property_tree.cc(1872)] About to enter if-sequence [node=0x12e1b9f1dcc0].
[1:5:0403/165458.265156:4502700909553:ERROR:property_tree.cc(1887)] Case 2 [node=0x12e1b9f1dcc0].
[1:5:0403/165458.265444:4502700909887:ERROR:property_tree.cc(1872)] About to enter if-sequence [node=0x12e1b9f1de10].
[1:5:0403/165458.265806:4502700910356:ERROR:property_tree.cc(1887)] Case 2 [node=0x12e1b9f1de10].
[1:5:0403/165458.266270:4502700910743:ERROR:property_tree.cc(1872)] About to enter if-sequence [node=0x12e1b9f1df60].
[1:5:0403/165458.266637:4502700911058:ERROR:property_tree.cc(1907)] PropertyTrees::GetAnimationScales [node=0x12e1b9f1df60, owning_layer_id=-1].
[1:5:0403/165458.266982:4502700911400:FATAL:property_tree.cc(1909)] Check failed: layer_impl. 

PictureLayerImpl::RecalculateRasterScales is called at both sync points.

After http://crrev.com/2758343002 draw_properties().screen_space_transform_is_animating is true after the animate() call.

Before the change it is always false and so we do not enter the block in PictureLayerImpl::RecalculateRasterScales that calls PropertyTrees::GetAnimationScales.

I've been trying to comment out logic in the change to see what introduces this state difference but I haven't found it yet.
Need to better understand:
- ComputeDrawPropertiesOfVisibleLayers()
- layer->draw_properties().screen_space_transform_is_animating
- transform_node->to_screen_is_potentially_animated

At breaking change:

[1:5:0403/174454.534495:4505697178891:ERROR:draw_property_utils.cc(935)] ComputeDrawPropertiesOfVisibleLayers [layer=33, node=4, to_screen_is_potentially_animated=1].
[1:5:0403/174454.535672:4505697180118:ERROR:picture_layer_impl.cc(1001)] PictureLayerImpl::RecalculateRasterScales [draw_properties().screen_space_transform_is_animating=1].
(...subsequent calls to PropertyTrees::GetAnimationScales which end up crashing)

Before breaking change:

[1:5:0403/174833.350908:4505915995274:ERROR:draw_property_utils.cc(935)] ComputeDrawPropertiesOfVisibleLayers [layer=33, node=4, to_screen_is_potentially_animated=0].
[1:5:0403/174833.351687:4505915997652:ERROR:picture_layer_impl.cc(1001)] PictureLayerImpl::RecalculateRasterScales [draw_properties().screen_space_transform_is_animating=0].
(...no calls to PropertyTrees::GetAnimationScales because screen_space_transform_is_animating is always false)
Breaking change causes transform node id=4 to have has_potential_animation=1 which propagates to to_screen_is_potentially_animated.

At breaking change:

TransformTree::UpdateAnimationProperties [node=4, to_screen_is_potentially_animated=1, has_potential_animation=1, ancestor_is_animating=0].
TransformTree::UpdateAnimationProperties [node=4, to_screen_is_potentially_animated=1, has_potential_animation=1, ancestor_is_animating=0].
ComputeDrawPropertiesOfVisibleLayers [layer=33, node=4, to_screen_is_potentially_animated=1].
PictureLayerImpl::RecalculateRasterScales [draw_properties().screen_space_transform_is_animating=1].

Before breaking change:

TransformTree::UpdateAnimationProperties [node=4, to_screen_is_potentially_animated=0, has_potential_animation=0, ancestor_is_animating=0].

From further instrumentation it appears that LayerTreeHostImpl::ElementIsAnimatingChanged is setting transform_node->has_potential_animation = true whereas previously we didn't.

In LayerTreeHostImpl we used to have:

LayerImpl* layer = tree->LayerByElementId(element_id);
if (layer)
  layer->OnIsAnimatingChanged(mask, state);

Now we have essentially inlined OnIsAnimatingChanged, and we call it irrespective of whether there is a layer, but neither of those should matter.

Previously Layer::OnIsAnimatingChanged called transform_tree.UpdateNodeFromOwningLayerId(id()) and was always seeing a null transform node. Log excerpt for before:

(Calling animate.)
Layer::OnIsAnimatingChanged checking transform property [id=31, transform_node=0]
(Called animate.)
Layer::OnIsAnimatingChanged checking transform property [id=15, transform_node=0].

but now we are using the element-id-to-transform-node map via transform_tree.FindNodeFromElementId(element_id) and we find the transform node for the animation's involved element id, and so we proceed which leads to setting transform_node->has_potential_animation.

The before-breaking-change behavior seems conceptually wrong given that the test case does have a transform animation. Perhaps new behavior is correct but we are not properly set up with scale cache data or something like that.
For SPv2 we never set cc::TransformNode.owning_layer_id in PropertyTreeManager, though we do for other node types. I'm not sure whether this was an intentional oversight.

The owning layer id is currently required for the scale computation, hence we see crash noted in bug original description.

Setting it as we do for other node types fixes the crash and seems an appropriate fix until we rework the involved cc-side logic to use element id and property nodes rather than layers (presuming that's feasible). Preparing a change for review now.

I will sub-title the above bug investigation research notes "Blind Descent: The Quest to Discover the Deepest Cave on Earth":

https://www.amazon.com/Blind-Descent-Quest-Discover-Deepest/dp/0812979494
It should definitely be feasible to stop using owner id.
Cc: chrishtr@chromium.org
Notes to briefly scope what would be obviously involved in modifying cc animation logic to avoid using layer here, focusing on all refs to cc::TransformNode.owning_layer_id. The specific issue this bug hit is (3).

1. LayerUtils::GetAnimationBounds: we get the layer from the node in order to call
   LayerUtils::HasTransformAnimationThatInflatesBounds and LayerImpl::TransformAnimationBoundsForBox.

   a. LayerUtils::HasTransformAnimationThatInflatesBounds: pass-through that calls LayerImpl::HasTransformAnimationThatInflatesBounds, which in turn calls GetMutatorHost()->HasTransformAnimationThatInflatesBounds(element_id()). So rather than knowing the mutator host and element id through having looked up the layer holding that data, we'd need a way to instead get from the transform node to the element id and associated mutator host. Unsure whether something exists to enable this, but not hard to add, assuming overhead is acceptable (see also bug link in (4) below).

   b. LayerImpl::TransformAnimationBoundsForBox: looks same as 1(a) above.

2. draw_property_utils TransformTreeIndexForBackfaceVisibility : checks layer id vs the transform node's owning layer id. Could use element id instead, requires element id field on node, or node-to-element-id map, see similar in (4) below.

3. PropertyTrees::GetAnimationScales: looks up layer via node's owning layer id in order to:

   a. call MutatorHost::MaximumTargetScale() with layer's element id. Fixing is similar to 1(a) above.

   b. call MutatorHost::AnimationStartScale() with layer's element id, and LayerImpl::GetElementTypeForAnimation(). Fixing the first part is similar to 1(a) above. LayerImpl::GetElementTypeForAnimation() scrutinizes LayerImpl::IsActive(), so we'd need some other way to determine whether the involved animation is active on its involved layer / element-id. Not sure if something already exists to enable, but could add.

4. TransformNode.operator==: change to use element_id, should be straightforward except that cc::TransformNode does not have element_id on it directly currently. See also discussion in  http://crbug.com/702774#c5 . Perhaps there is a map we can use instead, or we'd need to add field or build a reverse map.
Project Member

Comment 13 by bugdroid1@chromium.org, Apr 6 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/f03d05f724aabaf8e9db6df23bb38dbc1fe04ff6

commit f03d05f724aabaf8e9db6df23bb38dbc1fe04ff6
Author: wkorman <wkorman@chromium.org>
Date: Thu Apr 06 20:29:01 2017

Set owning layer id on transform nodes in SPv2.

http://crrev.com/2758343002 switched LayerTreeHostImpl to
look up transform nodes by element id rather than layer id.

This led to identifying a transform animation change in
SPv2 in some cases that were not identified previously.

This in turn led to raster scale computation crashing due
to expecting an owning layer id on the involved transform
node, whereas our SPv2 compositing logic that builds the
cc transform node was not setting an owning layer id.

We should update cc-side animation logic to use element
id and property tree nodes wherever possible rather than
layers, but in interim, setting an owning layer id on the
cc transform node is an acceptable solution.

BUG= 707281 
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_layout_tests_slimming_paint_v2

Review-Url: https://codereview.chromium.org/2800983002
Cr-Commit-Position: refs/heads/master@{#462605}

[modify] https://crrev.com/f03d05f724aabaf8e9db6df23bb38dbc1fe04ff6/third_party/WebKit/LayoutTests/FlagExpectations/enable-slimming-paint-v2
[modify] https://crrev.com/f03d05f724aabaf8e9db6df23bb38dbc1fe04ff6/third_party/WebKit/Source/platform/graphics/compositing/PaintArtifactCompositorTest.cpp
[modify] https://crrev.com/f03d05f724aabaf8e9db6df23bb38dbc1fe04ff6/third_party/WebKit/Source/platform/graphics/compositing/PropertyTreeManager.cpp

Status: Fixed (was: Started)

Sign in to add a comment