New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 875319 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Oct 17
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug-Regression

Blocking:
issue 876508



Sign in to add a comment

161.8% regression in memory.desktop at 583519:583572

Project Member Reported by tdres...@chromium.org, Aug 17

Issue description

See the link to graphs below.
 
All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=875319

(For debugging:) Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?sid=382a5c758ae065e283210d38ec205d6ef55481daeeaaeafba75425974cd6cd26


Bot(s) for this bug's original alert(s):

linux-perf
Cc: fsam...@chromium.org
Owner: fsam...@chromium.org
Status: Assigned (was: Untriaged)
📍 Found a significant difference after 1 commit.
https://pinpoint-dot-chromeperf.appspot.com/job/118692ea640000

Enable Viz Display Compositor on perf bots by fsamuel@chromium.org
https://chromium.googlesource.com/chromium/src/+/209e15b44e0c2c2d127b5c9a4416b996d663cc5e
4.589e+06 → 1.127e+07 (+6.677e+06)

Understanding performance regressions:
  http://g.co/ChromePerformanceRegressions
Cc: kylec...@chromium.org sadrul@chromium.org samans@chromium.org
Owner: kylec...@chromium.org
Reassigning to kylechar@ for triage. 

Comment 6 Deleted

Cc: -sunn...@chromium.org -ericrk@chromium.org -piman@chromium.org
I meant to add the above to  https://crbug.com/876508  instead.
Cc: backer@chromium.org piman@chromium.org
This is a real regression. How texture allocations work have changed a bit since the original alert and now tiles use SharedImages. I can still reproduce the original regression when running with --disable-features=VizDisplayCompositor vs --enable-features=VizDisplayCompositor.

The browser calls SharedImageInterfaceProxy::CreateSharedImage() for tile textures. A bit after the video goes full screen, that browser tries to free that tile memory and calls SharedImageInterfaceProxy::DestroySharedImage() for most of the tile textures.

SharedImageInterfaceProxy::DestroySharedImage() enqueues a deferred message at [1]. Without OOP-D those messages get flushed pretty quickly and memory is freed. With OOP-D those messages aren't getting flushed while the fullscreen video is open, probably because the browser UI compositor is no longer drawing anything and that GPU channel wouldn't be doing much. The IPC to destroy shared images isn't sent and memory isn't freed until the video is no longer full screen.

I've attached traces for oop-d disabled and enabled. The entries in the GPU process for gpu/shared-images/client_0x1 show the increase in memory.

piman/backer: Does this sound like something that needs to be fixed?

[1] https://cs.chromium.org/chromium/src/gpu/ipc/client/shared_image_interface_proxy.cc?l=52&rcl=d385e20f394c56a28715b7720401d3172cab4d04
Sorry, I don't see attachments.

This should be fixed. I'm happy to take a look early next week. I helped debug a similar issue where we just needed to FlushPendingWork (https://cs.chromium.org/chromium/src/cc/resources/resource_pool.cc?rcl=113faa68a0173e2667de80f546ddc81a5365a7ad&l=512).
Adding the attachments for real.
trace_disabled.html
3.8 MB View Download
trace_oopd.html
3.8 MB View Download
Cc: -backer@chromium.org
Owner: backer@chromium.org
Blocking: 876508
Status: Started (was: Assigned)
Looking...
This is similar to what I've seen before so pretty easy to find.

UI related GPU memory is reclaimed in ResourcePool::EvictExpiredResources. There's logic in there to reclaim memory that hasn't been reused in the 5 seconds since it was last freed. When we reclaim memory, we do it as a message to the GPU service (either as glDeleteTextures via command buffer pre-SharedImage or EnqueueDeferredMessage post-SharedImage). These messages are not received GPU service side until they are flushed. Without OOP-D, we had some pressure on the channel to flush it due to the draw calls issued by the display compositor. With OOP-D we no longer have this pressure.

Now there is logic in ResourcePool::EvictExpiredResources to force a flush, but it only kicks in if |unused_resources_.empty()|. I'm guessing we're recycling enough UI resources in this test that we never go down to 0.
It would be the main thread context created in VizProcessTransportFactory that needs to do the flush. It's created at [1] but also has automatic flushes set to false.

[1] https://cs.chromium.org/chromium/src/content/browser/compositor/viz_process_transport_factory.cc?l=81&rcl=8a1ad3a1a881b6fbc20c9c2ca03c46b94e272447
Actually I'm not sure if it's the main thread context, it could be the worker context, but both are created in VizProcessTransportFactory in the same way.
Project Member

Comment 18 by bugdroid1@chromium.org, Oct 17

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/9bf42619ce76159d90ab4b968208ad620c042ac2

commit 9bf42619ce76159d90ab4b968208ad620c042ac2
Author: Jonathan Backer <backer@chromium.org>
Date: Wed Oct 17 14:39:23 2018

Bound time to delete cc::Resource

With OOP-D, we no longer have draw call pressure inducing flushes.
https://crbug.com/c/875319 suggests that there are cases where we
reuse tiles enough in the UI compositor that we never hit the
 unused_resources_.empty() condition in the EvictExpiredResource method.

This CL adds a deadline for the flush, so that we wait at most 1 second.

Bug:  875319 
Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel
Change-Id: Ifa8d516699be7ae1eff47db01f8be55a0aae92c8
Reviewed-on: https://chromium-review.googlesource.com/c/1284435
Commit-Queue: Jonathan Backer <backer@chromium.org>
Reviewed-by: Antoine Labour <piman@chromium.org>
Reviewed-by: Sunny Sachanandani <sunnyps@chromium.org>
Cr-Commit-Position: refs/heads/master@{#600387}
[modify] https://crrev.com/9bf42619ce76159d90ab4b968208ad620c042ac2/cc/resources/resource_pool.cc
[modify] https://crrev.com/9bf42619ce76159d90ab4b968208ad620c042ac2/cc/resources/resource_pool.h

Status: Fixed (was: Started)
Fixes when I run the telemetry test locally.

Sign in to add a comment