New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 876508 link

Starred by 7 users

Issue metadata

Status: Fixed
Owner:
Closed: Oct 19
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug-Regression

Blocked on:
issue 875319



Sign in to add a comment

OOP-D telemetry memory regressions

Project Member Reported by hjd@google.com, Aug 21

Issue description

See the link to graphs below.
 
All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=876508

(For debugging:) Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?sid=b49beb9ff54d16c79c806eb7bf5bdc54f93810eb17d2fe2d106f3fd97e83e3e9


Bot(s) for this bug's original alert(s):

Win 7 Nvidia GPU Perf
Cc: fsam...@chromium.org
Owner: fsam...@chromium.org
Status: Assigned (was: Untriaged)
📍 Found a significant difference after 1 commit.
https://pinpoint-dot-chromeperf.appspot.com/job/12c1185e640000

Enable Viz Display Compositor on perf bots by fsamuel@chromium.org
https://chromium.googlesource.com/chromium/src/+/209e15b44e0c2c2d127b5c9a4416b996d663cc5e
1.913e+07 → 2.089e+07 (+1.768e+06)

Understanding performance regressions:
  http://g.co/ChromePerformanceRegressions
Cc: tdres...@chromium.org samans@chromium.org kylec...@chromium.org sadrul@chromium.org ericrk@chromium.org
 Issue 875017  has been merged into this issue.
Issue 875317 has been merged into this issue.
 Issue 875318  has been merged into this issue.
Issue 875335 has been merged into this issue.
 Issue 875363  has been merged into this issue.
Issue 875378 has been merged into this issue.
Issue 875420 has been merged into this issue.
Issue 876004 has been merged into this issue.
 Issue 876013  has been merged into this issue.
 Issue 876084  has been merged into this issue.
Cc: sande...@chromium.org
 Issue 875508  has been merged into this issue.
 Issue 876031  has been merged into this issue.
Owner: kylec...@chromium.org
Cc: piman@chromium.org sunn...@chromium.org
(posting on the correct bug this time)

The cc memory regression is an accounting problem. DisplayResourceProvider and ResourcePool are both counting the same texture memory. 

Let RP, T1, T2, DRP be MemoryAllocationDumps. RP is from ResourcePool that created the tile. T1 is from the TextureManager instance for the command buffer in the same client that has the ResourcePool. T2 is from the TextureManager instance for the command buffer in the same client as the DisplayResourceProvider. DRP is from DisplayResourceProvider. Let (guid:values) define a GUID, generated from values, linking MemoryAllocationDumps together. Let <-> indicate an edge between vertices in the memory dump graph. The way memory dumps are added today is the following:

RP <-> (guid:command_buffer_id1+client_texture_id) <-> T1 <-> (guid:command_buffer_id1+service_texture_id)
(guid:command_buffer_id2+service_texture_id) <-> T2 <-> (guid:command_buffer_id2+client_texture_id) <-> DRP

If command_buffer_id1 and command_buffer_id2 are the same then all of the memory dumps for a texture are linked together. The memory will only be counted once, although the trace viewer UI doesn't display this link correctly.

Without OOP-D browser textures the command buffers have the same id, so the memory is only counted once. Without OOP-d renderer textures have a different command buffer id than DisplayResourceProvider, so the memory is double counted. With OOP-D both renderer and browser command buffers are different than DisplayResourceProvider command buffer and memory is double counted. The regression here is the extra double counted memory.

A potential fix is to change the GUID for the service_texture_id to be generated based on the share group, like so:

RP <-> (guid:command_buffer_id/client_texture_id) <-> T1 <-> (guid:share_group_id/service_texture_id) <-> T2 <-> (guid:command_buffer_id/client_texture_id) <-> DRP

Normally contexts need to be in the same share group to use the same texture. This would work correctly for browser and renderer, with and without OOP-D. The complication I've run into here so far is that when using pass through command buffer textures are shared between contexts in different share groups. Using a constant share group id for the GUID with pass through command buffer might work?

sunnyps came to a similar conclusion on  https://crbug.com/875017#c4 .
+ericrk
Cc: geoffl...@chromium.org
Components: Internals>GPU>Internals
For passthrough, contexts use different (driver-level) GLShareGroup, to be 1:1 with client ContextGroups, but texture IDs are still shared across share groups thanks to an ANGLE extension, so for the purpose of tracking service side textures, it's just as if they're in the same share group, so we can definitely create a constant share group id for them.
Summary: OOP-D telemetry memory regressions (was: 6.4% regression in system_health.memory_desktop at 583498:583567)
Just had an offline discussion with piman@.  Even though we use separate share groups, the EGL_ANGLE_display_texture_share_group extension enables sharing of texture objects (only) across all contexts.  And on other platforms we have a single GLShareGroup instance anyway.  I suggest not using share group tracing guid for generation service texture guids, or using a constant value for all share groups if that makes sense.
I'll change my implementation in https://crrev.com/c/1211955 to not use the share group id. That's even simpler.
Project Member

Comment 14 by bugdroid1@chromium.org, Sep 14

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/d167a866a8f6c4d270830501a6fbd623128a32ad

commit d167a866a8f6c4d270830501a6fbd623128a32ad
Author: kylechar <kylechar@chromium.org>
Date: Fri Sep 14 20:16:29 2018

Improve GL texture memory reporting.

When using InProcessCommandBuffer for OOP-D we weren't tracking texture
memory correctly as there was no MemoryTracker instance created along
with the ContextGroup.

This CL fixes that by pulling GpuCommandBufferMemoryTracker out of
CommandBufferStub into it's own header. GpuCommandBufferMemoryTracker is
generalized a bit not to require a GpuChannel* on construction but
otherwise functions the same. For platforms other than WebView,
InProcessCommandBuffer will create a GpuCommandBufferMemoryTracker.

Unfortunately just implementing MemoryTracker doesn't fix memory
reporting. MemoryAllocatorDumps for the same texture from multiple
instances of TextureManager aren't linked together. This is because
ShareGroupTracingGUID() doesn't return a share group id, it returns a
command buffer id, so TextureManagers for different command buffers
won't generate the same GUID.

Fix this by removing ShareGroupTracingGUID(). All contexts are expected
to be in the same share group, or virtualized to the same context in
ANGLE, so differentiating between share groups isn't necessary. Add
CommandBufferId() for when the command buffer id is desired.

Bug:  876508 
Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel
Change-Id: Ie084061d321ffa49ca7c31619707e60dfca3dd97
Reviewed-on: https://chromium-review.googlesource.com/1211955
Reviewed-by: Antoine Labour <piman@chromium.org>
Reviewed-by: Bo <boliu@chromium.org>
Reviewed-by: Jonathan Backer <backer@chromium.org>
Commit-Queue: kylechar <kylechar@chromium.org>
Cr-Commit-Position: refs/heads/master@{#591448}
[modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/android_webview/browser/deferred_gpu_command_service.cc
[modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/android_webview/browser/deferred_gpu_command_service.h
[modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/base/trace_event/memory_infra_background_whitelist.cc
[modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/command_buffer/service/BUILD.gn
[modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/command_buffer/service/buffer_manager.cc
[modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/command_buffer/service/gles2_cmd_decoder_unittest.cc
[add] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/command_buffer/service/gpu_command_buffer_memory_tracker.cc
[add] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/command_buffer/service/gpu_command_buffer_memory_tracker.h
[modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/command_buffer/service/memory_tracking.h
[modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/command_buffer/service/mocks.h
[modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/command_buffer/service/raster_decoder_unittest.cc
[modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/command_buffer/service/renderbuffer_manager.cc
[modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/command_buffer/service/texture_manager.cc
[modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/command_buffer/service/texture_manager_unittest.cc
[modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/ipc/command_buffer_task_executor.h
[modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/ipc/gpu_in_process_thread_service.cc
[modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/ipc/gpu_in_process_thread_service.h
[modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/ipc/in_process_command_buffer.cc
[modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/ipc/service/command_buffer_stub.cc
[modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/ipc/service/gles2_command_buffer_stub.cc
[modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/ipc/service/raster_command_buffer_stub.cc
[modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/ipc/service/shared_image_stub.cc
[modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/ipc/service/shared_image_stub.h
[modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/ui/gl/trace_util.cc
[modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/ui/gl/trace_util.h

Confirming at #14 caused the expected drop in cc::effective_size on Android, Mac and Linux. That aspect of double counting should be fixed. No change seen on Windows due to missing MemoryDumpReporter code with passthrough command decoder.

https://chromeperf.appspot.com/report?sid=eee4efc070809d130323f1f210cba1c3a364d4e0549c2ca793b1360a5ad680f6
Cc: mustaq@chromium.org
 Issue 879199  has been merged into this issue.
Cc: farahcharab@google.com
 Issue 879203  has been merged into this issue.
Cc: -farahcharab@google.com -mustaq@chromium.org
Removing people that got cc'd from the merged bugs.
Project Member

Comment 19 by bugdroid1@chromium.org, Sep 18

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/1853b734e013824b9a25a24f172ccd6a15cdf2b6

commit 1853b734e013824b9a25a24f172ccd6a15cdf2b6
Author: kylechar <kylechar@chromium.org>
Date: Tue Sep 18 22:57:35 2018

Have VizProcessContextProvider use InProcessCommandBuffer directly.

VizProcessContextProvider relies on GLInProcessContext to own/initialize
some classes necessary to support a command buffer. We'll need to access
even more internal state from GLInProcessContext for memory reporting
which means adding another accessor. Instead of adding another
accessors, move initialization logic into VizProcessContextProvider.
This removes a step of indirection and makes the code a bit easier to
understand.

Bug:  876508 
Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel
Change-Id: Ib66ad5596e6c349695a25279b65792260ea89105
Reviewed-on: https://chromium-review.googlesource.com/1231829
Reviewed-by: Bo <boliu@chromium.org>
Reviewed-by: Antoine Labour <piman@chromium.org>
Commit-Queue: kylechar <kylechar@chromium.org>
Cr-Commit-Position: refs/heads/master@{#592241}
[modify] https://crrev.com/1853b734e013824b9a25a24f172ccd6a15cdf2b6/android_webview/browser/aw_render_thread_context_provider.cc
[modify] https://crrev.com/1853b734e013824b9a25a24f172ccd6a15cdf2b6/cc/test/test_in_process_context_provider.cc
[modify] https://crrev.com/1853b734e013824b9a25a24f172ccd6a15cdf2b6/components/viz/common/gl_helper_benchmark.cc
[modify] https://crrev.com/1853b734e013824b9a25a24f172ccd6a15cdf2b6/components/viz/common/gl_helper_unittest.cc
[modify] https://crrev.com/1853b734e013824b9a25a24f172ccd6a15cdf2b6/components/viz/common/yuv_readback_unittest.cc
[modify] https://crrev.com/1853b734e013824b9a25a24f172ccd6a15cdf2b6/components/viz/service/BUILD.gn
[modify] https://crrev.com/1853b734e013824b9a25a24f172ccd6a15cdf2b6/components/viz/service/display_embedder/viz_process_context_provider.cc
[modify] https://crrev.com/1853b734e013824b9a25a24f172ccd6a15cdf2b6/components/viz/service/display_embedder/viz_process_context_provider.h
[modify] https://crrev.com/1853b734e013824b9a25a24f172ccd6a15cdf2b6/gpu/ipc/client/gpu_in_process_context_tests.cc
[modify] https://crrev.com/1853b734e013824b9a25a24f172ccd6a15cdf2b6/gpu/ipc/gl_in_process_context.cc
[modify] https://crrev.com/1853b734e013824b9a25a24f172ccd6a15cdf2b6/gpu/ipc/gl_in_process_context.h
[modify] https://crrev.com/1853b734e013824b9a25a24f172ccd6a15cdf2b6/ui/compositor/test/in_process_context_provider.cc

Project Member

Comment 20 by bugdroid1@chromium.org, Sep 19

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/e7eca0a627c8289ec11efd27a26df8fe5c81e7e7

commit e7eca0a627c8289ec11efd27a26df8fe5c81e7e7
Author: kylechar <kylechar@chromium.org>
Date: Wed Sep 19 15:49:18 2018

Add memory dump support to VizProcessContextProvider.

VizProcessContextProvider is missing memory reporting logic found in
ContextProviderCommandBuffer. Make VizProcessContextFactory a
MemoryDumpProvider so display compositor gpu memory is reported again.

This is expected to have an impact on the gpu and skia memory tracing
categories. This isn't a regression itself, it's just fixing memory
reporting that was broken when VizDisplayCompositor experiment was
enabled by fieldtrial test config.

Bug:  876508 
Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel
Change-Id: Ia885fccd122dbfc8909f62e0a970c10ab3499632
Reviewed-on: https://chromium-review.googlesource.com/1209985
Reviewed-by: Jonathan Backer <backer@chromium.org>
Commit-Queue: kylechar <kylechar@chromium.org>
Cr-Commit-Position: refs/heads/master@{#592417}
[modify] https://crrev.com/e7eca0a627c8289ec11efd27a26df8fe5c81e7e7/components/viz/service/display_embedder/viz_process_context_provider.cc
[modify] https://crrev.com/e7eca0a627c8289ec11efd27a26df8fe5c81e7e7/components/viz/service/display_embedder/viz_process_context_provider.h

Issue 860342 has been merged into this issue.
Blockedon: 892850
There was an increase in malloc'd memory and gpu memory with OOP-D enabled. The gpu memory category increase was mainly after landing fixes for memory tracing with OOP-D, not right away when OOP-D was enabled. This increase is primary due to command buffer and transfer buffer memory.

Before OOP-D with a single browser window there was one context for the ui::Compositor/viz::Display. There is a 1mb of command buffer and some transfer buffer memory allocated in shared memory for the context.

After OOP-D with a single browser window there is still one context for the ui::Compositor. There is also a second context for the viz::Display in the GPU process. There is no process boundary being crossed here so command buffer and transfer buffer memory isn't in shared memory.

With OOP-D there is usually going to be one extra context (at most one extra since ui::Compositors share the same context now). This represents a real memory usage increase for the extra command buffer and transfer buffer memory. In the future the display compositor(s) won't need command buffers and will talk directly to skia, but until then it's a regression to live with.

The buffer memory used for the OOP-D display compositors is also reported differently. The IPC buffer memory uses shared memory, and memory tracing will link together the gpu and shared_memory dumps, so that the shared_memory category ends up with an effective size of 0 and gpu category counts the memory. The OOP-D display compositor buffer memory isn't shared_memory but instead malloc'd. The malloc category doesn't have detailed allocations, so we can't link up the gpu memory dump with a malloc memory dump to only count one. Both gpu and malloc categories will count the buffer memory as a result. This is just a reporting issue but it makes the regression look worse.

I think  crbug.com/892850  is the only OOP-D memory regression left to deal with.
To be clear,  crbug.com/892850  is an issue with instrumentation --- it is not a regression in performance. There is a narrow window of time when some tiles are not attributed to the client (time between SharedImage creation and RasterTask completing on compositor impl thread). This will manifest as a reported regression in GPU process memory usage.

It's also likely that we won't fix the instrumentation (see https://bugs.chromium.org/p/chromium/issues/detail?id=892850#c11), because fixes for the instrumentation would hurt performance (e.g. unnecessary locking).

I wouldn't block on that bug...
Blockedon: -892850 875319
Ah, I meant to link/block on  crbug.com/875319  instead.
Status: Fixed (was: Assigned)
Cc: robertph...@google.com herb@google.com allanmac@google.com maxlg@chromium.org
 Issue 880384  has been merged into this issue.

Sign in to add a comment