Issue metadata
Sign in to add a comment
|
OOP-D telemetry memory regressions |
||||||||||||||||||||||
Issue descriptionSee the link to graphs below.
,
Aug 21
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/12c1185e640000
,
Aug 21
📍 Found a significant difference after 1 commit. https://pinpoint-dot-chromeperf.appspot.com/job/12c1185e640000 Enable Viz Display Compositor on perf bots by fsamuel@chromium.org https://chromium.googlesource.com/chromium/src/+/209e15b44e0c2c2d127b5c9a4416b996d663cc5e 1.913e+07 → 2.089e+07 (+1.768e+06) Understanding performance regressions: http://g.co/ChromePerformanceRegressions
,
Aug 22
Issue 875017 has been merged into this issue. Issue 875317 has been merged into this issue. Issue 875318 has been merged into this issue. Issue 875335 has been merged into this issue. Issue 875363 has been merged into this issue. Issue 875378 has been merged into this issue. Issue 875420 has been merged into this issue. Issue 876004 has been merged into this issue. Issue 876013 has been merged into this issue. Issue 876084 has been merged into this issue.
,
Aug 22
,
Aug 22
Issue 876031 has been merged into this issue.
,
Aug 22
,
Sep 10
(posting on the correct bug this time) The cc memory regression is an accounting problem. DisplayResourceProvider and ResourcePool are both counting the same texture memory. Let RP, T1, T2, DRP be MemoryAllocationDumps. RP is from ResourcePool that created the tile. T1 is from the TextureManager instance for the command buffer in the same client that has the ResourcePool. T2 is from the TextureManager instance for the command buffer in the same client as the DisplayResourceProvider. DRP is from DisplayResourceProvider. Let (guid:values) define a GUID, generated from values, linking MemoryAllocationDumps together. Let <-> indicate an edge between vertices in the memory dump graph. The way memory dumps are added today is the following: RP <-> (guid:command_buffer_id1+client_texture_id) <-> T1 <-> (guid:command_buffer_id1+service_texture_id) (guid:command_buffer_id2+service_texture_id) <-> T2 <-> (guid:command_buffer_id2+client_texture_id) <-> DRP If command_buffer_id1 and command_buffer_id2 are the same then all of the memory dumps for a texture are linked together. The memory will only be counted once, although the trace viewer UI doesn't display this link correctly. Without OOP-D browser textures the command buffers have the same id, so the memory is only counted once. Without OOP-d renderer textures have a different command buffer id than DisplayResourceProvider, so the memory is double counted. With OOP-D both renderer and browser command buffers are different than DisplayResourceProvider command buffer and memory is double counted. The regression here is the extra double counted memory. A potential fix is to change the GUID for the service_texture_id to be generated based on the share group, like so: RP <-> (guid:command_buffer_id/client_texture_id) <-> T1 <-> (guid:share_group_id/service_texture_id) <-> T2 <-> (guid:command_buffer_id/client_texture_id) <-> DRP Normally contexts need to be in the same share group to use the same texture. This would work correctly for browser and renderer, with and without OOP-D. The complication I've run into here so far is that when using pass through command buffer textures are shared between contexts in different share groups. Using a constant share group id for the GUID with pass through command buffer might work? sunnyps came to a similar conclusion on https://crbug.com/875017#c4 .
,
Sep 10
+ericrk
,
Sep 10
For passthrough, contexts use different (driver-level) GLShareGroup, to be 1:1 with client ContextGroups, but texture IDs are still shared across share groups thanks to an ANGLE extension, so for the purpose of tracking service side textures, it's just as if they're in the same share group, so we can definitely create a constant share group id for them.
,
Sep 11
,
Sep 13
Just had an offline discussion with piman@. Even though we use separate share groups, the EGL_ANGLE_display_texture_share_group extension enables sharing of texture objects (only) across all contexts. And on other platforms we have a single GLShareGroup instance anyway. I suggest not using share group tracing guid for generation service texture guids, or using a constant value for all share groups if that makes sense.
,
Sep 13
I'll change my implementation in https://crrev.com/c/1211955 to not use the share group id. That's even simpler.
,
Sep 14
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/d167a866a8f6c4d270830501a6fbd623128a32ad commit d167a866a8f6c4d270830501a6fbd623128a32ad Author: kylechar <kylechar@chromium.org> Date: Fri Sep 14 20:16:29 2018 Improve GL texture memory reporting. When using InProcessCommandBuffer for OOP-D we weren't tracking texture memory correctly as there was no MemoryTracker instance created along with the ContextGroup. This CL fixes that by pulling GpuCommandBufferMemoryTracker out of CommandBufferStub into it's own header. GpuCommandBufferMemoryTracker is generalized a bit not to require a GpuChannel* on construction but otherwise functions the same. For platforms other than WebView, InProcessCommandBuffer will create a GpuCommandBufferMemoryTracker. Unfortunately just implementing MemoryTracker doesn't fix memory reporting. MemoryAllocatorDumps for the same texture from multiple instances of TextureManager aren't linked together. This is because ShareGroupTracingGUID() doesn't return a share group id, it returns a command buffer id, so TextureManagers for different command buffers won't generate the same GUID. Fix this by removing ShareGroupTracingGUID(). All contexts are expected to be in the same share group, or virtualized to the same context in ANGLE, so differentiating between share groups isn't necessary. Add CommandBufferId() for when the command buffer id is desired. Bug: 876508 Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel Change-Id: Ie084061d321ffa49ca7c31619707e60dfca3dd97 Reviewed-on: https://chromium-review.googlesource.com/1211955 Reviewed-by: Antoine Labour <piman@chromium.org> Reviewed-by: Bo <boliu@chromium.org> Reviewed-by: Jonathan Backer <backer@chromium.org> Commit-Queue: kylechar <kylechar@chromium.org> Cr-Commit-Position: refs/heads/master@{#591448} [modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/android_webview/browser/deferred_gpu_command_service.cc [modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/android_webview/browser/deferred_gpu_command_service.h [modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/base/trace_event/memory_infra_background_whitelist.cc [modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/command_buffer/service/BUILD.gn [modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/command_buffer/service/buffer_manager.cc [modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/command_buffer/service/gles2_cmd_decoder_unittest.cc [add] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/command_buffer/service/gpu_command_buffer_memory_tracker.cc [add] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/command_buffer/service/gpu_command_buffer_memory_tracker.h [modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/command_buffer/service/memory_tracking.h [modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/command_buffer/service/mocks.h [modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/command_buffer/service/raster_decoder_unittest.cc [modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/command_buffer/service/renderbuffer_manager.cc [modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/command_buffer/service/texture_manager.cc [modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/command_buffer/service/texture_manager_unittest.cc [modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/ipc/command_buffer_task_executor.h [modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/ipc/gpu_in_process_thread_service.cc [modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/ipc/gpu_in_process_thread_service.h [modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/ipc/in_process_command_buffer.cc [modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/ipc/service/command_buffer_stub.cc [modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/ipc/service/gles2_command_buffer_stub.cc [modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/ipc/service/raster_command_buffer_stub.cc [modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/ipc/service/shared_image_stub.cc [modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/gpu/ipc/service/shared_image_stub.h [modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/ui/gl/trace_util.cc [modify] https://crrev.com/d167a866a8f6c4d270830501a6fbd623128a32ad/ui/gl/trace_util.h
,
Sep 17
Confirming at #14 caused the expected drop in cc::effective_size on Android, Mac and Linux. That aspect of double counting should be fixed. No change seen on Windows due to missing MemoryDumpReporter code with passthrough command decoder. https://chromeperf.appspot.com/report?sid=eee4efc070809d130323f1f210cba1c3a364d4e0549c2ca793b1360a5ad680f6
,
Sep 18
,
Sep 18
,
Sep 18
Removing people that got cc'd from the merged bugs.
,
Sep 18
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/1853b734e013824b9a25a24f172ccd6a15cdf2b6 commit 1853b734e013824b9a25a24f172ccd6a15cdf2b6 Author: kylechar <kylechar@chromium.org> Date: Tue Sep 18 22:57:35 2018 Have VizProcessContextProvider use InProcessCommandBuffer directly. VizProcessContextProvider relies on GLInProcessContext to own/initialize some classes necessary to support a command buffer. We'll need to access even more internal state from GLInProcessContext for memory reporting which means adding another accessor. Instead of adding another accessors, move initialization logic into VizProcessContextProvider. This removes a step of indirection and makes the code a bit easier to understand. Bug: 876508 Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel Change-Id: Ib66ad5596e6c349695a25279b65792260ea89105 Reviewed-on: https://chromium-review.googlesource.com/1231829 Reviewed-by: Bo <boliu@chromium.org> Reviewed-by: Antoine Labour <piman@chromium.org> Commit-Queue: kylechar <kylechar@chromium.org> Cr-Commit-Position: refs/heads/master@{#592241} [modify] https://crrev.com/1853b734e013824b9a25a24f172ccd6a15cdf2b6/android_webview/browser/aw_render_thread_context_provider.cc [modify] https://crrev.com/1853b734e013824b9a25a24f172ccd6a15cdf2b6/cc/test/test_in_process_context_provider.cc [modify] https://crrev.com/1853b734e013824b9a25a24f172ccd6a15cdf2b6/components/viz/common/gl_helper_benchmark.cc [modify] https://crrev.com/1853b734e013824b9a25a24f172ccd6a15cdf2b6/components/viz/common/gl_helper_unittest.cc [modify] https://crrev.com/1853b734e013824b9a25a24f172ccd6a15cdf2b6/components/viz/common/yuv_readback_unittest.cc [modify] https://crrev.com/1853b734e013824b9a25a24f172ccd6a15cdf2b6/components/viz/service/BUILD.gn [modify] https://crrev.com/1853b734e013824b9a25a24f172ccd6a15cdf2b6/components/viz/service/display_embedder/viz_process_context_provider.cc [modify] https://crrev.com/1853b734e013824b9a25a24f172ccd6a15cdf2b6/components/viz/service/display_embedder/viz_process_context_provider.h [modify] https://crrev.com/1853b734e013824b9a25a24f172ccd6a15cdf2b6/gpu/ipc/client/gpu_in_process_context_tests.cc [modify] https://crrev.com/1853b734e013824b9a25a24f172ccd6a15cdf2b6/gpu/ipc/gl_in_process_context.cc [modify] https://crrev.com/1853b734e013824b9a25a24f172ccd6a15cdf2b6/gpu/ipc/gl_in_process_context.h [modify] https://crrev.com/1853b734e013824b9a25a24f172ccd6a15cdf2b6/ui/compositor/test/in_process_context_provider.cc
,
Sep 19
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/e7eca0a627c8289ec11efd27a26df8fe5c81e7e7 commit e7eca0a627c8289ec11efd27a26df8fe5c81e7e7 Author: kylechar <kylechar@chromium.org> Date: Wed Sep 19 15:49:18 2018 Add memory dump support to VizProcessContextProvider. VizProcessContextProvider is missing memory reporting logic found in ContextProviderCommandBuffer. Make VizProcessContextFactory a MemoryDumpProvider so display compositor gpu memory is reported again. This is expected to have an impact on the gpu and skia memory tracing categories. This isn't a regression itself, it's just fixing memory reporting that was broken when VizDisplayCompositor experiment was enabled by fieldtrial test config. Bug: 876508 Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel Change-Id: Ia885fccd122dbfc8909f62e0a970c10ab3499632 Reviewed-on: https://chromium-review.googlesource.com/1209985 Reviewed-by: Jonathan Backer <backer@chromium.org> Commit-Queue: kylechar <kylechar@chromium.org> Cr-Commit-Position: refs/heads/master@{#592417} [modify] https://crrev.com/e7eca0a627c8289ec11efd27a26df8fe5c81e7e7/components/viz/service/display_embedder/viz_process_context_provider.cc [modify] https://crrev.com/e7eca0a627c8289ec11efd27a26df8fe5c81e7e7/components/viz/service/display_embedder/viz_process_context_provider.h
,
Oct 9
Issue 860342 has been merged into this issue.
,
Oct 11
There was an increase in malloc'd memory and gpu memory with OOP-D enabled. The gpu memory category increase was mainly after landing fixes for memory tracing with OOP-D, not right away when OOP-D was enabled. This increase is primary due to command buffer and transfer buffer memory. Before OOP-D with a single browser window there was one context for the ui::Compositor/viz::Display. There is a 1mb of command buffer and some transfer buffer memory allocated in shared memory for the context. After OOP-D with a single browser window there is still one context for the ui::Compositor. There is also a second context for the viz::Display in the GPU process. There is no process boundary being crossed here so command buffer and transfer buffer memory isn't in shared memory. With OOP-D there is usually going to be one extra context (at most one extra since ui::Compositors share the same context now). This represents a real memory usage increase for the extra command buffer and transfer buffer memory. In the future the display compositor(s) won't need command buffers and will talk directly to skia, but until then it's a regression to live with. The buffer memory used for the OOP-D display compositors is also reported differently. The IPC buffer memory uses shared memory, and memory tracing will link together the gpu and shared_memory dumps, so that the shared_memory category ends up with an effective size of 0 and gpu category counts the memory. The OOP-D display compositor buffer memory isn't shared_memory but instead malloc'd. The malloc category doesn't have detailed allocations, so we can't link up the gpu memory dump with a malloc memory dump to only count one. Both gpu and malloc categories will count the buffer memory as a result. This is just a reporting issue but it makes the regression look worse. I think crbug.com/892850 is the only OOP-D memory regression left to deal with.
,
Oct 11
To be clear, crbug.com/892850 is an issue with instrumentation --- it is not a regression in performance. There is a narrow window of time when some tiles are not attributed to the client (time between SharedImage creation and RasterTask completing on compositor impl thread). This will manifest as a reported regression in GPU process memory usage. It's also likely that we won't fix the instrumentation (see https://bugs.chromium.org/p/chromium/issues/detail?id=892850#c11), because fixes for the instrumentation would hurt performance (e.g. unnecessary locking). I wouldn't block on that bug...
,
Oct 11
,
Oct 19
,
Jan 3
Issue 880384 has been merged into this issue. |
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by 42576172...@developer.gserviceaccount.com
, Aug 21