Issue metadata
Sign in to add a comment
|
1%-16379.8% regression in loading.desktop at 585103:585228 |
||||||||||||||||||||
Issue descriptionRegression on mac low-end bot.
,
Aug 23
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/139811b1640000
,
Aug 23
📍 Found a significant difference after 1 commit. https://pinpoint-dot-chromeperf.appspot.com/job/139811b1640000 Turn on OOP Raster on mac bots by enne@chromium.org https://chromium.googlesource.com/chromium/src/+/ee35f329c8e615505326a3b4df9ae4b7764037ae 2.662e+05 → 4.387e+07 (+4.36e+07) Understanding performance regressions: http://g.co/ChromePerformanceRegressions
,
Aug 28
,
Aug 28
This has some Android bugs duped in as well, which are definitely unrelated to the mac regression.
,
Sep 21
The time to first paint cold regressions all seem like dupes of issue 877587 . There's a number of memory issues here (gpu, skia, cc effective size) that need investigating. There's a number of mean frame time, input event latency, percentage smooth issues that need investigating. blink perf bindings seem unrelated. cpu_time_percentage_avg/TrivialFullscreenVideoPageSharedPageState appears to have recovered.
,
Sep 21
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/15dfff97640000
,
Sep 21
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/150390c7640000
,
Sep 21
Here's a set of representative examples from each regression category: ChromiumPerf/Android Nexus5 Perf/system_health.memory_mobile / memory:chrome:all_processes:reported_by_chrome:gpu:effective_size_avg / load_tools / load_tools_dropbox graph: https://chromeperf.appspot.com/report?sid=2174ca7332af0d54b1efae111c34e42b4170461199e5c968972a9bb9cfdc47eb&rev=585228 pinpoint bisect: https://pinpoint-dot-chromeperf.appspot.com/job/15dfff97640000 ChromiumPerf/mac-10_13_laptop_high_end-perf/system_health.memory_desktop / memory:chrome:all_processes:reported_by_chrome:gpu:effective_size_avg / browse_media / browse_media_youtube graph: https://chromeperf.appspot.com/report?sid=77c618b51edab76fc747a3e167f8c843b7476aa7eadaff618905d708cd43a678&start_rev=584306&end_rev=592771 pinpoint bisect: https://pinpoint-dot-chromeperf.appspot.com/job/150390c7640000 ChromiumPerf/mac-10_13_laptop_high_end-perf/system_health.memory_desktop / memory:chrome:all_processes:reported_by_os:system_memory:private_footprint_size_avg / browse_media / browse_media_pinterest graph: https://chromeperf.appspot.com/report?sid=e9523f720237fda34c7be4288976db8b5319bc9b4cd3fef917cb8d59d0f47fda&rev=585143 pinpoint trace: https://pinpoint-dot-chromeperf.appspot.com/job/1328ec3f640000 ChromiumPerf/mac-10_12_laptop_low_end-perf/system_health.memory_desktop / memory:chrome:all_processes:reported_by_chrome:skia:effective_size_avg / browse_media / browse_media_flickr_infinite_scroll graph: https://chromeperf.appspot.com/report?sid=ebb114f166b1861a9165b1624e9a6e451ef301f2ff0311e2e1c5f8832f8bad0d&rev=585139 pinpoint trace: https://pinpoint-dot-chromeperf.appspot.com/job/1744b7a8e40000 ChromiumPerf/mac-10_12_laptop_low_end-perf/system_health.memory_desktop / memory:chrome:all_processes:reported_by_chrome:skia:effective_size_avg / browse_news / browse_news_flipboard graph: https://chromeperf.appspot.com/report?sid=d18c4442b50c8d289c332c550afdba1541519bff9d5d36189075d7fc194c5bda&rev=585139 pinpoint trace: https://pinpoint-dot-chromeperf.appspot.com/job/132d9fa7640000 ChromiumPerf/mac-10_13_laptop_high_end-perf/system_health.memory_desktop / memory:chrome:all_processes:reported_by_chrome:cc:effective_size_avg / load_news / load_news_bbc graph: https://chromeperf.appspot.com/report?sid=4e33ffef8ac779121d2a282a8bfc6fa10dfb98e466b9a2f85543ad5b058315c8&rev=585143 pinpoint trace: https://pinpoint-dot-chromeperf.appspot.com/job/127b92b8e40000 ChromiumPerf/mac-10_13_laptop_high_end-perf/system_health.memory_desktop / memory:chrome:all_processes:reported_by_chrome:malloc:allocated_objects_size_avg / browse_news / browse_news_reddit graph: https://chromeperf.appspot.com/report?sid=d0e401e8d7d100680e12dc6f6f059bb4f4d6eca9f96e5585fcef27f8ee965443&rev=585143 pinpoint trace: https://pinpoint-dot-chromeperf.appspot.com/job/139cc597640000 ChromiumPerf/mac-10_12_laptop_low_end-perf/rendering.desktop / thread_raster_cpu_time_per_frame / web_animation_value_type_path graph: https://chromeperf.appspot.com/report?sid=a05aa0a4c1b9279003f6de7d6d37cd888430187960bfb0751736e82026a2f746&rev=585139 pinpoint trace: https://pinpoint-dot-chromeperf.appspot.com/job/11d4f3db640000 ChromiumPerf/mac-10_12_laptop_low_end-perf/rendering.desktop / input_event_latency_discrepancy / twitter_2018 graph: https://chromeperf.appspot.com/report?sid=b41fb48b73a591cb3d5f189f9b2a3c58567815a971fcf9483de3573352b73694&rev=585139 pinpoint trace: https://pinpoint-dot-chromeperf.appspot.com/job/11d47670e40000 ChromiumPerf/mac-10_12_laptop_low_end-perf/rendering.desktop / mean_frame_time_renderer_compositor / web_animation_value_type_path graph: https://chromeperf.appspot.com/report?sid=bf6fb2acdab99c56da8c8e99f7590cdefd1364ba32d39930953579d8d629ace2&rev=585139 pinpoint trace: https://pinpoint-dot-chromeperf.appspot.com/job/17b05188e40000 ChromiumPerf/mac-10_12_laptop_low_end-perf/rendering.desktop / mean_frame_time / css_value_type_path graph: https://chromeperf.appspot.com/report?sid=a863f344166703e2f26524f68a89d060b40e8e42d7aae257d408c61ab0a97b6f&rev=585139 pinpoint trace: https://pinpoint-dot-chromeperf.appspot.com/job/16f796a4e40000 ChromiumPerf/mac-10_12_laptop_low_end-perf/rendering.desktop / percentage_smooth / css_value_type_shadow graph: https://chromeperf.appspot.com/report?sid=87366d38d304155b0afed5d01f14f7e6cd3480ab4342eba02983783f09017799&rev=585139 pinpoint trace: https://pinpoint-dot-chromeperf.appspot.com/job/131626e0e40000 ChromiumPerf/mac-10_12_laptop_low_end-perf/rendering.desktop / percentage_smooth / web_animation_value_type_shadow graph: https://chromeperf.appspot.com/report?sid=16f9a635a356b94cca6bc2ba64cd55a1c04078206f0dd6d332390aa22bef16f1&rev=585139 pinpoint trace: https://pinpoint-dot-chromeperf.appspot.com/job/15d47670e40000
,
Sep 22
📍 Found a significant difference after 1 commit. https://pinpoint-dot-chromeperf.appspot.com/job/150390c7640000 Turn on OOP Raster on mac bots by enne@chromium.org https://chromium.googlesource.com/chromium/src/+/ee35f329c8e615505326a3b4df9ae4b7764037ae 2.844e+07 → 5.297e+07 (+2.453e+07) Understanding performance regressions: http://g.co/ChromePerformanceRegressions Benchmark documentation link: https://bit.ly/system-health-benchmarks
,
Sep 24
📍 Found a significant difference after 1 commit. https://pinpoint-dot-chromeperf.appspot.com/job/15dfff97640000 Remove Modern flag and hardcode #isChromeModernDesignEnabled to true by twellington@chromium.org https://chromium.googlesource.com/chromium/src/+/5955e6d67f3799a0cdd9dba7df8fa5460499f99f 2.19e+06 → 2.279e+06 (+8.926e+04) Understanding performance regressions: http://g.co/ChromePerformanceRegressions Benchmark documentation link: https://bit.ly/system-health-benchmarks
,
Sep 24
The change in #11 is for Android only (it wouldn't affect a desktop loading metric). There was detected gpu memory regression tracked in 877247. Removing myself as I don't think my change is applicable to this bug. Please feel free to re-add if that's not the case.
,
Sep 24
Agreed that issue 877247 captures the Android changes that were grouped up in this bug, thanks!
,
Sep 25
Re: "thread_raster_cpu_time_per_frame / web_animation_value_type_path". Looking at the traces, there's nothing that pops out (e.g. image decodes, etc). It's just that everything bottoms out in ~10ms RasterCHROMIUM vs ~2ms GpuRasterBuffer::Playback. I think some more directed profiling on Mac will be required to understand this one.
,
Sep 27
Re: "thread_raster_cpu_time_per_frame / web_animation_value_type_path". Running through instruments makes this looks like the majority of the cost is mapped memory when allocating in the transfer cache. It looks like there's not a lot of paths going on. There's ~630 paths, where each one repeats twice (so the second one is cached). The cached path is 1% of the time vs 95% of the time in creating the transfer cache entry. It looks like paths aren't reused from frame to frame.
,
Sep 27
I guess this is where we need to consider the optimization for inlining the data for small transfer cache entries in the command buffer instead of using mapped memory?
,
Sep 27
Re: "thread_raster_cpu_time_per_frame / web_animation_value_type_path" Each path is also only 44 bytes, so I think it's just that "300 transfer cache transactions is too many". I'll try reverting the "use transfer cache for paths and seeing what that looks like on the bots.
,
Oct 8
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/c8d0d44728928704030f8a901174d649f164cbe5 commit c8d0d44728928704030f8a901174d649f164cbe5 Author: Adrienne Walker <enne@chromium.org> Date: Mon Oct 08 19:55:31 2018 Use transfer buffer for small transfer cache entries OOP-R in RasterImplementation currently allocates the entire free space available in the transfer buffer so that it doesn't have to measure first then serialize and can just optimistically serialize into that space. This means that other transfer buffer consumers can't use it, and are instead forced to use mapped memory. This turns out to not be very efficient for large numbers of small allocations. To sidestep this problem, small transfer cache entries are serialized into the heap when encountered. Then when raster is complete, it shrinks the transfer buffer down to the correct size and these small transfer cache entries are added after it. Then the commands for these entries are submitted first before raster. For example, for three transfer cache tasks (A B C) and one raster (R) the transfer buffer and ring buffer will look like this: transfer buffer ring buffer: R A B C command buffer: A B C R This patch also loosens the restriction on gpu::RingBuffer that there can be only one in use block at any time. This leads to potential exhaustion issues because the ring buffer won't reallocate while there are in use blocks. To avoid this, this optimization is only used when there is room in the ring buffer without waiting. An alternative to this patch would have been to have yet another transfer buffer only for the transfer cache or to rewrite oopr serialization, but both of those are more invasive solutions. On OSX, on the rendering.desktop telemetry benchmark, on the story web_animation_value_type_path, this results in the following results: gpu-r: raster cpu time: 2.011ms gpu cpu time: 12.891ms total time: 14.902ms oop-r: raster cpu time: 7.314ms <- the bug gpu cpu time: 3.804ms total time: 11.118ms oop-r + this patch: raster cpu time: 0.812ms gpu cpu time: 3.901ms total time: 4.713ms Bug: 804380 , 877168 Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel Change-Id: I586cbca2acb13b8de7d3490c5eb5d6b415f6eda5 Reviewed-on: https://chromium-review.googlesource.com/c/1262955 Commit-Queue: enne <enne@chromium.org> Reviewed-by: Antoine Labour <piman@chromium.org> Cr-Commit-Position: refs/heads/master@{#597654} [modify] https://crrev.com/c8d0d44728928704030f8a901174d649f164cbe5/cc/paint/transfer_cache_serialize_helper.h [modify] https://crrev.com/c8d0d44728928704030f8a901174d649f164cbe5/cc/paint/transfer_cache_unittest.cc [modify] https://crrev.com/c8d0d44728928704030f8a901174d649f164cbe5/gpu/command_buffer/client/client_transfer_cache.cc [modify] https://crrev.com/c8d0d44728928704030f8a901174d649f164cbe5/gpu/command_buffer/client/client_transfer_cache.h [modify] https://crrev.com/c8d0d44728928704030f8a901174d649f164cbe5/gpu/command_buffer/client/raster_implementation.cc [modify] https://crrev.com/c8d0d44728928704030f8a901174d649f164cbe5/gpu/command_buffer/client/raster_implementation.h [modify] https://crrev.com/c8d0d44728928704030f8a901174d649f164cbe5/gpu/command_buffer/client/ring_buffer.cc [modify] https://crrev.com/c8d0d44728928704030f8a901174d649f164cbe5/gpu/command_buffer/client/ring_buffer.h [modify] https://crrev.com/c8d0d44728928704030f8a901174d649f164cbe5/gpu/command_buffer/client/transfer_buffer_unittest.cc
,
Oct 8
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/6bbcf5cf66c5e9540c93857d2fac6a357b273884 commit 6bbcf5cf66c5e9540c93857d2fac6a357b273884 Author: enne <enne@chromium.org> Date: Mon Oct 08 22:33:20 2018 Revert "Use transfer buffer for small transfer cache entries" This reverts commit c8d0d44728928704030f8a901174d649f164cbe5. Reason for revert: causing flaky webgl test failures Original change's description: > Use transfer buffer for small transfer cache entries > > OOP-R in RasterImplementation currently allocates the entire free space > available in the transfer buffer so that it doesn't have to measure > first then serialize and can just optimistically serialize into that > space. This means that other transfer buffer consumers can't use it, > and are instead forced to use mapped memory. This turns out to not > be very efficient for large numbers of small allocations. > > To sidestep this problem, small transfer cache entries are serialized > into the heap when encountered. Then when raster is complete, it > shrinks the transfer buffer down to the correct size and these small > transfer cache entries are added after it. Then the commands for these > entries are submitted first before raster. > > For example, for three transfer cache tasks (A B C) and one raster (R) > the transfer buffer and ring buffer will look like this: > > transfer buffer ring buffer: R A B C > command buffer: A B C R > > This patch also loosens the restriction on gpu::RingBuffer that there > can be only one in use block at any time. This leads to potential > exhaustion issues because the ring buffer won't reallocate while there > are in use blocks. To avoid this, this optimization is only used when > there is room in the ring buffer without waiting. > > An alternative to this patch would have been to have yet another > transfer buffer only for the transfer cache or to rewrite oopr > serialization, but both of those are more invasive solutions. > > On OSX, on the rendering.desktop telemetry benchmark, on the story > web_animation_value_type_path, this results in the following results: > > gpu-r: > raster cpu time: 2.011ms > gpu cpu time: 12.891ms > total time: 14.902ms > > oop-r: > raster cpu time: 7.314ms <- the bug > gpu cpu time: 3.804ms > total time: 11.118ms > > oop-r + this patch: > raster cpu time: 0.812ms > gpu cpu time: 3.901ms > total time: 4.713ms > > Bug: 804380 , 877168 > Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel > Change-Id: I586cbca2acb13b8de7d3490c5eb5d6b415f6eda5 > Reviewed-on: https://chromium-review.googlesource.com/c/1262955 > Commit-Queue: enne <enne@chromium.org> > Reviewed-by: Antoine Labour <piman@chromium.org> > Cr-Commit-Position: refs/heads/master@{#597654} TBR=enne@chromium.org,piman@chromium.org,ericrk@chromium.org Change-Id: Ia135febae14f3515a8a6447071d3ddd94bd0e89f No-Presubmit: true No-Tree-Checks: true No-Try: true Bug: 804380 , 877168 Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel Reviewed-on: https://chromium-review.googlesource.com/c/1269632 Reviewed-by: enne <enne@chromium.org> Commit-Queue: enne <enne@chromium.org> Cr-Commit-Position: refs/heads/master@{#597708} [modify] https://crrev.com/6bbcf5cf66c5e9540c93857d2fac6a357b273884/cc/paint/transfer_cache_serialize_helper.h [modify] https://crrev.com/6bbcf5cf66c5e9540c93857d2fac6a357b273884/cc/paint/transfer_cache_unittest.cc [modify] https://crrev.com/6bbcf5cf66c5e9540c93857d2fac6a357b273884/gpu/command_buffer/client/client_transfer_cache.cc [modify] https://crrev.com/6bbcf5cf66c5e9540c93857d2fac6a357b273884/gpu/command_buffer/client/client_transfer_cache.h [modify] https://crrev.com/6bbcf5cf66c5e9540c93857d2fac6a357b273884/gpu/command_buffer/client/raster_implementation.cc [modify] https://crrev.com/6bbcf5cf66c5e9540c93857d2fac6a357b273884/gpu/command_buffer/client/raster_implementation.h [modify] https://crrev.com/6bbcf5cf66c5e9540c93857d2fac6a357b273884/gpu/command_buffer/client/ring_buffer.cc [modify] https://crrev.com/6bbcf5cf66c5e9540c93857d2fac6a357b273884/gpu/command_buffer/client/ring_buffer.h [modify] https://crrev.com/6bbcf5cf66c5e9540c93857d2fac6a357b273884/gpu/command_buffer/client/transfer_buffer_unittest.cc
,
Oct 12
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/458236cb920e35864c7983b2b7e9c301a29edde9 commit 458236cb920e35864c7983b2b7e9c301a29edde9 Author: Adrienne Walker <enne@chromium.org> Date: Fri Oct 12 22:09:46 2018 Reland "Use transfer buffer for small transfer cache entries" This is a reland of c8d0d44728928704030f8a901174d649f164cbe5 Original change's description: > Use transfer buffer for small transfer cache entries > > OOP-R in RasterImplementation currently allocates the entire free space > available in the transfer buffer so that it doesn't have to measure > first then serialize and can just optimistically serialize into that > space. This means that other transfer buffer consumers can't use it, > and are instead forced to use mapped memory. This turns out to not > be very efficient for large numbers of small allocations. > > To sidestep this problem, small transfer cache entries are serialized > into the heap when encountered. Then when raster is complete, it > shrinks the transfer buffer down to the correct size and these small > transfer cache entries are added after it. Then the commands for these > entries are submitted first before raster. > > For example, for three transfer cache tasks (A B C) and one raster (R) > the transfer buffer and ring buffer will look like this: > > transfer buffer ring buffer: R A B C > command buffer: A B C R > > This patch also loosens the restriction on gpu::RingBuffer that there > can be only one in use block at any time. This leads to potential > exhaustion issues because the ring buffer won't reallocate while there > are in use blocks. To avoid this, this optimization is only used when > there is room in the ring buffer without waiting. > > An alternative to this patch would have been to have yet another > transfer buffer only for the transfer cache or to rewrite oopr > serialization, but both of those are more invasive solutions. > > On OSX, on the rendering.desktop telemetry benchmark, on the story > web_animation_value_type_path, this results in the following results: > > gpu-r: > raster cpu time: 2.011ms > gpu cpu time: 12.891ms > total time: 14.902ms > > oop-r: > raster cpu time: 7.314ms <- the bug > gpu cpu time: 3.804ms > total time: 11.118ms > > oop-r + this patch: > raster cpu time: 0.812ms > gpu cpu time: 3.901ms > total time: 4.713ms > > Bug: 804380 , 877168 > Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel > Change-Id: I586cbca2acb13b8de7d3490c5eb5d6b415f6eda5 > Reviewed-on: https://chromium-review.googlesource.com/c/1262955 > Commit-Queue: enne <enne@chromium.org> > Reviewed-by: Antoine Labour <piman@chromium.org> > Cr-Commit-Position: refs/heads/master@{#597654} Bug: 804380 , 877168 Change-Id: I5c7b0d3b43c0c6f57eb7e5a9c43b4b8ecb22518a Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel Reviewed-on: https://chromium-review.googlesource.com/c/1277949 Commit-Queue: enne <enne@chromium.org> Reviewed-by: Antoine Labour <piman@chromium.org> Cr-Commit-Position: refs/heads/master@{#599376} [modify] https://crrev.com/458236cb920e35864c7983b2b7e9c301a29edde9/cc/paint/transfer_cache_serialize_helper.h [modify] https://crrev.com/458236cb920e35864c7983b2b7e9c301a29edde9/cc/paint/transfer_cache_unittest.cc [modify] https://crrev.com/458236cb920e35864c7983b2b7e9c301a29edde9/gpu/command_buffer/client/client_transfer_cache.cc [modify] https://crrev.com/458236cb920e35864c7983b2b7e9c301a29edde9/gpu/command_buffer/client/client_transfer_cache.h [modify] https://crrev.com/458236cb920e35864c7983b2b7e9c301a29edde9/gpu/command_buffer/client/raster_implementation.cc [modify] https://crrev.com/458236cb920e35864c7983b2b7e9c301a29edde9/gpu/command_buffer/client/raster_implementation.h [modify] https://crrev.com/458236cb920e35864c7983b2b7e9c301a29edde9/gpu/command_buffer/client/ring_buffer.cc [modify] https://crrev.com/458236cb920e35864c7983b2b7e9c301a29edde9/gpu/command_buffer/client/ring_buffer.h [modify] https://crrev.com/458236cb920e35864c7983b2b7e9c301a29edde9/gpu/command_buffer/client/ring_buffer_test.cc [modify] https://crrev.com/458236cb920e35864c7983b2b7e9c301a29edde9/gpu/command_buffer/client/transfer_buffer.cc [modify] https://crrev.com/458236cb920e35864c7983b2b7e9c301a29edde9/gpu/command_buffer/client/transfer_buffer_unittest.cc
,
Jan 2
Khushal said he was going to investigate mac oopr issues, so assigning this bug to him.
,
Jan 9
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/15b00c51940000
,
Jan 9
😿 Pinpoint job stopped with an error. https://pinpoint-dot-chromeperf.appspot.com/job/15b00c51940000 All of the runs failed. The most common error (20/20 runs) was: SwarmingTaskError: The swarming task failed with state "BOT_DIED".
,
Jan 9
I caught up late on the thread, had started the bisect for the android regression before I realized it was a dupe of issue 877247 . I've re-triaged those alerts to the same issue to avoid any confusion.
,
Jan 9
Starting with memory regressions: The one in gpu:effective_size_avg seems to be all transfer buffer memory. Its not totally unexpected, OOP raster does use the transfer buffer instead of command buffer for raster command serialization. But we've had a lot of changes in this area since this change, including some for auto-shrinking the buffer, so I'll do a local run to see what the current state is.
,
Jan 11
I did a local run for browse:media:imgur to compare GPU and OOP raster and looked at memory:chrome:all_processes:reported_by_os:private_footprint_size. The value for OOP was 429M and GPU 435M over 3 runs. The slight difference in OOP was one run there reporting 362M, otherwise all runs in both the cases reported 430M. Going to try a pinpoint run for this benchmark.
,
Jan 11
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/1283d477940000
,
Jan 11
I tried a local run for memory.desktop benchmark and case TrivialGifPageSharedPageState and interestingly, OOP had better numbers for memory:chrome:all_processes:reported_by_os:system_memory:private_footprint_size that GPU. This was over 5 runs and the results were consistent across all runs. OOP was at ~120M while GPU is at ~150M. I tried digging through the sub-categories in the 2 traces to see if I could find something to explain the difference. And for the most part they look the same. There is some additional cc resource memory in the GPU process in the OOP case, which I'm assuming is the display compositor (likely a timing thing). Skia has 8M worth of GPU resources in the OOP case, since this memory is cleaned up by an idle time, again something that can be affected by timing because the cleanup moved from renderer to GPU process. Lastly there is 16M worth of extra transfer buffer memory in the OOP case which while not surprising I want to understand better. In comparison GPU just has an addition 1M worth of command buffer memory. So all in all, OOP has more reported things from chrome categories but GPU has more memory reported_by_os.
,
Jan 11
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/1522057b940000
,
Jan 11
Also, for gpu:effective_size_avg the difference due to transfer buffer memory is what is reported by the renderer. The gpu:effective_size_avg reported by the GPU process also has a difference but that's just a reporting change. This memory used to be reported in the renderer under cc:images but now is in the GPU process under gpu:transfer_cache. The delta between the 2 lines up that its just a reporting change.
,
Jan 11
😿 Pinpoint job stopped with an error. https://pinpoint-dot-chromeperf.appspot.com/job/1522057b940000 guid
,
Jan 11
😿 Pinpoint job stopped with an error. https://pinpoint-dot-chromeperf.appspot.com/job/1283d477940000 guid |
|||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||
Comment 1 by 42576172...@developer.gserviceaccount.com
, Aug 23