Issue metadata
Sign in to add a comment
|
5.6%-12.9% regression in memory:chrome:all_processes:reported_by_os:system_memory:private_footprint_size at 603087:603145 |
||||||||||||||||||||||
Issue descriptionSee the link to graphs below.
,
Oct 29
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/15082489e40000
,
Oct 29
📍 Found a significant difference after 1 commit. https://pinpoint-dot-chromeperf.appspot.com/job/15082489e40000 Make browser UI to use OOPR as well when OOPR is enabled. by penghuang@chromium.org https://chromium.googlesource.com/chromium/src/+/392cff8628eed1ce42f41362d2328f60768e2ce5 memory:chrome:all_processes:reported_by_os:system_memory:private_footprint_size: 1.349e+08 → 1.466e+08 (+1.167e+07) Understanding performance regressions: http://g.co/ChromePerformanceRegressions Benchmark documentation link: None
,
Oct 30
,
Oct 30
I'll take a look at this.
,
Oct 30
I believe that this is because Skia is caching more in the GPU process. This is one exampled I looked at: https://chromeperf.appspot.com/report?sid=38dd9c2f33aa2cc5d19a2fc5c2641d52812840fdece6358fc45493c02bee41d0&start_rev=601054&end_rev=603888. I examined traces before and after we OOP-Red the UI. The total memory jumped about 5MB. The increase in total size is attributable to the GPU process. If you examine the GPU process in the memory dumps, you will see the Skia category increases (2.1 MB vs 6.7 MB). Fortunately, the regression readily repros. I did a small hack: --- a/gpu/command_buffer/service/raster_decoder_context_state.cc +++ b/gpu/command_buffer/service/raster_decoder_context_state.cc @@ -97,6 +97,7 @@ void RasterDecoderContextState::InitializeGrContext( bool RasterDecoderContextState::OnMemoryDump( const base::trace_event::MemoryDumpArgs& args, base::trace_event::ProcessMemoryDump* pmd) { + PurgeMemory(base::MemoryPressureListener::MemoryPressureLevel::MEMORY_PRESSURE_LEVEL_CRITICAL); if (gr_context) DumpGrMemoryStatistics(gr_context.get(), pmd, base::nullopt); return true; This tells Skia to release as much as possible before we do a memory dump. This hack erased the difference in my local repro.
,
Oct 30
+cc khushalsagar Maybe we need to tune gpu/command_buffer/service/gr_cache_controller.cc ?
,
Oct 30
GrCacheController should work the same way as the ContextCacheController on the client side for how it triggers the cleanup. Maybe there is a minor timing difference in getting to the idle state in the client vs service side that is affecting whether the cleanup happens before or after the dump?
,
Oct 30
I think the biggest difference is that we are now sharing the cache. - With GPU raster for renderer and UI, the cache is client side and not shared. Each client has it's own ContextCacheController and there is no interaction. If either renderer or UI is idle for a while, it will freeGpuResources independently. - With OOP-R for renderer and UI, the cache is service side and shared. The GrCacheController will only freeGpuResources if renderer and UI are idle for a while (e.g. open new tab with cursor blinking in omnibox and cursor blink will prevent freeing of old tabs cache). I think overall it's an improvement because a shared service side cache is not as fragmented. Moreover, we're well under our cache cap size of 20MB -- if we hit the limit, I believe that we would drop purge LRU items.
,
Oct 30
Per your observation: the switch from distributed to global cache ought to actually reduce total system memory? Can we measure this? It would be nice to show that the viz work has given us a whole-system improvement.
,
Oct 30
There are some cases which do show improvements in overall memory usage: https://chromeperf.appspot.com/group_report?rev=603112. I'm guessing this ends up depending on what the combined renderer + ui cache usage was for each story? I'd still expect the shared cache to be a net win even in cases where the ui is static but the renderer is doing continuous raster. Since even if don't go completely idle we still do time based eviction of things (https://cs.chromium.org/chromium/src/gpu/command_buffer/service/gr_cache_controller.cc?type=cs&g=0&l=37), so that should evict things used only by the ui. The timing for that means this may not show up in our benchmarks.
,
Oct 30
,
Oct 31
To elaborate Khusal's point, I found an interesting example to show some of the timing problems with our test infrastructure. First I added some additional trace instrumentation (https://pastebin.com/ZjuRsW1G) Then I ran telemetry locally enable cc,gpu categories. Attached is a trace for TrivialGifPageSharedPageState. Notable timing info: - test duration is 12 seconds, so performDeferredCleanup(15s) is a no-op - Gif animation means that we are never idle enough service side to freeGpuResources
,
Oct 31
+cc bsalomon There may be an opportunity to tune things service side further. I'm guessing that with client side GPU raster (ContextCacheController) that we often freeGpuResources (1 second of inactivity) well before we performDeferredCleanup (wasn't used for 15 seconds). As freeGpuResources is much stronger, we would only see 15 seconds constant in exceptional cirumstances (animated GIF in a loop as above). Things change server side with the shared GrContext. We are less likely to hit freeGpuResources after 1 second of inactivity. I think that this is especially true with animated ads. Maybe we should tune 15 seconds down? I see that bsalomon@ originally set it to 30 seconds. And then he further reduced it to 15 seconds. Maybe we could go as low as 5 or 10?
,
Oct 31
15s was chosen arbitrarily. It likely would not be a problem to reduce it.
,
Nov 7
I did a CT memory run (https://ct.skia.org/results/cluster-telemetry/tasks/chromium_perf_runs/backer-20181106202303/html/index.html) reducing that constant from 15s to 5s. The results were underwhelming (-0.224%) in memory:chrome:all_processes:reported_by_os:private_footprint_size. Maybe this is because linux (which CT uses) is instrumented differently than OSX and this measure does not capture GPU memory on linux? CT did report a > 10% memory decrease used by Skia in the GPU process (memory:chrome:gpu_process:reported_by_chrome:skia:effective_size). I'm going to kick off another CT run for rendering. And I don't see major regressions in performance (total CPU and smoothness), we'll land it.
,
Nov 7
,
Nov 14
Here's the rendering results: https://ct.skia.org/results/cluster-telemetry/tasks/chromium_perf_runs/backer-20181108162853/html/index.html The descrepancies are slight --- probably just noise. I will put CL up for review.
,
Nov 15
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/a388764c6e5d04915eed520baa5ea6b6b939efdb commit a388764c6e5d04915eed520baa5ea6b6b939efdb Author: Jonathan Backer <backer@chromium.org> Date: Thu Nov 15 23:36:17 2018 Cleanup old resources faster These resources are mostly scratch space. As per crbug.com/899940 : - most of this is scratch space textures for intermediate results cached to avoid GPU memory churn - with the UI using OOP-R we don't go idle as often (e.g. cursor blink, throbber) - 15 seconds is a long time to keep it around (many of our regression tests don't run this long) Bug: 899940 Change-Id: Idc077c98551978afa97b18be3dd61784f8898806 Reviewed-on: https://chromium-review.googlesource.com/c/1320790 Reviewed-by: Brian Salomon <bsalomon@google.com> Commit-Queue: Jonathan Backer <backer@chromium.org> Cr-Commit-Position: refs/heads/master@{#608579} [modify] https://crrev.com/a388764c6e5d04915eed520baa5ea6b6b939efdb/gpu/command_buffer/service/gr_cache_controller.cc
,
Nov 26
Change in comment #19 did have some effect, but did not fix the regression. https://chromeperf.appspot.com/report?sid=507f10daa470fb5f1723d0dde1421deecda24a6fb5b24c08a795212bfb929d4a&start_rev=601065&end_rev=610749 I'll take another look.
,
Nov 26
There is a 4MB texture with OOP-R UI showing up in GPU process Skia memory stats. It is 1024 wide and 2048 long. I believe that it's a glyph cache based on this stack that I captured when the texture is generated: 5 libskia.dylib 0x000000011805dccc GrGLTexture::init(GrSurfaceDesc const&, GrGLTexture::IDDesc const&) + 364 6 libskia.dylib 0x000000011805de28 GrGLTexture::GrGLTexture(GrGLGpu*, SkBudgeted, GrSurfaceDesc const&, GrGLTexture::IDDesc const&, GrMipMapsStatus) + 344 7 libskia.dylib 0x000000011803a7c0 GrGLGpu::onCreateTexture(GrSurfaceDesc const&, SkBudgeted, GrMipLevel const*, int) + 768 8 libskia.dylib 0x0000000117f2854c GrGpu::createTexture(GrSurfaceDesc const&, SkBudgeted, GrMipLevel const*, int) + 460 9 libskia.dylib 0x0000000117f285f4 GrGpu::createTexture(GrSurfaceDesc const&, SkBudgeted) + 20 10 libskia.dylib 0x0000000117f64ccc GrResourceProvider::createTexture(GrSurfaceDesc const&, SkBudgeted, GrResourceProvider::Flags) + 268 11 libskia.dylib 0x0000000117f813d8 GrSurfaceProxy::createSurfaceImpl(GrResourceProvider*, int, bool, GrSurfaceFlags, GrMipMapped) const + 696 12 libskia.dylib 0x0000000117f81a7e GrSurfaceProxy::instantiateImpl(GrResourceProvider*, int, bool, GrSurfaceFlags, GrMipMapped, GrUniqueKey const*) + 334 13 libskia.dylib 0x0000000117f87359 GrTextureProxy::instantiate(GrResourceProvider*) + 105 14 libskia.dylib 0x0000000117f21520 GrDrawOpAtlas::addToAtlas(GrResourceProvider*, unsigned long long*, GrDeferredUploadTarget*, int, int, void const*, SkIPoint16*) + 512 15 libskia.dylib 0x000000011800e176 GrAtlasManager::addToAtlas(GrResourceProvider*, GrGlyphCache*, GrTextStrike*, unsigned long long*, GrDeferredUploadTarget*, GrMaskFormat, int, int, void const*, SkIPoint16*) + 230 16 libskia.dylib 0x000000011801106a GrTextStrike::addGlyphToAtlas(GrResourceProvider*, GrDeferredUploadTarget*, GrGlyphCache*, GrAtlasManager*, GrGlyph*, SkGlyphCache*, GrMaskFormat, bool) + 3034 17 libskia.dylib 0x000000011801878e bool GrTextBlob::VertexRegenerator::doRegen<false, false, true, false>(GrTextBlob::VertexRegenerator::Result*) + 894 18 libskia.dylib 0x0000000117fa6284 GrAtlasTextOp::onPrepareDraws(GrMeshDrawOp::Target*) + 2212 19 libskia.dylib 0x0000000117f564e7 GrRenderTargetOpList::onPrepare(GrOpFlushState*) + 423 20 libskia.dylib 0x0000000117f2efc2 GrOpList::prepare(GrOpFlushState*) + 98 21 libskia.dylib 0x0000000117f1bb24 GrDrawingManager::executeOpLists(int, int, GrOpFlushState*) + 452 22 libskia.dylib 0x0000000117f1b3c8 GrDrawingManager::flush(GrSurfaceProxy*, int, GrBackendSemaphore*) + 2504 23 libskia.dylib 0x0000000117f1bf89 GrDrawingManager::prepareSurfaceForExternalIO(GrSurfaceProxy*, int, GrBackendSemaphore*) + 185 24 libskia.dylib 0x0000000117f51552 GrRenderTargetContext::prepareForExternalIO(int, GrBackendSemaphore*) + 290 25 libskia.dylib 0x000000011808023a SkGpuDevice::flushAndSignalSemaphores(int, GrBackendSemaphore*) + 58 26 libgles2.dylib 0x0000000132fe134a gpu::raster::RasterDecoderImpl::DoEndRasterCHROMIUM() + 314 27 libgles2.dylib 0x0000000132fd38b9 gpu::raster::RasterDecoderImpl::HandleEndRasterCHROMIUM(unsigned int, void const volatile*) + 9 28 libgles2.dylib 0x0000000132fd9617 gpu::error::Error gpu::raster::RasterDecoderImpl::DoCommandsImpl<false>(unsigned int, void const volatile*, int, int*) + 519
,
Nov 26
That is the glyph cache. Before OOP-R each renderer process would have had its own glyph cache texture (assuming the page contains text).
,
Nov 26
I've confirmed that this is definitely the GrContext::fGlyphCache. It is not released unless we call GrContext::freeGpuResources, which we only do if we go idle for 1 second in GrCacheController. As per comment #14, we are less likely to go idle for 1 second when UI is also using OOP-R.
,
Nov 26
We could make it so that the glyph cache does get purged, but it seems like a bad idea to me as it's highly likely going to be recreated the next time a frame needs to be produced and a fresh atlas will mean repopulating the texture with glyphs.
,
Nov 26
I confirmed that the 1 second idle mechanism does work. It does not freeGpuResources in this test because the test animates right up until the memory dump. I'm marking this as WontFix (works as intended) because it's more and artifact of the test than a user visible regression. In general, sharing the GlyphCache will be beneficial (often have 1 instead of N) and we will reclaim it on idle. |
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by 42576172...@developer.gserviceaccount.com
, Oct 29