Reduce size of ShaderTranslatorCache on low-end devices |
||||||||||||||
Issue descriptionIn the analysis in https://docs.google.com/document/d/144DyQmYkqa7hcn7pM8lKqvP5Zajo3otG14ufdP0c4Kc/edit, it was observed that on a 512MB device, about 450KB is held onto when viewing about:blank by ShaderTranslatorCache. jmadill@ said "Caching shader translators ensures we don't have to recreate resources inside of ANGLE's translator every time a new type of context with different sets of extensions are started." On a 512MB device (isLowEndDevice() returns true) we tend to strongly favor memory to speed when the tradeoff arises, so it doesn't seem appropriate to hold onto this.
,
Feb 22 2017
How urgent is this? If it's really urgent, someone else has to take this because my plate is already full of "urgent" tasks.
,
Feb 22 2017
This isn't drop-everything urgent, but priority for the quarter. We'd like to make a dent in memory usage by M60.
,
Feb 23 2017
OK, then I can take a shot later.
,
Feb 24 2017
,
Apr 26 2017
Ping on this bug. We are getting close to M60.
,
Apr 26 2017
Can someone from the Clank team take this from Mo?
,
Apr 26 2017
To be sure, we're happy to provide guidance and reviews. It's probably just plumbing, but is a chunk of work.
,
Apr 26 2017
It would be a good start if someone who understands the code could describe the work that needs to happen (e.g. do the above mentioned guidance). Then depending on what this involves I can try to find someone for this if zmo@ is not able to do this.
,
Apr 26 2017
,
Apr 27 2017
GLES2DecoderImpl::InitializeShaderTranslator is where the ShaderTranslatorCache is used. I believe if the calls to shader_translator_cache()->GetTranslator(...) are replaced with explicit instantiation/initialization of ShaderTranslator (se implementation of ShaderTranslatorCache::GetTranslator), then nothing will be kept in the cache. However, I do worry that this will actually make things worse (both memory-wise and perf-wise), because then it means we won't share the translators between different contexts (and we use several typically), but keep a reference to each translator anyway (vertex_translator_/fragment_translator_ in GLES2DecoderImpl). A way to take it further is to not even cache the translator in the GLES2DecoderImpl, but instead recreate it at every shader compilation (e.g. by making GLES2DecoderImpl::GetTranslator actually create the translator, instead of using the context fields). I do however worry that this would be extremely prohibitive performance-wise, in particular startup (we pre-compile many shaders for the compositor), and cause a lot of jank during page rasterization (GPU raster typically needs many shaders, which need the translators, and recreating them could contribute to blocking the GPU main thread). Definitely worth measuring before going down that path. So I don't think there's necessarily a low-hanging fruit. Something we could look into is to tap into and/or introduce signals that we can release memory. E.g. we free the shared memory command buffer and do some cleanup when the page becomes invisible (see ContextCacheController::ClientBecameNotVisible), but we could do more (e.g. free the translator in GLES2DecoderImpl::DoFlushDriverCachesCHROMIUM), and do it more often (e.g. when going idle on Svelte/Go, instead of just when going invisible).
,
May 11 2017
,
May 11 2017
We recently discovered the GPU driver's own shader compiler also holds on to multiple megabytes of memory indefinitely after compile. ES2 spec provides glReleaseShaderCompiler() to hint the driver into releasing it. We currently don't have any callsites of it, but we probably will add some, similarly at times like after a few seconds of idleness, or invisible. It would then be natural to free up the ShaderTranslatorCache on the same signal. Looks like there's an old TODO suggesting that coupling in shader_translator_cache.h.
,
May 11 2017
Eric found that glReleaseShaderCompiler calls on Android One device seems to give us 1-2 MB memory wins. But, unfortunately the mali driver doesn't seem to do anything on this call. As the spec says it is only a hint for releasing memory. Not just 512, also 1GB is considered low-end. The initialization of the translator takes upto 400KB (gles2::ShaderTranslator::Init). After which the program cache can grow to maximum 900KiB as seen in benchmarks.
,
May 23 2017
,
Jul 17 2017
,
Jul 20 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/df412bd5a319c948a882c7d9b620696facd5ff51 commit df412bd5a319c948a882c7d9b620696facd5ff51 Author: Geoff Lang <geofflang@chromium.org> Date: Thu Jul 20 18:29:29 2017 Lazily create shader translators and implement glReleaseShaderCompiler ANGLE's shader translators hold a significant amount of memory for the builtin symbols. By lazily creating the shader translator, it may never be intialized at all if all compiles hit the program binary cache. BUG= 695135 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.win:win_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.android:android_optional_gpu_tests_rel Change-Id: I095dd2efa33e111726db893cd14c3c3574bdd6ce Reviewed-on: https://chromium-review.googlesource.com/570540 Reviewed-by: Kenneth Russell <kbr@chromium.org> Commit-Queue: Geoff Lang <geofflang@chromium.org> Cr-Commit-Position: refs/heads/master@{#488317} [modify] https://crrev.com/df412bd5a319c948a882c7d9b620696facd5ff51/gpu/command_buffer/service/gles2_cmd_decoder.cc [modify] https://crrev.com/df412bd5a319c948a882c7d9b620696facd5ff51/gpu/command_buffer/service/shader_manager.cc
,
Jul 25 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/65d1cd6e7454a3fd383d0deeaaad67918f97be3b commit 65d1cd6e7454a3fd383d0deeaaad67918f97be3b Author: Geoff Lang <geofflang@chromium.org> Date: Tue Jul 25 17:34:04 2017 Revert "Lazily create shader translators and implement glReleaseShaderCompiler" This reverts commit df412bd5a319c948a882c7d9b620696facd5ff51. Reason for revert: Translator released too early from shader objects causing the program binary cache to miss more often. BUG=748202 BUG= 747865 Original change's description: > Lazily create shader translators and implement glReleaseShaderCompiler > > ANGLE's shader translators hold a significant amount of memory for the > builtin symbols. By lazily creating the shader translator, it may never > be intialized at all if all compiles hit the program binary cache. > > BUG= 695135 > > CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.win:win_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.android:android_optional_gpu_tests_rel > > Change-Id: I095dd2efa33e111726db893cd14c3c3574bdd6ce > Reviewed-on: https://chromium-review.googlesource.com/570540 > Reviewed-by: Kenneth Russell <kbr@chromium.org> > Commit-Queue: Geoff Lang <geofflang@chromium.org> > Cr-Commit-Position: refs/heads/master@{#488317} TBR=geofflang@chromium.org,kbr@chromium.org # Not skipping CQ checks because original CL landed > 1 day ago. Bug: 695135 Change-Id: Ieb3ec4069b263654f86818cd3d1f5ab83fc1c4ab Cq-Include-Trybots: master.tryserver.chromium.win:win_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.android:android_optional_gpu_tests_rel Reviewed-on: https://chromium-review.googlesource.com/585087 Reviewed-by: Geoff Lang <geofflang@chromium.org> Commit-Queue: Geoff Lang <geofflang@chromium.org> Cr-Commit-Position: refs/heads/master@{#489354} [modify] https://crrev.com/65d1cd6e7454a3fd383d0deeaaad67918f97be3b/gpu/command_buffer/service/gles2_cmd_decoder.cc [modify] https://crrev.com/65d1cd6e7454a3fd383d0deeaaad67918f97be3b/gpu/command_buffer/service/shader_manager.cc
,
Jul 31 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/6ab76823251e56cfd00174f8621a7f63abfe0f25 commit 6ab76823251e56cfd00174f8621a7f63abfe0f25 Author: Geoff Lang <geofflang@chromium.org> Date: Mon Jul 31 13:59:58 2017 Revert "Lazily create shader translators and implement glReleaseShaderCompiler" This reverts commit df412bd5a319c948a882c7d9b620696facd5ff51. Reason for revert: Translator released too early from shader objects causing the program binary cache to miss more often. BUG=748202 BUG= 747865 Original change's description: > Lazily create shader translators and implement glReleaseShaderCompiler > > ANGLE's shader translators hold a significant amount of memory for the > builtin symbols. By lazily creating the shader translator, it may never > be intialized at all if all compiles hit the program binary cache. > > BUG= 695135 > > CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.win:win_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.android:android_optional_gpu_tests_rel > > Change-Id: I095dd2efa33e111726db893cd14c3c3574bdd6ce > Reviewed-on: https://chromium-review.googlesource.com/570540 > Reviewed-by: Kenneth Russell <kbr@chromium.org> > Commit-Queue: Geoff Lang <geofflang@chromium.org> > Cr-Commit-Position: refs/heads/master@{#488317} TBR=geofflang@chromium.org, kbr@chromium.org (cherry picked from commit 65d1cd6e7454a3fd383d0deeaaad67918f97be3b) Bug: 695135 Change-Id: Ieb3ec4069b263654f86818cd3d1f5ab83fc1c4ab Cq-Include-Trybots: master.tryserver.chromium.win:win_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.android:android_optional_gpu_tests_rel Reviewed-on: https://chromium-review.googlesource.com/585087 Reviewed-by: Geoff Lang <geofflang@chromium.org> Commit-Queue: Geoff Lang <geofflang@chromium.org> Cr-Original-Commit-Position: refs/heads/master@{#489354} Reviewed-on: https://chromium-review.googlesource.com/593853 Cr-Commit-Position: refs/branch-heads/3163@{#149} Cr-Branched-From: ff259bab28b35d242e10186cd63af7ed404fae0d-refs/heads/master@{#488528} [modify] https://crrev.com/6ab76823251e56cfd00174f8621a7f63abfe0f25/gpu/command_buffer/service/gles2_cmd_decoder.cc [modify] https://crrev.com/6ab76823251e56cfd00174f8621a7f63abfe0f25/gpu/command_buffer/service/shader_manager.cc
,
Aug 9 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/30efb7b6a95c12b2d476e3b117147d90e247a743 commit 30efb7b6a95c12b2d476e3b117147d90e247a743 Author: Geoff Lang <geofflang@chromium.org> Date: Wed Aug 09 14:59:23 2017 Lazily create shader translators and implement glReleaseShaderCompiler ANGLE's shader translators hold a significant amount of memory for the builtin symbols. By lazily creating the shader translator, it may never be intialized at all if all compiles hit the program binary cache. BUG= 695135 Cq-Include-Trybots: master.tryserver.chromium.android:android_optional_gpu_tests_rel;master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel Change-Id: I4038a98a67f8ae8be7c99710f4957b59cd37c399 Reviewed-on: https://chromium-review.googlesource.com/596537 Commit-Queue: Geoff Lang <geofflang@chromium.org> Reviewed-by: Antoine Labour <piman@chromium.org> Cr-Commit-Position: refs/heads/master@{#492981} [modify] https://crrev.com/30efb7b6a95c12b2d476e3b117147d90e247a743/gpu/command_buffer/service/gles2_cmd_decoder.cc [modify] https://crrev.com/30efb7b6a95c12b2d476e3b117147d90e247a743/gpu/command_buffer/service/gles2_cmd_decoder_unittest_base.cc [modify] https://crrev.com/30efb7b6a95c12b2d476e3b117147d90e247a743/gpu/command_buffer/service/memory_program_cache_unittest.cc [modify] https://crrev.com/30efb7b6a95c12b2d476e3b117147d90e247a743/gpu/command_buffer/service/mocks.h [modify] https://crrev.com/30efb7b6a95c12b2d476e3b117147d90e247a743/gpu/command_buffer/service/program_manager_unittest.cc [modify] https://crrev.com/30efb7b6a95c12b2d476e3b117147d90e247a743/gpu/command_buffer/service/shader_manager.cc [modify] https://crrev.com/30efb7b6a95c12b2d476e3b117147d90e247a743/gpu/command_buffer/service/shader_manager.h [modify] https://crrev.com/30efb7b6a95c12b2d476e3b117147d90e247a743/gpu/command_buffer/service/shader_manager_unittest.cc [modify] https://crrev.com/30efb7b6a95c12b2d476e3b117147d90e247a743/gpu/command_buffer/service/shader_translator.cc [modify] https://crrev.com/30efb7b6a95c12b2d476e3b117147d90e247a743/gpu/command_buffer/service/shader_translator.h [modify] https://crrev.com/30efb7b6a95c12b2d476e3b117147d90e247a743/gpu/command_buffer/service/shader_translator_unittest.cc [modify] https://crrev.com/30efb7b6a95c12b2d476e3b117147d90e247a743/gpu/command_buffer/service/test_helper.cc [modify] https://crrev.com/30efb7b6a95c12b2d476e3b117147d90e247a743/gpu/command_buffer/service/test_helper.h
,
Sep 20 2017
,
Sep 28 2017
Geoff, is there any work remaining on this bug?
,
Sep 28 2017
The biggest gains have been made I think. We could do more work eventually but this bug could be closed.
,
Sep 28 2017
What do you have in mind for follow-ups on this? Let's file bugs with "LowMemory" label to keep track and close this one.
,
Feb 2 2018
|
||||||||||||||
►
Sign in to add a comment |
||||||||||||||
Comment 1 by jmad...@chromium.org
, Feb 22 2017Owner: zmo@chromium.org