Issue metadata
Sign in to add a comment
|
Extremely long calls to GLES2DecoderImpl::ClearLevel on macOS during tab-switch |
||||||||||||||||||||||
Issue descriptionLocally I'm seeing hundred-millisecond-long hangs in GLES2DecoderImpl::ClearLevel. Attaching snapshot of a trace.
,
Jun 2 2018
(moving comment over from issue 848464, cause that is distinct from this): What appears in the trace is four calls to GLES2DecoderImpl::ClearLevel, which weigh in at ~200msec total (causing a perceptible jank). GLES2DecoderImpl::ClearLevel is allocating 8 MB of zeros, then telling the driver to upload them to the texture via the most inefficient upload path known to man (glTexSubImage2D ... btw, that path involves copying the data to a staging buffer inside the driver before doing the actual upload).
,
Jun 2 2018
The situation is as follows:
We allocate a ginormous texture -- 1024x2048. And we draw to the 58x12 pixel region in the top-left.
Then we bind that texture to read from, and we see that the remaining bazillion pixels have not been initialized. And so we go and issue three clear calls to clear a ~1024x2048 region.
I suspect that the texture is being allocated somewhere by Skia -- the shader source that is accessing the texture is
GLES2DecoderImpl::ClearUnclearedTextures...
#version 100
precision mediump float;
uniform mediump float ualpha_Stage1;
uniform lowp sampler2D uTextureSampler_0_Stage0;
varying highp vec2 vTextureCoords_Stage0;
varying highp float vTexIndex_Stage0;
varying mediump vec4 vinColor_Stage0;
void main() {
mediump vec4 outputCoverage_Stage0;
{
mediump vec4 texColor;
{
texColor = texture2D(uTextureSampler_0_Stage0, vTextureCoords_Stage0, -0.5);
}
outputCoverage_Stage0 = texColor;
}
{
outputCoverage_Stage0.w = max(max(outputCoverage_Stage0.x, outputCoverage_Stage0.y), outputCoverage_Stage0.z);
gl_FragColor = ualpha_Stage1 * outputCoverage_Stage0;
}
}
,
Jun 2 2018
Btw, the shader is created via this stack ... but that just looks like a "go create all my shaders" block, so it's not that interesting.
GrGLCompileAndAttachShader(...
GrGLProgramBuilder::compileAndAttachShaders(...
GrGLProgramBuilder::finalize(...
GrGLProgramBuilder::CreateProgram(...
GrGLGpu::ProgramCache::refProgram(...
GrGLGpu::flushGLState(...
GrGLGpu::draw(...
GrGpuRTCommandBuffer::draw(...
GrOpFlushState::executeDrawsAndUploadsForMeshDrawOp(...
GrOp::execute(...
GrRenderTargetOpList::onExecute(...
GrDrawingManager::executeOpLists(...
GrDrawingManager::internalFlush(...
GrDrawingManager::prepareSurfaceForExternalIO(...
GrContextPriv::prepareSurfaceForExternalIO(...
SkImage::MakeBackendTextureFromSkImage(...
cc::(...
cc::GpuImageDecodeCache::UploadImageIfNecessary(...
,
Jun 2 2018
,
Jun 5 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/fc70961580f6f6bd2498fd3242cb6ab35362cefc commit fc70961580f6f6bd2498fd3242cb6ab35362cefc Author: Christopher Cameron <ccameron@chromium.org> Date: Tue Jun 05 01:08:16 2018 Prefer to use glClear to clear uninitialized textures Skia will frequently allocate large temporary buffers (1k x 2k), write to a small part of the buffer, and then read from that small part. To ensure that we do not read from uninitialized memory, the GPU process will then clear all except for that small region of the buffer. This clear is, for historical reasons, implemented as a glTexSubImage2D, and we end up throwing ~8MB of data through glTexSubImage2D during tab switch, which can take ~200 msec, and produce noticeable jank. Split the pre-existing code to clear a level using glClear into a ClearLevelUsingGL function, add support for color buffers (it was previously only used for depth/stencil buffers), and use it for any uploads of more than 4k (an arbitrary choice based on my observation that we often do very small clears as well as very large clears). Bug: 848952 Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel Change-Id: Iba6f36c44dae9f3be3fa431ff0e64dbd8e706e8a Reviewed-on: https://chromium-review.googlesource.com/1083782 Commit-Queue: ccameron <ccameron@chromium.org> Reviewed-by: Sunny Sachanandani <sunnyps@chromium.org> Cr-Commit-Position: refs/heads/master@{#564324} [modify] https://crrev.com/fc70961580f6f6bd2498fd3242cb6ab35362cefc/gpu/command_buffer/service/gles2_cmd_decoder.cc [modify] https://crrev.com/fc70961580f6f6bd2498fd3242cb6ab35362cefc/gpu/command_buffer/service/gles2_cmd_decoder_unittest_drawing.cc
,
Jun 6 2018
ClearLevel is fixed, but there are more issues that need attention, at least on this configuration.
,
Jul 6
,
Jul 10
Hi there, It appears that the fix for this issue is causing devices with OpenGL ES 2.0 to hit a DCHECK in GLES2DecoderImpl::ClearLevel [1]. The error returned by glGetError() is GL_INVALID_ENUM. This DCHECK is consistently hit a few seconds after the browser is launched. Disclaimer: I'm very much out of my depth when it comes to anything GL-related, but... It looks like we're now taking the codepath that calls glBindFramebufferEXTFn() with GL_DRAW_FRAMEBUFFER_EXT as the 'target' param where we weren't previously (when prefer_use_gl_clear is true). This is fine in GLES 3.0, as the documentation states you can pass in GL_DRAW_FRAMEBUFFER, GL_READ_FRAMEBUFFER, or GL_FRAMEBUFFER as the 'target' param to glBindFramebuffer() [2]. However, for GLES 2.0, the documentation states that you must only pass in GL_FRAMEBUFFER as the 'target' param, otherwise a GL_INVALID_ENUM error is generated [3]. I suspect this is what is causing the DCHECK to be hit in GLES2DecoderImpl::ClearLevel. The only thing that's a bit weird is that previously we would make this same call under certain circumstances (now, in the must_use_gl_clear == true path), and one of the conditions is that feature_info_->gl_version_info().is_es2 evaluates to true. But since that path calls GLES2DecoderImpl::ClearLevel with a non-GL_FRAMEBUFFER target, it's not a path we would want to take on GLES 2.0 devices. I'm not sure if maybe that path was very rarely (if ever) taken or what, but it seemed contradictory to me. You should be able to reproduce the issue pretty easily - I've been able to reproduce it by building the chrome_public_apk target (Android) and sideloading it on a GLES 2.0 device (in my case, the Gen 3 Fire TV). Anything with a Mali-400/450/470 GPU will be limited to GLES 2.0. You'll need to have fieldtrial_testing_like_official_build=true in your build arguments, otherwise OOP rasterization will be enabled and a different code path that avoids the issue ends up being taken [4][5]. [1] https://cs.chromium.org/chromium/src/gpu/command_buffer/service/gles2_cmd_decoder.cc?l=13168&rcl=77f7da163cfb91ce8b85164109b324bdd56966d3 [2] https://www.khronos.org/registry/OpenGL-Refpages/es3.0/html/glBindFramebuffer.xhtml [3] https://www.khronos.org/registry/OpenGL-Refpages/es2.0/xhtml/glBindFramebuffer.xml [4] https://chromium.googlesource.com/chromium/src/+/d82c4fabc5cc6cbe6b475186ed9fa0216980e533%5E%21/ [5] https://cs.chromium.org/chromium/src/gpu/ipc/service/gpu_channel.cc?l=636&rcl=121ba4c5b400bbd3aaf4d0c5fb82f293f8091e0e
,
Jul 10
Agreed, on ES2 we should be using GL_FRAMEBUFFER instead (but still keep GL_DRAW_FRAMEBUFFER if SupportsSeparateFramebufferBinds() - i.e use GetDrawFramebufferTarget()).
,
Jul 11
,
Aug 6
I don't mean to pester, just a friendly ping. Wanted to make sure this issue getting re-opened didn't fall in the cracks.
,
Aug 7
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/9e1dd1485923316a9fcee03891706263a8c148bc commit 9e1dd1485923316a9fcee03891706263a8c148bc Author: Christopher Cameron <ccameron@chromium.org> Date: Tue Aug 07 19:49:46 2018 Take SupportsSeparateFramebufferBinds into account in ClearLevelUsingGL Query SupportsSeparateFramebufferBinds to determine if the framebuffer target should be GL_DRAW_FRAMEBUFFER_EXT or GL_FRAMEBUFFER_EXT. Update tests so that one test uses an ES2 context and tests target GL_FRAMEBUFFER_EXT, while the other test uses an ES3 context and tests target GL_DRAW_FRAMEBUFFER_EXT. Bug: 848952 Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel Change-Id: If9ff227498e10aef912f38e7bc26700c7513ed87 Reviewed-on: https://chromium-review.googlesource.com/1165597 Commit-Queue: ccameron <ccameron@chromium.org> Reviewed-by: Antoine Labour <piman@chromium.org> Cr-Commit-Position: refs/heads/master@{#581312} [modify] https://crrev.com/9e1dd1485923316a9fcee03891706263a8c148bc/gpu/command_buffer/service/gles2_cmd_decoder.cc [modify] https://crrev.com/9e1dd1485923316a9fcee03891706263a8c148bc/gpu/command_buffer/service/gles2_cmd_decoder_unittest_drawing.cc |
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by ccameron@chromium.org
, Jun 1 2018Status: Duplicate (was: Assigned)