New issue
Advanced search Search tips

Issue 848952 link

Starred by 5 users

Issue metadata

Status: Assigned
Merged: issue 848464
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 3
Type: Bug



Sign in to add a comment

Extremely long calls to GLES2DecoderImpl::ClearLevel on macOS during tab-switch

Project Member Reported by ccameron@chromium.org, Jun 1 2018

Issue description

Locally I'm seeing hundred-millisecond-long hangs in GLES2DecoderImpl::ClearLevel.

Attaching snapshot of a trace.
 
clearlevel.png
31.4 KB View Download
Mergedinto: 848464
Status: Duplicate (was: Assigned)
Cc: sdy@chromium.org ellyjo...@chromium.org erikc...@chromium.org
Status: Assigned (was: Duplicate)
(moving comment over from issue 848464, cause that is distinct from this):

What appears in the trace is four calls to GLES2DecoderImpl::ClearLevel, which weigh in at ~200msec total (causing a perceptible jank).

GLES2DecoderImpl::ClearLevel is allocating 8 MB of zeros, then telling the driver to upload them to the texture via the most inefficient upload path known to man (glTexSubImage2D ... btw, that path involves copying the data to a staging buffer inside the driver before doing the actual upload).
Cc: sunn...@chromium.org piman@chromium.org
The situation is as follows:

We allocate a ginormous texture -- 1024x2048. And we draw to the 58x12 pixel region in the top-left.

Then we bind that texture to read from, and we see that the remaining bazillion pixels have not been initialized. And so we go and issue three clear calls to clear a ~1024x2048 region.

I suspect that the texture is being allocated somewhere by Skia -- the shader source that is accessing the texture is


GLES2DecoderImpl::ClearUnclearedTextures...
#version 100

precision mediump float;
uniform mediump float ualpha_Stage1;
uniform lowp sampler2D uTextureSampler_0_Stage0;
varying highp vec2 vTextureCoords_Stage0;
varying highp float vTexIndex_Stage0;
varying mediump vec4 vinColor_Stage0;
void main() {
    mediump vec4 outputCoverage_Stage0;
    {
        mediump vec4 texColor;
        {
            texColor = texture2D(uTextureSampler_0_Stage0, vTextureCoords_Stage0, -0.5);
        }
        outputCoverage_Stage0 = texColor;
    }
    {
        outputCoverage_Stage0.w = max(max(outputCoverage_Stage0.x, outputCoverage_Stage0.y), outputCoverage_Stage0.z);
        gl_FragColor = ualpha_Stage1 * outputCoverage_Stage0;
    }
}

Btw, the shader is created via this stack ... but that just looks like a "go create all my shaders" block, so it's not that interesting.
    GrGLCompileAndAttachShader(...
    GrGLProgramBuilder::compileAndAttachShaders(...
    GrGLProgramBuilder::finalize(... 
    GrGLProgramBuilder::CreateProgram(...
    GrGLGpu::ProgramCache::refProgram(...
    GrGLGpu::flushGLState(...
    GrGLGpu::draw(...
    GrGpuRTCommandBuffer::draw(...
    GrOpFlushState::executeDrawsAndUploadsForMeshDrawOp(...
    GrOp::execute(...
    GrRenderTargetOpList::onExecute(...
    GrDrawingManager::executeOpLists(...
    GrDrawingManager::internalFlush(...
    GrDrawingManager::prepareSurfaceForExternalIO(...
    GrContextPriv::prepareSurfaceForExternalIO(...
    SkImage::MakeBackendTextureFromSkImage(...
    cc::(...
    cc::GpuImageDecodeCache::UploadImageIfNecessary(...


Summary: Extremely long calls to GLES2DecoderImpl::ClearLevel on macOS during tab-switch (was: Extremely long calls to GLES2DecoderImpl::ClearLevel on macOS)
Project Member

Comment 6 by bugdroid1@chromium.org, Jun 5 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/fc70961580f6f6bd2498fd3242cb6ab35362cefc

commit fc70961580f6f6bd2498fd3242cb6ab35362cefc
Author: Christopher Cameron <ccameron@chromium.org>
Date: Tue Jun 05 01:08:16 2018

Prefer to use glClear to clear uninitialized textures

Skia will frequently allocate large temporary buffers (1k x 2k), write
to a small part of the buffer, and then read from that small part.

To ensure that we do not read from uninitialized memory, the GPU
process will then clear all except for that small region of the buffer.

This clear is, for historical reasons, implemented as a glTexSubImage2D,
and we end up throwing ~8MB of data through glTexSubImage2D during tab
switch, which can take ~200 msec, and produce noticeable jank.

Split the pre-existing code to clear a level using glClear into a
ClearLevelUsingGL function, add support for color buffers (it was
previously only used for depth/stencil buffers), and use it for any
uploads of more than 4k (an arbitrary choice based on my observation
that we often do very small clears as well as very large clears).

Bug: 848952
Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel
Change-Id: Iba6f36c44dae9f3be3fa431ff0e64dbd8e706e8a
Reviewed-on: https://chromium-review.googlesource.com/1083782
Commit-Queue: ccameron <ccameron@chromium.org>
Reviewed-by: Sunny Sachanandani <sunnyps@chromium.org>
Cr-Commit-Position: refs/heads/master@{#564324}
[modify] https://crrev.com/fc70961580f6f6bd2498fd3242cb6ab35362cefc/gpu/command_buffer/service/gles2_cmd_decoder.cc
[modify] https://crrev.com/fc70961580f6f6bd2498fd3242cb6ab35362cefc/gpu/command_buffer/service/gles2_cmd_decoder_unittest_drawing.cc

ClearLevel is fixed, but there are more issues that need attention, at least on this configuration.
Status: Fixed (was: Assigned)
Hi there,

It appears that the fix for this issue is causing devices with OpenGL ES 2.0 to hit a DCHECK in GLES2DecoderImpl::ClearLevel [1]. The error returned by glGetError() is GL_INVALID_ENUM.

This DCHECK is consistently hit a few seconds after the browser is launched.

Disclaimer: I'm very much out of my depth when it comes to anything GL-related, but...

It looks like we're now taking the codepath that calls glBindFramebufferEXTFn() with GL_DRAW_FRAMEBUFFER_EXT as the 'target' param where we weren't previously (when prefer_use_gl_clear is true).

This is fine in GLES 3.0, as the documentation states you can pass in GL_DRAW_FRAMEBUFFER, GL_READ_FRAMEBUFFER, or GL_FRAMEBUFFER as the 'target' param to glBindFramebuffer() [2].
However, for GLES 2.0, the documentation states that you must only pass in GL_FRAMEBUFFER as the 'target' param, otherwise a GL_INVALID_ENUM error is generated [3]. I suspect this is what is causing the DCHECK to be hit in GLES2DecoderImpl::ClearLevel.

The only thing that's a bit weird is that previously we would make this same call under certain circumstances (now, in the must_use_gl_clear == true path), and one of the conditions is that feature_info_->gl_version_info().is_es2 evaluates to true. But since that path calls GLES2DecoderImpl::ClearLevel with a non-GL_FRAMEBUFFER target, it's not a path we would want to take on GLES 2.0 devices.
I'm not sure if maybe that path was very rarely (if ever) taken or what, but it seemed contradictory to me.

You should be able to reproduce the issue pretty easily - I've been able to reproduce it by building the chrome_public_apk target (Android) and sideloading it on a GLES 2.0 device (in my case, the Gen 3 Fire TV).
Anything with a Mali-400/450/470 GPU will be limited to GLES 2.0.
You'll need to have fieldtrial_testing_like_official_build=true in your build arguments, otherwise OOP rasterization will be enabled and a different code path that avoids the issue ends up being taken [4][5].

[1] https://cs.chromium.org/chromium/src/gpu/command_buffer/service/gles2_cmd_decoder.cc?l=13168&rcl=77f7da163cfb91ce8b85164109b324bdd56966d3
[2] https://www.khronos.org/registry/OpenGL-Refpages/es3.0/html/glBindFramebuffer.xhtml
[3] https://www.khronos.org/registry/OpenGL-Refpages/es2.0/xhtml/glBindFramebuffer.xml
[4] https://chromium.googlesource.com/chromium/src/+/d82c4fabc5cc6cbe6b475186ed9fa0216980e533%5E%21/
[5] https://cs.chromium.org/chromium/src/gpu/ipc/service/gpu_channel.cc?l=636&rcl=121ba4c5b400bbd3aaf4d0c5fb82f293f8091e0e
Agreed, on ES2 we should be using GL_FRAMEBUFFER instead (but still keep GL_DRAW_FRAMEBUFFER if SupportsSeparateFramebufferBinds() - i.e use GetDrawFramebufferTarget()).
Components: -Internals>GPU Internals>GPU>Internals
Status: Assigned (was: Fixed)
I don't mean to pester, just a friendly ping. Wanted to make sure this issue getting re-opened didn't fall in the cracks.

Project Member

Comment 13 by bugdroid1@chromium.org, Aug 7

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/9e1dd1485923316a9fcee03891706263a8c148bc

commit 9e1dd1485923316a9fcee03891706263a8c148bc
Author: Christopher Cameron <ccameron@chromium.org>
Date: Tue Aug 07 19:49:46 2018

Take SupportsSeparateFramebufferBinds into account in ClearLevelUsingGL

Query SupportsSeparateFramebufferBinds to determine if the framebuffer
target should be GL_DRAW_FRAMEBUFFER_EXT or GL_FRAMEBUFFER_EXT.

Update tests so that one test uses an ES2 context and tests target
GL_FRAMEBUFFER_EXT, while the other test uses an ES3 context and tests
target GL_DRAW_FRAMEBUFFER_EXT.

Bug: 848952
Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel
Change-Id: If9ff227498e10aef912f38e7bc26700c7513ed87
Reviewed-on: https://chromium-review.googlesource.com/1165597
Commit-Queue: ccameron <ccameron@chromium.org>
Reviewed-by: Antoine Labour <piman@chromium.org>
Cr-Commit-Position: refs/heads/master@{#581312}
[modify] https://crrev.com/9e1dd1485923316a9fcee03891706263a8c148bc/gpu/command_buffer/service/gles2_cmd_decoder.cc
[modify] https://crrev.com/9e1dd1485923316a9fcee03891706263a8c148bc/gpu/command_buffer/service/gles2_cmd_decoder_unittest_drawing.cc

Sign in to add a comment