Need to cache GL_VIEWPORT in command buffer |
||||
Issue descriptionIt's been found in this workspace: https://github.com/mtrofin/wasm-android-demo that on Android, the wasm version of the demo is much slower than the native version. All of the time is going into the glGetIntegerv(GL_VIEWPORT, ...); call. The reason is that that value is uncached: https://cs.chromium.org/chromium/src/gpu/command_buffer/client/gles2_implementation.cc?type=cs&l=917 It looks like this kills performance on the Nexus 5, at least. (I couldn't reproduce the slowdown on either a Nexus 5X or Pixel.) Applying the attached patch to the JavaScript sources and viewing webGL/webgl-teapots.htm out of that workspace should slow down the performance. This value should be cached in the command buffer client.
,
Dec 7 2016
Reproducible on N5. Adding the attached patch (to an older Chrome 44 which was installed on the device) slows it down enormously.
,
Dec 7 2016
To further confirm, the attached patch addresses the problem. There are quite a few other parameters that may benefit from caching. (https://cs.chromium.org/chromium/src/gpu/command_buffer/client/gles2_implementation.cc?sq=package:chromium&dr=CSs&rcl=1481055549&l=755)
,
Dec 7 2016
At least at the command buffer level, I'm not 100% sure when the viewport state might need to be fetched from the driver. Is there automatic updating of the viewport if the underlying window surface is resized? I'm having a difficult time finding calls to glViewport inside the compositor.
,
Dec 7 2016
Oh, that patch works for the teapots demo because we call glViewport at app initialization time. Its intention here is to validate that, indeed, caching would address the perf issue.
,
Dec 7 2016
In theory the initial viewport depends on the window that's being rendered to. We could either grab that and cache it on first use or define it to be (0,0). We'd also need to clamp it to GL_MAX_VIEWPORT_DIMS. Other than that caching should be doable. By the spec I don't think the driver should be modifying it after the context is created. The compositor sets its viewport at https://cs.chromium.org/chromium/src/cc/output/gl_renderer.cc?q=cc/+setviewport&sq=package:chromium&dr=CSs&l=3234
,
Dec 7 2016
We already cache the viewport at ContextState. We will need to wire it up to Get* calls + clamping to MAX_VIEWPORT_DIMS before returning. Let me take this.
,
Dec 7 2016
What about the other, currently-uncached, parameters?
,
Dec 7 2016
Ah, it's client side caching. A bit more work but still trivia.
,
Dec 7 2016
That's the thing - it's bad programming to query these values all the time. Clients should track them on their side.
,
Dec 7 2016
I don't disagree -- but in this situation, a simple query killed performance of this test case. I think we should try to avoid major performance pitfalls like this one.
,
Dec 7 2016
Moreover, the scenario is: C/C++ code shared between wasm/asm.js and native (e.g. Android). In our case, the glGetIntegerv call came from such shared code, which itself came from an Android demo.
,
Dec 7 2016
In general, each context state caching on the client side adds code duplication - we will need to do both client side validation and service side validation. That's against design philosophy. That said, I am OK with caching certain states to improve specific cases. For this time, let me just cache viewport.
,
Dec 9 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/c95386fb055bd88bc65debba5e9ea817cd272b6b commit c95386fb055bd88bc65debba5e9ea817cd272b6b Author: zmo <zmo@chromium.org> Date: Fri Dec 09 23:48:55 2016 Cache GL's viewport on the GPU command buffer client side. This is to tune the perf of a demo where glGetInteger(GL_VIEWPORT) is frequently called. BUG= 671872 TEST= R=kbr@chromium.org CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel Review-Url: https://codereview.chromium.org/2558933003 Cr-Commit-Position: refs/heads/master@{#437702} [modify] https://crrev.com/c95386fb055bd88bc65debba5e9ea817cd272b6b/gpu/command_buffer/build_gles2_cmd_buffer.py [modify] https://crrev.com/c95386fb055bd88bc65debba5e9ea817cd272b6b/gpu/command_buffer/client/client_context_state.cc [modify] https://crrev.com/c95386fb055bd88bc65debba5e9ea817cd272b6b/gpu/command_buffer/client/client_context_state.h [modify] https://crrev.com/c95386fb055bd88bc65debba5e9ea817cd272b6b/gpu/command_buffer/client/gles2_implementation.cc [modify] https://crrev.com/c95386fb055bd88bc65debba5e9ea817cd272b6b/gpu/command_buffer/client/gles2_implementation_impl_autogen.h [modify] https://crrev.com/c95386fb055bd88bc65debba5e9ea817cd272b6b/gpu/command_buffer/client/gles2_implementation_unittest.cc [modify] https://crrev.com/c95386fb055bd88bc65debba5e9ea817cd272b6b/gpu/command_buffer/common/capabilities.cc [modify] https://crrev.com/c95386fb055bd88bc65debba5e9ea817cd272b6b/gpu/command_buffer/common/capabilities.h [modify] https://crrev.com/c95386fb055bd88bc65debba5e9ea817cd272b6b/gpu/command_buffer/service/gl_utils.cc [modify] https://crrev.com/c95386fb055bd88bc65debba5e9ea817cd272b6b/gpu/command_buffer/service/gles2_cmd_decoder.cc
,
Dec 9 2016
,
Dec 10 2016
Verified on a Nexus 5 - thanks! Is there a way to add some sort of regression testing for this?
,
Dec 10 2016
This is really a bad way of using WebGL APIs - a program can just query a GL state in every draw call and kill the performance - if we optimize (by validating and caching all states on the client side) to cater all bad coding patterns, that will be too much a burden and totally destroy our design. So this is just a one time exception - I really don't see a need to do perf regression on this.
,
Dec 10 2016
See my earlier comment about the wasm scenario and code portability with native (#12) If the pattern is unusual in native, and the guidance for native is akin to the one for webgl, then we don't need this exception either. If the pattern is usual in native, then we need to first understand what having different performance guarantees for GL APIs in native and wasm would do to the web platform as a whole.
,
Dec 10 2016
Command buffer has been serving "native" code (Nacl) and WebGL for a long time. Surprisingly this porting issue doesn't come up until now. The current command buffer design definitely is inefficient if client code does a lot of context state querying. If we decide to optimize all of them, we pay a price (extra validation on the client side). I am not saying we shouldn't, but we really need to evaluate the situation (the code we plan to port over and do they query context states on a per draw call base a lot).
,
Dec 10 2016
The NaCL scenario is interesting, indeed. I wonder what a good strategy may be for vetting out whether what we're seeing is an unfortunate isolated example (the original C code comes from an isolated Android demo). Maybe I'll look at Unity code generation for native... any other suggestions would be welcome!
,
Dec 12 2016
Just realized the NaCl issue probably never came up because we had no NaCl on Android. I still owe an analysis of C/C++ OpenGL coding patterns, as I promised. |
||||
►
Sign in to add a comment |
||||
Comment 1 by kbr@chromium.org
, Dec 7 2016