New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 747159 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner:
Closed: Dec 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 1
Type: Bug
Proj-XR

Blocked on:
issue 750286
issue 750385
issue 770924



Sign in to add a comment

Frames rendered in fast/slow pattern, fence related?

Project Member Reported by klausw@chromium.org, Jul 20 2017

Issue description

TL;DR: very mysterious frame timing, where one frame takes twice the expected time followed by an almost-instant frame. Not sure if this is a misbehaving fence, or if rendering is really slow due to pipelining?

Chrome Version: ToT M61 @ 86245656c46e084d9c80bcef68e6181053226d1e (Jul 20)
OS: Android

What steps will reproduce the problem?
(1) Open https://webvr.info/samples/test-slow-render.html?renderScale=1&heavyGpu=1&cubeScale=0.3&workTime=4
(2) tap "Enter VR"
(3) collect event trace

What is the expected result?

Reasonable framerate, all frames take about the same time

What happens instead?

Framerate is 30-35fps on a Pixel XL, and the suspicious thing is that timing shows an alternating sequence of slow and fast frames.

See the attached event trace, I've added an image with some annotations drawn in to show the frame lifecycle.

The blocking factor seems to be the DrawFrameSubmitWhenReady polling for a GLFenceEGL to signal that drawing has completed. This takes about 30ms for one frame, then essentially zero for the next frame.

Other noteworthy details are how the CPU usage drops to near zero while waiting. This may be normal since I don't know if GPU usage would show here, the VrShellGl rendering context is a direct ui/gl/ context that doesn't use the CrGpuMain command buffer.

The code active here is:

https://cs.chromium.org/chromium/src/chrome/browser/android/vr_shell/vr_shell_gl.cc?l=876

    [draw image, this is a simple fullscreen quad copying a SurfaceTexture]

    // Continue with submit once a GL fence signals that current drawing
    // operations have completed.
    std::unique_ptr<gl::GLFenceEGL> fence = base::MakeUnique<gl::GLFenceEGL>();
    task_runner_->PostTask(
        FROM_HERE,
        base::Bind(&VrShellGl::DrawFrameSubmitWhenReady,
                   weak_ptr_factory_.GetWeakPtr(), frame_index, frame.release(),
                   render_info_primary_.head_pose, base::Passed(&fence)));

...

    while (not signaled yet)
      fence->ClientWaitWithTimeoutNanos(kWebVRFenceCheckTimeout.InMicroseconds() * 1000);
    frame.Submit(*buffer_viewport_list_, mat);

Anyone have an idea what may be going on here? I vaguely remember that the Adreno GPU has different rendering modes with tradeoffs between latency and throughput, is it possible that it's toggling between modes in some way and trying to render frames in parallel? 

Or is there an issue where fences don't signal completion even though the drawing operations are actually done?

I once had a similar effect in a prototype when misusing server wait (eglWaitSyncKHR), where I emitted two server waits in a row and the second server wait blocked GPU operations including checking earlier fences, that had almost identical symptoms. However, as far as I know neither my code nor the GVR code for the following steps uses server wait. Does Chrome use them internally somewhere in the GPU process?

Anything else I could dig up to try to help debug this?

FWIW, the code I measured with has a 1-line change to disable vsync-aligned JS rAF timing, this doesn't change the framerate significantly but shows the pattern more clearly:

https://cs.chromium.org/chromium/src/chrome/browser/android/vr_shell/vr_shell_gl.cc?q=vr_shell_gl.cc+file:%5Esrc/+package:%5Echromium$&dr&l=1094
-  if (surfaceless_rendering_ || !pending_vsync_) {
+  if (!pending_vsync_) {

 
trace_totnoalign-fastslowpattern.html
4.2 MB View Download
totunalign-fastslow.png
246 KB View Download

Comment 1 by klausw@chromium.org, Jul 20 2017

I've attached a trace and screenshot from the snapdragon profiler, though I'm not sure how to interpret it. Its sampling resolution seems to be about 0.1s, so it's not possible to see the alternating-frame pattern. "GPU % Utilization" shows as less than 50%, but I'm unsure if that's just a reporting error, I haven't really seen numbers much higher than 60% or so even on other applications I've tested.

fastslow.csv
884 KB View Download
fastslow-profiler.png
233 KB View Download

Comment 2 by klausw@chromium.org, Jul 20 2017

Labels: Proj-VR GPU-Qualcomm

Comment 3 by klausw@chromium.org, Jul 20 2017

Out of paranoia I added some debugging output to ui/gl/gl_fence_egl.cc to make sue I'm not doing something silly like destroying/recreating the fences instead of passing them. It seems to confirm the behavior.

For reference:
third_party/khronos/EGL/egl.h:#define EGL_TIMEOUT_EXPIRED               0x30F5
third_party/khronos/EGL/egl.h:#define EGL_CONDITION_SATISFIED           0x30F6

The GL calls being executed are basically this

  sync_ = eglCreateSyncKHR(eglGetCurrentDisplay(), EGL_SYNC_FENCE_KHR, NULL);
  glFlush();
  ...
  result = eglClientWaitSyncKHR(display_, sync_, flags, timeout /*= 2000 ns */);

Here's a few frames in sequence:

07-20 16:40:08.307 20047 21275 I chromium: [INFO:gl_fence_egl.cc(25)] GLFenceEGL;;; this=0xc38adc40 CONSTRUCTOR
07-20 16:40:08.308 20047 21275 I chromium: [INFO:gl_fence_egl.cc(29)] GLFenceEGL;;; this=0xc38adc40 flushed
07-20 16:40:08.308 20047 21275 I chromium: [INFO:vr_shell_gl.cc(879)] DrawFrame;;; fence=0xc38adc40
07-20 16:40:08.311 20047 21275 I chromium: [INFO:gl_fence_egl.cc(58)] ClientWaitWithTimeoutNanos;;; this=0xc38adc40 result=0x30f5
07-20 16:40:08.339 20047 21275 I chatty  : uid=10133(u0_a133) Thread-21 identical 12 lines
07-20 16:40:08.341 20047 21275 I chromium: [INFO:gl_fence_egl.cc(58)] ClientWaitWithTimeoutNanos;;; this=0xc38adc40 result=0x30f5
07-20 16:40:08.343 20047 21275 I chromium: [INFO:gl_fence_egl.cc(58)] ClientWaitWithTimeoutNanos;;; this=0xc38adc40 result=0x30f6
Elapsed: ~36ms

07-20 16:40:08.349 20047 21275 I chromium: [INFO:gl_fence_egl.cc(25)] GLFenceEGL;;; this=0xc38ad9e0 CONSTRUCTOR
07-20 16:40:08.350 20047 21275 I chromium: [INFO:gl_fence_egl.cc(29)] GLFenceEGL;;; this=0xc38ad9e0 flushed
07-20 16:40:08.350 20047 21275 I chromium: [INFO:vr_shell_gl.cc(879)] DrawFrame;;; fence=0xc38ad9e0
07-20 16:40:08.350 20047 21275 I chromium: [INFO:gl_fence_egl.cc(58)] ClientWaitWithTimeoutNanos;;; this=0xc38ad9e0 result=0x30f6
Elapsed: ~1ms

07-20 16:40:08.371 20047 21275 I chromium: [INFO:gl_fence_egl.cc(25)] GLFenceEGL;;; this=0xc72dce80 CONSTRUCTOR
07-20 16:40:08.371 20047 21275 I chromium: [INFO:gl_fence_egl.cc(29)] GLFenceEGL;;; this=0xc72dce80 flushed
07-20 16:40:08.371 20047 21275 I chromium: [INFO:vr_shell_gl.cc(879)] DrawFrame;;; fence=0xc72dce80
07-20 16:40:08.374 20047 21275 I chromium: [INFO:gl_fence_egl.cc(58)] ClientWaitWithTimeoutNanos;;; this=0xc72dce80 result=0x30f5
07-20 16:40:08.403 20047 21275 I chatty  : uid=10133(u0_a133) Thread-21 identical 12 lines
07-20 16:40:08.405 20047 21275 I chromium: [INFO:gl_fence_egl.cc(58)] ClientWaitWithTimeoutNanos;;; this=0xc72dce80 result=0x30f5
07-20 16:40:08.406 20047 21275 I chromium: [INFO:gl_fence_egl.cc(58)] ClientWaitWithTimeoutNanos;;; this=0xc72dce80 result=0x30f6
Elapsed: ~35ms

07-20 16:40:08.417 20047 21275 I chromium: [INFO:gl_fence_egl.cc(25)] GLFenceEGL;;; this=0xc38ad9a0 CONSTRUCTOR
07-20 16:40:08.417 20047 21275 I chromium: [INFO:gl_fence_egl.cc(29)] GLFenceEGL;;; this=0xc38ad9a0 flushed
07-20 16:40:08.418 20047 21275 I chromium: [INFO:vr_shell_gl.cc(879)] DrawFrame;;; fence=0xc38ad9a0
07-20 16:40:08.418 20047 21275 I chromium: [INFO:gl_fence_egl.cc(58)] ClientWaitWithTimeoutNanos;;; this=0xc38ad9a0 result=0x30f6
Elapsed: <1ms

Comment 4 by klausw@chromium.org, Jul 21 2017

Cc: kbr@chromium.org
+kbr since you were involved in some of the WebVR discussions previously, do you have any ideas what could explain the very odd timing? Or do you think it would make sense to discuss this with someone from Qualcomm in case the unusual rendering loop is confusing the driver?

Comment 5 by klausw@chromium.org, Jul 21 2017

Correction, billorr@ got me pointed in the right direction - there is actually significant overlap in rendering between frames, it starts setting up the mailbox for transfer before the waitForPreviousRenderToComplete step. See corrected interpretation of the original trace.


totunalign-fastslow-v2.png
250 KB View Download

Comment 6 by klausw@chromium.org, Jul 21 2017

It does seem that the effect is caused by parallel GPU work in different contexts, though it's odd that it's so drastic.

I tried an experiment with moving the wait-for-previous-frame-to-render to before the step where it sets up the mailbox, this evens out the framerate nicely but only improves framerate very slightly, see new traces.

I guess it may work better to start the next frame's rendering slightly earlier to improve parallelism, but not so early as to bog down the GPU with parallel work.
totnoalign-earlywait.png
193 KB View Download

Comment 7 by klausw@chromium.org, Jul 21 2017

+missing attachment for the trace itself
trace_totnoalign-earlywait.html
4.3 MB View Download

Comment 8 by kbr@chromium.org, Jul 21 2017

Cc: sunn...@chromium.org briander...@chromium.org
I'm not really familiar with this code but a couple of things jump out at me from the trace. In trace_totnoalign-earlywait.html, VRDisplay::GetImage is making two long calls to SyncChannel::Send, taking between 1.3 ms and 1.6 ms each; VRDisplay::EnsureMailbox then calls SyncChannel::Send taking 1.5 ms; and finally VRDisplay::SubmitFrame takes about 0.75 ms. I mentioned on one of the earlier CLs that it seemed like a bad idea to be calling ClientWaitSync for synchronization purposes; not sure whether that CL has landed, but making round-trips to the GPU process from the renderer process each frame should be avoided at all costs.

It's another question why waitForPreviousRenderToFinish is taking upwards of 10 ms per frame and sometimes 18 ms per frame.

The GPU process doesn't appear to be very busy in that trace; it's doing about 10 ms of work periodically. So it seems to me it should be possible to optimize things and get 60 FPS in this case.

You should dig deeper to understand exactly the behavior you want for a simple case, and make sure you're getting it per tracing. In my cursory understanding of the code, I suggest producing an EGLSync object from the producer as well as the texture it renders, and then the VRShell should pick it up, do an eglWaitSync (not eglClientWaitSync) and draw that texture.

Once the traces for the simple case look good, then go forward and try the more complex GPU-bound cases.

If absolutely necessary, you might do an operation like draining the graphics pipe once per frame via glFinish(), but I wouldn't introduce this unless you've found it is necessary due to overloading the GPU.

Comment 9 by enne@chromium.org, Jul 21 2017

Status: Available (was: Untriaged)
@kbr, thank you for the feedback. Would you be willing to do a VC to discuss this in more detail?

> The GPU process doesn't appear to be very busy in that trace; it's doing about 10 ms of work periodically. So it seems to me it should be possible to optimize things and get 60 FPS in this case.

Yes, the GPU process is not very busy, but I think the main bottleneck in this case is the actual GPU rendering, not the GPU process or renderer.

This is a synthetic test that simulates GPU-render-bound applications, it uses an extremely expensive fragment shader to stress the GPU but only has a handful of simple draw calls. If I understand it right, this means that the GPU process would not be expected to be very busy since it only has to issue a few commands to the underlying GL driver, and any waiting time for rendering to complete would in general not show directly in the CrGpuMain trace unless there's a blocking GL call, right?

> It's another question why waitForPreviousRenderToFinish is taking upwards of 10 ms per frame and sometimes 18 ms per frame.

waitForPreviousRenderToFinish is the sequencing that's used to prevent overloading the GPU. The intent is that we have one frame that's wrapping up GPU rendering while the next frame is executing JS and issuing GL commands to the GPU process. In this case longish wait times are expected since there's minimal JS work. (There's even an intentional 4ms busy wait in the test application before submitFrame to simulate a more complex JS render setup.) So even if the submitFrame overhead were lower, it would just spend more time waiting in the waitForPreviousRenderToFinish step.

The counterintuitive behavior I saw was that I get more consistent and slightly better performance by intentionally *delaying* sending commands to the GPU process to reduce parallelism, that's the change from comment #6 where I forced waiting for the previous frame before creating the mailbox with the associated steps. The original intent was to improve parallelism by letting the GPU start work on the next frame while the current one is finishing rendering, that usually worked but in this case it led to some very odd timing.

The specific problem with the original trace was the huge variation in rendering times. I'm not sure if the GPU is indeed taking such a long time to finish frames or if the fences don't signal as complete even though rendering is done already, but it seems as if it's buffering one frame and finishing a pair of frames together at almost exactly the same time. Part of the issue is that it's doing mailbox setup which involves making a pixel copy at around the same time as the previous frame is rendering, and this seems to lead to bad scheduling.

> I'm not really familiar with this code but a couple of things jump out at me from the trace. In trace_totnoalign-earlywait.html, VRDisplay::GetImage is making two long calls to SyncChannel::Send, taking between 1.3 ms and 1.6 ms each; VRDisplay::EnsureMailbox then calls SyncChannel::Send taking 1.5 ms; and finally VRDisplay::SubmitFrame takes about 0.75 ms. I mentioned on one of the earlier CLs that it seemed like a bad idea to be calling ClientWaitSync for synchronization purposes; not sure whether that CL has landed, but making round-trips to the GPU process from the renderer process each frame should be avoided at all costs.

Yes, I'm aware that the current rendering pipeline is sadly inefficient. The first two SyncChannel::Send happen inside rendering_context_->GetImage() which isn't our code. In previous discussions I got the recommendation to use that interface for accessing the canvas content in preparation for turning it into a mailbox image.

High level, what we want to happen at this point is just to create a mailbox and associated sync token, that's what gets passed to VRDisplay::SubmitFrame. In most cases this SubmitFrame is fairly fast (0.2ms or so), it's an async Mojo call. If there's a way to get the sync token and mailbox without GPU process round trips I'd be in favor of using that. However, I thought that a requirement for sync tokens is that the underlying fence must be flushed to the GPU process, doesn't that inherently involve a round trip? I'm not sure if an unverified sync token would be suitable in this case and if it would help with parallelism, but it sounds as if this would require changes to existing Blink interfaces.

The ClientWaitSync change has landed, those are the DrawFrameSubmitWhenReady src_func blocks in VrShellGl where it's basically polling for a previous frame to finish rendering. Once that fence signals completion, it unblocks the waitForPreviousRenderToFinish step.

> You should dig deeper to understand exactly the behavior you want for a simple case, and make sure you're getting it per tracing. In my cursory understanding of the code, I suggest producing an EGLSync object from the producer as well as the texture it renders, and then the VRShell should pick it up, do an eglWaitSync (not eglClientWaitSync) and draw that texture. 

The EGLSync is a bit tricky since we'd need to do a cross-context and cross-process sync. I do have a prototype that uses this as part of a replacement for the mailbox transfer, but that's a separate issue.

We draw the texture before waiting for the existing fence to complete, the eglClientWaitSync is just a post-draw check to see when it finishes. It's only a bottleneck if an application needs almost all of a frame's JS time budget in addition to its GPU budget, but that's not the case for this test.

> If absolutely necessary, you might do an operation like draining the graphics pipe once per frame via glFinish(), but I wouldn't introduce this unless you've found it is necessary due to overloading the GPU.

I was doing a glFinish in earlier versions, the current behavior was generally more efficient than the glFinish except for issues like here where the increased parallelism seems to result in pathological behavior :-/

FYI, just for completeness, there's also additional GL contexts involved. VrShellGl in the browser has a command buffer connection to the GPU process to retrieve the mailbox, and uses a separate ui/gl context to draw this into GVR's renderbuffer. GVR has its own high priority context which does async distortion shading at 120fps (two eye views per frame), that contributes some background load which isn't directly visible in these traces.

Comment 11 by kbr@chromium.org, Jul 24 2017

Owner: klausw@chromium.org
Status: Assigned (was: Available)
klausw@ happy to chat via GVC but it would be more productive for you to catch sunnyps@. He understands the synchronization primitives much better than I do, and can advise on how you can avoid round-trips to the GPU process.

> This is a synthetic test that simulates GPU-render-bound applications, it uses an extremely expensive fragment shader to stress the GPU but only has a handful of simple draw calls. If I understand it right, this means that the GPU process would not be expected to be very busy since it only has to issue a few commands to the underlying GL driver, and any waiting time for rendering to complete would in general not show directly in the CrGpuMain trace unless there's a blocking GL call, right?

I would still expect some call on the GPU process side to take a long time waiting for the GPU to complete its work; perhaps eglSwapBuffers, or something else.

Still encourage you to try to get rid of the client-side waits. Due to Chrome's deep and multi-process rendering pipeline, swamping the GPU is a recurring problem, but different approaches to throttling the client have been taken and at this point have been successful without doing client-side waits per frame.

Klaus, let me assign this bug to you as we'll need you to drive the investigation.

I've experimented a bit and got an interesting extreme case. I've mostly disabled the waitForPreviousRenderToFinish throttling, and now it seems to be creating dozens of frames and mailboxes in the renderer while the consumer in VrShellGl is getting way behind. If I select the CrGpuMain commands from the timespan 11s to 13s, I see 68 DoProduceTextureCHROMIUM calls but only a single DoSwapBuffers call which would correspond to it being consumed by drawing onto a transfer Surface. Latency is horrific, the completion fence took >800 ms. The in-app framerate claims to be ~43fps, but the actual screen updates are more like 0.1fps.

@sunnyps, is this maybe related to the GPU scheduling introduced recently as per https://docs.google.com/document/d/1hjVckIpb9WBE7A9HUxAmutRJEgSkZO_JAFfgwOE-8NE/edit /  issue 514813  ? A user reported a significant slowdown going from M59 to M60 which I couldn't reproduce myself, but I'm unsure to what extent this may depend on being enrolled in experiments.

I'll try disabling GPU scheduling and/or tweaking priorities, not sure if for some reason the VR composition in VrShellGl is accidentally ending up with low priority.
trace_earlyreportrendered-superslow.html
5.9 MB View Download
For context re my previous comment, VrShellGl is supposed to run the following code once per received submitFrame call:

https://cs.chromium.org/chromium/src/chrome/browser/android/vr_shell/mailbox_to_surface_bridge.cc?q=mailbox_to_sur&sq=package:chromium&l=251

  gl->WaitSyncTokenCHROMIUM(mailbox.sync_token.GetConstData());

  GLuint sourceTexture = gl->CreateAndConsumeTextureCHROMIUM(GL_TEXTURE_2D,
                                             mailbox.mailbox.name);
  DrawQuad(sourceTexture);
  ...
  gl_->SwapBuffers();
  return true;

Based on the trace, DoCreateAndConsumeTextureINTERNAL is being called 68 times which would be the expected count, but DoSwapBuffers only once in that time period. Strange. I vaguely remember there's a throttling mechanism for SwapBuffers that may apply here?
Correction to comment #13, looks lkie CreateAndConsumeTextureCHROMIUM is called twice per frame in normal operation, so this would match one single call to the mailbox_to_surface_bridge function. But the mystery still remains why this is only called so rarely.

I'm adding extra logging to see if it's related to SwapBuffers throttling.
The gpu command buffer scheduler was enabled on M61. Can you try running with --disable-features=GpuScheduler? (I'm still reading through all the comments on this bug.)
@sunnyps, I tried this:

$ ./build/android/adb_chrome_public_command_line --v=0 --disable-features=GpuScheduler

It's still showing the fast/slow pattern, see attached trace_fastslow-disablegpuscheduler.html , as far as I can tell that made no difference.

The performance regression from M59 to M60 is likely a red herring. Turns out that there's at least two different "Cardboard (2015)" headset configurations out in the wild with different render resolutions, I've seen 1672x922 and 1818x1144. This difference seems sufficient to explain some of the performance variations I've seen. FYI, the actual resolution used is visible in traces as an arg to SwapBuffers calls in the GPU process.

I can't see SwapBuffers throttling (see comment #13) after adding a TRACE_EVENT0 in GLES2Implementation::SwapBuffers for the WaitForToken call, so that also doesn't seem related.

I was able to fix the slow/fast issue and increase FPS from ~44fps to ~54fps by adding attribs.context_priority = gl::ContextPriorityLow; at the end of gpu_command_buffer/service/service_utils.cc's GenerateGLContextAttribs, see attached trace_hack-lowprio.html. This changes the driver-level GL context priority as per the EGL_IMG_context_priority extension. Result is that the WebGL context is running at low priority, the VrShellGl copy is at medium priority, and the GVR async reprojection (not visible in the event traces) is at high priority.
trace_fastslow-disablegpuscheduler.html
4.2 MB View Download
trace_hack-lowprio.html
4.3 MB View Download
Some notes from a VC with sunnyps@, bajones@, and billorr@ yesterday, please let me know if I misunderstood things.

Confirmed that te issue seems to be unrelated to the GPU scheduler, disabling it had no effect. Also, if I remember right, the issue also seems to exist in M60 where it wasn't active yet.

We don't know what specifically is causing the odd timing, other than what appears to be unfortunate scheduling decisions involving a very busy virtualized context in the GPU process that's starving the VrShellGl native context.

Changing the driver-level context priority seems like an invasive change and would need further discussion.

In the meantime, my plan is to tweak the timing to avoid this situation, for example ensuring that there's a minimum elapsed time delay between submitting a previous frame and flushing GL commands for the following one. This doesn't necessarily need to be based on exact sync points or unblocking via IPC messages, it is likely sufficient to just measure average times and do an approximate delay to prevent overstuffing.

Separately, sunnyps@ suggested switching from the current verified sync tokens that involve a synchronous IPC to unverified sync tokens that don't need that. This would involve some logic changes in VrShellGl to verify the tokens before using them, but should improve the available Javascript execution time.
Blockedon: 750286
Blockedon: 750385
Cc: kainino@chromium.org
Update: this issue does *not* reproduce on a Pixel (non-XL) running Android N. I suspect that this may be related specifically to the Android O Qualcomm driver as used on Pixel XL, in my case this was marlin-userdebug 8.0.0 OPR1.170623.011.
Update to the update: I do still see a fast/slow pattern on a Pixel XL running Android N. Since the non-XL Pixel and Pixel XL have very similar hardware, I'm suspecting that the difference may just be timing related where triggering the effect varies due to details such as render resolution.

For example, adding &renderScale=0.8 on the pixel XL results in fairly smooth 60fps without slow/fast pattern, and this roughly simulates the resolution difference for the devices (2560x1440 vs 1920x1080).

Sorry about the confusion, it seems difficult to track down what exactly is going on without more insight into driver-level details.
Labels: -M-61
Status: ExternalDependency (was: Assigned)
I filed b/65157989 asking Qualcomm for help.
Status: Fixed (was: ExternalDependency)
This seems to have gotten fixed unexpectedly as part of  issue 769488  , specifically change r505155 from https://chromium-review.googlesource.com/687832 "Enforce flush ordering between contexts in GPU process on macOS".

I'll update the external bug and close this one for now, we can revisit this separately if needed. (I'm still seeing fast/slow rendering in a GpuMemoryBuffer/AHardwareBuffer based prototype.)
Status: Assigned (was: Fixed)
This has started up again, caused by https://chromium-review.googlesource.com/c/chromium/src/+/696666.

Comment 27 by kbr@chromium.org, Oct 5 2017

Blockedon: 770924
It looks like the various blocking issues have been fixed. Is this still an issue?
Status: Fixed (was: Assigned)
Yes, closing. We've found workarounds for the issue.

If I understand it right, the problematic sequence was something like this:

- frame A rendering is still in progress. For example, a final texture copy to GVR's drawing buffer still hasn't been issued yet.

- frame B (the following frame) has completed WebGL preparation and is ready to issue commands

- frame B's GL commands get flushed to the GPU

- since both frame A and frame B share the same (virtualized) GL context, the GPU decides to complete the work for frame B now.

- frame A drawing commands get issued (i.e. texture copy to GVR), this is queued behind the frame B work on the GPU

End result is that frame A and B complete at basically the same time.

Workarounds were to avoid premature flushing for the next frame's drawing commands, and a followup aims to use the new GpuFence mechanism from http://crrev.com/680135 (not yet submitted) to explicitly separate the frames with a GL fence and server wait.
Labels: Test-Complete
Components: Blink>WebXR

Sign in to add a comment