New issue
Advanced search Search tips

Issue 763113 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner:
Closed: Sep 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 1
Type: Bug
Proj-XR



Sign in to add a comment

CompositorView callback degrades WebVR performance

Project Member Reported by klausw@chromium.org, Sep 7 2017

Issue description

This is a followup to issue 729164 "Android O - Presentation does not start if automatically triggered by inserting device into headset".

The core issue there seemed to be that surfaceRedrawNeededAsync were not being run reliably when entering VR, and amp@ fixed that in https://codereview.chromium.org/2930113002 aka r478698 .

However, a bisect of a WebVR performance degradation pointed to that change. Specifically, it seems that firing this callback triggers a performance decrease.

What do these callbacks do? It's showing as com.android.internal.view.SurfaceCallbackHelper$1 which isn't very illuminating.

To confirm this, I added a hack to schedule the callback with a 10-second delay when executed from setVisibility. Then, the WebVR test starts at fairly stable ~50fps, then sharply drops to ~32fps after the callback executes, and stays there.

The debug log also shows the connection. The FPS counter is a sliding window, so it doesn't drop instantly, but it starts dropping after the callback ran.

09-07 13:14:29.300  8474  9550 I chromium: [INFO:vr_shell_gl.cc(2240)] ;;; fps=49.9975
09-07 13:14:29.320  8474  9550 I chromium: [INFO:vr_shell_gl.cc(2240)] ;;; fps=50.1585
09-07 13:14:29.342  8474  9550 I chromium: [INFO:vr_shell_gl.cc(2240)] ;;; fps=50.3492
09-07 13:14:29.362  8474  9550 I chromium: [INFO:vr_shell_gl.cc(2240)] ;;; fps=50.0653
09-07 13:14:29.381  8474  9550 I chromium: [INFO:vr_shell_gl.cc(2240)] ;;; fps=49.9488
09-07 13:14:29.400  8474  9550 I chromium: [INFO:vr_shell_gl.cc(2240)] ;;; fps=49.8723
09-07 13:14:29.422  8474  9550 I chromium: [INFO:vr_shell_gl.cc(2240)] ;;; fps=49.6537
09-07 13:14:29.444  8474  9550 I chromium: [INFO:vr_shell_gl.cc(2240)] ;;; fps=49.2739
09-07 13:14:29.463  8474  9550 I chromium: [INFO:vr_shell_gl.cc(2240)] ;;; fps=49.596
09-07 13:14:29.482  8474  9550 I chromium: [INFO:vr_shell_gl.cc(2240)] ;;; fps=49.8321
09-07 13:14:29.490  8474  8474 I cr_CompositorView: running runnable com.android.internal.view.SurfaceCallbackHelper$1@b89cc29
09-07 13:14:29.503  8474  9550 I chromium: [INFO:vr_shell_gl.cc(2240)] ;;; fps=49.1077
09-07 13:14:29.532  8474  9550 I chromium: [INFO:vr_shell_gl.cc(2240)] ;;; fps=47.2197
09-07 13:14:29.563  8474  9550 I chromium: [INFO:vr_shell_gl.cc(2240)] ;;; fps=45.2313
09-07 13:14:29.593  8474  9550 I chromium: [INFO:vr_shell_gl.cc(2240)] ;;; fps=43.3302
09-07 13:14:29.624  8474  9550 I chromium: [INFO:vr_shell_gl.cc(2240)] ;;; fps=41.127
09-07 13:14:29.656  8474  9550 I chromium: [INFO:vr_shell_gl.cc(2240)] ;;; fps=39.1451
09-07 13:14:29.687  8474  9550 I chromium: [INFO:vr_shell_gl.cc(2240)] ;;; fps=37.7095
09-07 13:14:29.717  8474  9550 I chromium: [INFO:vr_shell_gl.cc(2240)] ;;; fps=36.6531
09-07 13:14:29.749  8474  9550 I chromium: [INFO:vr_shell_gl.cc(2240)] ;;; fps=35.0179
09-07 13:14:29.779  8474  9550 I chromium: [INFO:vr_shell_gl.cc(2240)] ;;; fps=33.6571
09-07 13:14:29.809  8474  9550 I chromium: [INFO:vr_shell_gl.cc(2240)] ;;; fps=32.7046
09-07 13:14:29.840  8474  9550 I chromium: [INFO:vr_shell_gl.cc(2240)] ;;; fps=32.4244
09-07 13:14:29.872  8474  9550 I chromium: [INFO:vr_shell_gl.cc(2240)] ;;; fps=32.3913

Let me know if you have suggestions on investigating this further.

Caveat: these numbers are from a prototype with a different rendering path than current WebVR. I'll separately run tests with a ToT version to confirm.
 
Screenshot_20170907-131434.png
4.0 MB View Download
Owner: khushals...@chromium.org
khushalsagar@, could you take a look? aelias@ thought you may know more about this, and thought that possibly something complicated like exposing a preexisting bug is going on here.

I've attached a trace. The callback trigger is about 2.1s into the trace, note how the "WebVR FPS" counter starts dropping and the "WebVR frame time" increasing proportionately, specifically the render time increases. (I'm pretty sure it's an instant drop, the gradual slope is due to using a sliding window average for these counters.)

That matches with increased waiting time for the GL fences to signal render completion in the VrShellGL thread. Everything else looks about unchanged at first glance.

This sounds like a GPU bottleneck - is it doing extra background work somewhere, or reconfiguring something to a lower-power mode or similar?
trace_wip-callback-slowdown.html
4.5 MB View Download
I've confirmed on a ToT build that this is also affected. See trace, framerate drops from ~26fps to ~17fps when the callback fires.

Test page: https://webvr.info/samples/test-slow-render.html?heavyGpu=1&cubeScale=0.3&workTime=4&standardSize=true&renderScale=1.6
trace_canary-callback-slowdown.html
3.9 MB View Download
Here's output from "adb shell dumpsys SurfaceFlinger", if I remember right the transition happens between -3.txt and -4.txt.

Here's a diff between the first and last ones:

$ diff -U 0 surfaceflinger-prototype-fastslow-1.txt surfaceflinger-prototype-fastslow-6.txt 
--- surfaceflinger-prototype-fastslow-1.txt	2017-09-07 14:51:26.944238779 -0700
+++ surfaceflinger-prototype-fastslow-6.txt	2017-09-07 14:51:41.772331134 -0700
@@ -15,2 +15,2 @@
-  < 1 frames: 1479.097 s (24.3%)
-  < 2 frames: 1556.204 s (25.6%)
+  < 1 frames: 1486.208 s (24.3%)
+  < 2 frames: 1563.942 s (25.6%)
@@ -22 +22 @@
-  7+ frames: 2835.809 s (46.6%)
+  7+ frames: 2835.809 s (46.4%)
@@ -165 +165 @@
-            mTexName=5 mCurrentTexture=0
+            mTexName=5 mCurrentTexture=1
@@ -170 +170 @@
-              default-size=[2560x84] default-format=1 transform-hint=04 frame-counter=3210
+              default-size=[2560x84] default-format=1 transform-hint=04 frame-counter=3211
@@ -173,2 +173,2 @@
-             >[00:0x7630232420] state=ACQUIRED 0x763023fa00 frame=3210 [  84x2560: 128,  1]
-              [01:0x7630232ce0] state=FREE     0x763023f500 frame=3209 [  84x2560: 128,  1]
+             >[01:0x7630232ce0] state=ACQUIRED 0x763023f500 frame=3211 [  84x2560: 128,  1]
+              [00:0x7630232420] state=FREE     0x763023fa00 frame=3210 [  84x2560: 128,  1]
@@ -199 +199 @@
-   type=0, hwcId=0, layerStack=0, (1440x2560), ANativeWindow=0x762e5c1010 (8:8:8:8), orient= 1 (type=00000003), flips=197335, isSecure=1, powerMode=2, activeConfig=0, numLayers=2
+   type=0, hwcId=0, layerStack=0, (1440x2560), ANativeWindow=0x762e5c1010 (8:8:8:8), orient= 1 (type=00000003), flips=198223, isSecure=1, powerMode=2, activeConfig=0, numLayers=2
@@ -219,2 +219,2 @@
-  last eglSwapBuffers() time: 47.188000 us
-  last transaction time     : 28.750000 us
+  last eglSwapBuffers() time: 53.698000 us
+  last transaction time     : 46.042000 us
@@ -231 +231 @@
-  events-delivered: 382787
+  events-delivered: 383676

surfaceflinger-prototype-fastslow-1.txt
21.8 KB View Download
surfaceflinger-prototype-fastslow-2.txt
21.8 KB View Download
surfaceflinger-prototype-fastslow-3.txt
21.8 KB View Download
surfaceflinger-prototype-fastslow-4.txt
21.8 KB View Download
surfaceflinger-prototype-fastslow-5.txt
21.8 KB View Download
surfaceflinger-prototype-fastslow-6.txt
21.8 KB View Download
Cc: -amp@chromium.org khushals...@chromium.org
Owner: klausw@chromium.org
Status: Started (was: Untriaged)
Quick update after a long chat with racarr@. 

surfaceRedrawNeededAsync seems to be a red herring. If I comment out the handler for that in CompositorSurfaceManager, there are no callbacks (events get run synchronously), and it's just always slow. Also, we seem to have had a WebVR performance regression on Android N also.

Current hypothesis is that there's some unintended extra hardware-level compositing or similar going on that consumes GPU performance. Failing to run the callback seems to have the effect that it leaves the compositing layer in an incomplete state and this may make it ineligible for compositing, so it's not spending resources on doing so.

Core issue seems to be that setting the 2D browser window to invisible isn't working properly:

+ Layer 0x762e31b000 (SurfaceView - org.chromium.chrome/org.chromium.chrome.browser.ChromeTabbedActivity#0)
dataspace=Default (0), pixelformat=RGBx_8888 alpha=1.000, flags=0x00000002, tr=[1.00, 0.00][0.00, 1.00]

This shows flags=0x2 which apparently means OPAQUE. It's supposed to have the 0x1 bit set for HIDDEN.

I'll look into the view hierarchy to see what's going on there.
Even if you hide the rest of the view hierarchy it doesn't seem to make a noticeable difference.

(Try adding this to enterVrWithCorrectWindowMode:
 View v = mActivity.getWindow().findViewById(R.id.menu_anchor_stub);
 v.setVisibility(View.INVISIBLE);
 View v1 = mActivity.getWindow().findViewById(R.id.action_bar_root);
 v1.setVisibility(View.INVISIBLE);
)
there might be another way to stop surfaceflinger composition, using the transparent region.  i think that (but cannot at the moment find the code for), if the layer's transparent region covers the entire layer, then surfaceflinger omits it from composition.  this would likely have the same effect as setting HIDDEN explicitly.

i don't know which mechanism was used before when this was working, but i mention it for completeness.

i also recall, quite some time ago, that in non-fullscreen mode, hiding the omnibox used to be sufficient to remove the app surface from surfaceflinger composition.  now, it isn't.  i think this change was intentional to reduce jank when showing the omnibox.  don't know if it's related to any of this either, but it sounds similar.
For comparison purposes, I've attached "adb shell dumpsys SurfaceFlinger" output from running Daydream Home. It's also showing a SurfaceView + Launcher layer pair, and the HWC properties look about the same. I'm not sure how to interpret this output, let me know if I'm missing something.
surfaceflinger-daydreamhome.txt
38.8 KB View Download
I've added mthiesse@'s changes from comment #5, using View.GONE instead of View.INVISIBLE, and on top of that added:

        mActivity.getWindow().setLayout(1024, 512);

It's still showing the slowdown (from 30fps to 20fps using ToT + delayed-callback hack).

The background is all black, and the VR view is clipped to this shrunken window. I wanted to sanity check that it's all in one window with no other content that it would be rendering on top of, and that GVR's drawing happens within this one window and not as a separate render path that would bypass normal rendering.

SurfaceFlinger dump attached for that experiment, not sure if it's very illuminating.

I also tried setLayout(0, 0) for a zero-sized window, then the screen is all black and "Display 0 HWC layers" is empty. 
surfaceflinger-bad-window-1024x512.txt
29.0 KB View Download
Status: WontFix (was: Started)
Sigh, this bug appears to be invalid. Apparently all that's happening is that the callback is required for the AndroidCompat.setSustainedPerformanceMode(mActivity, true); call to take effect, so all that's happening is that we're not using sustained performance mode when we were expecting it to be active.

If I manually toggle sustained performance mode while presenting, I see exactly the performance delta corresponding to the callback firing.
Components: Blink>WebXR

Sign in to add a comment