WebXR non-exclusive session causes renderer process segfault on L/N |
||||||
Issue descriptionStarting a WebXR non-exclusive session on Android L or N causes a segfault in the renderer process. This was the cause of Issue 814367, where all WebXR tests that used non-exclusive sessions were failing. For whatever reason, this doesn't happen on the M bots, which is why it wasn't caught in the CQ. Example log output: 02-21 03:38:35.625 23455 23471 E chromium: [ERROR:texture_manager.cc(2585)] [.Offscreen-For-WebGL-0x76b724e800]GL ERROR :GL_INVALID_VALUE : glTexImage2D: dimensions out of range 02-21 03:38:35.626 23424 23439 E chromium: [ERROR:XRWebGLDrawingBuffer.cpp(247)] Framebuffer incomplete 02-21 03:38:35.632 993 1315 I nanohub : osLog: [BMI160] accPower: on=1, state=3 02-21 03:38:35.633 993 1315 I nanohub : osLog: [BMI160] gyrSetRate: rate=409600, latency=2499584, state=4 02-21 03:38:35.642 23392 23392 I chromium: [INFO:CONSOLE(0)] "[.Offscreen-For-WebGL-0x76b724e800]GL ERROR :GL_INVALID_VALUE : glTexImage2D: dimensions out of range", source: file:///storage/emulated/0/chromium_tests_root/chrome/test/data/vr/e2e_test_files/html/generic_webxr_page.html (0) 02-21 03:38:35.649 23455 23471 E chromium: [ERROR:gles2_cmd_decoder.cc(4656)] [.Offscreen-For-WebGL-0x76b724e800]GL ERROR :GL_INVALID_FRAMEBUFFER_OPERATION : glClear: framebuffer incomplete 02-21 03:38:35.649 23455 23471 E chromium: [ERROR:texture_manager.cc(2585)] [.Offscreen-For-WebGL-0x76b724e800]GL ERROR :GL_INVALID_VALUE : glTexImage2D: dimensions out of range --------- beginning of crash 02-21 03:38:35.659 23424 23439 F libc : Fatal signal 11 (SIGSEGV), code 1, fault addr 0x0 in tid 23439 (CrRendererMain) 02-21 03:38:35.659 523 523 W : debuggerd: handling request: pid=23424 uid=99052 gid=99052 tid=23439 02-21 03:38:35.683 993 1315 I nanohub : osLog: [BMI160] gyrSetRate: rate=409600, latency=2499584, state=3 02-21 03:38:35.685 993 1315 I nanohub : osLog: [BMI160] accSetRate: rate=409600, latency=2499584, state=12 02-21 03:38:35.687 993 1315 I nanohub : osLog: [BMI160] accSetRate: rate=409600, latency=2499584, state=3 02-21 03:38:35.727 23516 23516 F DEBUG : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** 02-21 03:38:35.727 23516 23516 F DEBUG : Build fingerprint: 'google/marlin/marlin:7.1.1/NMF26U/3562008:userdebug/dev-keys' 02-21 03:38:35.727 23516 23516 F DEBUG : Revision: '0' 02-21 03:38:35.728 23516 23516 F DEBUG : ABI: 'arm64' 02-21 03:38:35.728 23516 23516 F DEBUG : pid: 23424, tid: 23439, name: CrRendererMain >>> org.chromium.chrome:sandboxed_process0 <<< 02-21 03:38:35.728 23516 23516 F DEBUG : signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0 02-21 03:38:35.728 23516 23516 F DEBUG : x0 0000000000000000 x1 0000000000000001 x2 0000000000000000 x3 0000000000000007 02-21 03:38:35.728 23516 23516 F DEBUG : x4 0000000000000036 x5 00000076b5d96888 x6 000000769ba3b664 x7 725e4c51405e4b46 02-21 03:38:35.728 23516 23516 F DEBUG : x8 22e72869be838085 x9 22e72869be838085 x10 0000000000000036 x11 0000000000000000 02-21 03:38:35.728 23516 23516 F DEBUG : x12 0000000000000060 x13 00000076acd60a74 x14 0000000000000000 x15 2e8ba2e8ba2e8ba3 02-21 03:38:35.728 23516 23516 F DEBUG : x16 00000076ba9215b0 x17 00000076ba8c8600 x18 000000769710e5b8 x19 00000038e4573ff8 02-21 03:38:35.729 23516 23516 F DEBUG : x20 00000038e4574000 x21 000000769bbea5e0 x22 00000076b5d9a4e8 x23 0000000000000000 02-21 03:38:35.729 23516 23516 F DEBUG : x24 000000769af951d8 x25 0000000000000019 x26 00000076b5d9a4e8 x27 0000000000000001 02-21 03:38:35.729 23516 23516 F DEBUG : x28 000000769b511000 x29 00000076b5d96f00 x30 00000076988ed560 02-21 03:38:35.729 23516 23516 F DEBUG : sp 00000076b5d96e70 pc 00000076963d4684 pstate 0000000060000000 02-21 03:38:35.732 23516 23516 F DEBUG : 02-21 03:38:35.732 23516 23516 F DEBUG : backtrace: 02-21 03:38:35.732 23516 23516 F DEBUG : #00 pc 0000000000373684 /data/app/org.chromium.chrome-1/base.apk (offset 0x321a000) 02-21 03:38:35.732 23516 23516 F DEBUG : #01 pc 000000000288c55c /data/app/org.chromium.chrome-1/base.apk (offset 0x321a000)
,
Feb 21 2018
As a note, this also seems to happen on our FYI bot's locally attached Pixel, which is running O, so I'm not sure why this didn't repro locally on my Pixel running O...
,
Feb 21 2018
A few more things: 1. The FYI bot's local device actually got swapped recently to a Pixel XL 2. I'm unable to repro locally on a Pixel flashed to the same OS build as the FYI bot and provisioned using the same provisioning script. 3. Thanks to Issue 812428, I'm unable to repro the issue on the swarming bots 4. I was uanble to repro on a Pixel XL with Canary and the WebXR magic window sample.
,
Feb 22 2018
This is repro-able in a hacky way by modifying the bot test spec config files to run the VR tests on the linux_android_rel_ng trybot. See https://chromium-review.googlesource.com/c/chromium/src/+/929522.
,
Feb 22 2018
I'm unable to get the tests to run properly on the debug K trybot. I ran the logcat output from the release K trybot through the Android stack tool, but since all the debug info is stripped out on that bot, it's not very useful:
signal 11 (SIGSEGV) at 0x00000000 (code=1), thread 25277 (CrRendererMain)
pid: 25263, tid: 25277, name: CrRendererMain >>> org.chromium.chrome:sandboxed_process0 <<<
signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 00000000
r0 00000000 r1 00000001 r2 00000000 r3 00000000
r4 7bbcd3e8 r5 75da38c0 r6 772283e0 r7 75da3928
r8 75da3b4c r9 75da3aa4 sl 75da3aa0 fp 00000000
ip 789baca9 sp 75da38c0 lr 7a106f27 pc 789baca8
Stack Trace:
RELADDR FUNCTION FILE:LINE
000ebca8 __aeabi_memset ??:0:0
01837f25 SkTSect<SkDCubic, SkDQuad>::binarySearchCoin(SkTSect<SkDQuad, SkDCubic>*, double, double, double*, double*) ??:0:0
,
Mar 1 2018
,
Mar 2 2018
I managed to get a better stack trace now that swarming works again:
signal 11 (SIGSEGV), code 1, fault addr 0x0 in tid 5712 (CrRendererMain)
pid: 5697, tid: 5712, name: CrRendererMain >>> org.chromium.chrome:sandboxed_process0 <<<
signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0
r0 00000000 r1 00000001 r2 00000000 r3 80000000
r4 4f64d3e8 r5 eceb8a90 r6 cd8e55b8 r7 eceb8af8
r8 eceb8d1c r9 eceb8c74 sl eceb8c70 fp 00000000
ip cfdd004d sp eceb8a90 lr d13f7de3 pc cfdd004c
Stack Trace:
RELADDR FUNCTION FILE:LINE
0168b04c GrGpuRTCommandBuffer::clearStencilClip(GrFixedClip const&, bool) ../../third_party/skia/src/image/SkImage.cpp:162:0
02cb2de1 blink::ImageLayerBridge::SetImage(scoped_refptr<blink::StaticBitmapImage>) ../../third_party/WebKit/Source/platform/graphics/gpu/ImageLayerBridge.cpp:53:55
,
Mar 3 2018
That's a much better lead than anything else I've found on this so far. (I've also been failing to repro this on my device.) Thanks for digging deeper on this! I just encountered something possibly related today, so I'm going to pull on that thread and see where I get.
,
Mar 15 2018
Apparently we're requesting a 2940 x 4173 frame buffer, which is larger than the max supported values.
,
Mar 15 2018
Good find! I'm going to guess that a single allocation of a buffer that large isn't going to push it over the edge, but the fact that we allocate multiple of them for a swap chain (and keep the magic window ones allocated when exclusive mode is started) probably adds up to be simply too much allocation for the system. I'll put some clamps on these values and try to be a bit more intelligent about cleaning them up.
,
Mar 15 2018
What appears to be happening is that we try to run XRWebGLDrawingBuffer::Resize to go from a 0 x 0 buffer to 2940 x 4173. This ends up failing here since it's too large https://cs.chromium.org/chromium/src/gpu/command_buffer/service/gles2_cmd_decoder.cc?q=gles2_cmd_decoder.cc&sq=package:chromium&dr&l=9078. We then find out that the frame buffer is incomplete here https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/graphics/gpu/XRWebGLDrawingBuffer.cpp?q=xrwebgldrawingbuffer&sq=package:chromium&dr=CSs&l=247, but we don't do anything about it. My guess is that we then later try to use the incomplete frame buffer, which causes issues.
,
Mar 15 2018
To summarize some offline discussion, the root cause appears to be some really, really weird behavior of setting the canvas width/height to 100%. When set to specific px values, e.g. 300 x 300, the offset width/height are correctly reported as such and the test passes. When we set the width/height to 100%, offset width is reported as 980, which is then multiplied by the DPR of 3 to get 2940. What's really strange is that screen.width reports 360 (the correct value for a Nexus 5) at all points during the test, so 100% should be converted to 360px...
,
Mar 15 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/69d3f0648a5962ea3a2a4d0e5244b2be57afbae1 commit 69d3f0648a5962ea3a2a4d0e5244b2be57afbae1 Author: Brandon Jones <bajones@chromium.org> Date: Thu Mar 15 23:30:57 2018 Ensured XRWebGLLayers clamp their framebuffer size. In some cases extreme output canvas sized were causing failed allocations and incomplete framebuffers, which made the ImageLayerBridge choke. This patch both clamps the backbuffer size to the max texture size and, if an incomplete framebuffer is detected, produces black 1x1 images for the ImageLayerBridge to consume instead of attempting to pass the texture that failed to allocate. Bug: 814460 Cq-Include-Trybots: master.tryserver.blink:linux_trusty_blink_rel;master.tryserver.chromium.linux:linux_layout_tests_slimming_paint_v2 Change-Id: Idba50e08052767360423018e06bc65f1f87c4d14 Reviewed-on: https://chromium-review.googlesource.com/964796 Reviewed-by: Brian Sheedy <bsheedy@chromium.org> Reviewed-by: Ian Vollick <vollick@chromium.org> Commit-Queue: Brandon Jones <bajones@chromium.org> Cr-Commit-Position: refs/heads/master@{#543549} [add] https://crrev.com/69d3f0648a5962ea3a2a4d0e5244b2be57afbae1/third_party/WebKit/LayoutTests/xr/xrWebGLLayer_non_exclusive_adjust_size.html [modify] https://crrev.com/69d3f0648a5962ea3a2a4d0e5244b2be57afbae1/third_party/WebKit/Source/platform/graphics/gpu/XRWebGLDrawingBuffer.cpp [modify] https://crrev.com/69d3f0648a5962ea3a2a4d0e5244b2be57afbae1/third_party/WebKit/Source/platform/graphics/gpu/XRWebGLDrawingBuffer.h
,
Mar 20 2018
Marking this as fixed since the clamping CL fixed the segfaults. I've filed Issue 823563 to track investigation/fixing of the root cause (incorrectly reported window size).
,
Mar 20 2018
,
Jul 4
,
Aug 29
|
||||||
►
Sign in to add a comment |
||||||
Comment 1 by bsheedy@chromium.org
, Feb 21 2018