Viz Android GL Out Of Memory Error |
||||
Issue descriptionOS: Android Bot: luci.chromium.try/android-kitkat-arm-rel Test suite: content_browsertests Tests: most of them Example failing run: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/android-kitkat-arm-rel/33098 It appears that on this bot we run out of GL memory as GL error 1285 is "GL_OUT_OF_MEMORY 0x0505" [FATAL:gl_context.cc(323)] Check failed: error == GL_NO_ERROR || error == GL_CONTEXT_LOST_KHR. GL error was: 1285 [ERROR:test_suite.cc(303)] Currently running: DomSerializerTests.SubResourceForElementsInNonHTMLNamespace Searching for native crashes in: /b/swarming/w/itqgQo5o/tmpOXlAgX Unknown Android release, consider passing --packed-lib. Reading Android symbols from: /b/swarming/w/ir Searching for Chrome symbols from within: /b/swarming/w/ir/out/Release/lib.unstripped:/b/swarming/w/ir/out/Release Stack Trace: RELADDR FUNCTION FILE:LINE 024bd9d9 logging::LogMessage::~LogMessage() ??:0:0 024891df gl::GLContext::MakeVirtuallyCurrent(gl::GLContext*, gl::GLSurface*) ??:0:0 02c9797b gpu::GLContextVirtual::MakeCurrent(gl::GLSurface*) ??:0:0 02cbd287 gpu::gles2::GLES2DecoderImpl::MakeCurrent() ??:0:0 02da7ee7 gpu::CommandBufferStub::MakeCurrent() ??:0:0 02da7d4f gpu::CommandBufferStub::OnMessageReceived(IPC::Message const&) ??:0:0 02dacd6d gpu::GpuChannel::HandleMessageHelper(IPC::Message const&) ??:0:0 02dabf33 gpu::GpuChannel::HandleMessage(IPC::Message const&) ??:0:0 02c79083 gpu::Scheduler::RunNextTask() ??:0:0 023aaaad base::internal::Invoker<base::internal::BindState<void (viz::TestLayerTreeFrameSink::*)(), base::WeakPtr<viz::TestLayerTreeFrameSink> >, void ()>::RunOnce(base::internal::BindStateBase*) ??:0:0 024b4921 base::debug::TaskAnnotator::RunTask(char const*, base::PendingTask*) ??:0:0 024c50e1 base::MessageLoop::RunTask(base::PendingTask*) ??:0:0 024c534d base::MessageLoop::DeferOrRunPendingTask(base::PendingTask) ??:0:0 024c5545 base::MessageLoop::DoWork() ??:0:0 024c723d base::MessagePumpDefault::Run(base::MessagePump::Delegate*) ??:0:0 024c4d35 base::MessageLoop::Run(bool) ??:0:0 024d6efd base::RunLoop::Run() ??:0:0 0250659f base::Thread::Run(base::RunLoop*) ??:0:0 02506733 base::Thread::ThreadMain() ??:0:0 025284a3 base::(anonymous namespace)::ThreadFunc(void*) ??:0:0 0000d173 <UNKNOWN> /system/lib/libc.so 0000d30b <UNKNOWN>
,
Jul 16
OUT_OF_MEMORY is misleading - it's just thrown because we can't allocate a new Buffer in the android BufferQueue for our surface because the BufferQueue is already torn down, not because the allocation failed. Probably a shutdown ordering issue - only appears on K, so later OS versions may have become more lenient to these kinds of ordering issues. Will see if I can make the ordering more deterministic and prevent this.
,
Aug 7
,
Aug 7
Note that this is also responsible for the chrome_public_test_vr_apk failures on the KitKat bot.
,
Aug 7
,
Aug 19
I've spent a fair amount of time on this, and it's unfortunately a real puzzle - nothing about our GL command stream seems wrong, and moving various components to shutdown first/second doesn't seem to help. The error seems to be popping up after we switch virtual contexts and restore state - maybe in restoring state we're re-binding something that's been deleted, leading to this error? Adding glFinish at various points makes this issue go away / become flaky as well, which is a sign of some sort of driver issue... The cleanest repro case I've seen is: 1) We delete RenderWorker, causing glContext to be un-bound 2) We switch to RenderCompositor, causing a real context switch and a full state restore. 3) We glFinish - no errors at this point. 4) RenderCompositor issues a glFlush - no-op 5) We switch to DisplayCompositor causing a virtual context switch (no actual switch) and a partial state restore 6) We glFinish, triggering OOM. This is really weird as we switch to the real context and finish in (3), at which point there are no errors. The only thing we do between that point and the error is flush and restore state. I suspect something about the state restore is hitting a timing issue / driver bug in K. Maybe some external Android resource that's bound to EGL/etc... is being deleted and when we try to restore various buffers/etc... we hit the issue? We don't appear to be restoring a framebuffer though, so not quite sure. Will keep looking on Monday.
,
Aug 19
One side point - we only ever have one CompositorImpl for now, and we *never* actually cleanly tear down Chrome in the wild (we always just kill the process), so this is really a test-only issue. It may be fine to just have the tests kill the gpu process preventing / hiding these issues.
,
Aug 22
Fix in flight.
,
Aug 23
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/7c6485d8afdd53f95b44ebc545351cafb835ecaa commit 7c6485d8afdd53f95b44ebc545351cafb835ecaa Author: Eric Karl <ericrk@chromium.org> Date: Thu Aug 23 23:58:15 2018 Android OOP-D: Tear down display when going invisible When Android goes invisible in OOP-D, it wasn't tearing down the display, which can lead to GL issues as we continue to use GL after the window (used to create the GL surface) is destroyed. In order to tear down the display for Viz, we need to invalidate our root frame sink ID. This change refactors things so that we always invalidate the root frame sink ID on going invisible, and re-register it on becoming visible. This allows both viz/non-viz to share the same logic. As registering/unregistering isn't doing much in non-viz case, this doesn't add significant overhead there. Bug: 863049 Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel Change-Id: I1589e402185fd9e2cdb007d3d8cd739f303ad48a Reviewed-on: https://chromium-review.googlesource.com/1184376 Reviewed-by: Khushal <khushalsagar@chromium.org> Reviewed-by: Tom Sepez <tsepez@chromium.org> Commit-Queue: Eric Karl <ericrk@chromium.org> Cr-Commit-Position: refs/heads/master@{#585664} [modify] https://crrev.com/7c6485d8afdd53f95b44ebc545351cafb835ecaa/components/viz/service/frame_sinks/root_compositor_frame_sink_impl.cc [modify] https://crrev.com/7c6485d8afdd53f95b44ebc545351cafb835ecaa/components/viz/service/frame_sinks/root_compositor_frame_sink_impl.h [modify] https://crrev.com/7c6485d8afdd53f95b44ebc545351cafb835ecaa/content/browser/renderer_host/compositor_impl_android.cc [modify] https://crrev.com/7c6485d8afdd53f95b44ebc545351cafb835ecaa/content/browser/renderer_host/compositor_impl_android.h [modify] https://crrev.com/7c6485d8afdd53f95b44ebc545351cafb835ecaa/services/viz/privileged/interfaces/compositing/display_private.mojom
,
Aug 24
|
||||
►
Sign in to add a comment |
||||
Comment 1 by ericrk@chromium.org
, Jul 15Status: Started (was: Untriaged)