--in-process-gpu can cause "loop to OOM"-hang
Reported by
hu...@vewd.com,
Jan 4 2018
|
||
Issue descriptionChrome Version: 63 What steps will reproduce the problem? 1. Apply attached patch to simulate the "bad" timing. 2. $ content_shell --in-process-gpu 3. Close the window by clicking its X (CTRL+C doesn't reproduce the bug). What is the expected result? Browser closes. What happens instead? Browse gets stuck. htop shows that it keeps allocating more and more memory. After a minute it will get killed by the Linux OOM killer. Background: At Vewd (downstream chromium) this OOM happens randomly on automated tests that quickly spawns and shuts down a --in-process-gpu browser.
,
Jan 24 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/4ce9b19bec721fedcdcd06f69829207979ac85b8 commit 4ce9b19bec721fedcdcd06f69829207979ac85b8 Author: Elly Fong-Jones <ellyjones@chromium.org> Date: Wed Jan 24 14:45:08 2018 Revert "Fix OOM crash when the in-process GPU thread starts slowly" This reverts commit 94ec4045f6d1f18b5c724c43efd0c0f7fc58c535. Reason for revert: Sheriffs think this introduced test crashes in content_browsertests and other browser test targets on debug builds. Every test in these targets crashes instantly like: [ RUN ] AccessibilityActionBrowserTest.IncrementDecrementActions [1/1146] AccessibilityActionBrowserTest.IncrementDecrementActions (CRASHED) [ RUN ] AccessibilityActionBrowserTest.CanvasGetImageScale [2/1146] AccessibilityActionBrowserTest.CanvasGetImageScale (CRASHED) This is a speculative revert as we haven't yet reproed locally. Original change's description: > Fix OOM crash when the in-process GPU thread starts slowly > > Background: > Sometimes (depending on the execution order of browser's > GPU vs IO thread), the IO thread is shutting down while > the GPU thread starts up. > > Problem: > BrowserIO can end up in an endless loop where it tries to > create GPU channels => Linux OOM: > > SendOutstandingReplies()'s first while-loop never exits. > > ~GpuProcessHost > GpuProcessHost::SendOutstandingReplies > BrowserGpuChannelHostFactory::EstablishRequest::OnEstablishedOnIO > BrowserGpuChannelHostFactory::EstablishRequest::EstablishOnIO > GpuProcessHost::EstablishGpuChannel // Queues a channel_request. > => SendOutstandingReplies sees another request => endless loop. > > This happens because of OnEstablishedOnIO's retry-logic. > > Solution: > (1) Run EstablishOnIO async. That gives > SendOutstandingReplies a chance to exit its loop. > (2) Synchronously Stop() the GPU thread before we destruct > its dependencies (here: tell it to stop starting). > > Q: What happens if we don't stop the GPU thread > before we destruct BrowserChildProcessHostImpl? > A: We get another race (and even a crash if we use > the bug description's timing): > > [15658:15698:FATAL:outgoing_broker_client_invitation.cc(52)] > 0 0x7fe6a522e12d base::debug::StackTrace::StackTrace() > 1 0x7fe6a522c02c base::debug::StackTrace::StackTrace() > 2 0x7fe6a52c51ea logging::LogMessage::~LogMessage() > 3 0x7fe69ef8262b mojo::edk::OutgoingBrokerClientInvitation > ::ExtractInProcessMessagePipe() > 4 0x7fe6a60592d8 content::ChildThreadImpl::Init() > 5 0x7fe6a605c032 content::ChildThreadImpl::ChildThreadImpl() > 6 0x7fe6a5eb94cf content::GpuChildThread::GpuChildThread() > 7 0x7fe6a5eb9e47 content::GpuChildThread::GpuChildThread() > 8 0x7fe6a5ee0b23 content::InProcessGpuThread::Init() > 9 0x7fe6a54c5b47 base::Thread::ThreadMain() > 10 0x7fe6a54a4161 base::(anonymous namespace)::ThreadFunc() > 11 0x7fe6ac4d56ba start_thread > 12 0x7fe691ab63dd clone > > Manual test: Follow the bug description's steps. > Automated test: Run content_browsertests. > > Bug: 799002 > Change-Id: Ic3e7234fbcf401f7788181eb0b558a238aebf3bc > Reviewed-on: https://chromium-review.googlesource.com/850358 > Commit-Queue: Hugo Holgersson <hugoh@vewd.com> > Reviewed-by: Dirk Pranke <dpranke@chromium.org> > Reviewed-by: Antoine Labour <piman@chromium.org> > Cr-Commit-Position: refs/heads/master@{#531505} TBR=boliu@chromium.org,dpranke@chromium.org,piman@chromium.org,hugoh@vewd.com Change-Id: Id9a9aedaa03a2ade4477f5a9662050e1f2137cd3 No-Presubmit: true No-Tree-Checks: true No-Try: true Bug: 799002 Reviewed-on: https://chromium-review.googlesource.com/883223 Reviewed-by: Elly Fong-Jones <ellyjones@chromium.org> Commit-Queue: Elly Fong-Jones <ellyjones@chromium.org> Cr-Commit-Position: refs/heads/master@{#531527} [modify] https://crrev.com/4ce9b19bec721fedcdcd06f69829207979ac85b8/content/browser/gpu/DEPS [modify] https://crrev.com/4ce9b19bec721fedcdcd06f69829207979ac85b8/content/browser/gpu/browser_gpu_channel_host_factory.cc [modify] https://crrev.com/4ce9b19bec721fedcdcd06f69829207979ac85b8/content/browser/gpu/gpu_process_host.cc [modify] https://crrev.com/4ce9b19bec721fedcdcd06f69829207979ac85b8/content/browser/gpu/gpu_process_host.h [delete] https://crrev.com/84d1bb7b2829ea3750e5ebb80e46989853aeb4c3/content/browser/gpu/in_process_gpu_thread_browsertests.cc [modify] https://crrev.com/4ce9b19bec721fedcdcd06f69829207979ac85b8/content/test/BUILD.gn [modify] https://crrev.com/4ce9b19bec721fedcdcd06f69829207979ac85b8/testing/buildbot/chromium.chromiumos.json [modify] https://crrev.com/4ce9b19bec721fedcdcd06f69829207979ac85b8/testing/buildbot/chromium.fyi.json [modify] https://crrev.com/4ce9b19bec721fedcdcd06f69829207979ac85b8/testing/buildbot/chromium.memory.json [modify] https://crrev.com/4ce9b19bec721fedcdcd06f69829207979ac85b8/testing/buildbot/filters/BUILD.gn [modify] https://crrev.com/4ce9b19bec721fedcdcd06f69829207979ac85b8/testing/buildbot/filters/mojo.fyi.viz.content_browsertests.filter [delete] https://crrev.com/84d1bb7b2829ea3750e5ebb80e46989853aeb4c3/testing/buildbot/filters/mus.content_browsertests.filter [modify] https://crrev.com/4ce9b19bec721fedcdcd06f69829207979ac85b8/testing/buildbot/filters/viz.content_browsertests.filter [modify] https://crrev.com/4ce9b19bec721fedcdcd06f69829207979ac85b8/testing/buildbot/test_suites.pyl
,
Jan 26 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/f5f8cf9fd888af13b6ca20af7d4d100d3c354efa commit f5f8cf9fd888af13b6ca20af7d4d100d3c354efa Author: Hugo Holgersson <hugoh@vewd.com> Date: Fri Jan 26 12:27:08 2018 Reland: Fix OOM crash when the in-process GPU thread starts slowly This CL got reverted because Linux debug builds crashed when running browser tests. I reproduced the crashes locally and fixed them with: - "//content/public/gpu:gpu_sources", + "//content/public/gpu", > Background: > Sometimes (depending on the execution order of browser's > GPU vs IO thread), the IO thread is shutting down while > the GPU thread starts up. > > Problem: > BrowserIO can end up in an endless loop where it tries to > create GPU channels => Linux OOM: > > SendOutstandingReplies()'s first while-loop never exits. > > ~GpuProcessHost > GpuProcessHost::SendOutstandingReplies > BrowserGpuChannelHostFactory::EstablishRequest::OnEstablishedOnIO > BrowserGpuChannelHostFactory::EstablishRequest::EstablishOnIO > GpuProcessHost::EstablishGpuChannel // Queues a channel_request. > => SendOutstandingReplies sees another request => endless loop. > > This happens because of OnEstablishedOnIO's retry-logic. > > Solution: > (1) Run EstablishOnIO async. That gives > SendOutstandingReplies a chance to exit its loop. > (2) Synchronously Stop() the GPU thread before we destruct > its dependencies (here: tell it to stop starting). > > Q: What happens if we don't stop the GPU thread > before we destruct BrowserChildProcessHostImpl? > A: We get another race (and even a crash if we use > the bug description's timing): > > [15658:15698:FATAL:outgoing_broker_client_invitation.cc(52)] > 0 0x7fe6a522e12d base::debug::StackTrace::StackTrace() > 1 0x7fe6a522c02c base::debug::StackTrace::StackTrace() > 2 0x7fe6a52c51ea logging::LogMessage::~LogMessage() > 3 0x7fe69ef8262b mojo::edk::OutgoingBrokerClientInvitation > ::ExtractInProcessMessagePipe() > 4 0x7fe6a60592d8 content::ChildThreadImpl::Init() > 5 0x7fe6a605c032 content::ChildThreadImpl::ChildThreadImpl() > 6 0x7fe6a5eb94cf content::GpuChildThread::GpuChildThread() > 7 0x7fe6a5eb9e47 content::GpuChildThread::GpuChildThread() > 8 0x7fe6a5ee0b23 content::InProcessGpuThread::Init() > 9 0x7fe6a54c5b47 base::Thread::ThreadMain() > 10 0x7fe6a54a4161 base::(anonymous namespace)::ThreadFunc() > 11 0x7fe6ac4d56ba start_thread > 12 0x7fe691ab63dd clone > > Manual test: Follow the bug description's steps. > Automated test: Run content_browsertests. Bug: 799002 Change-Id: I1d61475b35abf11c98c6a6a21be6987f6eebdb0f Reviewed-on: https://chromium-review.googlesource.com/886483 Reviewed-by: Antoine Labour <piman@chromium.org> Reviewed-by: Dirk Pranke <dpranke@chromium.org> Commit-Queue: Hugo Holgersson <hugoh@vewd.com> Cr-Commit-Position: refs/heads/master@{#531989} [modify] https://crrev.com/f5f8cf9fd888af13b6ca20af7d4d100d3c354efa/content/browser/gpu/DEPS [modify] https://crrev.com/f5f8cf9fd888af13b6ca20af7d4d100d3c354efa/content/browser/gpu/browser_gpu_channel_host_factory.cc [modify] https://crrev.com/f5f8cf9fd888af13b6ca20af7d4d100d3c354efa/content/browser/gpu/gpu_process_host.cc [modify] https://crrev.com/f5f8cf9fd888af13b6ca20af7d4d100d3c354efa/content/browser/gpu/gpu_process_host.h [add] https://crrev.com/f5f8cf9fd888af13b6ca20af7d4d100d3c354efa/content/browser/gpu/in_process_gpu_thread_browsertests.cc [modify] https://crrev.com/f5f8cf9fd888af13b6ca20af7d4d100d3c354efa/content/test/BUILD.gn [modify] https://crrev.com/f5f8cf9fd888af13b6ca20af7d4d100d3c354efa/testing/buildbot/chromium.chromiumos.json [modify] https://crrev.com/f5f8cf9fd888af13b6ca20af7d4d100d3c354efa/testing/buildbot/chromium.fyi.json [modify] https://crrev.com/f5f8cf9fd888af13b6ca20af7d4d100d3c354efa/testing/buildbot/chromium.memory.json [modify] https://crrev.com/f5f8cf9fd888af13b6ca20af7d4d100d3c354efa/testing/buildbot/filters/BUILD.gn [modify] https://crrev.com/f5f8cf9fd888af13b6ca20af7d4d100d3c354efa/testing/buildbot/filters/mojo.fyi.viz.content_browsertests.filter [add] https://crrev.com/f5f8cf9fd888af13b6ca20af7d4d100d3c354efa/testing/buildbot/filters/mus.content_browsertests.filter [modify] https://crrev.com/f5f8cf9fd888af13b6ca20af7d4d100d3c354efa/testing/buildbot/filters/viz.content_browsertests.filter [modify] https://crrev.com/f5f8cf9fd888af13b6ca20af7d4d100d3c354efa/testing/buildbot/test_suites.pyl
,
Jan 29 2018
|
||
►
Sign in to add a comment |
||
Comment 1 by bugdroid1@chromium.org
, Jan 24 2018