chrome://tracing hangs indefinitely if any renderers are in the Suspended state |
||||||||
Issue descriptionChrome Version: 57.0.2951.0 dev (64-bit) OS: ChromeOS Panther What steps will reproduce the problem? (1) Enable the MemoryCoordinator. (2) Open lots of tabs, so that one or more tabs are caused to suspend. (3) Open chrome://tracing. (4) Hit Record. (5) Kill all the suspended renderer processes. What is the expected result? Expect that at #4, the dialog to choose the tracing to perform is immediately displayed. What happens instead? The dialog is never displayed. When all the suspended renderers are killed at #5 the dialog suddenly pops up. See also issue 675269 - while we are in this "hung" state between Record and the dialog popping up, closing the tracing window can lead to a browser crash.
,
Dec 20 2016
Some tips to help debugging this: when you open the record dialog, the browser process under the hoods starts and stop tracing very quickly. This is to warmup the categories in all processes and discover the category names to populate the checkboxes. The "Stop" is asynchronous: the browser process sends a TracingMsg_EndTracing IPC message and then waits all the child processes to reply with a TracingHostMsg_EndTracingAck IPC. So I suspect that what's happening is that MC causes the child's ChildTraceMessageFilter::OnEndTracing() to somehow never complete and never send back the EndTracingAck, causing the browser to wait forever (or maybe until some long timeout is hit). Now the question is: what is not responding and why? Maybe the fact that TraceLog::FlushInternal in turn posts a message to all task runners of the local process (well, on any task runner which has at least an event, which realistically means all task runners) and expects them to reply. So if any of the taskrunners are kept frozen, the child it will never ack to the browser, and in turn it will keep the tracing UI in that suspended state.
,
Dec 20 2016
Thanks Primiano for the tips. It's really helpful. +tasak@ I'm not familiar with actual suspend logic in renderer (MC just call SuspendRenderer() and ResumeRenderer()) but it seem that we should resume scheduler when the renderer receives TracingMsg_EndTracing. There will be other messages that we should resume renderers.
,
Dec 20 2016
Ironically, TraceLog::FlushInternal has a timeout mechanism [1] to prevent what I described in #2 (waiting forever for a thread task runner to flush). However the timeout is rendered itself as a PostDelayedTask. So if you are preventing delayed tasks to run, you are also killing the timeout mechanism that prevent this from happening in the first place :) [1] https://cs.chromium.org/chromium/src/base/trace_event/trace_log.cc?rcl=0&l=905
,
Dec 20 2016
Suspending renderers with MC seems broken and we need to make some changes in MC messaging between renderers and browser. I'll disable suspending tentatively.
,
Dec 20 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/b1536fd7e1fa4dd7559e2c12e290e79cf949e9fe commit b1536fd7e1fa4dd7559e2c12e290e79cf949e9fe Author: bashi <bashi@chromium.org> Date: Tue Dec 20 04:49:30 2016 Disable suspending renderer when memory coordinator is enabled Just calling SuspendRenderer() in OnMemoryStateChange() breaks many things. We need a reliable and consistent way to suspend renderers. Until we come up with a solution, disable renderer suspending when memory coordinator is enabled. In follow-up CLs I'll add some tests to prevent this kind of regressions. BUG= 675735 , 675811 Review-Url: https://codereview.chromium.org/2590073002 Cr-Commit-Position: refs/heads/master@{#439708} [modify] https://crrev.com/b1536fd7e1fa4dd7559e2c12e290e79cf949e9fe/content/renderer/render_thread_impl.cc
,
Dec 21 2016
,
Dec 22 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/ff1e6242e014db17e8bb736741595705c81dc774 commit ff1e6242e014db17e8bb736741595705c81dc774 Author: tasak <tasak@google.com> Date: Thu Dec 22 04:53:55 2016 Disable PurgeAndSuspend when MemoryCoordinator is enabled. PurgeAndSuspend depends on OnMemoryStateChange(). However, always suspending renderer in OnMemoryStateChange(), it breaks many things. So only enabling such suspend / resume feature when only PurgeAndSuspend is enabled. BUG= 675735 , 675811 Review-Url: https://codereview.chromium.org/2595813002 Cr-Commit-Position: refs/heads/master@{#440340} [modify] https://crrev.com/ff1e6242e014db17e8bb736741595705c81dc774/content/renderer/render_thread_impl.cc
,
Jan 18 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/d224a9dddebc82b819515e4d9c35737c7a92e9d0 commit d224a9dddebc82b819515e4d9c35737c7a92e9d0 Author: tasak <tasak@google.com> Date: Wed Jan 18 00:44:29 2017 Stop suspending renderer and changing purge interval to 20min - Since SuspendRenderer sometimes causes very long style recalc and layout (c.f. youtube.com/), disable tab suspension and use default background task throttle. - also changed time-to-resuspension to 20min. This means, purge background tab's memory every 20min(to be precise 20min+10sec). - the original document about purge+suspend is https://docs.google.com/document/d/1EgLimgxWK5DGhptnNVbEGSvVn6Q609ZJaBkLjEPRJvI/edit?usp=sharing - the difference between the original purge+suspend and purge+suspend with this patch is https://docs.google.com/document/d/1qIrXsi9BuoAgNu6lVGTcGhbAm2TLPRb_JZ0OhynLjR0/edit?usp=sharing BUG= 675735 , 607077 Review-Url: https://codereview.chromium.org/2624063002 Cr-Commit-Position: refs/heads/master@{#444209} [modify] https://crrev.com/d224a9dddebc82b819515e4d9c35737c7a92e9d0/chrome/browser/memory/tab_manager.cc [modify] https://crrev.com/d224a9dddebc82b819515e4d9c35737c7a92e9d0/chrome/browser/memory/tab_manager_unittest.cc [modify] https://crrev.com/d224a9dddebc82b819515e4d9c35737c7a92e9d0/content/renderer/render_thread_impl.cc [modify] https://crrev.com/d224a9dddebc82b819515e4d9c35737c7a92e9d0/content/renderer/render_thread_impl.h
,
Jan 19 2017
Tested on windows 7 using chrome M57 #57.0.2986.0 and followed the steps: (1) Enabled the flag MemoryCoordinator. (2) Opened lots of tabs, none of the tabs crashed. (3) Opened chrome://tracing. (4) Hit Record and pop up displayed named "record a new trace " Attached screencast for reference. @bashi--- Could you please check the attached screencast and let us know if we had missed out any steps in verifying the issue . Thanks!
,
Jan 19 2017
FWIW issue 675269 covers the MemoryCoordinator tab-suspend behaviour that triggered the hung renderer in this case, but that particular cause is incidental to this bug. This bug is for chrome://tracing not coping correctly with "hung" renderer processes.
,
Jan 27 2017
Not sure if I did fully catchup the situation. Are you saying that even after bashi reverted the suspend, you still experience tracing getting stuck?
,
Jan 27 2017
#10: Sorry missed comments. Could you try to repro with ToT if you have time? FWIW, I've been using tracing on a custom chromium w/tab suspending for a while and it works well. Not sure the root cause of this bug though.
,
Feb 10 2017
,
Feb 10 2017
,
Apr 11 2017
We gave up suspending tabs for now and this shouldn't occur anymore. Let me close this. Please re-open this if this is still happening on ToT. |
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by haraken@chromium.org
, Dec 20 2016