Chrome can leave renderers frozen after cancelled suspend attempt |
|||||||
Issue description(Moving from http://crosbug.com/p/57040.) There appears to be a race in Chrome's renderer freezing/suspending code, which I think may only be enabled on some systems (including samus). In some cases, chromeos::RendererFreezer actually handles powerd's SuspendDone D-Bus signal *before* it handles the corresponding SuspendImminent signal, resulting in it thawing the not-frozen renderers, then freezing them, and then just leaving them frozen. Here's what that looks like with some additional logging enabled: [2011:2011:0914/104629:VERBOSE1:update_display_configuration_task.cc(64)] OnDisplaysUpdated: new_display_state=SINGLE new_power_state=ALL_OFF flags=0 force_configure=0 display_count=1 [2011:2011:0914/104629:VERBOSE1:display_configurator.cc(217)] EnterState: display=SINGLE power=ALL_OFF [2011:2011:0914/104629:ERROR:renderer_freezer.cc(61)] XXX RendererFreezer::SuspendDone [2011:2011:0914/104629:ERROR:freezer_cgroup_process_manager.cc(168)] XXX FreezerCgroupProcessManager::ThawRenderers [2011:3304:0914/104629:ERROR:freezer_cgroup_process_manager.cc(87)] XXX FileWorker::ThawRenderers start [2011:2011:0914/104629:VERBOSE1:display_configurator.cc(877)] SetDisplayPower: power_state=ALL_ON flags=0, configure timer=Stopped [2011:3304:0914/104629:ERROR:freezer_cgroup_process_manager.cc(98)] XXX FileWorker::ThawRenderers end: result=1 [2011:2011:0914/104629:ERROR:renderer_freezer.cc(135)] XXX RendererFreezer::OnThawRenderersComplete: success=1 [2011:2011:0914/104629:VERBOSE1:display_configurator.cc(1029)] OnConfigured: success=1 new_display_state=SINGLE new_power_state=ALL_OFF [2011:2011:0914/104629:VERBOSE1:display_manager.cc(534)] OnNativeDisplaysChanged(0): # of current displays=1 [2011:2011:0914/104629:ERROR:renderer_freezer.cc(52)] XXX RendererFreezer::SuspendImminent [2011:2011:0914/104629:ERROR:freezer_cgroup_process_manager.cc(161)] XXX FreezerCgroupProcessManager::FreezeRenderers [2011:2011:0914/104629:VERBOSE1:display_color_manager_chromeos.cc(218)] No ICC file found with product id: 30e42e04 for display id: 13761487533244416 [2011:3304:0914/104629:ERROR:freezer_cgroup_process_manager.cc(73)] XXX FileWorker::FreezeRenderers start [2011:3304:0914/104629:ERROR:freezer_cgroup_process_manager.cc(83)] XXX FileWorker::FreezeRenderers end [2011:2011:0914/104630:VERBOSE1:update_display_configuration_task.cc(64)] OnDisplaysUpdated: new_display_state=SINGLE new_power_state=ALL_ON flags=0 force_configure=0 display_count=1 [2011:2011:0914/104630:VERBOSE1:display_configurator.cc(217)] EnterState: display=SINGLE power=ALL_ON [2011:2011:0914/104630:VERBOSE1:arc_net_host_impl.cc(631)] New default network: /service/1 [2011:2011:0914/104630:VERBOSE1:arc_net_host_impl.cc(631)] New default network: /service/1 [2011:2011:0914/104630:VERBOSE1:display_configurator.cc(1029)] OnConfigured: success=1 new_display_state=SINGLE new_power_state=ALL_ON [2011:2011:0914/104630:VERBOSE1:display_manager.cc(563)] OnNativeDisplaysChanged(1):ManagedDisplayInfo[13761487533244416] native bounds=0,0 2560x1700, size=2560x1700, scale=2.000000, overscan=0,0,0,0, rotation=0, ui-scale=1.000000, touchscreen=yes, input_devices=[5] [2011:2011:0914/104630:VERBOSE1:display_color_manager_chromeos.cc(218)] No ICC file found with product id: 30e42e04 for display id: 13761487533244416 The problem appears to be that PowerManagerClientImpl doesn't actually call RendererFreezer::SuspendImminent synchronously when it sees the SuspendImminent signal. Instead, it waits until all of callbacks created by GetSuspendReadinessCallback() have been run asynchronously. It shouldn't do this if SuspendDone has already been received by then, but it does, and I think that that's when we handle these out-of-order. I suspect that this bug has always been around.
,
Sep 14 2016
Yeah, something like that would probably work. I think I find the suspend_is_pending_ member a bit confusing (it really means "we still need to report suspend readiness"), so I might try to rework this class a bit. It's pretty complicated so I'm also going to see if it's possible to add tests for it.
,
Sep 15 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/751cb04fca2fb7861de5ad396d2b08f18471fa9a commit 751cb04fca2fb7861de5ad396d2b08f18471fa9a Author: derat <derat@chromium.org> Date: Wed Sep 14 23:56:35 2016 chromeos: Add suspend delay tests for PowerManagerClient. Add initial unit tests for chromeos::PowerManagerClient's suspend-delay-handling code. More coming soon. BUG= 646912 Review-Url: https://codereview.chromium.org/2345593002 Cr-Commit-Position: refs/heads/master@{#418716} [modify] https://crrev.com/751cb04fca2fb7861de5ad396d2b08f18471fa9a/chromeos/chromeos.gyp [modify] https://crrev.com/751cb04fca2fb7861de5ad396d2b08f18471fa9a/chromeos/dbus/power_manager_client.h [add] https://crrev.com/751cb04fca2fb7861de5ad396d2b08f18471fa9a/chromeos/dbus/power_manager_client_unittest.cc
,
Sep 15 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/0ef935c2ea78249e96fb76fbc3bb91a79ab32afb commit 0ef935c2ea78249e96fb76fbc3bb91a79ab32afb Author: derat <derat@chromium.org> Date: Thu Sep 15 14:32:29 2016 dbus: Mock MockObjectProxy::SetNameOwnerChangedCallback(). ObjectProxy's real method calls Bus::AssertOnOriginThread() and causes in some log spam (but surprisingly doesn't crash). Override the method in MockObjectProxy to prevent this. Also update chromeos::PowerManagerClientTest to expect this call. BUG= 646912 Review-Url: https://codereview.chromium.org/2340683003 Cr-Commit-Position: refs/heads/master@{#418864} [modify] https://crrev.com/0ef935c2ea78249e96fb76fbc3bb91a79ab32afb/chromeos/dbus/power_manager_client_unittest.cc [modify] https://crrev.com/0ef935c2ea78249e96fb76fbc3bb91a79ab32afb/dbus/mock_object_proxy.h
,
Sep 15 2016
bugdroid is being lazy and/or broken, but this should be fixed by https://codereview.chromium.org/2340153002/ .
,
Sep 15 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/7e1b5a3c7d7ae30ed6ff862f6257ec617da9d742 commit 7e1b5a3c7d7ae30ed6ff862f6257ec617da9d742 Author: derat <derat@chromium.org> Date: Thu Sep 15 16:50:31 2016 chromeos: Fix renderer-freezing race during aborted suspend. Fix a (long-standing?) race where chromeos::PowerManagerClient could notify RendererFreezer that a suspend attempt was imminent *after* already telling it that the attempt was done. Also make PowerManagerClient not notify powerd about regular or dark suspend readiness after it's already seen a SuspendDone signal. Add a bunch of unit tests exercising this code, too, and rework the comments on some previously-added tests. BUG= 646912 Review-Url: https://codereview.chromium.org/2340153002 Cr-Commit-Position: refs/heads/master@{#418887} [modify] https://crrev.com/7e1b5a3c7d7ae30ed6ff862f6257ec617da9d742/chromeos/dbus/power_manager_client.cc [modify] https://crrev.com/7e1b5a3c7d7ae30ed6ff862f6257ec617da9d742/chromeos/dbus/power_manager_client_unittest.cc
,
Sep 19 2016
Can we request merge of this to M54?
,
Sep 20 2016
Is a merge really necessary? The fix is somewhat tricky, and I think that this bug has been present for at least a year.
,
Oct 25 2016
It's likely that this bug is responsible for some of the M54 "Files app / Help / crosh not responding" reports in issue 657515 . The fix has been out on M55 for a while now, so I think it's prudent to revisit merging it to M54. Specifically, I'd be merging the non-test part of https://codereview.chromium.org/2340153002 (it also added tests, but those are dependent on additional changes and I'd like to simplify the merge). Note that we'll also need to merge the non-test part of https://codereview.chromium.org/2403733003 to fix issue 648580, which was exposed by the fix for this bug. It's not zero-risk (especially since the original fix exposed another bug), but I haven't heard of any problems since the second fix went in. I think I'd prefer merging, since we otherwise have a bug with a known trigger (aborted suspend attempt completes before Chrome finishes preparing for suspend) that's already showing up in the wild. Note that the occurrence rate of this is probably hardware-dependent since it's a race.
,
Oct 25 2016
ok, approving this CL as per c#9
,
Oct 25 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/49505227e435b61ed3d9a61974508d34ecc9f291 commit 49505227e435b61ed3d9a61974508d34ecc9f291 Author: derat <derat@chromium.org> Date: Tue Oct 25 23:36:51 2016 chromeos: Fix renderer-freezing race during aborted suspend. Fix a (long-standing?) race where chromeos::PowerManagerClient could notify RendererFreezer that a suspend attempt was imminent *after* already telling it that the attempt was done. Also make PowerManagerClient not notify powerd about regular or dark suspend readiness after it's already seen a SuspendDone signal. (Unit tests omitted for merge to M54 since they depend on additional changes.) BUG= 646912 NOTRY=true NOPRESUBMIT=true Review-Url: https://codereview.chromium.org/2340153002 Cr-Commit-Position: refs/heads/master@{#418887} (cherry picked from commit 7e1b5a3c7d7ae30ed6ff862f6257ec617da9d742) Review-Url: https://codereview.chromium.org/2446423002 Cr-Commit-Position: refs/branch-heads/2840@{#774} Cr-Branched-From: 1ae106dbab4bddd85132d5b75c670794311f4c57-refs/heads/master@{#414607} [modify] https://crrev.com/49505227e435b61ed3d9a61974508d34ecc9f291/chromeos/dbus/power_manager_client.cc
,
Oct 25 2016
I'm merging the changes now: https://codereview.chromium.org/2446423002/ https://codereview.chromium.org/2451013002/ To verify, please try to repro issue 657515 in a build containing these changes: suspend and resume a few times and check that the Files app (and any existing tabs) remain operational. Please also verify that you don't observe Chrome hangs or crashes after suspending and resume (and particularly after closing and opening the lid quickly to trigger cancelled suspend attempts). If you see problems and are in dev mode, I'd be interested in seeing the output from running "cat /sys/fs/cgroup/freezer/chrome_renderers/to_be_frozen/freezer.state" on VT2.
,
Oct 26 2016
Not Reproduced on M 54.0.2840.79/8743.76.0 .Marking verified. If problems reoccur will update the bug.
,
Oct 26 2016
Issue 657515 has been merged into this issue.
,
Oct 27 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/49505227e435b61ed3d9a61974508d34ecc9f291 commit 49505227e435b61ed3d9a61974508d34ecc9f291 Author: derat <derat@chromium.org> Date: Tue Oct 25 23:36:51 2016 chromeos: Fix renderer-freezing race during aborted suspend. Fix a (long-standing?) race where chromeos::PowerManagerClient could notify RendererFreezer that a suspend attempt was imminent *after* already telling it that the attempt was done. Also make PowerManagerClient not notify powerd about regular or dark suspend readiness after it's already seen a SuspendDone signal. (Unit tests omitted for merge to M54 since they depend on additional changes.) BUG= 646912 NOTRY=true NOPRESUBMIT=true Review-Url: https://codereview.chromium.org/2340153002 Cr-Commit-Position: refs/heads/master@{#418887} (cherry picked from commit 7e1b5a3c7d7ae30ed6ff862f6257ec617da9d742) Review-Url: https://codereview.chromium.org/2446423002 Cr-Commit-Position: refs/branch-heads/2840@{#774} Cr-Branched-From: 1ae106dbab4bddd85132d5b75c670794311f4c57-refs/heads/master@{#414607} [modify] https://crrev.com/49505227e435b61ed3d9a61974508d34ecc9f291/chromeos/dbus/power_manager_client.cc |
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by chirantan@chromium.org
, Sep 14 2016