New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 646912 link

Starred by 11 users

Issue metadata

Status: Verified
Owner:
Closed: Sep 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux , Chrome
Pri: 2
Type: Bug



Sign in to add a comment

Chrome can leave renderers frozen after cancelled suspend attempt

Project Member Reported by derat@chromium.org, Sep 14 2016

Issue description

(Moving from http://crosbug.com/p/57040.)

There appears to be a race in Chrome's renderer freezing/suspending code, which I think may only be enabled on some systems (including samus).

In some cases, chromeos::RendererFreezer actually handles powerd's SuspendDone D-Bus signal *before* it handles the corresponding SuspendImminent signal, resulting in it thawing the not-frozen renderers, then freezing them, and then just leaving them frozen. Here's what that looks like with some additional logging enabled:

[2011:2011:0914/104629:VERBOSE1:update_display_configuration_task.cc(64)] OnDisplaysUpdated: new_display_state=SINGLE new_power_state=ALL_OFF flags=0 force_configure=0 display_count=1
[2011:2011:0914/104629:VERBOSE1:display_configurator.cc(217)] EnterState: display=SINGLE power=ALL_OFF
[2011:2011:0914/104629:ERROR:renderer_freezer.cc(61)] XXX RendererFreezer::SuspendDone
[2011:2011:0914/104629:ERROR:freezer_cgroup_process_manager.cc(168)] XXX FreezerCgroupProcessManager::ThawRenderers
[2011:3304:0914/104629:ERROR:freezer_cgroup_process_manager.cc(87)] XXX FileWorker::ThawRenderers start
[2011:2011:0914/104629:VERBOSE1:display_configurator.cc(877)] SetDisplayPower: power_state=ALL_ON flags=0, configure timer=Stopped
[2011:3304:0914/104629:ERROR:freezer_cgroup_process_manager.cc(98)] XXX FileWorker::ThawRenderers end: result=1
[2011:2011:0914/104629:ERROR:renderer_freezer.cc(135)] XXX RendererFreezer::OnThawRenderersComplete: success=1
[2011:2011:0914/104629:VERBOSE1:display_configurator.cc(1029)] OnConfigured: success=1 new_display_state=SINGLE new_power_state=ALL_OFF
[2011:2011:0914/104629:VERBOSE1:display_manager.cc(534)] OnNativeDisplaysChanged(0): # of current displays=1
[2011:2011:0914/104629:ERROR:renderer_freezer.cc(52)] XXX RendererFreezer::SuspendImminent
[2011:2011:0914/104629:ERROR:freezer_cgroup_process_manager.cc(161)] XXX FreezerCgroupProcessManager::FreezeRenderers
[2011:2011:0914/104629:VERBOSE1:display_color_manager_chromeos.cc(218)] No ICC file found with product id: 30e42e04 for display id: 13761487533244416
[2011:3304:0914/104629:ERROR:freezer_cgroup_process_manager.cc(73)] XXX FileWorker::FreezeRenderers start
[2011:3304:0914/104629:ERROR:freezer_cgroup_process_manager.cc(83)] XXX FileWorker::FreezeRenderers end
[2011:2011:0914/104630:VERBOSE1:update_display_configuration_task.cc(64)] OnDisplaysUpdated: new_display_state=SINGLE new_power_state=ALL_ON flags=0 force_configure=0 display_count=1
[2011:2011:0914/104630:VERBOSE1:display_configurator.cc(217)] EnterState: display=SINGLE power=ALL_ON
[2011:2011:0914/104630:VERBOSE1:arc_net_host_impl.cc(631)] New default network: /service/1
[2011:2011:0914/104630:VERBOSE1:arc_net_host_impl.cc(631)] New default network: /service/1
[2011:2011:0914/104630:VERBOSE1:display_configurator.cc(1029)] OnConfigured: success=1 new_display_state=SINGLE new_power_state=ALL_ON
[2011:2011:0914/104630:VERBOSE1:display_manager.cc(563)] OnNativeDisplaysChanged(1):ManagedDisplayInfo[13761487533244416] native bounds=0,0 2560x1700, size=2560x1700, scale=2.000000, overscan=0,0,0,0, rotation=0, ui-scale=1.000000, touchscreen=yes, input_devices=[5]
[2011:2011:0914/104630:VERBOSE1:display_color_manager_chromeos.cc(218)] No ICC file found with product id: 30e42e04 for display id: 13761487533244416

The problem appears to be that PowerManagerClientImpl doesn't actually call RendererFreezer::SuspendImminent synchronously when it sees the SuspendImminent signal. Instead, it waits until all of callbacks created by GetSuspendReadinessCallback() have been run asynchronously. It shouldn't do this if SuspendDone has already been received by then, but it does, and I think that that's when we handle these out-of-order.

I suspect that this bug has always been around.
 
I'm surprised we haven't been bitten by this earlier :-/  I think this can be fixed by setting suspend_is_pending_ to false in PowerManagerClient::SuspendDoneReceived 

Comment 2 by derat@chromium.org, Sep 14 2016

Yeah, something like that would probably work. I think I find the suspend_is_pending_ member a bit confusing (it really means "we still need to report suspend readiness"), so I might try to rework this class a bit. It's pretty complicated so I'm also going to see if it's possible to add tests for it.
Project Member

Comment 3 by bugdroid1@chromium.org, Sep 15 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/751cb04fca2fb7861de5ad396d2b08f18471fa9a

commit 751cb04fca2fb7861de5ad396d2b08f18471fa9a
Author: derat <derat@chromium.org>
Date: Wed Sep 14 23:56:35 2016

chromeos: Add suspend delay tests for PowerManagerClient.

Add initial unit tests for chromeos::PowerManagerClient's
suspend-delay-handling code. More coming soon.

BUG= 646912 

Review-Url: https://codereview.chromium.org/2345593002
Cr-Commit-Position: refs/heads/master@{#418716}

[modify] https://crrev.com/751cb04fca2fb7861de5ad396d2b08f18471fa9a/chromeos/chromeos.gyp
[modify] https://crrev.com/751cb04fca2fb7861de5ad396d2b08f18471fa9a/chromeos/dbus/power_manager_client.h
[add] https://crrev.com/751cb04fca2fb7861de5ad396d2b08f18471fa9a/chromeos/dbus/power_manager_client_unittest.cc

Project Member

Comment 4 by bugdroid1@chromium.org, Sep 15 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/0ef935c2ea78249e96fb76fbc3bb91a79ab32afb

commit 0ef935c2ea78249e96fb76fbc3bb91a79ab32afb
Author: derat <derat@chromium.org>
Date: Thu Sep 15 14:32:29 2016

dbus: Mock MockObjectProxy::SetNameOwnerChangedCallback().

ObjectProxy's real method calls Bus::AssertOnOriginThread()
and causes in some log spam (but surprisingly doesn't
crash). Override the method in MockObjectProxy to prevent
this.

Also update chromeos::PowerManagerClientTest to expect this
call.

BUG= 646912 

Review-Url: https://codereview.chromium.org/2340683003
Cr-Commit-Position: refs/heads/master@{#418864}

[modify] https://crrev.com/0ef935c2ea78249e96fb76fbc3bb91a79ab32afb/chromeos/dbus/power_manager_client_unittest.cc
[modify] https://crrev.com/0ef935c2ea78249e96fb76fbc3bb91a79ab32afb/dbus/mock_object_proxy.h

Comment 5 by derat@chromium.org, Sep 15 2016

Status: Fixed (was: Started)
bugdroid is being lazy and/or broken, but this should be fixed by https://codereview.chromium.org/2340153002/ .
Project Member

Comment 6 by bugdroid1@chromium.org, Sep 15 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/7e1b5a3c7d7ae30ed6ff862f6257ec617da9d742

commit 7e1b5a3c7d7ae30ed6ff862f6257ec617da9d742
Author: derat <derat@chromium.org>
Date: Thu Sep 15 16:50:31 2016

chromeos: Fix renderer-freezing race during aborted suspend.

Fix a (long-standing?) race where
chromeos::PowerManagerClient could notify RendererFreezer
that a suspend attempt was imminent *after* already telling
it that the attempt was done.

Also make PowerManagerClient not notify powerd about regular
or dark suspend readiness after it's already seen a
SuspendDone signal.

Add a bunch of unit tests exercising this code, too, and
rework the comments on some previously-added tests.

BUG= 646912 

Review-Url: https://codereview.chromium.org/2340153002
Cr-Commit-Position: refs/heads/master@{#418887}

[modify] https://crrev.com/7e1b5a3c7d7ae30ed6ff862f6257ec617da9d742/chromeos/dbus/power_manager_client.cc
[modify] https://crrev.com/7e1b5a3c7d7ae30ed6ff862f6257ec617da9d742/chromeos/dbus/power_manager_client_unittest.cc

Can we request merge of this to M54?

Comment 8 by derat@chromium.org, Sep 20 2016

Is a merge really necessary? The fix is somewhat tricky, and I think that this bug has been present for at least a year.

Comment 9 by derat@chromium.org, Oct 25 2016

Cc: agnescheng@chromium.org josa...@chromium.org
Labels: Merge-Request-54
It's likely that this bug is responsible for some of the M54 "Files app / Help / crosh not responding" reports in  issue 657515 . The fix has been out on M55 for a while now, so I think it's prudent to revisit merging it to M54.

Specifically, I'd be merging the non-test part of https://codereview.chromium.org/2340153002 (it also added tests, but those are dependent on additional changes and I'd like to simplify the merge).

Note that we'll also need to merge the non-test part of https://codereview.chromium.org/2403733003 to fix issue 648580, which was exposed by the fix for this bug.

It's not zero-risk (especially since the original fix exposed another bug), but I haven't heard of any problems since the second fix went in.

I think I'd prefer merging, since we otherwise have a bug with a known trigger (aborted suspend attempt completes before Chrome finishes preparing for suspend) that's already showing up in the wild. Note that the occurrence rate of this is probably hardware-dependent since it's a race.
Labels: -Merge-Request-54 M-54 ReleaseBlock-Stable Merge-Approved-54 OS-Chrome
ok, approving  this CL as per c#9 


Project Member

Comment 11 by bugdroid1@chromium.org, Oct 25 2016

Labels: -merge-approved-54 merge-merged-2840
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/49505227e435b61ed3d9a61974508d34ecc9f291

commit 49505227e435b61ed3d9a61974508d34ecc9f291
Author: derat <derat@chromium.org>
Date: Tue Oct 25 23:36:51 2016

chromeos: Fix renderer-freezing race during aborted suspend.

Fix a (long-standing?) race where
chromeos::PowerManagerClient could notify RendererFreezer
that a suspend attempt was imminent *after* already telling
it that the attempt was done.

Also make PowerManagerClient not notify powerd about regular
or dark suspend readiness after it's already seen a
SuspendDone signal.

(Unit tests omitted for merge to M54 since they depend on
additional changes.)

BUG= 646912 
NOTRY=true
NOPRESUBMIT=true
Review-Url: https://codereview.chromium.org/2340153002
Cr-Commit-Position: refs/heads/master@{#418887}
(cherry picked from commit 7e1b5a3c7d7ae30ed6ff862f6257ec617da9d742)

Review-Url: https://codereview.chromium.org/2446423002
Cr-Commit-Position: refs/branch-heads/2840@{#774}
Cr-Branched-From: 1ae106dbab4bddd85132d5b75c670794311f4c57-refs/heads/master@{#414607}

[modify] https://crrev.com/49505227e435b61ed3d9a61974508d34ecc9f291/chromeos/dbus/power_manager_client.cc

Comment 12 by derat@chromium.org, Oct 25 2016

I'm merging the changes now:

https://codereview.chromium.org/2446423002/
https://codereview.chromium.org/2451013002/

To verify, please try to repro  issue 657515  in a build containing these changes: suspend and resume a few times and check that the Files app (and any existing tabs) remain operational.

Please also verify that you don't observe Chrome hangs or crashes after suspending and resume (and particularly after closing and opening the lid quickly to trigger cancelled suspend attempts).

If you see problems and are in dev mode, I'd be interested in seeing the output from running "cat /sys/fs/cgroup/freezer/chrome_renderers/to_be_frozen/freezer.state" on VT2.
Status: Verified (was: Fixed)
Not Reproduced on M 54.0.2840.79/8743.76.0 .Marking verified.
If problems reoccur will update the bug.

Comment 14 by derat@chromium.org, Oct 26 2016

Cc: rdevlin....@chromium.org fukino@chromium.org keta...@chromium.org xiy...@chromium.org abodenha@chromium.org rookrishna@chromium.org derat@chromium.org rohi...@chromium.org weifangsun@chromium.org piman@chromium.org dhadd...@chromium.org sdantul...@chromium.org r...@chromium.org
 Issue 657515  has been merged into this issue.
Project Member

Comment 15 by bugdroid1@chromium.org, Oct 27 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/49505227e435b61ed3d9a61974508d34ecc9f291

commit 49505227e435b61ed3d9a61974508d34ecc9f291
Author: derat <derat@chromium.org>
Date: Tue Oct 25 23:36:51 2016

chromeos: Fix renderer-freezing race during aborted suspend.

Fix a (long-standing?) race where
chromeos::PowerManagerClient could notify RendererFreezer
that a suspend attempt was imminent *after* already telling
it that the attempt was done.

Also make PowerManagerClient not notify powerd about regular
or dark suspend readiness after it's already seen a
SuspendDone signal.

(Unit tests omitted for merge to M54 since they depend on
additional changes.)

BUG= 646912 
NOTRY=true
NOPRESUBMIT=true
Review-Url: https://codereview.chromium.org/2340153002
Cr-Commit-Position: refs/heads/master@{#418887}
(cherry picked from commit 7e1b5a3c7d7ae30ed6ff862f6257ec617da9d742)

Review-Url: https://codereview.chromium.org/2446423002
Cr-Commit-Position: refs/branch-heads/2840@{#774}
Cr-Branched-From: 1ae106dbab4bddd85132d5b75c670794311f4c57-refs/heads/master@{#414607}

[modify] https://crrev.com/49505227e435b61ed3d9a61974508d34ecc9f291/chromeos/dbus/power_manager_client.cc

Sign in to add a comment