Ensure lock is always shown after suspend if show lock on sleep pref is set |
||||||||||||
Issue descriptionThere are some reports that chrome lock screen never comes up. I wonder if there are some race conditions when chrome crashes, ie: 1 device is going to sleep/requested to show lock screen 2 chrome sends dbus message to show lock screen, chrome crashes very shortly after 3 session manager doesn't think we're in a lock state yet, so it restarts chrome as normal 4 session manager processes lock screen request 5 chrome starts back up without lock screen 6 user opens device a few hours later without lock screen The race is between 2-4 As a mitigating strategy we may want to check if the show lock on sleep pref is set and we're coming out of suspend, then the lock screen should be shown if it is not already. But we should try to see if there is and then and fix the underlying race.
,
Jul 26
Where are the reports? Do the logs indicate that Chrome actually crashed right after requesting that the screen be locked?
,
Jul 26
I don't have any hard reports on-hand, but this specific complaint has been surfacing consistently every once in a while (ie, the latest is https://www.google.com/tools/feedback/reports/85567837443).
,
Jul 26
[removing RVG; it's fine to link to feedback reports or other internal URLs as long as the URLs themselves don't reveal confidential information] The proposed mitigation sounds reasonable to me. As an actual fix, Chrome could support starting in a locked state. Issue 820644 is conveniently already assigned to you. :-)
,
Jul 26
+mnissler@ if he has any preference on the mitigation discussed below. rkc@ and I talked a bit offline about this, and we found at least one race condition. I inserted a LOG(FATAL) at [1], which led to lock not being shown after putting the lid down. The race is as follows: 1 chrome is sent a suspend event 2 chrome tells session manager it wants to lock 3 session manager tells chrome to lock If a crash happens between 1 and 2, then session manager is never told about the lock state and restarts chrome. The user then has full in-session access after resuming the device. I see two clear mitigation paths: 1) chrome listens for suspend done events, and if the lock screen is not shown then show it 2) session manager listens to suspend events; if chrome crashes after suspend has started we kick the user out of the session and go back to login - 1 has a better UX, as the user remains in-session (though they lost all session state due to the crash). - 1 seems a bit less reliable, ie, what if suspend event is not delivered to Chrome due to chrome starting late? What about two crashes in a row? - 2 seems a bit more robust - there are fewer moving parts - 2 has the disadvantage that even if the user does not have lock state pref, they will get kicked out of session. We can probably work around this by telling session manager about the preference value and it can change behavior as needed My preference is approach 2. 1: https://cs.chromium.org/chromium/src/chromeos/dbus/power_manager_client.cc?l=796&rcl=05e02f2c0db74dd5d08f9331033c9f4b7c0e7212
,
Jul 26
,
Jul 26
I suspect that approach 1 has a race of its own. It would probably be possible for the restarted Chrome to only register for SuspendDone after the signal has already been emitted, resulting in it not knowing that the system just resumed and not displaying the lock screen. Approach 2 also seems more robust to me, but I think that there's still a race there. Chrome could crash as the user is closing the lid (i.e. before suspend) and miss the SuspendImminent signal as a result. Probably session_manager could CHECK() if Chrome isn't running[1] when SuspendImminent is emitted or if Chrome exits at any point between SuspendImminent and SuspendDone. In other words, approach 2 with the added condition that Chrome needs to be already running at SuspendImminent. 1. Defined here as "Chrome has registered for D-Bus signals from powerd"
,
Jul 26
(You already pointed out the race in approach 1.) :-P
,
Sep 5
I need to think about this more, but here is my knee-jerk security perspective before I call it a day: 1. Anything that depends on different components going through a specific sequence of steps to finally bring up the screen lock is likely to be fragile. 2. Purely event-driven mechanisms are fragile since crashes may not leave around sufficient information to recover. You need to be able to tell the correct state the system should be in after a crash at any point just by looking at device state. In other words, I think we need an approach with a more fundamental guarantee that a crash at any point on the critical path between "user closes lid" and "everything agrees the screen is locked" can't leave the system in a state where we can't tell whether we're locked or not. To accomplish that, I think we generally need to switch to an approach that doesn't exclusively rely on events to trigger transitions, but give the decision makers a chance to inquire state to correctly recover from crashes. A couple ideas: 1. Make powerd track a sequence of states the device went through, tagged by monotonically increasing IDs. This allows other components to inquire and see whether they have missed a suspend cycle. 2. Centralize the decision whether we're locked or not either in Chrome or session_manager, but not share that responsibility across both of them. Let's assume we have #1 and #2. Then, whenever Chrome starts up, it can do two things: 1. Examine a persistent flag it keeps whether the last known state was locked or not. 2. Get unprocessed power state transitions from powerd and determine which of them it hasn't acted on yet (it needs to persist the most recent power state it completely processed). If there is anything unprocessed, it can replay these transitions to achieve the correct state. Full disclaimer: The above is what my brain produces at midnight, so take it with a grain of salt ;-)
,
Sep 13
The current powerd/session_manager/Chrome interactions around suspend/resume and screen-locking are already extremely baroque and hard to understand: https://chromium.googlesource.com/chromiumos/platform2/+/master/login_manager/#screen-locking https://chromium.googlesource.com/chromiumos/platform2/+/master/power_manager/docs/suspend_resume.md etc. I'm hesitant about anything that adds additional complexity. Maybe we can just do something simple here like making session_manager stop the session if it sees a Chrome crash within X seconds of a suspend/resume cycle. If that results in too many false positives, we should probably work harder to reduce browser crashes.
,
Sep 17
"The only thing that ever improved security is reducing complexity" - who/thomasdullien I would agree that we should probably not try to address this problem by adding complexity to the very mechanism that's enforcing (or trying to enforce) the property that we want. An unexpected login screen on a Chrome crash (because of a stopped session as per #11) is probably better than an unlocked screen.
,
Sep 17
IIUC, what Jacob is suggesting is - if session_manager sees suspend, sees a crash before seeing a resume, stop the session. What comment 11 is suggesting is, if session_manager sees suspend, sees a crash within X seconds (whether or not a resume has been seen), stop the session. I am not sure what the latter approach will buy us though? Is it trying to prevent the case where we have a suspend and a resume and Chrome misses both the events?
,
Sep 17
Yes, see #8. I want us to end the session if we see a Chrome crash X seconds before SuspendImminent or Y seconds after SuspendDone. It might be possible to handle all the corner cases here, but we'll have poor/nonexistent test coverage for it and it's likely to just regress later. I'm tired of investigating reports of systems not locking and want us to err on the side of security.
,
Sep 17
I completely understand and empathize with wanting to err on the side of security. However, would it not be possible to add specific tests here that test for a various bunch of these scenarios? I understand it may be hard with the current infrastructure but if we invested some time, we 'should' be able to inject a crash into Chrome at various stages (while artificially delaying the delivery of the SuspendIminent/SuspendDone signals to Chrome). I am a bit uncomfortable with simply not even trying to get a solution that covers the corner cases (which does seem possible here) and using a hammer to solve the problem instead. I am also not convinced that the hammer will actually solve all the problems. For example, how many seconds should X and Y be? We've seen delays of higher than 16s while locking on slower Chromebooks earlier. I am completely fine with going with the hammer for now - as long as we do keep a more elegant(?) solution as a long-term goal.
,
Sep 17
> We've seen delays of higher than 16s while locking on slower Chromebooks earlier. The delay here should be much, much shorter than that - it is only when chrome is in a transition state after receiving a suspend event but not telling session manager about it (see comment #6, 1-2). The 16s comes from loading the lock screen, which is after this transition period.
,
Sep 17
Right - my point however is that it was a number far exceeding what we had thought could be possible. Which is why there is always a slight risk with having just a 'x seconds' based approach. There could also be unknown deadlocks which may cause a Chrome to not receive the suspend imminent event, ever- hence making any crash afterwards still bring up an unlocked screen. I am not particularly familiar with the session_manager and powerd code though. If the consensus is that having a "stop session if crash within x seconds of SuspendImminent" will definitely stop every possible escape from the lockscreen due to this race, then sure, I'm fine with settling on that as the long term solution.
,
Sep 26
Our integration tests that try to log into Chrome don't consistently pass (issue 888520, issue 884454 , issue 880823 , issue 880818 , issue 877720 , issue 875263 , issue 865813, etc.). I think we'll have far worse reliability problems in tests that log in and then try to trigger Chrome crashes at precise instances while we're simultaneously suspending the system and locking its screen. I'm looking into making a session_manager change for this.
,
Oct 30
,
Oct 30
,
Oct 30
I'm working on this now. No need for an arbitrary RBS. :-P
,
Oct 30
I mostly put that out of habit. I usually put RBS on bugs that I want to make sure don't slip through the cracks - it helps me personally track them better.
,
Oct 30
derat@, which solution are working on?
I believe you're adding a check in session_manager that if there is a crash within x seconds of a suspend event (either SuspendImminent or SuspendDone) to restart the session.
Can we make the check be
1. after SuspendImminent but before SuspendDone always restart the session
2. xxx seconds after SuspendDone always restart the session
This will eliminate a potential interval between SuspendImminent and SuspendDone where we do not restart the session.
I think if this is implemented as a separate state machine complexity will remain low, ie, instead of trying to combine both states into a single enum, split them out, eg
Merged state machine:
* Idle -> SuspendImminent -> ReceivedChromeLock -> SuspendDone -> ReceivedChromeUnlock -> Idle
Split state machine:
* Awake -> SuspendImminent -> SuspendDone -> CrashBrieflyAfterSuspendDoneUntilTimeout -> Awake
* Unlocked -> ReceivedChromeLock -> ReceivedChromeUnlock -> Unlocked
Then if either state machine wants to restart session manager restarts. Splitting the state machines out avoids a bunch of different transitions and edge cases (ie, SuspendDone arrives at session_manager after ReceivedChromeUnlock).
,
Oct 30
#23: Your "Can we make the check be" conditions are exactly what I've written and am about to send to mnissler@* for review as soon as I verify them locally. :-) I'm planning to add some additional logic to end the session if Chrome crashes X seconds *before* SuspendImminent in a later change, but that's trickier due to session_manager's life choices. * or you, if you'd like!
,
Oct 30
Ah, perfect, thanks! The bit before suspend imminent also seems like a nice improvement.
,
Oct 30
> The bit before suspend imminent also seems like a nice improvement. I think it's required as part of this. session_manager and Chrome receive SuspendImminent simultaneously, so we need to handle the case where Chrome crashes quickly and session_manager receives SIGCHLD about Chrome before it receives SuspendImminent from powerd.
,
Oct 31
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform2/+/a02c44455923d917dac48d30bd067a9ac7f5fd41 commit a02c44455923d917dac48d30bd067a9ac7f5fd41 Author: Daniel Erat <derat@chromium.org> Date: Wed Oct 31 07:54:26 2018 login: Log explanation when choosing not to restart Chrome. Make session_manager log a human-readable explanation when it chooses not to restart Chrome (e.g. due to a crash while the screen is locked). BUG= chromium:867970 TEST=manual: sent SIGSEGV to chrome after locking the screen and saw "Ending session rather than restarting browser: screen is locked." in /var/log/messages Change-Id: I36ae45f75cc738fd5bda42c4d8bd7424789095ec Reviewed-on: https://chromium-review.googlesource.com/1308683 Commit-Ready: Dan Erat <derat@chromium.org> Tested-by: Dan Erat <derat@chromium.org> Reviewed-by: Dan Erat <derat@chromium.org> [modify] https://crrev.com/a02c44455923d917dac48d30bd067a9ac7f5fd41/login_manager/session_manager_interface.h [modify] https://crrev.com/a02c44455923d917dac48d30bd067a9ac7f5fd41/login_manager/mock_session_manager.h [modify] https://crrev.com/a02c44455923d917dac48d30bd067a9ac7f5fd41/login_manager/session_manager_impl_test.cc [modify] https://crrev.com/a02c44455923d917dac48d30bd067a9ac7f5fd41/login_manager/session_manager_service.cc [modify] https://crrev.com/a02c44455923d917dac48d30bd067a9ac7f5fd41/login_manager/session_manager_impl.cc [modify] https://crrev.com/a02c44455923d917dac48d30bd067a9ac7f5fd41/login_manager/session_manager_process_test.cc [modify] https://crrev.com/a02c44455923d917dac48d30bd067a9ac7f5fd41/login_manager/session_manager_impl.h
,
Oct 31
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform2/+/555de928024b54c421affe7488483ac4cb169556 commit 555de928024b54c421affe7488483ac4cb169556 Author: Daniel Erat <derat@chromium.org> Date: Wed Oct 31 07:54:27 2018 login: Stop session if Chrome crashes during/after suspend. Make session_manager end the user's session if a Chrome crash is observed during suspend (i.e. between powerd's SuspendImminent and SuspendDone D-Bus signals) or within 5 seconds of SuspendDone being received. This is intended to reduce the window of time where an inopportune Chrome crash can result in the screen not being locked before suspending. BUG= chromium:867970 TEST=added unit tests; also manually verified that the session ends if i kill chrome soon after resuming or when i modify chrome to CHECK(false) in ash::PowerEventObserver::SuspendImminent Change-Id: I2d2b7a60d8e7b4cddeda87afb63a493bac531f9a Reviewed-on: https://chromium-review.googlesource.com/1308684 Commit-Ready: Dan Erat <derat@chromium.org> Tested-by: Dan Erat <derat@chromium.org> Reviewed-by: Dan Erat <derat@chromium.org> [modify] https://crrev.com/555de928024b54c421affe7488483ac4cb169556/login_manager/session_manager_impl.cc [modify] https://crrev.com/555de928024b54c421affe7488483ac4cb169556/login_manager/session_manager_impl_test.cc [modify] https://crrev.com/555de928024b54c421affe7488483ac4cb169556/login_manager/session_manager_service.cc [modify] https://crrev.com/555de928024b54c421affe7488483ac4cb169556/login_manager/session_manager_impl.h
,
Nov 3
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform2/+/fe80d32ea23e9ba46303c663094422244642cb36 commit fe80d32ea23e9ba46303c663094422244642cb36 Author: Daniel Erat <derat@chromium.org> Date: Sat Nov 03 04:55:00 2018 login: Stop session for suspend after Chrome crash. Make session_manager stop the user session when it receives a SuspendImminent D-Bus signal from powerd within five seconds of restarting Chrome (typically due to a browser crash). This should further reduce the likelihood of a Chrome crash preventing the screen from being locked before the system suspends. BUG= chromium:867970 TEST=added unit test; also manually verified that session is stopped after killing chrome and then running powerd_dbus_suspend Change-Id: I3e75343ab7e556ebe4b6a235729d87efbbbf03cd Reviewed-on: https://chromium-review.googlesource.com/1312237 Commit-Ready: Dan Erat <derat@chromium.org> Tested-by: Dan Erat <derat@chromium.org> Reviewed-by: Jacob Dufault <jdufault@chromium.org> [modify] https://crrev.com/fe80d32ea23e9ba46303c663094422244642cb36/login_manager/mock_process_manager_service.h [modify] https://crrev.com/fe80d32ea23e9ba46303c663094422244642cb36/login_manager/session_manager_service.h [modify] https://crrev.com/fe80d32ea23e9ba46303c663094422244642cb36/login_manager/session_manager_impl_test.cc [modify] https://crrev.com/fe80d32ea23e9ba46303c663094422244642cb36/login_manager/session_manager_service.cc [modify] https://crrev.com/fe80d32ea23e9ba46303c663094422244642cb36/login_manager/process_manager_service_interface.h [modify] https://crrev.com/fe80d32ea23e9ba46303c663094422244642cb36/login_manager/session_manager_impl.cc [modify] https://crrev.com/fe80d32ea23e9ba46303c663094422244642cb36/login_manager/session_manager_impl.h
,
Nov 4
I think we can mostly close the remaining issue of the unlocked desktop potentially being briefly visible after resume (even though Chrome is already dead and the session ended at that point) if there's some way for session_manager to clear the framebuffer. Stephane or Dominik, do you have any suggestions there? Untested, but would something like "/sbin/frecon --clear 0x000000" work?
,
Nov 8
It wouldn't work generally, for example on platforms with AFBC we can't just clear the buffer like this, we also need to clear the metadata.
,
Nov 9
Got it; thanks. Well, let's see how things work with the existing changes -- they should prevent access to the unlocked desktop, at least. If people are still occasionally seeing a glimpse of the desktop on resume after a Chrome crash, we can look into adding more hacks^W carefully-designed changes to try to prevent that from happening. |
||||||||||||
►
Sign in to add a comment |
||||||||||||
Comment 1 by jdufault@chromium.org
, Jul 26