New issue
Advanced search Search tips

Issue 889078 link

Starred by 2 users

Issue metadata

Status: WontFix
Owner: ----
Closed: Nov 25
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

Improve chrome.New's ability to determine cause of login failures

Project Member Reported by hidehiko@chromium.org, Sep 25

Issue description

Currently, in chrome package looks at whether cryptohome dir is mounted,
but there's a bug in chrome that prevented login from succeeding.
(cf crbug.com/888520,  crbug.com/884454 ).

It should be able to observe SessionManager.SessionStateChanged D-Bus signal instead.

 
Cc: apronin@chromium.org xiy...@chromium.org
Summary: Improve chrome.New's ability to determine cause of login failures (was: Observer SessionManager.SessionStateChanged signal from chrome package.)
Thanks for filing this!

I should mention that I don't understand the big picture of how Chrome login works (especially during testing), so I'm not completely sure that we'll be able to use SessionStateChanged to distinguish between Chrome and cryptohome bugs. I'll rename this bug to make that more obvious.

Here's the main flow of what chrome.Chrome.logIn does right now (at http://cs/chromeos_public/src/platform/tast-tests/src/chromiumos/tast/local/chrome/chrome.go?l=453). I just patterned this after what Telemetry and Autotest do.

a) wait for OOBE DevTools target
b) connect to OOBE DevTools target
c) wait for Oobe.loginForTesting
d) call Oobe.loginForTesting
e) wait for cryptohome user mount to appear
f) wait for OOBE DevTools target to disappear

Most of the test timeouts that I'm seeing happen at e). I'm wondering if we can insert an additional step between d) and e) that will identify issues within Chrome; right now I end up bothering Andrey too often about failures that aren't cryptohomed's fault.

I'm not aware of any documentation describing the interactions between Chrome, session_manager, and cryptohomed during login. I'll try adding some debug logging to Chrome to see if I can get a better idea of the order in which things happen.

Xiyuan, do you have any suggestions here?
Hmm. The SessionStateChanged signal appears to be emitted after the cryptohome mount appears, at least on my caroline device. Here's a combined log sequence from several sources:

2018/09/25 [10:28:22.431] Logging in as user "testuser@gmail.com"
[4343:4343:0925/102822.458503:ERROR:gaia_screen_handler.cc(780)] XXX GaiaScreenHandler::HandleCompleteLogin
^^^ Chrome sees Oobe.loginForTesting call
[4343:4343:0925/102822.461301:ERROR:cryptohome_client.cc(752)] XXX CryptohomeClient::MountEx
[4343:4343:0925/102822.466537:ERROR:cryptohome_client.cc(752)] XXX CryptohomeClient::MountEx
^^^ Chrome asks cryptohome to mount user dir, I think
2018/09/25 [10:28:22.494] Waiting for cryptohome /home/user/a15eb7996a42a661cec694d149dfcebefafcddd5
2018/09/25 [10:28:23.140] Waiting for OOBE to be dismissed
^^^ test has seen user dir
[4343:4343:0925/102823.666117:ERROR:user_session_manager.cc(586)] XXX UserSessionManager::StartSession
[4343:4343:0925/102823.673308:ERROR:user_session_manager.cc(1092)] XXX UserSessionManager::StartCrosSession
[4343:4343:0925/102823.673380:ERROR:session_manager_client.cc(197)] XXX SessionManagerClient::StartSession
^^^ Chrome tells session_manager about login
2018-09-25T10:28:23.683141-07:00 INFO session_manager[4313]: [INFO:session_manager_impl.cc(562)] Starting user session
2018-09-25T10:28:23.683396-07:00 INFO session_manager[4313]: [INFO:session_manager_impl.cc(570)] Emitting D-Bus signal SessionStateChanged:started
^^^ session_manager emits signal
2018/09/25 [10:28:24.136] Waiting for SessionStateChanged "started" D-Bus signal from session_manager
2018/09/25 [10:28:24.136] Got SessionStateChanged signal

I still don't know what happens when login fails, though. If there's a cryptohomed issue, will session_manager still emit SessionStateChanged without the user dir showing up, for instance?
Cc: achuith@chromium.org
+achuith in case he has suggestions too

ExistingUserController is the bridge between login screen UI and cryptohome. It has Login (offline login) and CompleteLogin (online). These two calls would create a LoginPerformer to delegate to auth request to CryptohomeAuthenticator and gets back the auth result. If all goo, ExistingUserController::OnAuthSuccess will be called and kick start a user session via UserSessionManager. Otherwise, ExistingUserController::OnAuthFailure will be called and we show some error message.

Interactions with cryptohomed are in CryptohomeAuthenticator. Interactions with session_manager daemon are mostly in UserSessionManager.

Not sure what we want to add between step d) and e) in #c1. We have various login time markers via LoginEventRecorder::AddLoginTimeMarker [1]. You can see the markers on device (cat /home/chronos/user/login-times). The markers depict different stages during login. We can follow that and see whether they are interesting. 

[1] https://cs.chromium.org/chromium/src/chromeos/login_event_recorder.cc?rcl=e69e0dfe287599fc21f26fc68869ab3f969f77e0&l=30
Status: WontFix (was: Untriaged)
The login-times file seems interesting, but based on its location in the user's encrypted home dir, it seems like it's unlikely to be available when debugging these failures.

I think that this situation is better now: there was only one "not mounted" error on release builders in the last 14 days, in the eve-release build at http://stainless/browse/chromeos-autotest-results/260797740-chromeos-test/ . Judging from all of the tpm_managerd and cryptohomed crashes there, it was likely an issue with chapsd not running rather than a login issue.

I'm going to WontFix this since I think that most of the causes of these failures have been fixed now. I'm hopeful that things will improve further after  issue 901363  is fixed, too.

Sign in to add a comment