New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 884454 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Oct 16
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug-Regression



Sign in to add a comment

Many Autotest and Tast UI test failures on CQ due to Chrome login not succeeding

Project Member Reported by derat@chromium.org, Sep 15

Issue description

There were a bunch of failures in the ui.ChromeLogin Tast test on paladin builders in R71-11066.0.0-rc2: http://stainless/search?exclude_retried=true&first_date=2018-09-11&master_builder_name=&builder_name_number=&shard=&exclude_acts=true&builder_name=-paladin%24&master_builder_name_number=&owner=&retry=&exclude_cts=false&exclude_non_production=true&hostname=&board=&test=%5Etast%5C.ui%5C.ChromeLogin%24&suite=&build=%5ER71%5C-11066%5C.0%5C.0%5C-rc2%24&status=FAIL&status=ERROR&status=ABORT&reason=&waterfall=&exclude_not_run=true&last_date=2018-09-15&exclude_non_release=false&exclude_au=true&model=&view=list

They all failed with "Chrome login failed: /home/user/b6f6fc1fb54359f6e0567f6cadbfcb6590a672a4 not mounted: context deadline exceeded (last error: doesn't exist)".

I'll dig into the logs. I don't think there have been any recent changes to the tast or tast-tests repositories that could've affected login, so I suspect it's a real problem in the system image.
 
Hmm, it's weirdly inconsistent. In the edgar run at http://stainless/browse/chromeos-autotest-results/238049885-chromeos-test/, ui.ChromeCrashLoggedIn (which also logs in) passed but then ui.ChromeLogin failed shortly afterward:

2018/09/14 20:00:16 Started test ui.ChromeCrashLoggedIn
2018/09/14 20:00:16 [20:00:15.674] Restarting ui job
2018/09/14 20:00:18 [20:00:17.586] Waiting for org.chromium.SessionManager D-Bus service
2018/09/14 20:00:18 [20:00:17.619] Asking session_manager to enable Chrome testing
2018/09/14 20:00:18 [20:00:17.622] Waiting for Chrome to write its debugging port to /home/chronos/DevToolsActivePort
2018/09/14 20:00:19 [20:00:19.287] Checking cryptohomed service
2018/09/14 20:00:19 [20:00:19.289] Removing cryptohome for testuser@gmail.com
2018/09/14 20:00:20 [20:00:19.366] Finding OOBE DevTools target
2018/09/14 20:00:20 [20:00:19.810] Connecting to Chrome at ws://127.0.0.1:36436/devtools/page/4595B468395DE7BBC7CDC8F3A54E2685
2018/09/14 20:00:20 [20:00:20.079] Waiting for OOBE
2018/09/14 20:00:26 [20:00:25.615] Logging in as user "testuser@gmail.com"
2018/09/14 20:00:26 [20:00:25.863] Waiting for cryptohome /home/user/b6f6fc1fb54359f6e0567f6cadbfcb6590a672a4
2018/09/14 20:00:30 [20:00:30.109] Waiting for OOBE to be dismissed
2018/09/14 20:00:34 [20:00:33.600] Sending SIGSEGV to root Chrome process 28017
2018/09/14 20:00:34 [20:00:33.616] Waiting for 18 Chrome process(es) to exit
2018/09/14 20:00:34 [20:00:34.130] All Chrome processes exited
2018/09/14 20:00:34 [20:00:34.131] Found expected Chrome minidump file /home/chronos/crash/a8b65990-a213-4c27-84680ebd-6c0f10e8.dmp
2018/09/14 20:00:34 [20:00:34.131] Removing new crash file /home/chronos/crash/a8b65990-a213-4c27-84680ebd-6c0f10e8.dmp
2018/09/14 20:00:34 Completed test ui.ChromeCrashLoggedIn in 18.465s with 0 error(s)
...
2018/09/14 20:00:46 Started test ui.ChromeLogin
2018/09/14 20:00:46 [20:00:45.558] Restarting ui job
2018/09/14 20:00:47 [20:00:46.737] Waiting for org.chromium.SessionManager D-Bus service
2018/09/14 20:00:47 [20:00:46.758] Asking session_manager to enable Chrome testing
2018/09/14 20:00:47 [20:00:46.771] Waiting for Chrome to write its debugging port to /home/chronos/DevToolsActivePort
2018/09/14 20:00:49 [20:00:48.630] Checking cryptohomed service
2018/09/14 20:00:49 [20:00:48.631] Removing cryptohome for testuser@gmail.com
2018/09/14 20:00:49 [20:00:48.779] Finding OOBE DevTools target
2018/09/14 20:00:49 [20:00:49.026] Connecting to Chrome at ws://127.0.0.1:44678/devtools/page/A740E408F9055FAED6058FA2EADD3232
2018/09/14 20:00:50 [20:00:49.656] Waiting for OOBE
2018/09/14 20:00:56 [20:00:55.478] Logging in as user "testuser@gmail.com"
2018/09/14 20:00:56 [20:00:55.668] Waiting for cryptohome /home/user/b6f6fc1fb54359f6e0567f6cadbfcb6590a672a4
2018/09/14 20:02:43 [20:02:42.765] cryptohome status:
{
   "installattrs": {
      "first_install": true,
      "initialized": true,
      "invalid": false,
      "lockbox_index": 536870916,
      "lockbox_nvram_version": 2,
      "secure": true,
      "size": 0,
      "version": 1
   },
   "mounts": [ {
      "enterprise": false,
      "keysets": [  ],
      "mounted": false,
      "owner": "b6f6fc1fb54359f6e0567f6cadbfcb6590a672a4",
      "type": "none"
   } ],
   "tpm": {
      "being_owned": false,
      "can_connect": true,
      "can_decrypt": false,
      "can_encrypt": false,
      "can_load_srk": true,
      "can_load_srk_pubkey": true,
      "enabled": true,
      "has_context": true,
      "has_cryptohome_key": false,
      "has_key_handle": false,
      "last_error": 0,
      "owned": true,
      "srk_vulnerable_roca": true
   }
}
2018/09/14 20:02:43 [20:02:42.797] /sys/class/tpm/tpm0/device/owned contains "1"
2018/09/14 20:02:43 [20:02:42.799] Error at chrome_login.go:72: Chrome login failed: /home/user/b6f6fc1fb54359f6e0567f6cadbfcb6590a672a4 not mounted: context deadline exceeded (last error: doesn't exist)
Cc: xiy...@chromium.org achuith@chromium.org
Labels: Hotlist-TreeCloser
Owner: jdufault@chromium.org
Summary: ui.ChromeLogin test failures on CQ due to Chrome login not succeeding (was: ui.ChromeLogin test failures on CQ with due to cryptohome dir not showing up)
Here's part of Chrome's log from the successful test:

[28017:28017:0914/200025.843168:VERBOSE1:gaia_screen_handler.cc(780)] HandleCompleteLogin
[28017:28017:0914/200025.844086:VERBOSE1:existing_user_controller.cc(574)] Setting flow from PerformLogin
[28017:28017:0914/200025.844151:VERBOSE1:user_flow.cc(21)] Flow 0xf1e41275440 got host 0xf1e3f98edc0
[28017:28017:0914/200025.844610:VERBOSE1:login_performer.cc(287)] Online login completion started.
[28017:28017:0914/200025.845236:VERBOSE1:cryptohome_authenticator.cc(811)] Resolved state to: 0
[28017:28017:0914/200025.854688:VERBOSE1:gaia_screen_handler.cc(1171)] LoadAuthExtension, force: 1, offline: 0
[28017:28017:0914/200025.871998:ERROR:device_event_log_impl.cc(159)] [20:00:25.871] Login: cryptohome_util.cc:131 GetKeyDataEx failed with no GetKeyDataReply extension in reply.
[28017:28017:0914/200025.903455:ERROR:device_event_log_impl.cc(159)] [20:00:25.903] Login: cryptohome_authenticator.cc:147 MountEx failed. Error: 32
[28017:28017:0914/200025.903558:ERROR:device_event_log_impl.cc(159)] [20:00:25.903] Login: cryptohome_authenticator.cc:971 Cryptohome failure: state(AuthState)=1, code(cryptohome::MountError)=32
[28017:28017:0914/200025.903604:VERBOSE1:cryptohome_authenticator.cc(811)] Resolved state to: 6
[28017:28017:0914/200026.384357:VERBOSE1:signin_screen_handler.cc(1437)] Login WebUI >> loginVisible, src: oobe, webui_visible_: 0
[28017:28017:0914/200026.384755:VERBOSE1:webui_login_view.cc(608)] Login WebUI >> login-prompt-visible
[28017:28017:0914/200026.614179:INFO:signin_screen_handler.cc(1483)] Login WebUI >> active: 1, source: gaia-signin
[28017:28017:0914/200027.233555:VERBOSE1:gaia_screen_handler.cc(621)] Auth extension finished loading
[28017:28017:0914/200027.338360:VERBOSE1:signin_screen_handler.cc(1437)] Login WebUI >> loginVisible, src: gaia-loading, webui_visible_: 1
[28017:28017:0914/200029.771131:VERBOSE1:wizard_controller.cc(1935)] Resolved local timezone={dstOffset=3600.000000, rawOffset=-28800.000000, timeZoneId='America/Los_Angeles', timeZoneName='Pacific Daylight Time', error_message='', status=0 (OK)}.
[28017:28017:0914/200029.771493:VERBOSE1:wizard_controller.cc(1957)] Resolve TimeZone: setting timezone to 'America/Los_Angeles'
[28017:28017:0914/200031.375158:VERBOSE1:cryptohome_authenticator.cc(811)] Resolved state to: 13
[28017:28017:0914/200031.375265:VERBOSE1:cryptohome_authenticator.cc(701)] Login success
[28017:28017:0914/200031.375573:VERBOSE1:login_performer.cc(87)] LoginSuccess hash: b6f6fc1fb54359f6e0567f6cadbfcb6590a672a4
[28017:28017:0914/200031.375625:VERBOSE1:user_session_manager.cc(590)] Starting user session.
[28017:28017:0914/200031.472821:VERBOSE1:input_events_blocker.cc(19)] InputEventsBlocker 0xf1e41a6e300 created.
[28017:28017:0914/200031.473110:VERBOSE1:arc_util.cc(196)] ARC is not available.
[28017:28017:0914/200031.521585:ERROR:input_method_manager_impl.cc(1113)] IMEEngine for "jkghodnilhceideoidjikpgommlajknk" is not registered
[28017:28017:0914/200031.521846:VERBOSE1:user_session_manager.cc(731)] Setting first login prefs
[28017:28017:0914/200031.522274:VERBOSE1:user_session_manager.cc(1244)] Seed IdentityManager and SigninManagerBase with the authenticated account info, success=1
[28017:28017:0914/200031.771567:VERBOSE1:input_events_blocker.cc(28)] InputEventsBlocker 0xf1e41a6e300 destroyed.
[28017:28017:0914/200032.267247:INFO:chrome_cryptauth_service.cc(135)] Primary account with refresh token not yet available; waiting before starting CryptAuth managers.
[28017:28017:0914/200032.526366:VERBOSE1:input_events_blocker.cc(19)] InputEventsBlocker 0xf1e42332330 created.
[28017:28017:0914/200032.536652:VERBOSE1:user_session_manager.cc(1989)] Launching browser...
[28017:28017:0914/200032.771158:VERBOSE1:input_events_blocker.cc(28)] InputEventsBlocker 0xf1e42332330 destroyed.
EOF

And here's the same portion from the failing test:

[29528:29528:0914/200055.641710:VERBOSE1:gaia_screen_handler.cc(780)] HandleCompleteLogin
[29528:29528:0914/200055.667078:VERBOSE1:existing_user_controller.cc(574)] Setting flow from PerformLogin
[29528:29528:0914/200055.667166:VERBOSE1:user_flow.cc(21)] Flow 0xca57ea78510 got host 0xca57c828700
[29528:29528:0914/200055.673152:VERBOSE1:login_performer.cc(287)] Online login completion started.
[29528:29528:0914/200055.693068:ERROR:device_event_log_impl.cc(159)] [20:00:55.693] Login: cryptohome_util.cc:131 GetKeyDataEx failed with no GetKeyDataReply extension in reply.
[29528:29528:0914/200055.693232:VERBOSE1:cryptohome_authenticator.cc(811)] Resolved state to: 0
[29528:29528:0914/200055.718446:ERROR:device_event_log_impl.cc(159)] [20:00:55.718] Login: cryptohome_authenticator.cc:147 MountEx failed. Error: 32
[29528:29528:0914/200055.718550:ERROR:device_event_log_impl.cc(159)] [20:00:55.718] Login: cryptohome_authenticator.cc:971 Cryptohome failure: state(AuthState)=1, code(cryptohome::MountError)=32
[29528:29528:0914/200055.718595:VERBOSE1:cryptohome_authenticator.cc(811)] Resolved state to: 6
[29528:29528:0914/200100.085979:VERBOSE1:gaia_screen_handler.cc(803)] Gaia is loaded
[29528:29528:0914/200100.086092:VERBOSE1:signin_screen_handler.cc(1437)] Login WebUI >> loginVisible, src: gaia-signin, webui_visible_: 1
[29528:29528:0914/200100.693907:WARNING:CONSOLE(18996)] "Unrecognized message from GAIA: showConfirmCancel", source: chrome://oobe/login.js (18996)
EOF

It looks to me like Chrome is getting stuck. This may be related to  issue 880823 .

Jacob, can you take a look?
Cc: alemate@chromium.org rsorokin@chromium.org emaxx@chromium.org
https://storage.cloud.google.com/chromeos-autotest-results/238049586-chromeos-test/chromeos6-row2-rack14-host13/tast/results/system_logs/chrome/chrome_20180914-195602 from a different failed run also ends with some of the same errors:

[5691:5691:0914/195605.594332:VERBOSE1:gaia_screen_handler.cc(1171)] LoadAuthExtension, force: 1, offline: 0
[5691:5691:0914/195605.618750:INFO:signin_screen_handler.cc(1483)] Login WebUI >> active: 1, source: gaia-signin
[5691:5691:0914/195605.654736:VERBOSE1:signin_screen_handler.cc(1437)] Login WebUI >> loginVisible, src: oobe, webui_visible_: 0
[5691:5691:0914/195605.676611:VERBOSE1:gaia_screen_handler.cc(621)] Auth extension finished loading
[5691:5691:0914/195605.680060:VERBOSE1:signin_screen_handler.cc(1437)] Login WebUI >> loginVisible, src: gaia-loading, webui_visible_: 1
[5691:5691:0914/195605.686117:ERROR:CONSOLE(217)] "Unexpected condition on chrome://oobe/gaia-signin: Could not find value for languageList", source: chrome://resources/js/load_time_data.js (217)
[5691:5691:0914/195605.699179:VERBOSE1:gaia_screen_handler.cc(780)] HandleCompleteLogin
[5691:5691:0914/195605.699723:VERBOSE1:existing_user_controller.cc(574)] Setting flow from PerformLogin
[5691:5691:0914/195605.699759:VERBOSE1:user_flow.cc(21)] Flow 0x6a5096c4060 got host 0x6a509829540
[5691:5691:0914/195605.700065:VERBOSE1:login_performer.cc(287)] Online login completion started.
[5691:5691:0914/195605.700216:VERBOSE1:cryptohome_authenticator.cc(811)] Resolved state to: 0
[5691:5691:0914/195605.700887:ERROR:device_event_log_impl.cc(159)] [19:56:05.700] Login: cryptohome_util.cc:131 GetKeyDataEx failed with no GetKeyDataReply extension in reply.
[5691:5691:0914/195605.710665:ERROR:device_event_log_impl.cc(159)] [19:56:05.710] Login: cryptohome_authenticator.cc:147 MountEx failed. Error: 32
[5691:5691:0914/195605.710729:ERROR:device_event_log_impl.cc(159)] [19:56:05.710] Login: cryptohome_authenticator.cc:971 Cryptohome failure: state(AuthState)=1, code(cryptohome::MountError)=32
[5691:5691:0914/195605.710753:VERBOSE1:cryptohome_authenticator.cc(811)] Resolved state to: 6
[5691:5691:0914/195607.969967:VERBOSE1:gaia_screen_handler.cc(803)] Gaia is loaded
[5691:5691:0914/195607.970083:VERBOSE1:signin_screen_handler.cc(1437)] Login WebUI >> loginVisible, src: gaia-signin, webui_visible_: 1
[5691:5691:0914/195608.937328:WARNING:CONSOLE(18996)] "Unrecognized message from GAIA: showConfirmCancel", source: chrome://oobe/login.js (18996)
EOF

I don't know if these are expected or not. (It'd be nice if we had fewer expected errors.)

Could a server-side GAIA change be causing this?
Components: -Tests>Tast Infra>Client>ChromeOS
Summary: Many Autotest and Tast UI test failures on CQ due to Chrome login not succeeding (was: ui.ChromeLogin test failures on CQ due to Chrome login not succeeding)
This isn't specific to Tast; Chrome login (at least from tests) is actually broken. A ton of Autotest tests that log in also started failing in R71-11066.0.0-rc2 -- see http://stainless/search?view=matrix&row=test&col=build&first_date=2018-09-11&last_date=2018-09-15&builder_name=-paladin%24&test=%5Elogin_&exclude_cts=false&exclude_not_run=true&exclude_non_release=false&exclude_au=true&exclude_acts=true&exclude_retried=true&exclude_non_production=true .
Hmm. Stainless now shows that there were tons of failures in R71-11066.0.0-rc2 and R71-11067.0.0-rc1, but none in R71-11068.0.0-rc1 or R71-11068.0.0-rc2.

It's important that the cause of these failures is still determined and addressed, since we can't be in a state where Chrome login starts failing across all tests for unknown reasons.
[5691:5691:0914/195605.700887:ERROR:device_event_log_impl.cc(159)] [19:56:05.700] Login: cryptohome_util.cc:131 GetKeyDataEx failed with no GetKeyDataReply extension in reply.
[5691:5691:0914/195605.710665:ERROR:device_event_log_impl.cc(159)] [19:56:05.710] Login: cryptohome_authenticator.cc:147 MountEx failed. Error: 32
[5691:5691:0914/195605.710729:ERROR:device_event_log_impl.cc(159)] [19:56:05.710] Login: cryptohome_authenticator.cc:971 Cryptohome failure: state(AuthState)=1, code(cryptohome::MountError)=32

Looks like something went wrong with the user key. apronin@ WDYT?
32 => MOUNT_ERROR_USER_DOES_NOT_EXIST = 1 << 5

This happens for new user to the device and inline with the successful test where CryptohomeAuthenticator::AuthState goes from 0 (CONTINUE) -> 6 (CREATE_NEW) -> 13 (ONLINE_LOGIN).

I suspect there is a racing condition. In successful login, "Gaia is loaded" log line happens after auth state changes to 13 (ONLINE_LOGIN). But in failed case, this happens after 6 (CREATE_NEW). I suspect that breaks the test login flow and chrome probably stays on Gaia and wait for input.

Not sure about autotest failures in #4. Somehow, it does not show chrome logs and kernel logs in a few tests that I looked.
Another strange thing is this log line in failed test:

[5691:5691:0914/195605.686117:ERROR:CONSOLE(217)] "Unexpected condition on chrome://oobe/gaia-signin: Could not find value for languageList", source: chrome://resources/js/load_time_data.js (217)

This is from "chrome://oobe/gaia-signin". It is for the Gaia dialog for views-based login.

For the failed test, it looks we first start chrome with "--show-webui-login" at 2018-09-15T02:56:01. The chrome is restarted at 2018-09-15T02:56:02 without that switch. So tast is actually testing against the views-based login. Is this expected?
Cc: r...@chromium.org
If --show-webui-login is set, it's presumably coming from the fallback code introduced in session_manager by https://chromium-review.googlesource.com/956311.

Jacob, if we're comfortable sticking with Views-based login at this point, can you remove that code from session_manager?
We are launching in M70, so testing the webui login seems be a strict waste of time on our bots.

I'm not saying we should be testing WebUI login. From #8, it sounds like the session_manager code to fall back to WebUI login may be getting triggered unintentionally in tests. I'm wondering if it's safe to remove it now to remove the risk of that happening.
Ah - yes, I think so. WebUI login is not being tested at all, so at this point, falling back to WebUI login is probably more dangerous than helpful.

The fallback code should be removed completely.
I'll send out a CL to disable the --show-webui-login fallback in session manager, though I am concerned about the race condition xiyuan@ mentioned in comment #7.
Quick question re deprecated webui-login:
So, if I see 

[1665:1665:0829/114945.712199:VERBOSE1:login_display_host_webui.cc(795)] Login WebUI >> WEBUI_VISIBLE

here: b/113359995#comment46,
that means we are somehow using a deprecated/untested UI workflow on that device? 

We are trying to understand what's causing parallel enrollment workflows there, and webui-login might be some thing to pay close attention to, since it shouldn't have happened? Or that logline above is harmless?
That line indicates webui-based login or oobe. Enrollment implies oobe, and the webui experience will remain default for oobe (ie, there is no views oobe implementation).
Status: Fixed (was: Assigned)
Please reopen if needed.

Sign in to add a comment