Issue metadata
Sign in to add a comment
|
Many Autotest and Tast UI test failures on CQ due to Chrome login not succeeding |
||||||||||||||||||||||
Issue descriptionThere were a bunch of failures in the ui.ChromeLogin Tast test on paladin builders in R71-11066.0.0-rc2: http://stainless/search?exclude_retried=true&first_date=2018-09-11&master_builder_name=&builder_name_number=&shard=&exclude_acts=true&builder_name=-paladin%24&master_builder_name_number=&owner=&retry=&exclude_cts=false&exclude_non_production=true&hostname=&board=&test=%5Etast%5C.ui%5C.ChromeLogin%24&suite=&build=%5ER71%5C-11066%5C.0%5C.0%5C-rc2%24&status=FAIL&status=ERROR&status=ABORT&reason=&waterfall=&exclude_not_run=true&last_date=2018-09-15&exclude_non_release=false&exclude_au=true&model=&view=list They all failed with "Chrome login failed: /home/user/b6f6fc1fb54359f6e0567f6cadbfcb6590a672a4 not mounted: context deadline exceeded (last error: doesn't exist)". I'll dig into the logs. I don't think there have been any recent changes to the tast or tast-tests repositories that could've affected login, so I suspect it's a real problem in the system image.
,
Sep 15
Here's part of Chrome's log from the successful test:
[28017:28017:0914/200025.843168:VERBOSE1:gaia_screen_handler.cc(780)] HandleCompleteLogin
[28017:28017:0914/200025.844086:VERBOSE1:existing_user_controller.cc(574)] Setting flow from PerformLogin
[28017:28017:0914/200025.844151:VERBOSE1:user_flow.cc(21)] Flow 0xf1e41275440 got host 0xf1e3f98edc0
[28017:28017:0914/200025.844610:VERBOSE1:login_performer.cc(287)] Online login completion started.
[28017:28017:0914/200025.845236:VERBOSE1:cryptohome_authenticator.cc(811)] Resolved state to: 0
[28017:28017:0914/200025.854688:VERBOSE1:gaia_screen_handler.cc(1171)] LoadAuthExtension, force: 1, offline: 0
[28017:28017:0914/200025.871998:ERROR:device_event_log_impl.cc(159)] [20:00:25.871] Login: cryptohome_util.cc:131 GetKeyDataEx failed with no GetKeyDataReply extension in reply.
[28017:28017:0914/200025.903455:ERROR:device_event_log_impl.cc(159)] [20:00:25.903] Login: cryptohome_authenticator.cc:147 MountEx failed. Error: 32
[28017:28017:0914/200025.903558:ERROR:device_event_log_impl.cc(159)] [20:00:25.903] Login: cryptohome_authenticator.cc:971 Cryptohome failure: state(AuthState)=1, code(cryptohome::MountError)=32
[28017:28017:0914/200025.903604:VERBOSE1:cryptohome_authenticator.cc(811)] Resolved state to: 6
[28017:28017:0914/200026.384357:VERBOSE1:signin_screen_handler.cc(1437)] Login WebUI >> loginVisible, src: oobe, webui_visible_: 0
[28017:28017:0914/200026.384755:VERBOSE1:webui_login_view.cc(608)] Login WebUI >> login-prompt-visible
[28017:28017:0914/200026.614179:INFO:signin_screen_handler.cc(1483)] Login WebUI >> active: 1, source: gaia-signin
[28017:28017:0914/200027.233555:VERBOSE1:gaia_screen_handler.cc(621)] Auth extension finished loading
[28017:28017:0914/200027.338360:VERBOSE1:signin_screen_handler.cc(1437)] Login WebUI >> loginVisible, src: gaia-loading, webui_visible_: 1
[28017:28017:0914/200029.771131:VERBOSE1:wizard_controller.cc(1935)] Resolved local timezone={dstOffset=3600.000000, rawOffset=-28800.000000, timeZoneId='America/Los_Angeles', timeZoneName='Pacific Daylight Time', error_message='', status=0 (OK)}.
[28017:28017:0914/200029.771493:VERBOSE1:wizard_controller.cc(1957)] Resolve TimeZone: setting timezone to 'America/Los_Angeles'
[28017:28017:0914/200031.375158:VERBOSE1:cryptohome_authenticator.cc(811)] Resolved state to: 13
[28017:28017:0914/200031.375265:VERBOSE1:cryptohome_authenticator.cc(701)] Login success
[28017:28017:0914/200031.375573:VERBOSE1:login_performer.cc(87)] LoginSuccess hash: b6f6fc1fb54359f6e0567f6cadbfcb6590a672a4
[28017:28017:0914/200031.375625:VERBOSE1:user_session_manager.cc(590)] Starting user session.
[28017:28017:0914/200031.472821:VERBOSE1:input_events_blocker.cc(19)] InputEventsBlocker 0xf1e41a6e300 created.
[28017:28017:0914/200031.473110:VERBOSE1:arc_util.cc(196)] ARC is not available.
[28017:28017:0914/200031.521585:ERROR:input_method_manager_impl.cc(1113)] IMEEngine for "jkghodnilhceideoidjikpgommlajknk" is not registered
[28017:28017:0914/200031.521846:VERBOSE1:user_session_manager.cc(731)] Setting first login prefs
[28017:28017:0914/200031.522274:VERBOSE1:user_session_manager.cc(1244)] Seed IdentityManager and SigninManagerBase with the authenticated account info, success=1
[28017:28017:0914/200031.771567:VERBOSE1:input_events_blocker.cc(28)] InputEventsBlocker 0xf1e41a6e300 destroyed.
[28017:28017:0914/200032.267247:INFO:chrome_cryptauth_service.cc(135)] Primary account with refresh token not yet available; waiting before starting CryptAuth managers.
[28017:28017:0914/200032.526366:VERBOSE1:input_events_blocker.cc(19)] InputEventsBlocker 0xf1e42332330 created.
[28017:28017:0914/200032.536652:VERBOSE1:user_session_manager.cc(1989)] Launching browser...
[28017:28017:0914/200032.771158:VERBOSE1:input_events_blocker.cc(28)] InputEventsBlocker 0xf1e42332330 destroyed.
EOF
And here's the same portion from the failing test:
[29528:29528:0914/200055.641710:VERBOSE1:gaia_screen_handler.cc(780)] HandleCompleteLogin
[29528:29528:0914/200055.667078:VERBOSE1:existing_user_controller.cc(574)] Setting flow from PerformLogin
[29528:29528:0914/200055.667166:VERBOSE1:user_flow.cc(21)] Flow 0xca57ea78510 got host 0xca57c828700
[29528:29528:0914/200055.673152:VERBOSE1:login_performer.cc(287)] Online login completion started.
[29528:29528:0914/200055.693068:ERROR:device_event_log_impl.cc(159)] [20:00:55.693] Login: cryptohome_util.cc:131 GetKeyDataEx failed with no GetKeyDataReply extension in reply.
[29528:29528:0914/200055.693232:VERBOSE1:cryptohome_authenticator.cc(811)] Resolved state to: 0
[29528:29528:0914/200055.718446:ERROR:device_event_log_impl.cc(159)] [20:00:55.718] Login: cryptohome_authenticator.cc:147 MountEx failed. Error: 32
[29528:29528:0914/200055.718550:ERROR:device_event_log_impl.cc(159)] [20:00:55.718] Login: cryptohome_authenticator.cc:971 Cryptohome failure: state(AuthState)=1, code(cryptohome::MountError)=32
[29528:29528:0914/200055.718595:VERBOSE1:cryptohome_authenticator.cc(811)] Resolved state to: 6
[29528:29528:0914/200100.085979:VERBOSE1:gaia_screen_handler.cc(803)] Gaia is loaded
[29528:29528:0914/200100.086092:VERBOSE1:signin_screen_handler.cc(1437)] Login WebUI >> loginVisible, src: gaia-signin, webui_visible_: 1
[29528:29528:0914/200100.693907:WARNING:CONSOLE(18996)] "Unrecognized message from GAIA: showConfirmCancel", source: chrome://oobe/login.js (18996)
EOF
It looks to me like Chrome is getting stuck. This may be related to issue 880823 .
Jacob, can you take a look?
,
Sep 15
https://storage.cloud.google.com/chromeos-autotest-results/238049586-chromeos-test/chromeos6-row2-rack14-host13/tast/results/system_logs/chrome/chrome_20180914-195602 from a different failed run also ends with some of the same errors: [5691:5691:0914/195605.594332:VERBOSE1:gaia_screen_handler.cc(1171)] LoadAuthExtension, force: 1, offline: 0 [5691:5691:0914/195605.618750:INFO:signin_screen_handler.cc(1483)] Login WebUI >> active: 1, source: gaia-signin [5691:5691:0914/195605.654736:VERBOSE1:signin_screen_handler.cc(1437)] Login WebUI >> loginVisible, src: oobe, webui_visible_: 0 [5691:5691:0914/195605.676611:VERBOSE1:gaia_screen_handler.cc(621)] Auth extension finished loading [5691:5691:0914/195605.680060:VERBOSE1:signin_screen_handler.cc(1437)] Login WebUI >> loginVisible, src: gaia-loading, webui_visible_: 1 [5691:5691:0914/195605.686117:ERROR:CONSOLE(217)] "Unexpected condition on chrome://oobe/gaia-signin: Could not find value for languageList", source: chrome://resources/js/load_time_data.js (217) [5691:5691:0914/195605.699179:VERBOSE1:gaia_screen_handler.cc(780)] HandleCompleteLogin [5691:5691:0914/195605.699723:VERBOSE1:existing_user_controller.cc(574)] Setting flow from PerformLogin [5691:5691:0914/195605.699759:VERBOSE1:user_flow.cc(21)] Flow 0x6a5096c4060 got host 0x6a509829540 [5691:5691:0914/195605.700065:VERBOSE1:login_performer.cc(287)] Online login completion started. [5691:5691:0914/195605.700216:VERBOSE1:cryptohome_authenticator.cc(811)] Resolved state to: 0 [5691:5691:0914/195605.700887:ERROR:device_event_log_impl.cc(159)] [19:56:05.700] Login: cryptohome_util.cc:131 GetKeyDataEx failed with no GetKeyDataReply extension in reply. [5691:5691:0914/195605.710665:ERROR:device_event_log_impl.cc(159)] [19:56:05.710] Login: cryptohome_authenticator.cc:147 MountEx failed. Error: 32 [5691:5691:0914/195605.710729:ERROR:device_event_log_impl.cc(159)] [19:56:05.710] Login: cryptohome_authenticator.cc:971 Cryptohome failure: state(AuthState)=1, code(cryptohome::MountError)=32 [5691:5691:0914/195605.710753:VERBOSE1:cryptohome_authenticator.cc(811)] Resolved state to: 6 [5691:5691:0914/195607.969967:VERBOSE1:gaia_screen_handler.cc(803)] Gaia is loaded [5691:5691:0914/195607.970083:VERBOSE1:signin_screen_handler.cc(1437)] Login WebUI >> loginVisible, src: gaia-signin, webui_visible_: 1 [5691:5691:0914/195608.937328:WARNING:CONSOLE(18996)] "Unrecognized message from GAIA: showConfirmCancel", source: chrome://oobe/login.js (18996) EOF I don't know if these are expected or not. (It'd be nice if we had fewer expected errors.) Could a server-side GAIA change be causing this?
,
Sep 15
This isn't specific to Tast; Chrome login (at least from tests) is actually broken. A ton of Autotest tests that log in also started failing in R71-11066.0.0-rc2 -- see http://stainless/search?view=matrix&row=test&col=build&first_date=2018-09-11&last_date=2018-09-15&builder_name=-paladin%24&test=%5Elogin_&exclude_cts=false&exclude_not_run=true&exclude_non_release=false&exclude_au=true&exclude_acts=true&exclude_retried=true&exclude_non_production=true .
,
Sep 15
Hmm. Stainless now shows that there were tons of failures in R71-11066.0.0-rc2 and R71-11067.0.0-rc1, but none in R71-11068.0.0-rc1 or R71-11068.0.0-rc2. It's important that the cause of these failures is still determined and addressed, since we can't be in a state where Chrome login starts failing across all tests for unknown reasons.
,
Sep 17
[5691:5691:0914/195605.700887:ERROR:device_event_log_impl.cc(159)] [19:56:05.700] Login: cryptohome_util.cc:131 GetKeyDataEx failed with no GetKeyDataReply extension in reply. [5691:5691:0914/195605.710665:ERROR:device_event_log_impl.cc(159)] [19:56:05.710] Login: cryptohome_authenticator.cc:147 MountEx failed. Error: 32 [5691:5691:0914/195605.710729:ERROR:device_event_log_impl.cc(159)] [19:56:05.710] Login: cryptohome_authenticator.cc:971 Cryptohome failure: state(AuthState)=1, code(cryptohome::MountError)=32 Looks like something went wrong with the user key. apronin@ WDYT?
,
Sep 17
32 => MOUNT_ERROR_USER_DOES_NOT_EXIST = 1 << 5 This happens for new user to the device and inline with the successful test where CryptohomeAuthenticator::AuthState goes from 0 (CONTINUE) -> 6 (CREATE_NEW) -> 13 (ONLINE_LOGIN). I suspect there is a racing condition. In successful login, "Gaia is loaded" log line happens after auth state changes to 13 (ONLINE_LOGIN). But in failed case, this happens after 6 (CREATE_NEW). I suspect that breaks the test login flow and chrome probably stays on Gaia and wait for input. Not sure about autotest failures in #4. Somehow, it does not show chrome logs and kernel logs in a few tests that I looked.
,
Sep 17
Another strange thing is this log line in failed test: [5691:5691:0914/195605.686117:ERROR:CONSOLE(217)] "Unexpected condition on chrome://oobe/gaia-signin: Could not find value for languageList", source: chrome://resources/js/load_time_data.js (217) This is from "chrome://oobe/gaia-signin". It is for the Gaia dialog for views-based login. For the failed test, it looks we first start chrome with "--show-webui-login" at 2018-09-15T02:56:01. The chrome is restarted at 2018-09-15T02:56:02 without that switch. So tast is actually testing against the views-based login. Is this expected?
,
Sep 17
If --show-webui-login is set, it's presumably coming from the fallback code introduced in session_manager by https://chromium-review.googlesource.com/956311. Jacob, if we're comfortable sticking with Views-based login at this point, can you remove that code from session_manager?
,
Sep 18
We are launching in M70, so testing the webui login seems be a strict waste of time on our bots.
,
Sep 18
I'm not saying we should be testing WebUI login. From #8, it sounds like the session_manager code to fall back to WebUI login may be getting triggered unintentionally in tests. I'm wondering if it's safe to remove it now to remove the risk of that happening.
,
Sep 18
Ah - yes, I think so. WebUI login is not being tested at all, so at this point, falling back to WebUI login is probably more dangerous than helpful. The fallback code should be removed completely.
,
Sep 18
I'll send out a CL to disable the --show-webui-login fallback in session manager, though I am concerned about the race condition xiyuan@ mentioned in comment #7.
,
Sep 18
Quick question re deprecated webui-login: So, if I see [1665:1665:0829/114945.712199:VERBOSE1:login_display_host_webui.cc(795)] Login WebUI >> WEBUI_VISIBLE here: b/113359995#comment46, that means we are somehow using a deprecated/untested UI workflow on that device? We are trying to understand what's causing parallel enrollment workflows there, and webui-login might be some thing to pay close attention to, since it shouldn't have happened? Or that logline above is harmless?
,
Sep 18
That line indicates webui-based login or oobe. Enrollment implies oobe, and the webui experience will remain default for oobe (ie, there is no views oobe implementation).
,
Sep 19
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform2/+/5e795a139057d15181250f6bdd8f489b1d749fa9 commit 5e795a139057d15181250f6bdd8f489b1d749fa9 Author: Jacob Dufault <jdufault@google.com> Date: Wed Sep 19 22:40:36 2018 login: Remove --show-webui-login fallback on crash BUG= chromium:822434 , chromium:884454 TEST=cros_workon_make --board=eve chromeos-base/chromeos-login --test Change-Id: I4aa9303914b55ba8aed3d98ee18b3409f9bdcd7f Reviewed-on: https://chromium-review.googlesource.com/1232017 Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com> Commit-Ready: Jacob Dufault <jdufault@chromium.org> Tested-by: Jacob Dufault <jdufault@chromium.org> Reviewed-by: Dan Erat <derat@chromium.org> [modify] https://crrev.com/5e795a139057d15181250f6bdd8f489b1d749fa9/login_manager/login_metrics.h [modify] https://crrev.com/5e795a139057d15181250f6bdd8f489b1d749fa9/login_manager/browser_job_test.cc [modify] https://crrev.com/5e795a139057d15181250f6bdd8f489b1d749fa9/login_manager/mock_metrics.h [modify] https://crrev.com/5e795a139057d15181250f6bdd8f489b1d749fa9/login_manager/browser_job.cc [modify] https://crrev.com/5e795a139057d15181250f6bdd8f489b1d749fa9/login_manager/login_metrics.cc
,
Oct 16
Please reopen if needed. |
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by derat@chromium.org
, Sep 15