tast.ui.MashLogin crash in chrome_user_manager_util::IsGuestSessionAllowed |
|||||||||
Issue descriptionThis doesn't look particularly mash-related. Any ideas, guys? Maybe UserManager is null here? https://cs.chromium.org/chromium/src/chrome/browser/chromeos/login/existing_user_controller.cc?type=cs&q=updatelogindisplay&sq=package:chromium&g=0&l=410 https://luci-milo.appspot.com/buildbot/chromeos/caroline-release/1809 http://stainless/browse/chromeos-autotest-results/205807781-chromeos-test/ . If you look under the tast/results/ directory (e.g. search for "full.txt" on that page), you can see the files that Tast actually produced. It looks like there are some Chrome crashes that didn't get symbolized properly (sigh), although "tast symbolize" works here: Operating system: Linux 0.0.0 Linux 3.18.0-17852-g2b40b9e8fc30 #1 SMP PREEMPT Tue Jun 5 12:16:55 PDT 2018 x86_64 CPU: amd64 family 6 model 78 stepping 3 1 CPU GPU: UNKNOWN Crash reason: SIGSEGV Crash address: 0x0 Process uptime: not available Thread 0 (crashed) 0 chrome!chromeos::chrome_user_manager_util::IsGuestSessionAllowed(chromeos::CrosSettings const*) [chrome_user_manager_util.cc : 117 + 0x0] rax = 0x0000000000000000 rdx = 0x0000000000000003 rcx = 0x4a651429d194f900 rbx = 0x00003efc9d7807e8 rsi = 0x00003efc9d7807e8 rdi = 0x00003efc9d768880 rbp = 0x00007ffc222bd280 rsp = 0x00007ffc222bd240 r8 = 0x000000000000000b r9 = 0x73656d614e726573 r10 = 0x65efda3b407aa36c r11 = 0x00007e0e614f11e0 r12 = 0x0000000000000000 r13 = 0x00003efc9de5c9a0 r14 = 0x00003efc9d7e2440 r15 = 0x00003efc9d780798 rip = 0x00005b777d3cf867 Found by: given as instruction pointer in context 1 chrome!chromeos::ExistingUserController::UpdateLoginDisplay(std::__1::vector<user_manager::User*, std::__1::allocator<user_manager::User*> > const&) [existing_user_controller.cc : 410 + 0xc] rbx = 0x00003efc9d7e2440 rbp = 0x00007ffc222bd320 rsp = 0x00007ffc222bd290 r12 = 0x0000000000000000 r13 = 0x00003efc9de5c9a0 r14 = 0x0000000000000000 r15 = 0x00003efc9d780798 rip = 0x00005b777d34d3a9 Found by: call frame info 2 chrome!chromeos::LoginDisplayHostWebUI::OnStartSignInScreen(chromeos::LoginScreenContext const&) [existing_user_controller.cc : 370 + 0xb] rbx = 0x00003efc9eb7ac00 rbp = 0x00007ffc222bd4b0 rsp = 0x00007ffc222bd330 r12 = 0x00007ffc222bd7b0 r13 = 0x00003efc9de5c9a0 r14 = 0x0000000000000000 r15 = 0x00003efc9d780798 rip = 0x00005b777d3ae8e0 Found by: call frame info 3 chrome!chromeos::LoginDisplayHostCommon::StartSignInScreen(chromeos::LoginScreenContext const&) [login_display_host_common.cc : 118 + 0xc] rbx = 0x00003efc9de5c9a0 rbp = 0x00007ffc222bd620 rsp = 0x00007ffc222bd4c0 r12 = 0x00003efc9f2005a0 r13 = 0x00003efc9d74ac00 r14 = 0x00007ffc222bd7b0 r15 = 0x00003efc9d780798 rip = 0x00005b777d3a936a Found by: call frame info 4 chrome!chromeos::WizardController::ShowLoginScreen(chromeos::LoginScreenContext const&) [wizard_controller.cc : 497 + 0x8] rbx = 0x00003efc9f216a80 rbp = 0x00007ffc222bd780 rsp = 0x00007ffc222bd630 r12 = 0x00003efc9f2005a0 r13 = 0x00003efc9d74ac00 r14 = 0x00007ffc222bd7b0 r15 = 0x00007ffc222bd638 rip = 0x00005b777d3dafe5 Found by: call frame info 5 chrome!<name omitted> [signin_screen_handler.cc : 0 + 0x8] rbx = 0x0000000000000000 rbp = 0x00007ffc222bd900 rsp = 0x00007ffc222bd790 r12 = 0x00003efc9f2005a0 r13 = 0x00003efc9d74ac00 r14 = 0x00003efc9d790180 r15 = 0x00003efc9f216a80 rip = 0x00005b777d3d8a1f Found by: call frame info 6 chrome!chromeos::WizardController::Init(chromeos::OobeScreen) [wizard_controller.cc : 357 + 0x8] rbx = 0x0000000000000000 rbp = 0x00007ffc222bdab0 rsp = 0x00007ffc222bd910 r12 = 0x00003efc9f2005a0 r13 = 0x00003efc9d74ac00 r14 = 0x00003efc9d790180 r15 = 0x00003efc9f216a80 rip = 0x00005b777d3d8082 Found by: call frame info 7 chrome!chromeos::LoginDisplayHostWebUI::StartWizard(chromeos::OobeScreen) [login_display_host_webui.cc : 612 + 0x8] rbx = 0x0000000000000001 rbp = 0x00007ffc222bdc20 rsp = 0x00007ffc222bdac0 r12 = 0x00003efc9f2005a0 r13 = 0x00003efc9de5c9a0 r14 = 0x0000000000000028 r15 = 0x0000000000000000 rip = 0x00005b777d3adced Found by: call frame info 8 chrome!(anonymous namespace)::ShowLoginWizardFinish(chromeos::OobeScreen, chromeos::StartupCustomizationDocument const*) [login_display_host_webui.cc : 223 + 0x9] rbx = 0x00003efc9d703ce0 rbp = 0x00007ffc222bde60 rsp = 0x00007ffc222bdc30 r12 = 0x00003efc9d74ac00 r13 = 0x00003efc9de5c9a0 r14 = 0x00003efc9eacecf0 r15 = 0x0000000000000028 rip = 0x00005b777d3b27fe Found by: call frame info 9 chrome!(anonymous namespace)::TriggerShowLoginWizardFinish(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::unique_ptr<(anonymous namespace)::ShowLoginWizardSwitchLanguageCallbackData, std::__1::default_delete<(anonymous namespace)::ShowLoginWizardSwitchLanguageCallbackData> >) [login_display_host_webui.cc : 268 + 0x5] rbx = 0x00007ffc222bded0 rbp = 0x00007ffc222bde90 rsp = 0x00007ffc222bde70 r12 = 0x0000000000000028 r13 = 0x00003efc9eacecf0 r14 = 0x00003efc9eafff80 r15 = 0x00003efc9eaced60 rip = 0x00005b777d3b1e41 Found by: call frame info 10 chrome!chromeos::ShowLoginWizard(chromeos::OobeScreen) [login_display_host_webui.cc : 1283 + 0x5] rbx = 0x00003efc9dc465c0 rbp = 0x00007ffc222be110 rsp = 0x00007ffc222bdea0 r12 = 0x0000000000000028 r13 = 0x00003efc9eacecf0 r14 = 0x00003efc9eafff80 r15 = 0x00003efc9eaced60 rip = 0x00005b777d3b1a52 Found by: call frame info 11 chrome!chromeos::ChromeSessionManager::Initialize(base::CommandLine const&, Profile*, bool) [chrome_session_manager.cc : 80 + 0xa] rbx = 0x00007ffc222be1b0 rbp = 0x00007ffc222be310 rsp = 0x00007ffc222be120 r12 = 0x00005b77847a3876 r13 = 0x00003efc9d703ce0 r14 = 0x00005b7785419350 r15 = 0x0000000000000001 rip = 0x00005b777d389f22 Found by: call frame info 12 chrome!chromeos::ChromeBrowserMainPartsChromeos::PostProfileInit() [chrome_browser_main_chromeos.cc : 938 + 0x7] rbx = 0x0000000000000000 rbp = 0x00007ffc222be470 rsp = 0x00007ffc222be320 r12 = 0x00003efc9ead3aa0 r13 = 0x00003efc9d713000 r14 = 0x00007ffc222be320 r15 = 0x00003efc9d713000 rip = 0x00005b777d25a5aa Found by: call frame info 13 chrome!ChromeBrowserMainParts::PreMainMessageLoopRun() [chrome_browser_main.cc : 1884 + 0x9] rbx = 0x00003efc9d782540 rbp = 0x00007ffc222be6e0 rsp = 0x00007ffc222be480 r12 = 0x00003efc9ead3aa0 r13 = 0x00003efc9d713000 r14 = 0x000000013527f88f r15 = 0x002ecc841eb6c655 rip = 0x00005b777e01777e Found by: call frame info
,
Jun 13 2018
This is still happening periodically in the lab. Here are some symbolized crash dumps: https://stainless.corp.google.com/browse/chromeos-autotest-results/207944040-chromeos-test/
,
Jun 14 2018
Another possibility is that we cleared "Local State" but not the policy blob file. Hence, we have an owner but not in the known user list. Would the DUT starts the tests in a clean state that clears files?
,
Jun 14 2018
The test login code is here: https://chromium.googlesource.com/chromiumos/platform/tast-tests/+/HEAD/src/chromiumos/tast/local/chrome/chrome.go I modeled it on what Telemetry does. We typically call session_manager's EnableChromeTesting method, clear the user's cryptohome, and then call Oobe.loginForTesting. Chrome is restarted with --oobe-skip-postlogin and --disable-gaia-services, in case either of those are relevant.
,
Jun 15 2018
Thanks for the code link. I am more interested to find out how DUT is prepared before running the test. Are we removing both "Local State" and policy blob file?
,
Jun 15 2018
This is just a regular DUT in the lab. The rest is in the file that I linked to, starting in the New function. We: - write a test extension to disk - call EnableChromeTesting per https://chromium.googlesource.com/chromiumos/platform/tast-tests/+/HEAD/src/chromiumos/tast/local/chrome/chrome.go#237 - connect to Chrome using CDP - tell cryptohomed to remove the user's cryptohome - log in per https://chromium.googlesource.com/chromiumos/platform/tast-tests/+/HEAD/src/chromiumos/tast/local/chrome/chrome.go#443 So I wouldn't expect us to be removing Local State or the policy blob, unless session_manager does that as part of EnableChromeTesting. Also note that this isn't crashing consistently, as far as I know, so it seems likely to be a race.
,
Jun 15 2018
The race is probably because owner account id is only set after policy blob is verified. When we are lucky, the owner account id is not set when the login screen shows so we skipped the FindUser(owner_account_id)->GetType() call. Not sure why FindUser could return nullptr while we have an owner account. Is there a way to get the "Local State" file on the failed DUT?
,
Jun 15 2018
Hmm. cautotest currently lists the state of the DUT (chromeos6-row1-rack23-host7) state as "verifying", and it doesn't respond when I try to SSH to it. The test log also shows that all other tests that tried to interact with Chrome in the same run had problems: 2018/06/05 17:01:49 ui.ARCStartup [ FAIL ] Failed to connect to Chrome: cdp.Runtime: Evaluate: websocket: close 1006 (abnormal closure): unexpected EOF 2018/06/05 17:01:49 ui.ChromeCrashLoggedIn [ FAIL ] failed to read Chrome debugging port from /home/chronos/DevToolsActivePort: context deadline exceeded 2018/06/05 17:01:49 ui.ChromeCrashNotLoggedIn [ FAIL ] failed to read Chrome debugging port from /home/chronos/DevToolsActivePort: context deadline exceeded 2018/06/05 17:01:49 ui.ChromeLogin [ FAIL ] Failed to connect to Chrome: failed to read Chrome debugging port from /home/chronos/DevToolsActivePort: context deadline exceeded 2018/06/05 17:01:49 ui.MashLogin [ FAIL ] Chrome probably crashed on startup: failed to read Chrome debugging port from /home/chronos/DevToolsActivePort: context deadline exceeded I think it's likely that Chrome was in a bad state on the DUT. James, have you seen the same crash in any other runs?
,
Jun 18 2018
I have not seen this recently. chrome is usually crashing in ozone initialization right now, so it might not be getting to this code. I can't be sure.
,
Jun 19 2018
,
Jun 19 2018
,
Jun 19 2018
Hmm. Unfortunately, neither of those DUTs (chromeos4-row13-rack10-host9 or chromeos4-row13-rack10-host10) appear to still be in the bad state -- the baked-in copies of ui.ChromeLogin and ui.MashLogin pass on both now rather than failing. Xiyuan, if Local State would be helpful in debugging this, I can update the test to collect it. Are there any other files or commands that would help in gathering info?
,
Jun 19 2018
And the policy blob "/var/lib/whitelist/policy.*" that contains the owner email if possible.
,
Jun 19 2018
Okay! I've sent https://crrev.com/c/1107204 to collect those files.
,
Jun 21 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/tast-tests/+/10248a86deeacaf9233b4c1405670eb75d30c0b6 commit 10248a86deeacaf9233b4c1405670eb75d30c0b6 Author: Daniel Erat <derat@chromium.org> Date: Thu Jun 21 08:05:44 2018 tast-tests: Make ui.MashLogin collect debug files. Make the ui.MashLogin test collect /home/chronos/Local State and /var/lib/whitelist/policy.* on failure to try to track down Chrome crashes in IsGuestSessionAllowed(). BUG= chromium:850139 TEST=manual: ran it after inverting failure check and verified that expected files are in test's output dir Change-Id: Id8ee2397d8ad0537927ebc68d009cf3a55e7f722 Reviewed-on: https://chromium-review.googlesource.com/1107204 Commit-Ready: Dan Erat <derat@chromium.org> Tested-by: Dan Erat <derat@chromium.org> Reviewed-by: James Cook <jamescook@chromium.org> Reviewed-by: Shuhei Takahashi <nya@chromium.org> [modify] https://crrev.com/10248a86deeacaf9233b4c1405670eb75d30c0b6/src/chromiumos/tast/local/bundles/cros/ui/mash_login.go
,
Jun 23 2018
The files are collected now, but this test is super-noisy due to issue 850168. :-/ (I also accidentally clobbered the error message in #15, which I'll fix now.)
,
Jun 23 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/tast-tests/+/8102d60b79fc415924cd3083fa2aa05fc36f470b commit 8102d60b79fc415924cd3083fa2aa05fc36f470b Author: Daniel Erat <derat@chromium.org> Date: Sat Jun 23 05:51:23 2018 tast-tests: Don't clobber error in ui.MashLogin. Fix an issue introduced by 10248a86deea where ui.MashLogin overwrote Chrome errors and reported '<nil>' instead. BUG= chromium:850139 TEST=built and ran test Change-Id: Id448c23cfcfc04b4f971e6ca6aba15ac77c85c3c Reviewed-on: https://chromium-review.googlesource.com/1112762 Commit-Ready: Dan Erat <derat@chromium.org> Tested-by: Dan Erat <derat@chromium.org> Reviewed-by: James Cook <jamescook@chromium.org> [modify] https://crrev.com/8102d60b79fc415924cd3083fa2aa05fc36f470b/src/chromiumos/tast/local/bundles/cros/ui/mash_login.go
,
Jun 26 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/tast-tests/+/0fd9562402a18268fddbda77fd741f7c1e731db7 commit 0fd9562402a18268fddbda77fd741f7c1e731db7 Author: Daniel Erat <derat@chromium.org> Date: Tue Jun 26 14:24:57 2018 tast-tests: Move Chrome debugging to ui.ChromeLogin. Move some debug code that captures Local State and policy files from ui.MashLogin to ui.ChromeLogin. I added this to help investigate IsGuestSessionAllowed segfaults in Chrome, but ui.MashLogin is failing frequently due to an unrelated issue (https://crbug.com/850168) and it's easier to identify these segfaults in ui.ChromeLogin. BUG= chromium:850139 TEST=ran tests Change-Id: Iebb092e46ef84b16c7229eede9a0cc790a9f159e Reviewed-on: https://chromium-review.googlesource.com/1114314 Tested-by: Dan Erat <derat@chromium.org> Reviewed-by: James Cook <jamescook@chromium.org> Reviewed-by: Shuhei Takahashi <nya@chromium.org> Commit-Queue: Dan Erat <derat@chromium.org> [modify] https://crrev.com/0fd9562402a18268fddbda77fd741f7c1e731db7/src/chromiumos/tast/local/bundles/cros/ui/chrome_login.go [modify] https://crrev.com/0fd9562402a18268fddbda77fd741f7c1e731db7/src/chromiumos/tast/local/bundles/cros/ui/mash_login.go
,
Jun 27 2018
I finally got Local State and a policy file, but it looks like Xiyuan is out for a while. Jacob, are you familiar with this code in Chrome? The attached files are from http://stainless/browse/chromeos-autotest-results/212061640-chromeos-test/ . The crash dump is from a little bit earlier than that test failure, but Chrome looks like it was crashing every time with this stack.
,
Jun 27 2018
Some possibly-interesting bits from Local State:
"KnownUsers": [
],
"profile": {
"info_cache": {
"Default": {
"avatar_icon": "chrome:\/\/theme\/IDR_PROFILE_AVATAR_26",
"background_apps": false,
"gaia_id": "",
"is_ephemeral": false,
"is_omitted_from_profile_list": false,
"is_using_default_avatar": true,
"is_using_default_name": true,
"managed_user_id": "",
"name": "First user",
"user_name": ""
}
},
"last_active_profiles": [
]
},
I didn't decode the policy file, but I see a "first_user@nowhere.com" string in it.
I wouldn't be surprised if tests are leaving Chrome in a weird state. Should I just make the crashing code check for a null pointer?
,
Jun 27 2018
"first_user@nowhere.com" appears in a bunch of Autotest login and cryptohome tests: https://cs.corp.google.com/search/?q=%22first_user@nowhere.com%22&m=100&det=matsel&sq=package:%5Echromeos_(internal%7Cpublic)$&type=cs Should Tast clear /var/lib/whitelist/policy.* before restarting Chrome?
,
Jun 28 2018
Yep, empty "KnownUsers" in Local state is the problem. It looks like Local State is nuked but we kept the policy blob around. Adding a null check sgtm.
,
Jun 29 2018
Sent https://crrev.com/c/1121377 as a speculative workaround.
,
Jul 3
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/e432cbc9d85b6d09c10410069c46a9480c69c122 commit e432cbc9d85b6d09c10410069c46a9480c69c122 Author: Daniel Erat <derat@chromium.org> Date: Tue Jul 03 15:22:43 2018 chromeos: Speculative fix for segfault with missing owner. Try to avoid a crash in a situation we've seen where a lab system has a policy file referencing an owner but the user list in the Local State file is empty. Bug: 850139 Change-Id: I4fad13685d8a8975e6b8f1e41a65bf37cdfcd89d Reviewed-on: https://chromium-review.googlesource.com/1121377 Reviewed-by: Alexander Alekseev <alemate@chromium.org> Commit-Queue: Dan Erat <derat@chromium.org> Cr-Commit-Position: refs/heads/master@{#572221} [modify] https://crrev.com/e432cbc9d85b6d09c10410069c46a9480c69c122/chrome/browser/chromeos/login/users/chrome_user_manager_util.cc [modify] https://crrev.com/e432cbc9d85b6d09c10410069c46a9480c69c122/chrome/browser/ui/webui/chromeos/login/gaia_screen_handler.cc
,
Jul 16
Chrome finally got uprevved, and I don't think that this is happening anymore.
,
Aug 24
Issue 873834 has been merged into this issue.
,
Sep 6
Per issue 873834, this appears to be hitting real users on M68. Since it's apparently triggered by a broken system configuration (the policy file says that device is owned, but KnownUsers in Local State doesn't list the owner), I'm not sure whether the workaround will improve things much, but it's typically better not to segfault. :-P Requesting a merge to M-69, although the stable release is coming up soon.
,
Sep 6
This bug requires manual review: Request affecting a post-stable build Please contact the milestone owner if you have questions. Owners: amineer@(Android), kariahda@(iOS), cindyb@(ChromeOS), govind@(Desktop) For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Sep 11
,
Sep 11
Sigh, I must've misread the release calendar earlier. https://crrev.com/c/1121377 landed on July 3, so it's already included in M69 (branched July 19). |
|||||||||
►
Sign in to add a comment |
|||||||||
Comment 1 by xiy...@chromium.org
, Jun 6 2018