New issue
Advanced search Search tips

Issue 850139 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Jul 16
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 3
Type: Bug



Sign in to add a comment

tast.ui.MashLogin crash in chrome_user_manager_util::IsGuestSessionAllowed

Project Member Reported by jamescook@chromium.org, Jun 6 2018

Issue description

This doesn't look particularly mash-related. Any ideas, guys?

Maybe UserManager is null here?

https://cs.chromium.org/chromium/src/chrome/browser/chromeos/login/existing_user_controller.cc?type=cs&q=updatelogindisplay&sq=package:chromium&g=0&l=410

https://luci-milo.appspot.com/buildbot/chromeos/caroline-release/1809

 http://stainless/browse/chromeos-autotest-results/205807781-chromeos-test/ . If you look under the tast/results/ directory (e.g. search for "full.txt" on that page), you can see the files that Tast actually produced.

It looks like there are some Chrome crashes that didn't get symbolized properly (sigh), although "tast symbolize" works here:

Operating system: Linux
                  0.0.0 Linux 3.18.0-17852-g2b40b9e8fc30 #1 SMP PREEMPT Tue Jun 5 12:16:55 PDT 2018 x86_64
CPU: amd64
     family 6 model 78 stepping 3
     1 CPU

GPU: UNKNOWN

Crash reason:  SIGSEGV
Crash address: 0x0
Process uptime: not available

Thread 0 (crashed)
 0  chrome!chromeos::chrome_user_manager_util::IsGuestSessionAllowed(chromeos::CrosSettings const*) [chrome_user_manager_util.cc : 117 + 0x0]
    rax = 0x0000000000000000   rdx = 0x0000000000000003
    rcx = 0x4a651429d194f900   rbx = 0x00003efc9d7807e8
    rsi = 0x00003efc9d7807e8   rdi = 0x00003efc9d768880
    rbp = 0x00007ffc222bd280   rsp = 0x00007ffc222bd240
     r8 = 0x000000000000000b    r9 = 0x73656d614e726573
    r10 = 0x65efda3b407aa36c   r11 = 0x00007e0e614f11e0
    r12 = 0x0000000000000000   r13 = 0x00003efc9de5c9a0
    r14 = 0x00003efc9d7e2440   r15 = 0x00003efc9d780798
    rip = 0x00005b777d3cf867
    Found by: given as instruction pointer in context
 1  chrome!chromeos::ExistingUserController::UpdateLoginDisplay(std::__1::vector<user_manager::User*, std::__1::allocator<user_manager::User*> > const&) [existing_user_controller.cc : 410 + 0xc]
    rbx = 0x00003efc9d7e2440   rbp = 0x00007ffc222bd320
    rsp = 0x00007ffc222bd290   r12 = 0x0000000000000000
    r13 = 0x00003efc9de5c9a0   r14 = 0x0000000000000000
    r15 = 0x00003efc9d780798   rip = 0x00005b777d34d3a9
    Found by: call frame info
 2  chrome!chromeos::LoginDisplayHostWebUI::OnStartSignInScreen(chromeos::LoginScreenContext const&) [existing_user_controller.cc : 370 + 0xb]
    rbx = 0x00003efc9eb7ac00   rbp = 0x00007ffc222bd4b0
    rsp = 0x00007ffc222bd330   r12 = 0x00007ffc222bd7b0
    r13 = 0x00003efc9de5c9a0   r14 = 0x0000000000000000
    r15 = 0x00003efc9d780798   rip = 0x00005b777d3ae8e0
    Found by: call frame info
 3  chrome!chromeos::LoginDisplayHostCommon::StartSignInScreen(chromeos::LoginScreenContext const&) [login_display_host_common.cc : 118 + 0xc]
    rbx = 0x00003efc9de5c9a0   rbp = 0x00007ffc222bd620
    rsp = 0x00007ffc222bd4c0   r12 = 0x00003efc9f2005a0
    r13 = 0x00003efc9d74ac00   r14 = 0x00007ffc222bd7b0
    r15 = 0x00003efc9d780798   rip = 0x00005b777d3a936a
    Found by: call frame info
 4  chrome!chromeos::WizardController::ShowLoginScreen(chromeos::LoginScreenContext const&) [wizard_controller.cc : 497 + 0x8]
    rbx = 0x00003efc9f216a80   rbp = 0x00007ffc222bd780
    rsp = 0x00007ffc222bd630   r12 = 0x00003efc9f2005a0
    r13 = 0x00003efc9d74ac00   r14 = 0x00007ffc222bd7b0
    r15 = 0x00007ffc222bd638   rip = 0x00005b777d3dafe5
    Found by: call frame info
 5  chrome!<name omitted> [signin_screen_handler.cc : 0 + 0x8]
    rbx = 0x0000000000000000   rbp = 0x00007ffc222bd900
    rsp = 0x00007ffc222bd790   r12 = 0x00003efc9f2005a0
    r13 = 0x00003efc9d74ac00   r14 = 0x00003efc9d790180
    r15 = 0x00003efc9f216a80   rip = 0x00005b777d3d8a1f
    Found by: call frame info
 6  chrome!chromeos::WizardController::Init(chromeos::OobeScreen) [wizard_controller.cc : 357 + 0x8]
    rbx = 0x0000000000000000   rbp = 0x00007ffc222bdab0
    rsp = 0x00007ffc222bd910   r12 = 0x00003efc9f2005a0
    r13 = 0x00003efc9d74ac00   r14 = 0x00003efc9d790180
    r15 = 0x00003efc9f216a80   rip = 0x00005b777d3d8082
    Found by: call frame info
 7  chrome!chromeos::LoginDisplayHostWebUI::StartWizard(chromeos::OobeScreen) [login_display_host_webui.cc : 612 + 0x8]
    rbx = 0x0000000000000001   rbp = 0x00007ffc222bdc20
    rsp = 0x00007ffc222bdac0   r12 = 0x00003efc9f2005a0
    r13 = 0x00003efc9de5c9a0   r14 = 0x0000000000000028
    r15 = 0x0000000000000000   rip = 0x00005b777d3adced
    Found by: call frame info
 8  chrome!(anonymous namespace)::ShowLoginWizardFinish(chromeos::OobeScreen, chromeos::StartupCustomizationDocument const*) [login_display_host_webui.cc : 223 + 0x9]
    rbx = 0x00003efc9d703ce0   rbp = 0x00007ffc222bde60
    rsp = 0x00007ffc222bdc30   r12 = 0x00003efc9d74ac00
    r13 = 0x00003efc9de5c9a0   r14 = 0x00003efc9eacecf0
    r15 = 0x0000000000000028   rip = 0x00005b777d3b27fe
    Found by: call frame info
9  chrome!(anonymous namespace)::TriggerShowLoginWizardFinish(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::unique_ptr<(anonymous namespace)::ShowLoginWizardSwitchLanguageCallbackData, std::__1::default_delete<(anonymous namespace)::ShowLoginWizardSwitchLanguageCallbackData> >) [login_display_host_webui.cc : 268 + 0x5]
    rbx = 0x00007ffc222bded0   rbp = 0x00007ffc222bde90
    rsp = 0x00007ffc222bde70   r12 = 0x0000000000000028
    r13 = 0x00003efc9eacecf0   r14 = 0x00003efc9eafff80
    r15 = 0x00003efc9eaced60   rip = 0x00005b777d3b1e41
    Found by: call frame info
10  chrome!chromeos::ShowLoginWizard(chromeos::OobeScreen) [login_display_host_webui.cc : 1283 + 0x5]
    rbx = 0x00003efc9dc465c0   rbp = 0x00007ffc222be110
    rsp = 0x00007ffc222bdea0   r12 = 0x0000000000000028
    r13 = 0x00003efc9eacecf0   r14 = 0x00003efc9eafff80
    r15 = 0x00003efc9eaced60   rip = 0x00005b777d3b1a52
    Found by: call frame info
11  chrome!chromeos::ChromeSessionManager::Initialize(base::CommandLine const&, Profile*, bool) [chrome_session_manager.cc : 80 + 0xa]
    rbx = 0x00007ffc222be1b0   rbp = 0x00007ffc222be310
    rsp = 0x00007ffc222be120   r12 = 0x00005b77847a3876
    r13 = 0x00003efc9d703ce0   r14 = 0x00005b7785419350
    r15 = 0x0000000000000001   rip = 0x00005b777d389f22
    Found by: call frame info
12  chrome!chromeos::ChromeBrowserMainPartsChromeos::PostProfileInit() [chrome_browser_main_chromeos.cc : 938 + 0x7]
    rbx = 0x0000000000000000   rbp = 0x00007ffc222be470
    rsp = 0x00007ffc222be320   r12 = 0x00003efc9ead3aa0
    r13 = 0x00003efc9d713000   r14 = 0x00007ffc222be320
    r15 = 0x00003efc9d713000   rip = 0x00005b777d25a5aa
    Found by: call frame info
13  chrome!ChromeBrowserMainParts::PreMainMessageLoopRun() [chrome_browser_main.cc : 1884 + 0x9]
    rbx = 0x00003efc9d782540   rbp = 0x00007ffc222be6e0
    rsp = 0x00007ffc222be480   r12 = 0x00003efc9ead3aa0
    r13 = 0x00003efc9d713000   r14 = 0x000000013527f88f
    r15 = 0x002ecc841eb6c655   rip = 0x00005b777e01777e
    Found by: call frame info

 
Cc: alemate@chromium.org
+alemate

The crash is on chrome_user_manager_util.cc L117 (introduced in https://chromium-review.googlesource.com/c/chromium/src/+/1067682).

I suspect it is because "FindUser(owner_account_id)" returns null. Not sure how it happened though because that we checked owner_account_id to be valid and the owner should be in the user list to be found.

Maybe UserManager does not like the owner account id set in 

https://cs.chromium.org/chromium/src/chrome/browser/chromeos/login/users/chrome_user_manager_impl.cc?rcl=f73246ba7034539dbda70502ec21b53b06cd68b3&l=760
This is still happening periodically in the lab. Here are some symbolized crash dumps:

https://stainless.corp.google.com/browse/chromeos-autotest-results/207944040-chromeos-test/

Comment 3 by xiy...@chromium.org, Jun 14 2018

Another possibility is that we cleared "Local State" but not the policy blob file. Hence, we have an owner but not in the known user list.

Would the DUT starts the tests in a clean state that clears files?

Comment 4 by derat@chromium.org, Jun 14 2018

Cc: achuith@chromium.org
The test login code is here: https://chromium.googlesource.com/chromiumos/platform/tast-tests/+/HEAD/src/chromiumos/tast/local/chrome/chrome.go

I modeled it on what Telemetry does. We typically call session_manager's EnableChromeTesting method, clear the user's cryptohome, and then call Oobe.loginForTesting. Chrome is restarted with --oobe-skip-postlogin and --disable-gaia-services, in case either of those are relevant.

Comment 5 by xiy...@chromium.org, Jun 15 2018

Thanks for the code link.

I am more interested to find out how DUT is prepared before running the test. Are we removing both "Local State" and policy blob file?

Comment 6 by derat@chromium.org, Jun 15 2018

This is just a regular DUT in the lab. The rest is in the file that I linked to, starting in the New function. We:

- write a test extension to disk
- call EnableChromeTesting per https://chromium.googlesource.com/chromiumos/platform/tast-tests/+/HEAD/src/chromiumos/tast/local/chrome/chrome.go#237
- connect to Chrome using CDP
- tell cryptohomed to remove the user's cryptohome
- log in per https://chromium.googlesource.com/chromiumos/platform/tast-tests/+/HEAD/src/chromiumos/tast/local/chrome/chrome.go#443

So I wouldn't expect us to be removing Local State or the policy blob, unless session_manager does that as part of EnableChromeTesting.

Also note that this isn't crashing consistently, as far as I know, so it seems likely to be a race.

Comment 7 by xiy...@chromium.org, Jun 15 2018

The race is probably because owner account id is only set after policy blob is verified. When we are lucky, the owner account id is not set when the login screen shows so we skipped the FindUser(owner_account_id)->GetType() call.

Not sure why FindUser could return nullptr while we have an owner account.

Is there a way to get the "Local State" file on the failed DUT?

Comment 8 by derat@chromium.org, Jun 15 2018

Hmm. cautotest currently lists the state of the DUT (chromeos6-row1-rack23-host7) state as "verifying", and it doesn't respond when I try to SSH to it.

The test log also shows that all other tests that tried to interact with Chrome in the same run had problems:

2018/06/05 17:01:49 ui.ARCStartup              [ FAIL ] Failed to connect to Chrome: cdp.Runtime: Evaluate: websocket: close 1006 (abnormal closure): unexpected EOF
2018/06/05 17:01:49 ui.ChromeCrashLoggedIn     [ FAIL ] failed to read Chrome debugging port from /home/chronos/DevToolsActivePort: context deadline exceeded
2018/06/05 17:01:49 ui.ChromeCrashNotLoggedIn  [ FAIL ] failed to read Chrome debugging port from /home/chronos/DevToolsActivePort: context deadline exceeded
2018/06/05 17:01:49 ui.ChromeLogin             [ FAIL ] Failed to connect to Chrome: failed to read Chrome debugging port from /home/chronos/DevToolsActivePort: context deadline exceeded
2018/06/05 17:01:49 ui.MashLogin               [ FAIL ] Chrome probably crashed on startup: failed to read Chrome debugging port from /home/chronos/DevToolsActivePort: context deadline exceeded

I think it's likely that Chrome was in a bad state on the DUT. James, have you seen the same crash in any other runs?
I have not seen this recently. chrome is usually crashing in ozone initialization right now, so it might not be getting to this code. I can't be sure.

Comment 12 by derat@chromium.org, Jun 19 2018

Hmm. Unfortunately, neither of those DUTs (chromeos4-row13-rack10-host9 or chromeos4-row13-rack10-host10) appear to still be in the bad state -- the baked-in copies of ui.ChromeLogin and ui.MashLogin pass on both now rather than failing.

Xiyuan, if Local State would be helpful in debugging this, I can update the test to collect it. Are there any other files or commands that would help in gathering info?
And the policy blob "/var/lib/whitelist/policy.*" that contains the owner email if possible.

Comment 14 by derat@chromium.org, Jun 19 2018

Okay! I've sent https://crrev.com/c/1107204 to collect those files.
Project Member

Comment 15 by bugdroid1@chromium.org, Jun 21 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/tast-tests/+/10248a86deeacaf9233b4c1405670eb75d30c0b6

commit 10248a86deeacaf9233b4c1405670eb75d30c0b6
Author: Daniel Erat <derat@chromium.org>
Date: Thu Jun 21 08:05:44 2018

tast-tests: Make ui.MashLogin collect debug files.

Make the ui.MashLogin test collect /home/chronos/Local State
and /var/lib/whitelist/policy.* on failure to try to track
down Chrome crashes in IsGuestSessionAllowed().

BUG= chromium:850139 
TEST=manual: ran it after inverting failure check and
     verified that expected files are in test's output dir

Change-Id: Id8ee2397d8ad0537927ebc68d009cf3a55e7f722
Reviewed-on: https://chromium-review.googlesource.com/1107204
Commit-Ready: Dan Erat <derat@chromium.org>
Tested-by: Dan Erat <derat@chromium.org>
Reviewed-by: James Cook <jamescook@chromium.org>
Reviewed-by: Shuhei Takahashi <nya@chromium.org>

[modify] https://crrev.com/10248a86deeacaf9233b4c1405670eb75d30c0b6/src/chromiumos/tast/local/bundles/cros/ui/mash_login.go

Comment 16 by derat@chromium.org, Jun 23 2018

The files are collected now, but this test is super-noisy due to issue 850168. :-/

(I also accidentally clobbered the error message in #15, which I'll fix now.)
Project Member

Comment 17 by bugdroid1@chromium.org, Jun 23 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/tast-tests/+/8102d60b79fc415924cd3083fa2aa05fc36f470b

commit 8102d60b79fc415924cd3083fa2aa05fc36f470b
Author: Daniel Erat <derat@chromium.org>
Date: Sat Jun 23 05:51:23 2018

tast-tests: Don't clobber error in ui.MashLogin.

Fix an issue introduced by 10248a86deea where ui.MashLogin
overwrote Chrome errors and reported '<nil>' instead.

BUG= chromium:850139 
TEST=built and ran test

Change-Id: Id448c23cfcfc04b4f971e6ca6aba15ac77c85c3c
Reviewed-on: https://chromium-review.googlesource.com/1112762
Commit-Ready: Dan Erat <derat@chromium.org>
Tested-by: Dan Erat <derat@chromium.org>
Reviewed-by: James Cook <jamescook@chromium.org>

[modify] https://crrev.com/8102d60b79fc415924cd3083fa2aa05fc36f470b/src/chromiumos/tast/local/bundles/cros/ui/mash_login.go

Project Member

Comment 18 by bugdroid1@chromium.org, Jun 26 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/tast-tests/+/0fd9562402a18268fddbda77fd741f7c1e731db7

commit 0fd9562402a18268fddbda77fd741f7c1e731db7
Author: Daniel Erat <derat@chromium.org>
Date: Tue Jun 26 14:24:57 2018

tast-tests: Move Chrome debugging to ui.ChromeLogin.

Move some debug code that captures Local State and policy
files from ui.MashLogin to ui.ChromeLogin. I added this to
help investigate IsGuestSessionAllowed segfaults in Chrome,
but ui.MashLogin is failing frequently due to an unrelated
issue (https://crbug.com/850168) and it's easier to identify
these segfaults in ui.ChromeLogin.

BUG= chromium:850139 
TEST=ran tests

Change-Id: Iebb092e46ef84b16c7229eede9a0cc790a9f159e
Reviewed-on: https://chromium-review.googlesource.com/1114314
Tested-by: Dan Erat <derat@chromium.org>
Reviewed-by: James Cook <jamescook@chromium.org>
Reviewed-by: Shuhei Takahashi <nya@chromium.org>
Commit-Queue: Dan Erat <derat@chromium.org>

[modify] https://crrev.com/0fd9562402a18268fddbda77fd741f7c1e731db7/src/chromiumos/tast/local/bundles/cros/ui/chrome_login.go
[modify] https://crrev.com/0fd9562402a18268fddbda77fd741f7c1e731db7/src/chromiumos/tast/local/bundles/cros/ui/mash_login.go

Comment 19 by derat@chromium.org, Jun 27 2018

I finally got Local State and a policy file, but it looks like Xiyuan is out for a while.

Jacob, are you familiar with this code in Chrome?

The attached files are from http://stainless/browse/chromeos-autotest-results/212061640-chromeos-test/ . The crash dump is from a little bit earlier than that test failure, but Chrome looks like it was crashing every time with this stack.
log.txt
1.3 KB View Download
Local State.txt
4.1 KB View Download
policy.1
638 bytes Download
chrome.20180626.234329.5636.dmp.txt
119 KB View Download

Comment 20 by derat@chromium.org, Jun 27 2018

Some possibly-interesting bits from Local State:

  "KnownUsers": [
    
  ],

  "profile": {
    "info_cache": {
      "Default": {
        "avatar_icon": "chrome:\/\/theme\/IDR_PROFILE_AVATAR_26",
        "background_apps": false,
        "gaia_id": "",
        "is_ephemeral": false,
        "is_omitted_from_profile_list": false,
        "is_using_default_avatar": true,
        "is_using_default_name": true,
        "managed_user_id": "",
        "name": "First user",
        "user_name": ""
      }
    },
    "last_active_profiles": [
      
    ]
  },

I didn't decode the policy file, but I see a "first_user@nowhere.com" string in it.

I wouldn't be surprised if tests are leaving Chrome in a weird state. Should I just make the crashing code check for a null pointer?

Comment 21 by derat@chromium.org, Jun 27 2018

Cc: mnissler@chromium.org
"first_user@nowhere.com" appears in a bunch of Autotest login and cryptohome tests: https://cs.corp.google.com/search/?q=%22first_user@nowhere.com%22&m=100&det=matsel&sq=package:%5Echromeos_(internal%7Cpublic)$&type=cs

Should Tast clear /var/lib/whitelist/policy.* before restarting Chrome?
Yep, empty "KnownUsers" in Local state is the problem. It looks like Local State is nuked but we kept the policy blob around.

Adding a null check sgtm. 

Comment 23 by derat@chromium.org, Jun 29 2018

Owner: derat@chromium.org
Status: Started (was: Untriaged)
Sent https://crrev.com/c/1121377 as a speculative workaround.
Project Member

Comment 24 by bugdroid1@chromium.org, Jul 3

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/e432cbc9d85b6d09c10410069c46a9480c69c122

commit e432cbc9d85b6d09c10410069c46a9480c69c122
Author: Daniel Erat <derat@chromium.org>
Date: Tue Jul 03 15:22:43 2018

chromeos: Speculative fix for segfault with missing owner.

Try to avoid a crash in a situation we've seen where a lab
system has a policy file referencing an owner but the user
list in the Local State file is empty.

Bug:  850139 
Change-Id: I4fad13685d8a8975e6b8f1e41a65bf37cdfcd89d
Reviewed-on: https://chromium-review.googlesource.com/1121377
Reviewed-by: Alexander Alekseev <alemate@chromium.org>
Commit-Queue: Dan Erat <derat@chromium.org>
Cr-Commit-Position: refs/heads/master@{#572221}
[modify] https://crrev.com/e432cbc9d85b6d09c10410069c46a9480c69c122/chrome/browser/chromeos/login/users/chrome_user_manager_util.cc
[modify] https://crrev.com/e432cbc9d85b6d09c10410069c46a9480c69c122/chrome/browser/ui/webui/chromeos/login/gaia_screen_handler.cc

Status: Fixed (was: Started)
Chrome finally got uprevved, and I don't think that this is happening anymore.
Issue 873834 has been merged into this issue.
Labels: Merge-Request-69
Per issue 873834, this appears to be hitting real users on M68. Since it's apparently triggered by a broken system configuration (the policy file says that device is owned, but KnownUsers in Local State doesn't list the owner), I'm not sure whether the workaround will improve things much, but it's typically better not to segfault. :-P

Requesting a merge to M-69, although the stable release is coming up soon.
Project Member

Comment 28 by sheriffbot@chromium.org, Sep 6

Labels: -Merge-Request-69 Merge-Review-69 Hotlist-Merge-Review
This bug requires manual review: Request affecting a post-stable build
Please contact the milestone owner if you have questions.
Owners: amineer@(Android), kariahda@(iOS), cindyb@(ChromeOS), govind@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Labels: -Merge-Review-69 Merge-Approved-69 M-69
Labels: -Hotlist-Merge-Review -Merge-Approved-69
Sigh, I must've misread the release calendar earlier. https://crrev.com/c/1121377 landed on July 3, so it's already included in M69 (branched July 19).

Sign in to add a comment