login_multipleSessions HW test is flaky. |
|||||||||||||
Issue descriptionFollow up for crbug.com/740828
,
Jul 11 2017
derat@, could you triage?
,
Jul 11 2017
Here's the output from the failure linked in the other bug:
07/10 22:24:20.391 DEBUG| utils:0212| Running 'status ui'
07/10 22:24:20.405 DEBUG| utils:0212| Running 'stop ui'
07/10 22:24:21.815 DEBUG| utils:0280| [stdout] ui stop/waiting
07/10 22:24:21.816 DEBUG| utils:0212| Running 'status ui'
07/10 22:24:21.826 DEBUG| utils:0212| Running 'bootstat_get_last login-prompt-visible'
07/10 22:24:21.837 DEBUG| utils:0212| Running 'start ui'
07/10 22:24:21.963 DEBUG| utils:0280| [stdout] ui start/running, process 22292
07/10 22:24:21.964 DEBUG| utils:0212| Running 'bootstat_get_last login-prompt-visible'
07/10 22:24:22.973 DEBUG| utils:0212| Running 'bootstat_get_last login-prompt-visible'
07/10 22:24:23.980 DEBUG| utils:0212| Running 'bootstat_get_last login-prompt-visible'
07/10 22:24:24.989 DEBUG| utils:0212| Running 'bootstat_get_last login-prompt-visible'
07/10 22:24:25.006 DEBUG| cros_disks:0071| Getting D-Bus proxy object on bus "org.chromium.Cryptohome" and path "/org/chromium/Cryptohome"
07/10 22:24:25.008 DEBUG| cros_disks:0111| Handling D-Bus signal "AsyncCallStatus(async_id, return_status, return_code)" on interface "org.chromium.CryptohomeInterface"
07/10 22:24:25.018 DEBUG| test:0363| starting before_iteration_hooks
07/10 22:24:25.055 INFO | base_sysinfo:0392| ChromeOS BOARD = link_1.8GHz_4GB
07/10 22:24:25.055 DEBUG| utils:0212| Running 'logger "autotest starting iteration /usr/local/autotest/results/default/login_MultipleSessions/sysinfo/iteration.1 on link_1.8GHz_4GB"'
07/10 22:24:25.062 DEBUG| test:0366| before_iteration_hooks completed
07/10 22:24:25.063 DEBUG| test:0380| starting test(run_once()), test details follow
()
07/10 22:24:25.069 DEBUG| cryptohome:0609| Issued call <dbus.proxies._DeferredMethod instance at 0x7f46352fa440> with async_id 15
07/10 22:24:25.070 DEBUG| cros_disks:0133| Waiting for D-Bus signal "AsyncCallStatus"
07/10 22:24:25.171 DEBUG| cros_disks:0138| Received D-Bus signal "AsyncCallStatus({'return_status': dbus.Boolean(True), 'async_id': dbus.Int32(15), 'return_code': dbus.Int32(0)})"
07/10 22:24:25.173 DEBUG| cryptohome:0609| Issued call <dbus.proxies._ProxyMethod instance at 0x7f46352fa440> with async_id 16
07/10 22:24:25.174 DEBUG| cros_disks:0133| Waiting for D-Bus signal "AsyncCallStatus"
07/10 22:24:25.777 DEBUG| cros_disks:0138| Received D-Bus signal "AsyncCallStatus({'return_status': dbus.Boolean(False), 'async_id': dbus.Int32(16), 'return_code': dbus.Int32(2)})"
07/10 22:24:25.781 DEBUG| test:0390| Test failed due to org.chromium.SessionManagerInterface.NoUserNssDb: Can't create session.. Exception log follows the after_iteration_hooks.
07/10 22:24:25.782 DEBUG| test:0393| starting after_iteration_hooks
07/10 22:24:25.782 DEBUG| utils:0212| Running 'mkdir -p /usr/local/autotest/results/default/login_MultipleSessions/sysinfo/iteration.1/var/spool'
07/10 22:24:25.793 DEBUG| utils:0212| Running 'rsync --no-perms --chmod=ugo+r -a --exclude=autoserv* --safe-links --exclude=*.core /var/spool/crash /usr/local/autotest/results/default/login_MultipleSessions/sysinfo/iteration.1/var/spool'
07/10 22:24:25.807 DEBUG| utils:0212| Running 'rm -rf /var/spool/crash/*'
07/10 22:24:25.821 DEBUG| utils:0212| Running 'logger "autotest finished iteration /usr/local/autotest/results/default/login_MultipleSessions/sysinfo/iteration.1"'
07/10 22:24:25.831 DEBUG| test:0396| after_iteration_hooks completed
07/10 22:24:25.835 WARNI| test:0616| The test failed with the following exception
Traceback (most recent call last):
File "/usr/local/autotest/common_lib/test.py", line 610, in _exec
_call_test_function(self.execute, *p_args, **p_dargs)
File "/usr/local/autotest/common_lib/test.py", line 824, in _call_test_function
raise error.UnhandledTestFail(e)
UnhandledTestFail: Unhandled DBusException: org.chromium.SessionManagerInterface.NoUserNssDb: Can't create session.
Traceback (most recent call last):
File "/usr/local/autotest/common_lib/test.py", line 818, in _call_test_function
return func(*args, **dargs)
File "/usr/local/autotest/common_lib/test.py", line 471, in execute
dargs)
File "/usr/local/autotest/common_lib/test.py", line 348, in _call_run_once_with_retry
postprocess_profiled_run, args, dargs)
File "/usr/local/autotest/common_lib/test.py", line 381, in _call_run_once
self.run_once(*args, **dargs)
File "/usr/local/autotest/tests/login_MultipleSessions/login_MultipleSessions.py", line 42, in run_once
self.__start_session_for(expected_owner)
File "/usr/local/autotest/tests/login_MultipleSessions/login_MultipleSessions.py", line 81, in __start_session_for
self._session_manager.StartSession(user, '')
File "/usr/local/lib64/python2.7/site-packages/dbus/proxies.py", line 145, in __call__
**keywords)
File "/usr/local/lib64/python2.7/site-packages/dbus/connection.py", line 651, in call_blocking
message, timeout)
DBusException: org.chromium.SessionManagerInterface.NoUserNssDb: Can't create session.
07/10 22:24:25.841 DEBUG| test:0621| Running cleanup for test.
07/10 22:24:25.841 DEBUG| utils:0212| Running 'stop ui'
07/10 22:24:27.212 DEBUG| utils:0280| [stdout] ui stop/waiting
07/10 22:24:27.212 DEBUG| utils:0212| Running 'bootstat_get_last login-prompt-visible'
07/10 22:24:27.221 DEBUG| utils:0212| Running 'start ui'
07/10 22:24:27.334 DEBUG| utils:0280| [stdout] ui start/running, process 22685
07/10 22:24:27.335 DEBUG| logging_manager:0627| Logging subprocess finished
07/10 22:24:27.338 DEBUG| logging_manager:0627| Logging subprocess finished
,
Jul 11 2017
Here's the failing code in SessionManagerImpl::CreateUserSession:
crypto::ScopedPK11Slot slot(nss_->OpenUserDB(GetUserPath(username)));
if (!slot) {
LOG(ERROR) << "Could not open the current user's NSS database.";
*error = CreateError(dbus_error::kNoUserNssDb, "Can't create session.");
return nullptr;
}
The test looks like it's just calls session_manager's StartSession D-Bus method for two different users, checks that the first one is the device owner, restarts session_manager, starts sessions in the opposite order, and then checks the owner again.
The test output isn't very good, but it looks like it's failing on the first StartSession call it makes. Here's the relevant portion of syslog:
2017-07-11T05:24:24.653013+00:00 INFO session_manager[22292]: [INFO:server_backed_state_key_generator.cc(131)] Stable device secret missing!
2017-07-11T05:24:25.005286+00:00 WARNING session_manager[22292]: [WARNING:exported_object.cc(214)] Unknown method: message_type: MESSAGE_METHOD_CALL#012destination: :1.262#012path: /org/chromium/SessionManager#012interface: org.freedesktop.DBus.Introspectable#012member: Introspect#012sender: :1.269#012serial: 3#012#012
2017-07-11T05:24:25.061887+00:00 NOTICE root[22515]: autotest starting iteration /usr/local/autotest/results/default/login_MultipleSessions/sysinfo/iteration.1 on link_1.8GHz_4GB
2017-07-11T05:24:25.173321+00:00 INFO cryptohomed[1475]: Asynced Mount() requested. Tracking request sequence id 16 for later PKCS#11 initialization.
2017-07-11T05:24:25.177676+00:00 WARNING cryptohomed[1475]: Could not load the device policy file.
2017-07-11T05:24:25.203209+00:00 INFO kernel: [ 1265.454620] tpm_tis tpm_tis: command 0x15 (size 14) returned code 0x0
2017-07-11T05:24:25.314244+00:00 INFO kernel: [ 1265.565679] tpm_tis tpm_tis: command 0x65 (size 18) returned code 0x0
2017-07-11T05:24:25.202848+00:00 WARNING cryptohomed[1475]: Could not load the device policy file.
2017-07-11T05:24:25.337716+00:00 ERR cryptohomed[1475]: TPM error 0x1 (Authentication failed): Error calling Tspi_Key_GetPubKey
2017-07-11T05:24:25.338227+00:00 INFO kernel: [ 1265.589584] tpm_tis tpm_tis: command 0x21 (size 14) returned code 0x1
2017-07-11T05:24:25.467212+00:00 INFO kernel: [ 1265.718539] tpm_tis tpm_tis: command 0x65 (size 18) returned code 0x0
2017-07-11T05:24:25.688679+00:00 ERR cryptohomed[1475]: TPM error 0x1 (Authentication failed): Error calling Tspi_Data_Unbind
2017-07-11T05:24:25.688934+00:00 ERR cryptohomed[1475]: The TPM failed to unwrap the intermediate key with the supplied credentials
2017-07-11T05:24:25.689248+00:00 INFO kernel: [ 1265.940472] tpm_tis tpm_tis: command 0x1e (size 274) returned code 0x1
2017-07-11T05:24:25.689286+00:00 ERR cryptohomed[1475]: Failed to decrypt any keysets for b02d33503f941632bdb4e3c9d220f6cf9e71eadb
2017-07-11T05:24:25.689984+00:00 INFO cryptohomed[1475]: An asynchronous mount request with sequence id: 16 finished; doing PKCS11 init...
2017-07-11T05:24:25.779539+00:00 INFO session_manager[22292]: [INFO:policy_key.cc(53)] No policy key on disk at /home/root/b02d33503f941632bdb4e3c9d220f6cf9e71eadb/session_manager/policy/key
2017-07-11T05:24:25.780623+00:00 ERR session_manager[22292]: [ERROR:nss_util.cc(129)] Error opening persistent database (configDir='sql:/home/user/b02d33503f941632bdb4e3c9d220f6cf9e71eadb/.pki/nssdb' tokenDescription='/home/user/b02d33503f941632bdb4e3c9d220f6cf9e71eadb'): -8018
2017-07-11T05:24:25.780653+00:00 ERR session_manager[22292]: [ERROR:session_manager_impl.cc(1261)] Could not open the current user's NSS database.
2017-07-11T05:24:25.780691+00:00 ERR session_manager[22292]: [ERROR:dbus_util.cc(14)] CreateError(...): Domain=dbus, Code=org.chromium.SessionManagerInterface.NoUserNssDb, Message=Can't create session.
Lots of warnings and errors. The TPM auth problems don't look good to me. Gwendal, can you help interpret them?
,
Jul 11 2017
https://luci-milo.appspot.com/buildbot/chromeos/cyan-paladin/3154, vmtest failure due to: login_OwnershipApi timestamp=1499782148 localtime=Jul 11 09:09:08 Unhandled DBusException: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. Is it related to this bug?
,
Jul 11 2017
It seems unlikely. This bug is about an error returned in response to some (possibly TPM-related?) crypto problem. The login_OwnershipApi failures show something (session_manager? error message is useless) not responding to a D-Bus call, probably due to a hang or crash.
,
Jul 12 2017
Drive-by comment: I wouldn't be too surprised if most of these errors and warnings also show up on first login for a user. In particular, it's expected that there is no policy key on disk, and we'll also not be able to decrypt any keysets in cryptohome (although I don't know for sure whether these log messages are expected for that case). Regarding the NSS DB open errors, we might just look at a race condition. I *think* the NSS DB is created by Chrome, and session_manager then opens it. However, error code -8018 corresponds to SEC_ERROR_UNKNOWN_PKCS11_ERROR if I'm not mistaken (NSS error codes are a complex beast), so that might point at a TPM issue?
,
Jul 13 2017
I probably know less about NSS than anyone else on this thread (i.e. I know nothing, not even what it stands for), so I'm not a good choice for this. Mattias, I'll let you decide if you want to hand it off to someone else for the TPM angle. :-P
,
Jul 23 2017
Issue 709441 has been merged into this issue.
,
Jul 23 2017
Issue 721125 has been merged into this issue.
,
Jul 23 2017
Issue 728347 has been merged into this issue.
,
Jul 24 2017
[bvt-inline] login_MultipleSessions Failure on link-release/R62-9774.0.0 This report is automatically generated to track the following Failure: Test: login_MultipleSessions. Suite: bvt-inline. Chrome Version: 62.0.3165.0. Build: link-release/R62-9774.0.0. Reason: Unhandled DBusException: org.chromium.SessionManagerInterface.NoUserNssDb: Can't create session.. build artifacts: https://storage.cloud.google.com/?arg=chromeos-image-archive/link-release/R62-9774.0.0. results log: http://ubercautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/130071029-chromeos-test/chromeos4-row9-rack2-host1/debug/. status log: http://ubercautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/130071029-chromeos-test/chromeos4-row9-rack2-host1/status.log. buildbot stages: NA. job link: http://cautotest.corp.google.com/afe/#tab_id=view_job&object_id=130071029. You may want to check the test history on wmatrix: https://wmatrix.googleplex.com/unfiltered?hide_missing=True&tests=login_MultipleSessions You may also want to check the test retry dashboard in case this is a flakey test: https://wmatrix.googleplex.com/retry_teststats/?days_back=30&tests=login_MultipleSessions ANCHOR TestFailure{bvt-inline,login_MultipleSessions,Unhandled DBusException: org.chromium.SessionManagerInterface.NoUserNssDb: Can't create session.}
,
Jul 29 2017
[bvt-inline] login_MultipleSessions Failure on link-release/R62-9789.0.0 This report is automatically generated to track the following Failure: Test: login_MultipleSessions. Suite: bvt-inline. Chrome Version: 62.0.3169.0. Build: link-release/R62-9789.0.0. Reason: Unhandled DBusException: org.chromium.SessionManagerInterface.NoUserNssDb: Can't create session.. build artifacts: https://storage.cloud.google.com/?arg=chromeos-image-archive/link-release/R62-9789.0.0. results log: http://ubercautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/131311818-chromeos-test/chromeos4-row5-rack13-host11/debug/. status log: http://ubercautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/131311818-chromeos-test/chromeos4-row5-rack13-host11/status.log. buildbot stages: NA. job link: http://cautotest.corp.google.com/afe/#tab_id=view_job&object_id=131311818. You may want to check the test history on wmatrix: https://wmatrix.googleplex.com/unfiltered?hide_missing=True&tests=login_MultipleSessions You may also want to check the test retry dashboard in case this is a flakey test: https://wmatrix.googleplex.com/retry_teststats/?days_back=30&tests=login_MultipleSessions ANCHOR TestFailure{bvt-inline,login_MultipleSessions,Unhandled DBusException: org.chromium.SessionManagerInterface.NoUserNssDb: Can't create session.}
,
Aug 24 2017
Issue 758665 has been merged into this issue.
,
Sep 18 2017
Issue 755340 has been merged into this issue.
,
Sep 18 2017
,
Jun 8 2018
,
Oct 12
,
Dec 10
Checked in Stainless and looks like this is no longer flaky, marking fixed |
|||||||||||||
►
Sign in to add a comment |
|||||||||||||
Comment 1 by oka@chromium.org
, Jul 11 2017