New issue
Advanced search Search tips

Issue 906138 link

Starred by 2 users

Issue metadata

Status: Started
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment

Login tests failing due to trunksd not running: blocked syscall stat64 on arm

Project Member Reported by derat@chromium.org, Nov 16

Issue description

In the scarlet-paladin run at http://stainless/browse/chromeos-autotest-results/257744577-chromeos-test/, all Tast tests that require login failed due to what looks like a cryptohomed timeout. Here's the first failing test:

2018/11/15 21:29:46 Started test ui.ChromeCrashLoggedIn
2018/11/15 21:29:46 [21:29:45.241] Restarting ui job
2018/11/15 21:29:47 [21:29:46.496] Waiting for org.chromium.SessionManager D-Bus service
2018/11/15 21:29:47 [21:29:46.532] Asking session_manager to enable Chrome testing
2018/11/15 21:29:47 [21:29:46.537] Waiting for Chrome to write its debugging port to /home/chronos/DevToolsActivePort
2018/11/15 21:29:48 [21:29:47.733] Removing cryptohome for testuser@gmail.com
2018/11/15 21:30:12 [21:30:11.296] Finding OOBE DevTools target
2018/11/15 21:30:12 [21:30:11.305] Connecting to Chrome at ws://127.0.0.1:37284/devtools/page/CE9FD1FD3E3E9B7A27EC8B4AB9580591
2018/11/15 21:30:12 [21:30:11.321] Waiting for OOBE
2018/11/15 21:30:12 [21:30:11.328] Logging in as user "testuser@gmail.com"
2018/11/15 21:30:12 [21:30:11.387] Waiting for cryptohome for user "testuser@gmail.com"
2018/11/15 21:31:46 [21:31:45.243] Failed to get cryptohome status
2018/11/15 21:31:46 [21:31:45.243] External command timed out
2018/11/15 21:31:46 [21:31:45.243] Command: cryptohome --action=status
2018/11/15 21:31:46 [21:31:45.243] Uncaptured output:
2018/11/15 21:31:46 [21:31:45.261] /sys/class/tpm/tpm0/device/owned contains "1"
2018/11/15 21:31:46 [21:31:45.264] Error at chrome_crash_logged_in.go:26: Chrome login failed: not mounted for testuser@gmail.com: context deadline exceeded; last error follows: /home/user/897ff8920e92a4a1359ad3adea101deb7d6a6486 not found
...

So in addition to the directory not getting mounted, I think that "cryptohome --action=status" might've also timed out when we ran it later.

It looks like there are a lot of problems in the messages file:

2018-11-16T05:29:45.095322+00:00 WARNING attestationd[7973]: Failed to initialize the trunks IPC proxy; trunksd is not ready.
2018-11-16T05:29:45.095352+00:00 ERR attestationd[7973]: Failed to initialize trunks.
2018-11-16T05:29:45.095938+00:00 CRIT attestationd[7973]: Check failed: default_tpm_utility_->Initialize(). #012
2018-11-16T05:29:45.121762+00:00 INFO crash_reporter[8090]: libminijail[8090]: mount '/dev/log' -> '/dev/log' type '' flags 0x1001
2018-11-16T05:29:45.145791+00:00 WARNING crash_reporter[8090]: Could not load the device policy file.
2018-11-16T05:29:45.146679+00:00 WARNING crash_reporter[8090]: [user] Received crash notification for attestationd[7973] sig 6, user 247 group 247 (developer build - not testing - always dumping)
2018-11-16T05:29:45.150468+00:00 INFO crash_reporter[8090]: State of crashed process [7973]: D (disk sleep)
2018-11-16T05:29:45.151366+00:00 INFO crash_reporter[8090]: Accessing crash dir '/var/spool/crash' via symlinked handle '/proc/self/fd/5'
2018-11-16T05:29:45.152728+00:00 WARNING crash_reporter[8090]: Crash directory /var/spool/crash already full with 32 pending reports
2018-11-16T05:29:45.152836+00:00 ERR crash_reporter[8090]: Could not create crash directory
2018-11-16T05:29:45.152867+00:00 ERR crash_reporter[8090]: Unable to find/create process-specific crash path
...
2018-11-16T05:30:11.287556+00:00 ERR cryptohomed[1482]: AddDBusError(...): Domain=dbus, Code=org.freedesktop.DBus.Error.ServiceUnknown, Message=The name org.chromium.TpmManager was not provided by any .service files
2018-11-16T05:30:11.290059+00:00 WARNING cryptohomed[1482]: No valid keysets on disk for 897ff8920e92a4a1359ad3adea101deb7d6a6486
2018-11-16T05:30:11.333353+00:00 INFO tpm_managerd[8691]: Starting TPM Manager...
2018-11-16T05:30:11.343744+00:00 ERR cryptohomed[1482]: AddDBusError(...): Domain=dbus, Code=org.freedesktop.DBus.Error.ServiceUnknown, Message=The name org.chromium.TpmManager was not provided by any .service files
2018-11-16T05:30:11.346414+00:00 ERR cryptohomed[1482]: TpmIsReady: is not enabled.
2018-11-16T05:30:11.346493+00:00 ERR cryptohomed[1482]: HasAuthorization: TPM not ready.
2018-11-16T05:30:11.346532+00:00 ERR cryptohomed[1482]: Destroy() called with insufficient authorization.
2018-11-16T05:30:11.356548+00:00 WARNING cryptohomed[1482]: Could not load the device policy file.
2018-11-16T05:30:11.383314+00:00 WARNING cryptohomed[1482]: message repeated 2 times: [ Could not load the device policy file.]
2018-11-16T05:30:11.386020+00:00 INFO cryptohomed[1482]: Finalizing() 2 bytes.
2018-11-16T05:30:11.392612+00:00 INFO cryptohomed[1482]: InstallAttributes have been finalized.
...

There are a bunch of attestationd crashes. Here's one:

Crash reason:  SIGABRT
Crash address: 0x0
Process uptime: not available

Thread 0 (crashed)
 0  libc-2.27.so!__libc_do_syscall + 0x5
     r0 = 0x00000000    r1 = 0xf2f35fcc    r2 = 0x00000000    r3 = 0x00000008
     r4 = 0xf3a33ac0    r5 = 0x00000000    r6 = 0xf2f35fcc    r7 = 0x000000af
     r8 = 0xf2f36224    r9 = 0xf2f36704   r10 = 0xf2f36700   r12 = 0x000000af
     fp = 0xf3a33ac0    sp = 0xf2f35fc0    lr = 0xf337118f    pc = 0xf33622d6
    Found by: given as instruction pointer in context
 1  libc-2.27.so!raise [nptl-signals.h : 80 + 0xf]
     r4 = 0xf3a33ac0    r5 = 0x00000000    r6 = 0xf2f35fcc    r7 = 0x00000006
     r8 = 0xf2f36224    r9 = 0xf2f36704   r10 = 0xf2f36700    fp = 0xf3a33ac0
     sp = 0xf2f35fc8    pc = 0xf337118f
    Found by: call frame info
 2  libc-2.27.so!abort [abort.c : 79 + 0x5]
     r4 = 0xf3442f08    r5 = 0xf2f39450    r6 = 0x00000000    r7 = 0xf2f36208
     r8 = 0xf2f36224    r9 = 0xf2f36704   r10 = 0xf2f36700    fp = 0xf3a33ac0
     sp = 0xf2f360e8    pc = 0xf3371f13
    Found by: call frame info
 3  libbase-core-456626.so!base::debug::BreakDebugger() [debugger_posix.cc : 221 + 0x3]
     r4 = 0xf3442f08    r5 = 0xf2f39450    r6 = 0x00000000    r7 = 0xf2f36208
     r8 = 0xf2f36224    r9 = 0xf2f36704   r10 = 0xf2f36700    fp = 0xf3a33ac0
     sp = 0xf2f36208    pc = 0xf37713bb
    Found by: call frame info
 4  libbase-core-456626.so!logging::LogMessage::~LogMessage() [logging.cc : 774 + 0x3]
     r4 = 0xf3442f08    r5 = 0xf2f39450    r6 = 0x00000000    r7 = 0xf2f36650
     r8 = 0xf2f36224    r9 = 0xf2f36704   r10 = 0xf2f36700    fp = 0xf3a33ac0
     sp = 0xf2f36210    pc = 0xf3788aa9
    Found by: call frame info
 5  attestationd!attestation::AttestationService::InitializeTask() [attestation_service.cc : 419 + 0x1d]
     r4 = 0xffcae9d8    r5 = 0xf381ea28    r6 = 0x00000000    r7 = 0xf2f38720
     r8 = 0x00000000    r9 = 0xf2f387c8   r10 = 0xf3a33ac0    fp = 0xf3a33ac0
     sp = 0xf2f36658    pc = 0x0dbab43b
    Found by: call frame info
...

The bob-paladin run at http://stainless/browse/chromeos-autotest-results/257746405-chromeos-test/ looks like it had similar problems, but it also captured some tpm_managerd crashes:

Crash reason:  SIGABRT
Crash address: 0x0
Process uptime: not available

Thread 0 (crashed)
 0  libc-2.27.so!__libc_do_syscall + 0x5
     r0 = 0x00000000    r1 = 0xee418f04    r2 = 0x00000000    r3 = 0x00000008
     r4 = 0xeedc8ac0    r5 = 0x00000000    r6 = 0xee418f04    r7 = 0x000000af
     r8 = 0xee41915c    r9 = 0xee41959c   r10 = 0xee419598   r12 = 0x000000af
     fp = 0xeedc8ac0    sp = 0xee418ef8    lr = 0xee7e918f    pc = 0xee7da2d6
    Found by: given as instruction pointer in context
 1  libc-2.27.so!raise [nptl-signals.h : 80 + 0xf]
     r4 = 0xeedc8ac0    r5 = 0x00000000    r6 = 0xee418f04    r7 = 0x00000006
     r8 = 0xee41915c    r9 = 0xee41959c   r10 = 0xee419598    fp = 0xeedc8ac0
     sp = 0xee418f00    pc = 0xee7e918f
    Found by: call frame info
 2  libc-2.27.so!abort [abort.c : 79 + 0x5]
     r4 = 0xee8baf08    r5 = 0xee41a450    r6 = 0x00000000    r7 = 0xee419140
     r8 = 0xee41915c    r9 = 0xee41959c   r10 = 0xee419598    fp = 0xeedc8ac0
     sp = 0xee419020    pc = 0xee7e9f13
    Found by: call frame info
 3  libbase-core-456626.so!base::debug::BreakDebugger() [debugger_posix.cc : 221 + 0x3]
     r4 = 0xee8baf08    r5 = 0xee41a450    r6 = 0x00000000    r7 = 0xee419140
     r8 = 0xee41915c    r9 = 0xee41959c   r10 = 0xee419598    fp = 0xeedc8ac0
     sp = 0xee419140    pc = 0xeebb73bb
    Found by: call frame info
 4  libbase-core-456626.so!logging::LogMessage::~LogMessage() [logging.cc : 774 + 0x3]
     r4 = 0xee8baf08    r5 = 0xee41a450    r6 = 0x00000000    r7 = 0xee419588
     r8 = 0xee41915c    r9 = 0xee41959c   r10 = 0xee419598    fp = 0xeedc8ac0
     sp = 0xee419148    pc = 0xeebceaa9
    Found by: call frame info
 5  libtrunks.so!trunks::TpmStateImpl::IsPlatformHierarchyEnabled() [tpm_state_impl.cc : 97 + 0x19]
     r4 = 0xedb03b30    r5 = 0xff9a4cec    r6 = 0xedb01d20    r7 = 0xee419648
     r8 = 0xeedc8ac0    r9 = 0xedb03b30   r10 = 0xeedc8ac0    fp = 0xeedc8ac0
     sp = 0xee419590    pc = 0xeed76cb5
    Found by: call frame info
 6  libtrunks.so!trunks::TpmStateImpl::IsEnabled() [tpm_state_impl.cc : 115 + 0x7]
     r4 = 0xedb03b30    r5 = 0xff9a4cec    r6 = 0xedb01d20    r7 = 0xee419658
     r8 = 0xff9a4cd0    r9 = 0xee419668   r10 = 0xeedc8ac0    fp = 0xeedc8ac0
     sp = 0xee419650    pc = 0xeed76f23
    Found by: call frame info
 7  tpm_managerd!tpm_manager::TpmManagerService::InitializeTask() [tpm_manager_service.cc : 116 + 0x7]
     r4 = 0xff9a4d1c    r5 = 0xff9a4cec    r6 = 0xedb01d20    r7 = 0xee419720
     r8 = 0xff9a4cd0    r9 = 0xee419668   r10 = 0xeedc8ac0    fp = 0xeedc8ac0
     sp = 0xee419660    pc = 0x0cd4e303
    Found by: call frame info
...
 
As usual, the problem is with trunksd not running.
Other daemons -tpm_managerd, attestationd - crash by design when the one daemon that connects them to TPM is not there.

E.g., in scarlet case:
2018-11-16T05:33:47.634519+00:00 WARNING attestationd[9688]: Failed to initialize the trunks IPC proxy; trunksd is not ready.
2018-11-16T05:33:47.634582+00:00 ERR attestationd[9688]: Failed to initialize trunks.
2018-11-16T05:33:47.635629+00:00 CRIT attestationd[9688]: Check failed: default_tpm_utility_->Initialize(). #012

In bob case:
22018-11-16T05:50:23.146641+00:00 WARNING tpm_managerd[10691]: Failed to initialize the trunks IPC proxy; trunksd is not ready.
2018-11-16T05:50:23.147326+00:00 ERR tpm_managerd[10691]: Error TrunksDBusProxy cannot connect to trunksd.
2018-11-16T05:50:23.147363+00:00 ERR tpm_managerd[10691]: GetCapability: SAPI_RC_NO_CONNECTION
2018-11-16T05:50:23.147378+00:00 ERR tpm_managerd[10691]: Failed to query TPM properties: SAPI_RC_NO_CONNECTION
2018-11-16T05:50:23.147401+00:00 WARNING tpm_managerd[10691]: Error initializing trunks tpm state: SAPI_RC_NO_CONNECTION
2018-11-16T05:50:23.147998+00:00 CRIT tpm_managerd[10691]: Check failed: initialized_. #012

Now the real question is why trunksd is not running. The logs I saw trimmed the start of the /var/log/messages, so no info from the time when trunksd was supposed to be started.

Last time trunksd was crashing on start it was a missing syscall in seccomp allow-list:  issue 897541 .
Were there any recent libchrome/libbrillo uprevs?
Note: bob and scarlet paladins both had two green runs after that failed run.
The failed runs are:
 - scarlet: http://cros-goldeneye/chromeos/healthmonitoring/buildDetails?buildbucketId=8929733737596891872
 - bob: http://cros-goldeneye/chromeos/healthmonitoring/buildDetails?buildbucketId=8929733862480839104

In both cases the bvt-tast-cq issues unsurprisingly started with failed platform.TPMResponsive, and bvt-inline had even more login-related test failures.

Looking at one of the previous failed tests on scarlet: http://cautotest-prod/tko/retrieve_logs.cgi?job=/results/257746439-chromeos-test/, /var/log/messages contains the reason for trunksd failing:

2018-11-16T05:24:25.456544+00:00 INFO trunksd[1404]: TPM: /dev/tpm0 opened successfully
2018-11-16T05:24:25.457123+00:00 WARNING trunksd[1404]: libminijail[1404]: allowing syscall: socket
2018-11-16T05:24:25.457221+00:00 WARNING trunksd[1404]: libminijail[1404]: allowing syscall: connect
2018-11-16T05:24:25.457265+00:00 WARNING trunksd[1404]: libminijail[1404]: allowing syscall: gettimeofday
2018-11-16T05:24:25.457306+00:00 WARNING trunksd[1404]: libminijail[1404]: allowing syscall: send
2018-11-16T05:24:25.457350+00:00 WARNING trunksd[1404]: libminijail[1404]: allowing syscall: writev
2018-11-16T05:24:25.457972+00:00 WARNING trunksd[1404]: libminijail[1404]: logging seccomp filter failures
2018-11-16T05:24:25.458440+00:00 ERR trunksd[1404]: libminijail[1404]: blocked syscall: stat64

Both bob & scarlet are aarch64 with TPM2.0 (so, trunks), so something must have changed for that arch in the underlying libs. I wonder why subsequent runs succeeded - maybe fixed already?
Cc: ejcaruso@chromium.org
Status: WontFix (was: Untriaged)
http://stainless/search?view=list&first_date=2018-11-15&last_date=2018-11-16&hostname=%5Echromeos6-row6-rack8-host10%24&exclude_cts=false&exclude_not_run=false&exclude_non_release=false&exclude_au=true&exclude_acts=true&exclude_retried=true&exclude_non_production=false lists all of the tests run on one of the DUTs; I've been able to use this before to find earlier snapshots of logs.

But even going back to http://stainless/browse/chromeos-autotest-results/257738490-chromeos-test/ , the logs are truncated. I at least see references to trunksd crash files there, but it doesn't look like any of them got saved.

No luck with the other host, either.

It's the CQ, so it's possible someone was trying to land a broken change. I scanned through the changes at http://cros-goldeneye/chromeos/healthmonitoring/buildDetails?buildbucketId=8929733737596891872 but didn't see anything obvious, though.

I mostly just filed this to try to make sure this wasn't a problem that Tast can work around when running tests. WontFix-ing since it doesn't seem like there's anything actionable here. Thanks for the help looking at it!
Components: -Tests>Tast OS>Kernel>TPM
Labels: Hotlist-TreeCloser
Status: Assigned (was: WontFix)
I think that this is happening again, in both Autotest and Tast tests:

bob-paladin: http://cros-goldeneye/chromeos/healthmonitoring/buildDetails?buildbucketId=8929315887814567024, http://stainless/browse/chromeos-autotest-results/259439666-chromeos-test/

scarlet-paladin: http://cros-goldeneye/chromeos/healthmonitoring/buildDetails?buildbucketId=8929315749720015040, http://stainless/browse/chromeos-autotest-results/259439657-chromeos-test/

There are a bunch of attestationd and tpm_managerd crashes. I've attached some stack traces.

The platform.TPMResponsive Tast test, which just runs the "tpm_version" command and checks for "Version Info" in its output, is failing, so the problem is presumably lower-level than Chrome:

2018/11/20 12:51:28 Started test platform.TPMResponsive
2018/11/20 12:51:56 [12:51:56.418] External command failed: exit status 134
2018/11/20 12:51:56 [12:51:56.418] Command: tpm_version
2018/11/20 12:51:56 [12:51:56.419] Uncaptured output:
[WARNING:trunks_factory_impl.cc(158)] Failed to initialize the trunks IPC proxy; trunksd is not ready.
[FATAL:trunks_client.cc(353)] Check failed: factory.Initialize(). Failed to initialize trunks factory.

Aborted (core dumped)
2018/11/20 12:51:56 [12:51:56.419] Error at tpm_responsive.go:28: Failed to run tpm_version: exit status 134

---

I don't know why trunksd is unhappy, though.
Summary: Login tests failing due to trunksd not running (was: Login tests timed out while waiting for cryptohome mount with lots of attestationd and tpm_managerd crashes)
Cc: mnissler@chromium.org
From messages in http://stainless/browse/chromeos-autotest-results/259439666-chromeos-test/ :

2018-11-20T20:20:26.057619+00:00 INFO trunksd[1347]: TPM: /dev/tpm0 opened successfully
2018-11-20T20:20:26.058723+00:00 WARNING trunksd[1347]: libminijail[1347]: allowing syscall: socket
2018-11-20T20:20:26.058963+00:00 WARNING trunksd[1347]: libminijail[1347]: allowing syscall: connect
2018-11-20T20:20:26.059061+00:00 WARNING trunksd[1347]: libminijail[1347]: allowing syscall: gettimeofday
2018-11-20T20:20:26.059143+00:00 WARNING trunksd[1347]: libminijail[1347]: allowing syscall: send
2018-11-20T20:20:26.059227+00:00 WARNING trunksd[1347]: libminijail[1347]: allowing syscall: writev
2018-11-20T20:20:26.061391+00:00 WARNING trunksd[1347]: libminijail[1347]: logging seccomp filter failures
2018-11-20T20:20:26.062291+00:00 ERR trunksd[1347]: libminijail[1347]: blocked syscall: stat64
2018-11-20T20:20:26.072261+00:00 INFO tpm_managerd[1386]: Starting TPM Manager...
2018-11-20T20:20:26.076387+00:00 NOTICE u2fd[1401]: Exiting, no device policy
2018-11-20T20:20:26.089374+00:00 INFO crash_reporter[1395]: libminijail[1395]: mount '/dev/log' -> '/dev/log' type '' flags 0x1001
2018-11-20T20:20:26.113063+00:00 WARNING crash_reporter[1395]: Could not load the device policy file.
2018-11-20T20:20:26.113815+00:00 WARNING crash_reporter[1395]: [user] Received crash notification for trunksd[1347] sig 6, user 251 group 251 (developer build - not testing - always dumping)
...
2018-11-20T20:20:26.208656+00:00 WARNING session_manager[1437]: [WARNING:libpolicy.cc(38)] Could not load the device policy file.
2018-11-20T20:20:26.219063+00:00 INFO crash_reporter[1395]: Stored minidump to /var/spool/crash/trunksd.20181120.202026.1347.dmp
2018-11-20T20:20:26.220744+00:00 INFO crash_reporter[1395]: Leaving core file at /proc/self/fd/5/trunksd.20181120.202026.1347.core due to developer image
2018-11-20T20:20:26.231571+00:00 INFO minijail0[1497]: libminijail[1497]: not running init loop, exiting immediately
2018-11-20T20:20:26.234942+00:00 WARNING kernel: [   13.496069] init: trunksd main process (1347) killed by ABRT signal
2018-11-20T20:20:26.234964+00:00 WARNING kernel: [   13.496106] init: trunksd main process ended, respawning
...
2018-11-20T20:20:26.399440+00:00 WARNING kernel: [   13.661554] init: trunksd main process (1503) killed by ABRT signal
2018-11-20T20:20:26.399479+00:00 WARNING kernel: [   13.661654] init: trunksd main process ended, respawning
2018-11-20T20:20:26.428916+00:00 INFO trunksd[1548]: TPM: /dev/tpm0 opened successfully
2018-11-20T20:20:26.429318+00:00 WARNING trunksd[1548]: libminijail[1548]: allowing syscall: socket
2018-11-20T20:20:26.429342+00:00 WARNING trunksd[1548]: libminijail[1548]: allowing syscall: connect
2018-11-20T20:20:26.429352+00:00 WARNING trunksd[1548]: libminijail[1548]: allowing syscall: gettimeofday
2018-11-20T20:20:26.429360+00:00 WARNING trunksd[1548]: libminijail[1548]: allowing syscall: send
2018-11-20T20:20:26.429371+00:00 WARNING trunksd[1548]: libminijail[1548]: allowing syscall: writev
2018-11-20T20:20:26.429648+00:00 WARNING trunksd[1548]: libminijail[1548]: logging seccomp filter failures
2018-11-20T20:20:26.429884+00:00 INFO trunksd[1548]: Trunks service started.
2018-11-20T20:20:26.430269+00:00 ERR trunksd[1548]: libminijail[1548]: blocked syscall: stat64
...

I don't know much about anything, but does stat64 or fstat64 need to be added to the seccomp policy? I see this in platform2/trunks:

% grep stat64 *.policy
trunksd-seccomp-arm.policy:fstat64: 1
trunksd-seccomp-mips.policy:fstat64: 1
trunksd-seccomp-x86.policy:fstat64: 1
trunksd-seccomp-x86.policy:stat64: 1
Re #6: and this is bob (which uses arm.policy, as does scarlet). Looks like something has indeed changed, and arm boards now call stat64 in some cases. I will submit a CL to add stat64 to trunksd-seccomp-arm.policy. Not sure if I should do it for other architectures, besides x86 and arm. So far, only bob and scarlet are failing.
Status: Started (was: Assigned)
Thanks! I'm happy to LGTM.
Summary: Login tests failing due to trunksd not running: blocked syscall stat64 on arm (was: Login tests failing due to trunksd not running)
Submitted https://crrev.com/c/1344657 since we know stat64 is being called for arm, so at least this change is needed anyways.

But given that we don't know what change led to stat64 being occasionally called, and it only happens infrequently: 

Is there any pattern like "tests fail only after running after Repair job on the device" or "only if there were no reboots since powerwash"? I'd like to repro to make sure we don't miss other unexpected syscalls.
Re patterns, I'm not sure. cautotest's UI for displaying a host's history is user-hostile enough that I always give up before I'm able to find the information that I'm looking for. :-(

If you want to try, the failing test run on chromeos6-row4-rack13-host5 started at 2018/11/20 12:50:57 and the run on chromeos6-row2-rack7-host15 started at 2018/11/20 12:51:25.
Just a guess, nevertheless: Maybe this change in the set of employed syscalls is caused by the recent glibc update ( bug 834385 )?
> is caused by the recent glibc update

Yes, quite likely. But we seem to lack a test that'd hit the branch with stat64 from trunksd consistently :(

I don't see where we directly work with file paths in trunksd (besides /dev/tpm0), so if I had to guess, I'd say some code in libchrome/libbrillo - e.g. logger or some dbus handling, or even minijail->Enter - coupled with this glibc update...

If only there was a way for libs like libchrome to list syscalls it needs for various sub-APIs, so that the users can include those in their seccomp filters. 

Otherwise, we risk releasing a version that occasionally crashes like here (say, when it needs to switch to a new logfile or something) but we don't catch it in tests. Or, we should stop using those libs from daemons with seccomp filters or become intimately familiar with what exact system functions they call.
Cc: vapier@chromium.org
RE comment 12: Yes, this situation seems risky, but it looks like that's the official policy in Chrome OS - it's probably expected that the autotest coverage would be sufficient to catch such errors, and that developers are monitoring autotests results for any SIGSYS-related crashes; see Mike's response to our question here:

https://chromium-review.googlesource.com/c/chromiumos/platform2/+/1163715/1/diagnostics/init/diagnosticsd.conf#37

+vapier@
Project Member

Comment 14 by bugdroid1@chromium.org, Nov 21

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform2/+/9bb0baf5fc0f37ea5889e7cabaef8a80b72744b2

commit 9bb0baf5fc0f37ea5889e7cabaef8a80b72744b2
Author: Andrey Pronin <apronin@chromium.org>
Date: Wed Nov 21 13:52:58 2018

trunks: allow stat64 for arm and mips

Add stat64 to the list of allowed syscalls for arm and mips.

BUG=chromium:906138
TEST=emerge-bob trunks;
     check that trunksd-seccomp.policy now contains stat64

Change-Id: Ib84cf23b14776b718629efbe300f5332fc532e62
Reviewed-on: https://chromium-review.googlesource.com/1344657
Commit-Ready: Andrey Pronin <apronin@chromium.org>
Tested-by: Andrey Pronin <apronin@chromium.org>
Reviewed-by: Dan Erat <derat@chromium.org>

[modify] https://crrev.com/9bb0baf5fc0f37ea5889e7cabaef8a80b72744b2/trunks/trunksd-seccomp-mips.policy
[modify] https://crrev.com/9bb0baf5fc0f37ea5889e7cabaef8a80b72744b2/trunks/trunksd-seccomp-arm.policy

Sign in to add a comment