New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 896903 link

Starred by 2 users

Issue metadata

Status: Verified
Owner:
Last visit > 30 days ago
Closed: Oct 24
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug
Build-Toolchain



Sign in to add a comment

security_SandboxLinuxUnittests failed on eve with glibc 2.27

Project Member Reported by yunlian@chromium.org, Oct 18

Issue description

It reports test failure on eve. When I run
/var/cache/chromeos-chrome/chrome-src-internal/src/out_eve/Release/sandbox_linux_unittests inside chroot,
it says
#0 0x5557ce5c0d2f (/var/cache/chromeos-chrome/chrome-src-internal/src/out_eve/Release/sandbox_linux_unittests+0xcdd2e)

[  FAILED  ] SandboxBPF.SigBus (3 ms)
[249/249] SandboxBPF.SigBus (3 ms)
1 test failed:
    SandboxBPF.SigBus (../../../../../chromeos-cache/distfiles/target/chrome-src-internal/src/sandbox/linux/integration_tests/bpf_dsl_seccomp_unittest.cc:677)
Tests took 2 seconds.

 
Cc: michael....@intel.corp-partner.google.com
Cc: palmer@chromium.org jorgelo@chromium.org
Components: Internals>Sandbox
Cc: allenwebb@chromium.org
Labels: -Pri-3 Pri-2
Can you provide more information about what build it is failing on and any local changes that might be present on the system?

Did it only happen once or does it happen every time?

I don't see failures for even on stainless for that test, so I am wondering if maybe you changed something locally that is triggering that failure (like changing the kernel or libc, etc.)
Cc: mpdenton@chromium.org rsesek@chromium.org
+ some Linux experts.
This failure happens when I upgrade my glibc from glibc 2.23 to glibc 2.27. So yes, we changed the libc.
Could you please give some ideas about how to triage the problem? Or I can send you document about how to reproduce it locally.(it may take several hours to rebuild everything with gliibc 2.27)
I get the output of strace when running this test with glibc 2.27 and 2.23

On 2.27
[pid   343] fcntl(5, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
[pid   344] set_robust_list(0x7fed20e97320, 24 <unfinished ...>
[pid   343] poll([{fd=5, events=POLLIN|POLLRDHUP}], 1, 60000 <unfinished ...>
[pid   344] <... set_robust_list resumed> ) = 0
[pid   344] dup2(6, 2)                  = 2
[pid   344] close(5)                    = 0
[pid   344] close(6)                    = 0
[pid   344] rt_sigaction(SIGALRM, {sa_handler=0x55dc66bdae30, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7fed20875540}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
[pid   344] rt_sigprocmask(SIG_UNBLOCK, [ALRM], NULL, 8) = 0
[pid   344] alarm(30)                   = 0
[pid   344] prlimit64(0, RLIMIT_CORE, {rlim_cur=0, rlim_max=0}, NULL) = 0
[pid   344] prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, NULL) = -1 EFAULT (Bad address)
[pid   344] openat(AT_FDCWD, "/proc/", O_RDONLY|O_CLOEXEC|O_DIRECTORY) = 5
[pid   344] seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_TSYNC, NULL) = -1 EFAULT (Bad address)
[pid   344] newfstatat(5, "self/task/", {st_mode=S_IFDIR|0555, st_size=0, ...}, 0) = 0
[pid   344] close(5)                    = 0
[pid   344] rt_sigaction(SIGSYS, {sa_handler=0x55dc66c71a20, sa_mask=[], sa_flags=SA_RESTORER|SA_NODEFER|SA_SIGINFO, sa_restorer=0x7fed20875540}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
[pid   344] rt_sigprocmask(SIG_UNBLOCK, [SYS], NULL, 8) = 0
[pid   344] prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0) = 0
[pid   344] seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_TSYNC, {len=16, filter=0x7ffffd7fc5e0}) = 0
[pid   344] socketpair(AF_UNIX, SOCK_STREAM, 0, [1724843113, 21980]) = 53
[pid   344] --- SIGSYS {si_signo=SIGSYS, si_code=SYS_SECCOMP, si_errno=ENOENT, si_call_addr=0x7fed1f6bc34a, si_syscall=__NR_socketpair, si_arch=AUDIT_ARCH_X86_64} ---
[pid   344] rt_sigprocmask(SIG_BLOCK, [BUS], NULL, 8) = 0
[pid   344] socketpair(AF_UNIX, SOCK_STREAM, 0, [5, 6]) = 0
[pid   344] rt_sigreturn({mask=[]})     = 0
[pid   344] rt_sigaction(SIGBUS, {sa_handler=0x55dc66bb8940, sa_mask=[], sa_flags=SA_RESTORER|SA_SIGINFO, sa_restorer=0x7fed20875540}, NULL, 8) = 13
[pid   344] --- SIGSYS {si_signo=SIGSYS, si_code=SYS_SECCOMP, si_errno=ENOENT, si_call_addr=0x7fed2087565d, si_syscall=__NR_rt_sigaction, si_arch=AUDIT_ARCH_X86_64} ---
[pid   344] rt_sigprocmask(SIG_BLOCK, [BUS], NULL, 8) = 0
[pid   344] rt_sigaction(SIGBUS, {sa_handler=0x55dc66bb8940, sa_mask=[], sa_flags=SA_RESTORER|SA_SIGINFO, sa_restorer=0x7fed20875540}, NULL, 8) = 0
[pid   344] rt_sigreturn({mask=[]})     = 0
[pid   344] rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [], 8) = 0
[pid   344] getpid()                    = 39
[pid   344] --- SIGSYS {si_signo=SIGSYS, si_code=SYS_SECCOMP, si_errno=ENOENT, si_call_addr=0x7fed208753a9, si_syscall=__NR_getpid, si_arch=AUDIT_ARCH_X86_64} ---
[pid   343] <... poll resumed> )        = 1 ([{fd=5, revents=POLLHUP}])
[pid   344] +++ killed by SIGSYS +++
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=344, si_uid=144797, si_status=SIGSYS, si_utime=0, si_stime=0} ---


on glibc 2.23

[pid   343] close(6)                    = 0
[pid   343] fcntl(5, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
[pid   344] set_robust_list(0x7f0ac4376a20, 24 <unfinished ...>
[pid   343] poll([{fd=5, events=POLLIN|POLLRDHUP}], 1, 60000 <unfinished ...>
[pid   344] <... set_robust_list resumed> ) = 0
[pid   344] dup2(6, 2)                  = 2
[pid   344] close(5)                    = 0
[pid   344] close(6)                    = 0
[pid   344] rt_sigaction(SIGALRM, {sa_handler=0x55d0fd05d200, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f0ac3d5ae80}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
[pid   344] rt_sigprocmask(SIG_UNBLOCK, [ALRM], NULL, 8) = 0
[pid   344] alarm(30)                   = 0
[pid   344] setrlimit(RLIMIT_CORE, {rlim_cur=0, rlim_max=0}) = 0
[pid   344] prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, NULL) = -1 EFAULT (Bad address)
[pid   344] open("/proc/", O_RDONLY|O_CLOEXEC|O_DIRECTORY) = 5
[pid   344] seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_TSYNC, NULL) = -1 EFAULT (Bad address)
[pid   344] newfstatat(5, "self/task/", {st_mode=S_IFDIR|0555, st_size=0, ...}, 0) = 0
[pid   344] close(5)                    = 0
[pid   344] rt_sigaction(SIGSYS, {sa_handler=0x55d0fd0fc8f0, sa_mask=[], sa_flags=SA_RESTORER|SA_NODEFER|SA_SIGINFO, sa_restorer=0x7f0ac3d5ae80}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
[pid   344] rt_sigprocmask(SIG_UNBLOCK, [SYS], NULL, 8) = 0
[pid   344] prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0) = 0
[pid   344] seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_TSYNC, {len=16, filter=0x7ffdeb110f90}) = 0
[pid   344] socketpair(AF_UNIX, SOCK_STREAM, 0, [-351202000, 32765]) = 53
[pid   344] --- SIGSYS {si_signo=SIGSYS, si_code=SYS_SECCOMP, si_errno=ENOENT, si_call_addr=0x7f0ac2c4483a, si_syscall=__NR_socketpair, si_arch=AUDIT_ARCH_X86_64} ---
[pid   344] rt_sigprocmask(SIG_BLOCK, [BUS], NULL, 8) = 0
[pid   344] socketpair(AF_UNIX, SOCK_STREAM, 0, [5, 6]) = 0
[pid   344] rt_sigreturn({mask=[]})     = 0
[pid   344] rt_sigaction(SIGBUS, {sa_handler=0x55d0fd017200, sa_mask=[], sa_flags=SA_RESTORER|SA_SIGINFO, sa_restorer=0x7f0ac3d5ae80}, NULL, 8) = 13
[pid   344] --- SIGSYS {si_signo=SIGSYS, si_code=SYS_SECCOMP, si_errno=ENOENT, si_call_addr=0x7f0ac3d5af81, si_syscall=__NR_rt_sigaction, si_arch=AUDIT_ARCH_X86_64} ---
[pid   344] rt_sigprocmask(SIG_BLOCK, [BUS], NULL, 8) = 0
[pid   344] rt_sigaction(SIGBUS, {sa_handler=0x55d0fd017200, sa_mask=[], sa_flags=SA_RESTORER|SA_SIGINFO, sa_restorer=0x7f0ac3d5ae80}, NULL, 8) = 0
[pid   344] rt_sigreturn({mask=[]})     = 0
[pid   344] tgkill(344, 344, SIGBUS)    = 234
[pid   344] --- SIGSYS {si_signo=SIGSYS, si_code=SYS_SECCOMP, si_errno=ENOENT, si_call_addr=0x7f0ac3d5ad39, si_syscall=__NR_tgkill, si_arch=AUDIT_ARCH_X86_64} ---
[pid   344] rt_sigprocmask(SIG_BLOCK, [BUS], NULL, 8) = 0
[pid   344] tgkill(344, 344, SIGBUS)    = 0
[pid   344] rt_sigreturn({mask=[]})     = 0
[pid   344] --- SIGBUS {si_signo=SIGBUS, si_code=SI_TKILL, si_pid=344, si_uid=144797} ---
[pid   344] write(6, "U", 1)            = 1
[pid   344] --- SIGSYS {si_signo=SIGSYS, si_code=SYS_SECCOMP, si_errno=ENOENT, si_call_addr=0x7f0ac3d59fd0, si_syscall=__NR_write, si_arch=AUDIT_ARCH_X86_64} ---
[pid   344] write(6, "U", 1)            = 1
[pid   344] rt_sigreturn({mask=[BUS]})  = 1
[pid   344] rt_sigreturn({mask=[]})     = 0
[pid   344] read(5, "", 1)              = 0
[pid   344] --- SIGSYS {si_signo=SIGSYS, si_code=SYS_SECCOMP, si_errno=ENOENT, si_call_addr=0x7f0ac2c52a0c, si_syscall=__NR_read, si_arch=AUDIT_ARCH_X86_64} ---
[pid   344] rt_sigprocmask(SIG_BLOCK, [BUS], NULL, 8) = 0
[pid   344] read(5, "U", 1)             = 1
[pid   344] rt_sigreturn({mask=[]})     = 1
[pid   344] close(5)                    = 3
[pid   344] --- SIGSYS {si_signo=SIGSYS, si_code=SYS_SECCOMP, si_errno=ENOENT, si_call_addr=0x7f0ac3d5a090, si_syscall=__NR_close, si_arch=AUDIT_ARCH_X86_64} ---
[pid   344] rt_sigprocmask(SIG_BLOCK, [BUS], NULL, 8) = 0
[pid   344] close(5)                    = 0
[pid   344] rt_sigreturn({mask=[]})     = 0
[pid   344] close(6)                    = 3
[pid   344] --- SIGSYS {si_signo=SIGSYS, si_code=SYS_SECCOMP, si_errno=ENOENT, si_call_addr=0x7f0ac3d5a090, si_syscall=__NR_close, si_arch=AUDIT_ARCH_X86_64} ---
[pid   344] rt_sigprocmask(SIG_BLOCK, [BUS], NULL, 8) = 0
[pid   344] close(6)                    = 0
[pid   344] rt_sigreturn({mask=[]})     = 0
[pid   344] exit_group(42)              = 231
[pid   344] --- SIGSYS {si_signo=SIGSYS, si_code=SYS_SECCOMP, si_errno=ENOENT, si_call_addr=0x7f0ac2c11cc8, si_syscall=__NR_exit_group, si_arch=AUDIT_ARCH_X86_64} ---
[pid   344] rt_sigprocmask(SIG_BLOCK, [BUS], NULL, 8) = 0
[pid   344] exit_group(42)              = ?
[pid   344] +++ exited with 42 +++



For glibc 2.27, this seems to point to a problem in using getpid, which we have seen in other places.


[pid   344] getpid()                    = 39
[pid   344] --- SIGSYS {si_signo=SIGSYS, si_code=SYS_SECCOMP, si_errno=ENOENT, si_call_addr=0x7fed208753a9, si_syscall=__NR_getpid, si_arch=AUDIT_ARCH_X86_64} ---







Cc: -palmer@chromium.org
this test is a bit harder than system daemons since it's specifically testing sandboxing/seccomp logic, so there are a lot of expected SIGSYS failures in there, which means you have to figure out which SIGSYS failure *shouldn't* be in there :).  which Luis most likely did with the getpid failure.
Yes, the sandboxing logic here works as follows: when calling a system call (say, socketpair), we first receive a SIGSYS from seccomp. The sig handler will then mark SIGBUS as blocked, and recall the system call (socketpair) from a whitelisted location, that seccomp will allow, so the system call succeeds. Then on return from the sig handler, rt_sigreturn is called.

I think the glibc bug in question is https://sourceware.org/bugzilla/show_bug.cgi?id=15368. It changes the implementation of the "raise" system call to block all signals, and call getpid/gettid, and then finally call raise. Then I suppose getpid triggers a SIGSYS that kills the program. I'm not sure why.
Oh, silly me, it kills the process because all signals are blocked, but then calling getpid() causes a SIGSYS signal to be generated, which kills the program. Since none of the problems listed in the glibc bug apply to us, we can probably switch to using pthread_kill(pthread_self(), SIGBUS), or kill(getpid(), SIGBUS), instead of raise(SIGBUS).

Can you let me know if that fixes the problem?
mpdenton@ thanks, changing raise(SIGBUS) to kill(getpid(), SIGBUG) works for glibc 2.27, I will test that whether it works with glibc 2.23.
It works with glibc 2.23 too, I will prepare a patch for that.
Project Member

Comment 17 by bugdroid1@chromium.org, Oct 23

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/76af09195e66f80dd60b9c4e356c4bf8aa2c8567

commit 76af09195e66f80dd60b9c4e356c4bf8aa2c8567
Author: Yunlian Jiang <yunlian@google.com>
Date: Tue Oct 23 22:15:14 2018

Fix security_SandboxLinuxUnittests with glibc 2.27

glibc 2.27 changed the implementation of the "raise" system call
to block all signals and call getpid/gettid, and then finally call raise.
This implementation breaks our test, so we use 'kill' instead.

BUG= chromium:896903 
TEST=sandbox_linux_unittests passes on host with glibc 2.23 and glibc 2.27.

Change-Id: I4a037989cddb29d7dff2741f3abd03947feef8b2
Reviewed-on: https://chromium-review.googlesource.com/c/1294845
Reviewed-by: Robert Sesek <rsesek@chromium.org>
Commit-Queue: Yunlian Jiang <yunlian@chromium.org>
Cr-Commit-Position: refs/heads/master@{#602129}
[modify] https://crrev.com/76af09195e66f80dd60b9c4e356c4bf8aa2c8567/sandbox/linux/integration_tests/bpf_dsl_seccomp_unittest.cc

Status: Verified (was: Untriaged)

Sign in to add a comment