New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 899484 link

Starred by 2 users

Issue metadata

Status: Duplicate
Merged: issue 888212
Owner:
Closed: Oct 30
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug

Blocking:
issue 897240



Sign in to add a comment

crosvm crash while doing a large amount of I/O in Crostini

Project Member Reported by cylee@chromium.org, Oct 27

Issue description

I encountered the problem when I'm developing disk I/O benchmark for Crostini. I tried to use fio to measure disk write speed, but it often leads to container crash. It doesn't occur 100% time, but I found it's more likely to happen if I run fio repeatedly for multiple times, or writing a large amount of data. Finally I come up with the step which could reproduce it with a high probability.

1) Create a whole new container instance and get a container shell.
2) sudo apt-get install fio
3) fio -filename=fio_test_data -size=1G -bs=1m -readwrite=write -ioengine=libaio -iodepth=1 -direct=1 -name=test -loops=15
(It repeatedly write to a 1G file for 15 times, if omit the -loops option it's unlikely to reproduce the error).

Result:
Starting 1 process
test: Laying out IO file(s) (1 file(s) / 1024MB)
[ERROR:utils.cc(50)] Failed to read message size from socket: Resource temporarily unavailable
                                                                                              [ERROR:vsh_client.cc(181)] Failed to receive message from server: Resource temporarily unavailable

After that, the crosvm processes is died so I can't get into the container anymore.

localhost /mnt/stateful_partition/fio # vsh --vm_name=termina --target_container=penguin --owner_id=7ca726520bacb99197e033e54850c6ea9f73a304
[ERROR:vsh.cc(214)] Failed to accept connection from daemon: Resource temporarily unavailable

To reproduce the problem, I made a simple Tast test:
https://chromium-review.googlesource.com/c/chromiumos/platform/tast-tests/+/1303774


FWIW, there're some other error messages which happens seldom when running fio. For example, I once see the error message (A previous run of the command is successful):
fio: io_u error on file fio_test_data: Read-only file system: write offset=297795584, buflen=4194304

After the VM is crashed I tried to restart the VM and it seems the system is in an unstable state and may be able to restart. I once see
"failed to mount stateful: Input/output error"
when trying to restart the VM. 

Also the bash auto-complete in the container shell once give me
testuser@penguin:~$ ls -bash: cannot create temp file for here-document: Read-only file system
-bash: cannot create temp file for here-document: Read-only file system
after running a failed fio
 
Blocking: 897240
Note that while I say creating a new container, I use the Tast test to create a new VM/container. The problem mostly occurs if the Tast program is running. For example, after creating the container by Tast, whether I run fio programmatically or get a vsh shell and run fio manually (and leave the Tast program running by time.Sleep()) the problem happens. If I get the vsh shell and run fio after Tast exits, the problem almost never happen. So it could also be some interaction with Tast which causes the problem.
Owner: dverkamp@chromium.org
Status: Assigned (was: Untriaged)
Labels: -Pri-2 M-72 Pri-1
FYI, here's the relevant log dumped from /var/log/messages

2018-10-29T19:13:41.624657+08:00 INFO vm_cicerone[30690]: Received LaunchVshd request
2018-10-29T19:14:27.290833+08:00 NOTICE temp_logger[31057]: 00:79 02:32 03:31 04:33 05:33 06:82 07:32
2018-10-29T19:14:50.893840+08:00 INFO crash_reporter[31086]: libminijail[31086]: mount '/dev/log' -> '/dev/log' type '' flags 0x1001
2018-10-29T19:14:50.908731+08:00 WARNING crash_reporter[31086]: [user] Received crash notification for crosvm[30702] sig 31, user 299 group 299 (developer build - not testing - always dumping)
2018-10-29T19:14:50.910029+08:00 INFO crash_reporter[31086]: State of crashed process [30702]: D (disk sleep)
2018-10-29T19:14:50.910459+08:00 INFO crash_reporter[31086]: Accessing crash dir '/var/spool/crash' via symlinked handle '/proc/self/fd/6'
2018-10-29T19:14:50.910760+08:00 WARNING crash_reporter[31086]: Crash directory /var/spool/crash already full with 32 pending reports
2018-10-29T19:14:50.910813+08:00 ERR crash_reporter[31086]: Could not create crash directory
2018-10-29T19:14:50.910842+08:00 ERR crash_reporter[31086]: Unable to find/create process-specific crash path
2018-10-29T19:14:50.982475+08:00 ERR localhos[30697]: crosvm[4]: [src/linux.rs:929] child 7 died: signo 17, status 31, code 2
2018-10-29T19:14:51.214267+08:00 ERR localhos[30697]: crosvm[4]: [devices/src/proxy.rs:247] failed to shutdown child device process: IO error configuring proxy device Connection refused (os error 111).
2018-10-29T19:14:51.214297+08:00 ERR localhos[30701]: crosvm[1]: [devices/src/proxy.rs:104] error: child device process failed send: Connection refused (os error 111)
2018-10-29T19:14:51.215631+08:00 INFO localhos[30697]: crosvm[4]: [src/main.rs:548] crosvm has exited normally
2018-10-29T19:14:51.224484+08:00 ERR localhos[30695]: 9s[2]: [9s/src/main.rs:149] error while handling client 214:1025: Connection reset by peer (os error 104)
2018-10-29T19:14:51.317012+08:00 WARNING localhos[30697]: crosvm[4]: [src/main.rs:708] not all child processes have exited; sending SIGKILL
2018-10-29T19:14:51.317514+08:00 INFO vm_concierge[30670]:  Process 4 killed by signal 9
2018-10-29T19:14:51.318204+08:00 INFO vm_concierge[30670]: process 5 did not exit from signal 9 in 0 seconds
2018-10-29T19:14:51.318944+08:00 INFO seneschal[30669]: Received request to stop server
2018-10-29T19:14:51.319402+08:00 INFO seneschal[30669]: Process 2 killed by signal 15
2018-10-29T19:14:51.328465+08:00 INFO vm_concierge[30670]:  Process 5 exited with status 0
2018-10-29T19:14:51.481617+08:00 INFO vm_concierge[30670]:  Process 12 exited with status 0

The crash looks like a seccomp failure; it's possible this is  https://crbug.com/888212  which I saw some time ago but wasn't able to reproduce.  I'm trying your test case now to see if it can trigger the issue.  If you happen to reproduce it again, please see if you can get a backtrace from the coredump in /var/spool/crash so we can verify which syscall is failing and what is calling it.

The read-only filesystem issue may be  https://crbug.com/899273  - do you see anything in /var/log/messages after that happens?
I got plenty of core dumps at https://x20.corp.google.com/users/cy/cylee/crosvm_crash  :)
Not sure if they're all the same issue, but at least crosvm.20181030.015337.4564.* is generated by the same cause as the one reproducible by my CL above (A failed fio with exit status 240.)
By running my CL above, I can actually reproduce it 100% of the time.

The readonly filesystem issue happens rarely. I don't have a deterministic way to reproduce it yet.
if you see crash_reporter saying the failure mode is "sig 31", then it's almost guaranteed to be a seccomp failure (sig 31 is SIGSYS).  if you run `minidump_dump` on one of those coredumps, you can find the syscall #.

looking at the rax register (which is how the syscall # is passed), i see 2 which is __NR_open on x86_64.  so i guess find out what path is trying to be opened by the code (assuming open() is not permitted in the seccomp filter).
cylee@, could you please try with this patch and see if you can still reproduce the crash? https://chromium-review.googlesource.com/c/chromiumos/platform/crosvm/+/1305756

I'm not sure this is the best fix, but it should at least work around the problem from https://bugs.chromium.org/p/chromium/issues/detail?id=888212 - if you could also verify that the backtrace looks like the one in that bug, that would be appreciated (I don't have the corresponding symbols for the minidumps).
Thanks for the explanation and patch .

vapier@, I found that the path address is stored in rsi but not knowing how to find the corresponding content at that address from dump file yet (if possible).
 
dverkamp@,
  I applied the patch and rebuilt crosvm. It looks now it fails on another system call (sys_madvise?):

MDRawContextAMD64
  p1_home       = 0x0
  p2_home       = 0x0
  p3_home       = 0x0
  p4_home       = 0x0
  p5_home       = 0x0
  p6_home       = 0x0
  context_flags = 0x10000f
  mx_csr        = 0x0
  cs            = 0x33
  ds            = 0x0
  es            = 0x0
  fs            = 0x0
  gs            = 0x0
  ss            = 0x2b
  eflags        = 0x246
  dr0           = 0x0
  dr1           = 0x0
  dr2           = 0x0
  dr3           = 0x0
  dr6           = 0x0
  dr7           = 0x0
  rax           = 0x1c
  rcx           = 0x7bbea3b89487
  rdx           = 0x4
  rbx           = 0x7bbbe40d2090
  rsp           = 0x7bbbe8b4ba08
  rbp           = 0x7bbbe8b4ba80
  rsi           = 0x5000
  rdi           = 0x7bbbe40f3000
  r8            = 0x0
  r9            = 0x9d120000
  r10           = 0x25f70
  r11           = 0x246
  r12           = 0x7bbbe4000000
  r13           = 0x5000
  r14           = 0x7bbbe4000020
  r15           = 0x7bbbe4000020
  rip           = 0x7bbea3b89487
Can you try adding 'madvise: 1' to the seccomp filter as well? If that works, I will update the patch to include it.
Just noticed that the x86_64 seccomp filter already includes a line for madvise - I've updated the patch to add MADV_DONTNEED as well, which is used in the same glibc malloc cleanup path as the open() call that was the original problem.
i don't think that's the crashing thread -- rax is set to 0xfffffffffffffe00 there.  look at the MDRawContextAMD64 that comes after the MDException, not the per-thread context.

also, the syscall convention puts the string argument in rdi, not rsi.  so we have:
MDRawContextAMD64
  rax           = 0x2              __NR_open
  rdx           = 0x8000           mode (where it makes sense)
  rsi           = 0x80000          flags
  rdi           = 0x7ede64389b20   pathname
decoding flags says this is O_RDONLY|O_CLOEXEC, so mode is going to be ignored.

for the pathname, we have to look at what is mapped at that address.  the minidump tells us:
7ede64211000-7ede643b2000 r-xp 00000000 fd:00 16402                      /lib64/libc-2.23.so

so it's most likely a constant string that is +0x178b20 bytes into one of the C library's loaded ELF segments.  the minidump wouldn't have include that memory by design.  but if you load the coredump into gdb, you should be able to dump the memory at the address and get the string back.

wild guess it's "/dev/log".
Daniel, After your latest patch I can't reproduce the error anymore. Great!

Mike, Thanks for the help here.
Mergedinto: 888212
Status: Duplicate (was: Assigned)
Great, thanks for testing.

I am going to merge this into crbug/888212 since it is the same root cause; please also follow crbug/899273 for the other issue (read-only filesystem).

Sign in to add a comment