New issue
Advanced search Search tips

Issue 921700 link

Starred by 4 users

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment

Failure running 'pwd' in vm.CrostiniStartEverything on octopus/grunt

Project Member Reported by jkardatzke@chromium.org, Jan 14

Issue description

This is a new test failure that is showing up in 11578.0.0...however, it may have been introduced as early as 11559.0.0 because there was another problem causing failures between them.  The failure is only occurring on grunt and octopus.

The error message is:

Failed to run pwd: exit status 1, Failure dumping container log: exit status 1

It *may* be from this CL: https://chromium-review.googlesource.com/c/chromiumos/platform2/+/1403857/

The termina version did not change across these versions, so it won't be on that end.

@smbarber, do you think that CL could be the cause?

 
octopus and grunt are also the only boards on 4.14 host kernels, I believe.

https://chromium.googlesource.com/chromiumos/third_party/kernel/+/5ac122cffbbf5ba4228a4d670ddfdbbe22618bd5 looks potentially relevant.
Confirmed on grunt that reverting kernel/v4.14 5ac122cffbbf5ba4228a4d670ddfdbbe22618bd5 makes Terminal usable again and vm.CrostiniStartEverything pass.
Cc: lepton@chromium.org
Labels: -Pri-2 Pri-1
From some quick hacking on the __vsock_bind_stream() function in the kernel, it seems that port numbers greater than 2^16 are being truncated somewhere (changing the original code to 'static u32 port = 65536;' triggers the same apparent issue).  I'm guessing that the port gets truncated to 16 bits somewhere along the way, and the random port number is quite likely to be greater than 65536.

I also don't totally understand the rationale for the change - if I'm reading it correctly, the random starting port number is just chosen once per kernel boot (port is a static variable), then gets incremented once per bind(). This function gets called on the host kernel, at least in the way that Crostini uses it, so the randomization only happens once. (Maybe this is intended to randomize the port numbers inside the VM? I don't think we bind() in that case in Crostini.)
The original change is for fixing this case: vm guest crashed, it could leave some orphan sockets in host side, then if vm guest tried to make a connection to same host port, it will hit that orphan sockets and connection will fail. To make
the start port randomization, then next time vm guest would likely choose a different port and then the connection can pass through.

BTW, how this impact crostini? could you share more log so I can understand why this cause failure of crostini?
I found the bug - it's actually in Crostini-specific code. 
 src/platform2/vm_tools/vsh/vshd.cc casts the port number to uint16_t, which causes the truncation as mentioned above.

The kernel change just happens to tickle this because the random number generator will mostly pick ports outside of the 0..2^16 range, whereas previously the ports always started at 1024 and wouldn't normally exceed the 2^16 range that would cause problems.
dverkamp@ are you going to do a fix for that?
smbarber@ has a fix in the CQ: https://crrev.com/c/1410063
Also note that this will need a component update before the fix takes effect, since the change is in vshd, which is inside the Termina VM rootfs.
Project Member

Comment 9 by bugdroid1@chromium.org, Jan 16

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform2/+/15cbcb25169cb1839e8810c2c8be579acb03deb3

commit 15cbcb25169cb1839e8810c2c8be579acb03deb3
Author: Stephen Barber <smbarber@chromium.org>
Date: Wed Jan 16 03:50:14 2019

vm_tools: vsh: cast forwarding port to uint32_t

vsock ports are uint32_t, not uint16_t.

BUG=chromium:921700
TEST=vm.CrostiniStartEverything on octopus/grunt

Change-Id: I5b7150f3801d102bcfaed8609a0faa0b20a2ec7d
Reviewed-on: https://chromium-review.googlesource.com/1410063
Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com>
Tested-by: Daniel Verkamp <dverkamp@chromium.org>
Reviewed-by: Daniel Verkamp <dverkamp@chromium.org>

[modify] https://crrev.com/15cbcb25169cb1839e8810c2c8be579acb03deb3/vm_tools/vsh/vshd.cc

Sign in to add a comment