New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 715848 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: May 2017
Cc:
Components:
EstimatedDays: 2
NextAction: ----
OS: Linux
Pri: 2
Type: Bug

Blocking:
issue 699331



Sign in to add a comment

Xvfb doesn't play nice with a remote_run-managed TEMPDIR

Project Member Reported by dpranke@chromium.org, Apr 27 2017

Issue description

See bug 708681 for background. remote_run attempts to set TEMPDIR and TMPDIR in order to manage where the recipe-launched processes are writing to. Apparently Xvfb (one of those processes) wants to make sure that a file actually in /tmp exists, and so attempts to create a link from /tmp/foo to TEMPDIR/foo . Which works fine, as long as TEMPDIR is on the same filesystem. But, on some machines, it won't be, and that breaks.

See:

https://luci-logdog.appspot.com/v/?s=chromium%2Fbb%2Fchromium.fyi%2FSite_Isolation_Linux%2F16115%2F%2B%2Frecipes%2Fsteps%2Fwebkit_tests%2F0%2Fstdout

for the errors.
 

Comment 1 by iannu...@google.com, Apr 27 2017

Yep, this will be an issue on all the various swarming stuffs too.
Cc: tansell@chromium.org
Project Member

Comment 3 by bugdroid1@chromium.org, Apr 27 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/build/+/fb4b3222dc1a951b98720c877f8b622c71857481

commit fb4b3222dc1a951b98720c877f8b622c71857481
Author: Dirk Pranke <dpranke@chromium.org>
Date: Thu Apr 27 00:20:49 2017

Disable kitchen on Site Isolation Linux for now.

kitchen sets TEMPDIR and TMPDIR in order to attempt to make sure
that its subprocesses are only writing to directories it can manage.
However, if you run Xvfb, it appears to want to create something
that exists in a hard-coded /tmp/foo location, and so it seems to
try and create a hard link from /tmp/foo to TEMPDIR/foo. Which
doesn't work when TEMPDIR is on a different filesystem.

This CL disables kitchen for now, until we can come up with a fix.

TBR=dnj@chromium.org, tansell@chromium.org
BUG=708681,  715848 

Change-Id: I54700551de824c8c6fc8ec3d5e4d74ee24c260eb
Reviewed-on: https://chromium-review.googlesource.com/488133
Reviewed-by: Dirk Pranke <dpranke@chromium.org>
Commit-Queue: Dirk Pranke <dpranke@chromium.org>

[modify] https://crrev.com/fb4b3222dc1a951b98720c877f8b622c71857481/scripts/slave/remote_run.py

Comment 4 by iannu...@google.com, Apr 27 2017

Blocking: 699331

Comment 5 by d...@chromium.org, Apr 27 2017

https://bugs.launchpad.net/ubuntu/+source/xorg-server/+bug/1033179

Looks like it's known, and nobody cares. I am not sure upstream fix here will be timely or useful.
It looks like the bug is confirmed, but no one's patched it; they may just
take a patch.

Comment 7 by d...@chromium.org, Apr 27 2017

I'm more concerned about the larger process:
- Prepare / submit / defend a patch.
- Get new XVFB rolled into a package.
- Update the package on all of our testing bots.

This is before we can run Kitchen on any of them. Another bug: https://bugs.launchpad.net/ubuntu/+source/xorg-server/+bug/972324

Maybe there is an XVFB flag that we can explicitly set...

Comment 8 by d...@chromium.org, Apr 27 2017

Problem value is hard-coded here: https://cgit.freedesktop.org/xorg/xserver/tree/xkb/ddxLoad.c#n62

Throughout this, though, I see a lot of hardcoded "/var/run/..." etc. We're not going to get a good quarantine even if we fix the TEMPDIR problem. Perhaps we should approach this from the other side and have the hosted Xvfb use a hardcoded "/tmp"?

Or possibly probe "/tmp" from "xkb" or the native Xorg configuration?

https://chromium.googlesource.com/chromium/src/+/master/third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/port/linux.py#154

Comment 9 by no...@chromium.org, Apr 27 2017

Cc: friedman@chromium.org
can we mount /tmp to the same partition where we run bots? It might be a good change in general. We avoid using native temp dir because it is a on a small drive.
+friedman

Comment 10 by d...@chromium.org, Apr 27 2017

Perhaps something with "unionfs"? The problem is that we would need to bootstrap our build process in some sort of isolated container so the filesystem mount is limited to that process. We can do that with cgroups or Docker, but it's a far larger change :(

I'm not worried about /tmp size here. Xvfb is pretty versatile and works on any number of platforms, embedded and otherwise, so I expect it to use /tmp responsibly enough.
Thats not a bad idea, especially on VMs where sdb is the disk with all the churn, but it is a fairly large one-off.  What happens if you launch xvfb with a custom env var like: TEMPDIR=/tmp xvfb?  Can you give me some commands to replicate this failure that I can play with?

Comment 12 by d...@chromium.org, Apr 27 2017

The entire problem here is that it's ignoring those variables in some places.
I see.  I suppose puppet could mount over /tmp with something from sdb, but I'm concerned about the existing fds there... /tmp is important.  Is /tmp/$foo $foo predefined?  We could pre-mount that first.  There's also pivot_root which might help but ... yikes?  chroot?
Other options include;

 * Use xvnc instead of Xvfb (I actually suggest we should switch to using PyVirtualDisplay module which supports xvnc, xvfb and xeypher).
 * Using a chroot.
 * LD_PRELOAD


Comment 15 by no...@chromium.org, Apr 28 2017

from yesterday offline discussion between iannucci@ and dnj@: since xvfb hardcodes /tmp and /var, we could wrap invocation of xvfb with a wrapper that sets TEMPDIR back to hardcoded /tmp.

I find this solution least invasive.

Comment 16 by d...@chromium.org, Apr 28 2017

It's already wrapped in a wrapper (liked in #8), so we just need to tweak that.
Project Member

Comment 17 by bugdroid1@chromium.org, May 4 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/9cc81c66fbe4e06f84a5d8647faf1a2fab5ecf38

commit 9cc81c66fbe4e06f84a5d8647faf1a2fab5ecf38
Author: dnj <dnj@chromium.org>
Date: Thu May 04 17:48:04 2017

webkitpy: Override TEMPDIR for Xvfb.

Hard-code TEMPDIR/TMPDIR for "Xvfb" to "/tmp" in webkitpy's Linux layout
test port. The Xvfb tool's "xkb" library compiles in "/tmp" in several
places, and hard-linking expectations get violated if TEMPDIR and "/tmp"
are on different filesystems.

BUG= chromium:715848 
TEST=local
  - Ran without patch and TEMPDIR/TMPDIR overrides, verified hardlink
    failure from bug.
  - Ran with patch, no more hardlink failure!

Review-Url: https://codereview.chromium.org/2859703002
Cr-Commit-Position: refs/heads/master@{#469378}

[modify] https://crrev.com/9cc81c66fbe4e06f84a5d8647faf1a2fab5ecf38/third_party/WebKit/Tools/Scripts/webkitpy/common/system/executive_mock.py
[modify] https://crrev.com/9cc81c66fbe4e06f84a5d8647faf1a2fab5ecf38/third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/breakpad/dump_reader_win_unittest.py
[modify] https://crrev.com/9cc81c66fbe4e06f84a5d8647faf1a2fab5ecf38/third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/port/linux.py
[modify] https://crrev.com/9cc81c66fbe4e06f84a5d8647faf1a2fab5ecf38/third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/port/linux_unittest.py
[modify] https://crrev.com/9cc81c66fbe4e06f84a5d8647faf1a2fab5ecf38/third_party/WebKit/Tools/Scripts/webkitpy/tool/commands/auto_rebaseline_unittest.py
[modify] https://crrev.com/9cc81c66fbe4e06f84a5d8647faf1a2fab5ecf38/third_party/WebKit/Tools/Scripts/webkitpy/tool/commands/rebaseline_unittest.py

Project Member

Comment 18 by bugdroid1@chromium.org, May 4 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/build/+/37b16786dee83e4e72e8be310c68d46d3da7b19c

commit 37b16786dee83e4e72e8be310c68d46d3da7b19c
Author: Daniel Jacques <dnj@chromium.org>
Date: Thu May 04 19:03:55 2017

Revert "Disable kitchen on Site Isolation Linux for now."

This reverts commit fb4b3222dc1a951b98720c877f8b622c71857481.

Reason for revert: Xvfb issue believed to be fixed.

Original change's description:
> Disable kitchen on Site Isolation Linux for now.
> 
> kitchen sets TEMPDIR and TMPDIR in order to attempt to make sure
> that its subprocesses are only writing to directories it can manage.
> However, if you run Xvfb, it appears to want to create something
> that exists in a hard-coded /tmp/foo location, and so it seems to
> try and create a hard link from /tmp/foo to TEMPDIR/foo. Which
> doesn't work when TEMPDIR is on a different filesystem.
> 
> This CL disables kitchen for now, until we can come up with a fix.
> 
> TBR=dnj@chromium.org, tansell@chromium.org
> BUG=708681,  715848 
> 
> Change-Id: I54700551de824c8c6fc8ec3d5e4d74ee24c260eb
> Reviewed-on: https://chromium-review.googlesource.com/488133
> Reviewed-by: Dirk Pranke <dpranke@chromium.org>
> Commit-Queue: Dirk Pranke <dpranke@chromium.org>
> 

TBR=iannucci@chromium.org,dpranke@chromium.org,dnj@chromium.org,tansell@chromium.org
# Not skipping CQ checks because original CL landed > 1 day ago.
BUG=708681,  715848 

Change-Id: Ie6e07e7044623c5bc4a7fb91f21f94d4c774b9f7
Reviewed-on: https://chromium-review.googlesource.com/495852
Reviewed-by: Daniel Jacques <dnj@chromium.org>
Commit-Queue: Daniel Jacques <dnj@chromium.org>

[modify] https://crrev.com/37b16786dee83e4e72e8be310c68d46d3da7b19c/scripts/slave/remote_run.py

Comment 19 by efoo@chromium.org, May 4 2017

Labels: LUCI-M1-S1 REQBY-LUCI-M3-ClosedBeta LUCI-M1-Dev1
Owner: d...@chromium.org
Status: Assigned (was: Available)
Assigned to dnj to update. 

Marked for LUCI-M1-S1

Comment 20 by d...@chromium.org, May 4 2017

Status: Fixed (was: Assigned)
After the landed CL, a Kitchen-run version of this build seems to be working!

https://luci-milo.appspot.com/buildbot/chromium.fyi/Site%20Isolation%20Linux/16342

Comment 21 by efoo@chromium.org, May 5 2017

EstimatedDays: 2

Comment 22 by efoo@chromium.org, May 6 2017

Labels: -LUCI-M1-Dev1 -REQBY-LUCI-M3-ClosedBeta -LUCI-M1-S1

Sign in to add a comment