New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 796626 link

Starred by 2 users

Issue metadata

Status: WontFix
Owner:
Last visit > 30 days ago
Closed: Aug 30
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug

Blocked on:
issue 820664



Sign in to add a comment

Container pool host mounts are not getting cleaned up

Project Member Reported by kenobi@chromium.org, Dec 20 2017

Issue description

On server35.cbf.

Compare the outputs of the following commands:
$ ls -d /usr/local/autotest/containers/container.*
$ ls -d /usr/local/autotest/containers/host/container.??????

Directories that exist in the host dir that don't correspond to directories in the containers dir, are orphaned host mounts.
(e.g. /usr/local/autotest/containers/host/container.1xXjbW)

A quick check reveals that at least some of these mount points are still bound:
$ mount | grep 1xXjbW
/usr/local/autotest/results/shared on /usr/local/autotest/containers/host/container.1xXjbW/usr/local/autotest/results/shared type none (rw,bind)
/etc/chrome-infra on /usr/local/autotest/containers/host/container.1xXjbW.ro/etc/chrome-infra type none (ro,bind)
/usr/local/autotest/containers/host/container.1xXjbW.ro/etc/chrome-infra on /usr/local/autotest/containers/host/container.1xXjbW/etc/chrome-infra type none (ro,bind)
/creds/service_accounts on /usr/local/autotest/containers/host/container.1xXjbW.ro/creds/service_accounts type none (ro,bind)
/usr/local/autotest/containers/host/container.1xXjbW.ro/creds/service_accounts on /usr/local/autotest/containers/host/container.1xXjbW/creds/service_accounts type none (ro,bind)
/usr/local/autotest/site-packages on /usr/local/autotest/containers/host/container.1xXjbW.ro/usr/local/autotest/site-packages type none (ro,bind)
/usr/local/autotest/containers/host/container.1xXjbW.ro/usr/local/autotest/site-packages on /usr/local/autotest/containers/host/container.1xXjbW/usr/local/autotest/site-packages type none (ro,bind)
/usr/local/autotest/results/162294527-chromeos-test/chromeos2-row11-rack1-host6 on /usr/local/autotest/containers/host/container.1xXjbW/usr/local/autotest/results/162294527-chromeos-test type none (rw,bind)


This is bad.  Container host mounts need to be cleaned up; lingering/orphaned host mounts will stick around, using up resources.
 

Comment 1 by kenobi@chromium.org, Dec 21 2017

Here is what I would recommend: containers are currently assigned an ID when the pool transfers ownership to a test process.  Modify that code to write the ContainerId not just into the container, but also into the host dir.  That way we'll be able to track each orphaned host dir back to the test job where it originated, and look up the logs.
Status: Assigned (was: Untriaged)

Comment 3 by jkop@chromium.org, Mar 7 2018

Status: Started (was: Assigned)
Project Member

Comment 4 by bugdroid1@chromium.org, Mar 8 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/4d75dc4004c0a57b2f4868eb1b1c61972b54b763

commit 4d75dc4004c0a57b2f4868eb1b1c61972b54b763
Author: Jacob Kopczynski <jkop@google.com>
Date: Thu Mar 08 06:08:26 2018

lxc: track host dir for debugging purposes

Write the ContainerID into the host dir when it's handed to the test
job, to track sources of leftover host mounts which failed to be cleaned up.

BUG= chromium:796626 
TEST=tryjob

Change-Id: I7673e1d7b974b234f604b1e0c94d4b87a5620b91
Reviewed-on: https://chromium-review.googlesource.com/954064
Commit-Ready: Jacob Kopczynski <jkop@chromium.org>
Tested-by: Jacob Kopczynski <jkop@chromium.org>
Reviewed-by: Ilja H. Friedel <ihf@chromium.org>

[modify] https://crrev.com/4d75dc4004c0a57b2f4868eb1b1c61972b54b763/site_utils/lxc/container.py

Project Member

Comment 5 by bugdroid1@chromium.org, Mar 10 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/a5f91bcc5e440faa251c53eddf2d9934155f593d

commit a5f91bcc5e440faa251c53eddf2d9934155f593d
Author: Jacob Kopczynski <jkop@chromium.org>
Date: Sat Mar 10 00:53:57 2018

Revert "lxc: track host dir for debugging purposes"

This reverts commit 4d75dc4004c0a57b2f4868eb1b1c61972b54b763.

Reason for revert: Breaks cleanup on moblab

Original change's description:
> lxc: track host dir for debugging purposes
> 
> Write the ContainerID into the host dir when it's handed to the test
> job, to track sources of leftover host mounts which failed to be cleaned up.
> 
> BUG= chromium:796626 
> TEST=tryjob
> 
> Change-Id: I7673e1d7b974b234f604b1e0c94d4b87a5620b91
> Reviewed-on: https://chromium-review.googlesource.com/954064
> Commit-Ready: Jacob Kopczynski <jkop@chromium.org>
> Tested-by: Jacob Kopczynski <jkop@chromium.org>
> Reviewed-by: Ilja H. Friedel <ihf@chromium.org>

Bug:  chromium:796626 
Change-Id: I0ff6a37329000da51d3d7910c041de80bd42c7c8
Reviewed-on: https://chromium-review.googlesource.com/957982
Reviewed-by: Jacob Kopczynski <jkop@chromium.org>
Commit-Queue: Jacob Kopczynski <jkop@chromium.org>
Tested-by: Jacob Kopczynski <jkop@chromium.org>

[modify] https://crrev.com/a5f91bcc5e440faa251c53eddf2d9934155f593d/site_utils/lxc/container.py

Comment 6 by jkop@chromium.org, Mar 12 2018

Blockedon: 820664
Need to investigate the cause of a moblab failure to roll this out safely.
Components: Infra>Client>ChromeOS>Test
Components: -Infra>Client>ChromeOS
Status: WontFix (was: Started)
Obsolete

Sign in to add a comment