New issue
Advanced search Search tips

Issue 750860 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner:
Closed: Aug 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 3
Type: Bug

Blocking:
issue 707006



Sign in to add a comment

More aggressive cleaning of isolated_cache may be necessary

Project Member Reported by jeffcarp@chromium.org, Jul 31 2017

Issue description

Via alert "build14-b1 SlaveFreeDiskSpaceVeryLow"

On build14-b1 there were a bunch of files like below that were taking up a ton of space:

/b/android_0748c4920059f1a7/swarming/isolated_cache/750b52cfe8539cf22444151fa378694e73d56fad

Many of these files hadn't been modified recently, most of them not since May.

The bot has 7 similar android_* directories in /b/:

drwxr-xr-x  5 root       root        4096 May 15 17:56 android_03849495003bfd06
drwxr-xr-x  5 root       root        4096 May 15 17:56 android_0627a579003b6718
drwxr-xr-x  5 root       root        4096 May 15 17:56 android_06a978f9003b6d2c
drwxr-xr-x  5 root       root        4096 May 15 17:56 android_0748c4920059f1a7
drwxr-xr-x  5 root       root        4096 May 15 17:56 android_0accf87325995bd0
drwxr-xr-x  5 root       root        4096 May 15 17:56 android_0d88d39743e4b8df
drwxr-xr-x  5 root       root        4096 May 15 17:56 android_0d88db4443e4b0d5

Cleaning out all files from /b/android_*/swarming/isolated_cache/* that were modified more than 7 days ago reduced the disk usage by 84% to 11%.

 
Components: -Infra Infra>Platform>Swarming
Status: Available (was: Untriaged)

Comment 3 by mar...@chromium.org, Aug 21 2017

Labels: OS-Android
Owner: bpastene@chromium.org
Status: Assigned (was: Available)
Looks like an issue with the disk space for dockerized bots? In practice, since the maximum size of each cache is 50Gb, the disk should be at least N*50Gb. So for N=7, this means 350Gb.

If the disk is lower than ~400Gb, the simplest solution is to change the maximum size setting in the relevant bot_config:
https://github.com/luci/luci-py/blob/master/appengine/swarming/swarming_bot/config/bot_config.py#L108

Assigning to Ben so he can decide the route to take.
Status: WontFix (was: Assigned)
The disks on our bots are ~450 GB, which should be plenty of space. It looks like an old slave + src/ checkout was taking up additional space and was cleared out. /b/README should mentions this.

Comment 5 by mar...@chromium.org, Aug 22 2017

That's odd, normally bot_config looks for these and deletes them. Oh well.
It's outside the container, so the bot wouldn't be able to see it. Containerization: it works!

Sign in to add a comment