New issue
Advanced search Search tips

Issue 724570 link

Starred by 3 users

Issue metadata

Status: Archived
Owner:
Closed: May 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

Mac builders running out of disk space on chromium.perf

Project Member Reported by martiniss@chromium.org, May 19 2017

Issue description

The mac builders are running out of disk space. They have 36GB of breakpad dumps on disk. I don't know where these are coming from.

The large dumps have filenames like this:

311M    Google Chrome Framework-60.0.3081.0.breakpad
311M    Google Chrome Framework-60.0.3082.0.breakpad
312M    Google Chrome Framework-60.0.3083.0.breakpad
312M    Google Chrome Framework-60.0.3084.0.breakpad
312M    Google Chrome Framework-60.0.3085.0.breakpad
312M    Google Chrome Framework-60.0.3086.0.breakpad
312M    Google Chrome Framework-60.0.3087.0.breakpad
312M    Google Chrome Framework-60.0.3088.0.breakpad
313M    Google Chrome Framework-60.0.3089.0.breakpad
313M    Google Chrome Framework-60.0.3090.0.breakpad
313M    Google Chrome Framework-60.0.3091.0.breakpad
313M    Google Chrome Framework-60.0.3092.0.breakpad
313M    Google Chrome Framework-60.0.3094.0.breakpad
313M    Google Chrome Framework-60.0.3095.0.breakpad

I'm not sure where these came from. I feel like I've seen this before though.
 
These files have been on the builder for a long time I think. See https://luci-milo.appspot.com/buildbot/chromium.perf/Mac%20Builder/85000 for an example of an old build. The files still show up in the 'package build' step.
If those files haven't been accessed for a long time, I think it's safe to just delete them?
Yeah, I agree. I think it should be safe to delete these files.

I don't know where they're coming from, though. And I don't know why they aren't automatically cleaned up.
Cc: -nednguyen@chromium.org nedngu...@google.com
I'm running this on the bots;

'cd /b/c/b/Mac_Builder/src/out/Release && find . -type f -mtime +7 -name '*breakpad' -exec rm '{}' \;'

This should free up space for a while on these bots, at least for the short term.
Ok, I should have stopped the bleeding. http://shortn/_HmWSOV7mcN (internal link) shows the disk usage going down.

Now to figure out where these files are coming from, and why they aren't getting deleted.
Cc: mark@chromium.org rsesek@chromium.org dpranke@chromium.org
mark@ or rsesk@, could you help us? We compile official builds on the perf bots, to more accurately get real world characteristics. It looks like official builds generate these breakpad files, which is filling up our disk. It looks like the official builders do clobber builds every time, so they don't run into this problem.

Why are these files generated? Is it safe to delete them after every build?
The files are being created during the build by this rule:

https://cs.chromium.org/chromium/src/chrome/BUILD.gn?l=1278

I don't know why we put the version number in the output file, but presumably it is safe to delete the files with older version numbers on the perf bots. 

Comment 9 by mark@chromium.org, May 19 2017

It's fine to delete these.
It's definitely safe to delete these. The version number is in the output file only because that's what GYP did; I don't know if there's a good reason for it.
Do we have any automatic cleaner step on the builder recipe? It would be great if we can just scan all the files in out/ and remove ones that haven't been touched for a long time.
We do have scripts that run that clean the builders in a fairly ad-hoc way. We could add rules for something like this.

The long term vision is to move to ephemeral builders that are recycled regularly (which we're already doing for linux swarming bots), but we don't have an ETA for this for the whole fleet.
+Dirk: do you have pointer to the rules. I would want to add something like:

for all file in out directory:
  if os.stat(file_path).st_atime < 3 days ago:
    remove file_path
Actually, there's already a rule which should be deleting these files, I think. So maybe there's a bug...
Cc: sergeybe...@chromium.org jeffcarp@chromium.org
cc-ing people. It looks like the 

{Path: `/b/c/b/*/src/out/Release*/*`, MaxAge: tenHours},

line in cleanup disk isn't deleting files like.

/b/c/b/Mac_Builder/src/out/Release/Google Chrome Framework-60.0.3097.0.breakpad

Any ideas?
I was curious if the regex was bad. https://play.golang.org/p/8BWsZEICM8 seems to say it should find the file.
Owner: martiniss@chromium.org
Status: Started (was: Available)
Ok, I found the problem. The binary that's on the bots is old, because a manual pin hasn't been updated for several months. I'll work on doing that.
Project Member

Comment 19 by bugdroid1@chromium.org, May 23 2017

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infra/infra_internal/+/a4b2cbe85459c24f5a2eb79f0724510e1053b172

commit a4b2cbe85459c24f5a2eb79f0724510e1053b172
Author: Stephen Martinis <martiniss@google.com>
Date: Tue May 23 21:18:59 2017

Ok, I landed a change to cleanup_disk. It's canaried, so we have to wait until it runs. As far as I can tell, it runs once a day, at 0:25 in the morning. So I'll see how the affected bots are tomorrow, and then push to stable.
For reference:

http://shortn/_OPxxw1j1fv is a graph showing all bots disk usage.
http://shortn/_FC4IFSUKli is a graph showing fyi bot disk usage.
Cc: martiniss@chromium.org
 Issue 661661  has been merged into this issue.
Status: Fixed (was: Started)
Ok, I've landed a couple changes which should make it so the bots don't fill up their disk anymore. http://shortn/_V5YO0AURBa shows the graph where I manually deleted the big files. I'll monitor the bots going forward to see if the disk usage grows again.
Project Member

Comment 24 by bugdroid1@chromium.org, Jun 12 2017

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infra/puppet/+/d2ca657a9356945f6cfc4f39dae2db30e30cbf4d

commit d2ca657a9356945f6cfc4f39dae2db30e30cbf4d
Author: Stephen Martinis <martiniss@google.com>
Date: Mon Jun 12 22:54:00 2017

Labels: VerifyIn-61

Comment 26 by dchan@chromium.org, Jan 22 2018

Status: Archived (was: Fixed)

Sign in to add a comment