Mac builders running out of disk space on chromium.perf |
||||||||
Issue descriptionThe mac builders are running out of disk space. They have 36GB of breakpad dumps on disk. I don't know where these are coming from. The large dumps have filenames like this: 311M Google Chrome Framework-60.0.3081.0.breakpad 311M Google Chrome Framework-60.0.3082.0.breakpad 312M Google Chrome Framework-60.0.3083.0.breakpad 312M Google Chrome Framework-60.0.3084.0.breakpad 312M Google Chrome Framework-60.0.3085.0.breakpad 312M Google Chrome Framework-60.0.3086.0.breakpad 312M Google Chrome Framework-60.0.3087.0.breakpad 312M Google Chrome Framework-60.0.3088.0.breakpad 313M Google Chrome Framework-60.0.3089.0.breakpad 313M Google Chrome Framework-60.0.3090.0.breakpad 313M Google Chrome Framework-60.0.3091.0.breakpad 313M Google Chrome Framework-60.0.3092.0.breakpad 313M Google Chrome Framework-60.0.3094.0.breakpad 313M Google Chrome Framework-60.0.3095.0.breakpad I'm not sure where these came from. I feel like I've seen this before though.
,
May 19 2017
If those files haven't been accessed for a long time, I think it's safe to just delete them?
,
May 19 2017
Yeah, I agree. I think it should be safe to delete these files. I don't know where they're coming from, though. And I don't know why they aren't automatically cleaned up.
,
May 19 2017
,
May 19 2017
I'm running this on the bots;
'cd /b/c/b/Mac_Builder/src/out/Release && find . -type f -mtime +7 -name '*breakpad' -exec rm '{}' \;'
This should free up space for a while on these bots, at least for the short term.
,
May 19 2017
Ok, I should have stopped the bleeding. http://shortn/_HmWSOV7mcN (internal link) shows the disk usage going down. Now to figure out where these files are coming from, and why they aren't getting deleted.
,
May 19 2017
mark@ or rsesk@, could you help us? We compile official builds on the perf bots, to more accurately get real world characteristics. It looks like official builds generate these breakpad files, which is filling up our disk. It looks like the official builders do clobber builds every time, so they don't run into this problem. Why are these files generated? Is it safe to delete them after every build?
,
May 19 2017
The files are being created during the build by this rule: https://cs.chromium.org/chromium/src/chrome/BUILD.gn?l=1278 I don't know why we put the version number in the output file, but presumably it is safe to delete the files with older version numbers on the perf bots.
,
May 19 2017
It's fine to delete these.
,
May 20 2017
It's definitely safe to delete these. The version number is in the output file only because that's what GYP did; I don't know if there's a good reason for it.
,
May 22 2017
Do we have any automatic cleaner step on the builder recipe? It would be great if we can just scan all the files in out/ and remove ones that haven't been touched for a long time.
,
May 22 2017
We do have scripts that run that clean the builders in a fairly ad-hoc way. We could add rules for something like this. The long term vision is to move to ephemeral builders that are recycled regularly (which we're already doing for linux swarming bots), but we don't have an ETA for this for the whole fleet.
,
May 23 2017
+Dirk: do you have pointer to the rules. I would want to add something like:
for all file in out directory:
if os.stat(file_path).st_atime < 3 days ago:
remove file_path
,
May 23 2017
I think we could add something in here: https://chrome-internal.googlesource.com/infra/infra_internal.git/+/master/go/src/infra_internal/tools/cleanup_disk/cmd/cleanup_disk/main.go#31 I'm not an expert on that code though.
,
May 23 2017
Actually, there's already a rule which should be deleting these files, I think. So maybe there's a bug...
,
May 23 2017
cc-ing people. It looks like the
{Path: `/b/c/b/*/src/out/Release*/*`, MaxAge: tenHours},
line in cleanup disk isn't deleting files like.
/b/c/b/Mac_Builder/src/out/Release/Google Chrome Framework-60.0.3097.0.breakpad
Any ideas?
,
May 23 2017
I was curious if the regex was bad. https://play.golang.org/p/8BWsZEICM8 seems to say it should find the file.
,
May 23 2017
Ok, I found the problem. The binary that's on the bots is old, because a manual pin hasn't been updated for several months. I'll work on doing that.
,
May 23 2017
The following revision refers to this bug: https://chrome-internal.googlesource.com/infra/infra_internal/+/a4b2cbe85459c24f5a2eb79f0724510e1053b172 commit a4b2cbe85459c24f5a2eb79f0724510e1053b172 Author: Stephen Martinis <martiniss@google.com> Date: Tue May 23 21:18:59 2017
,
May 24 2017
Ok, I landed a change to cleanup_disk. It's canaried, so we have to wait until it runs. As far as I can tell, it runs once a day, at 0:25 in the morning. So I'll see how the affected bots are tomorrow, and then push to stable.
,
May 24 2017
For reference: http://shortn/_OPxxw1j1fv is a graph showing all bots disk usage. http://shortn/_FC4IFSUKli is a graph showing fyi bot disk usage.
,
May 25 2017
,
May 25 2017
Ok, I've landed a couple changes which should make it so the bots don't fill up their disk anymore. http://shortn/_V5YO0AURBa shows the graph where I manually deleted the big files. I'll monitor the bots going forward to see if the disk usage grows again.
,
Jun 12 2017
The following revision refers to this bug: https://chrome-internal.googlesource.com/infra/puppet/+/d2ca657a9356945f6cfc4f39dae2db30e30cbf4d commit d2ca657a9356945f6cfc4f39dae2db30e30cbf4d Author: Stephen Martinis <martiniss@google.com> Date: Mon Jun 12 22:54:00 2017
,
Aug 1 2017
,
Jan 22 2018
|
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by martiniss@chromium.org
, May 19 2017