Mac builder bots aren't clearing versioned ".breakpad" files out of the build directory |
||||||||
Issue descriptionThese builders aren't doing clobber builds. While extracting the builds linked from http://crbug.com/750176#c4 I noticed a bunch of inflating: full-build-mac/Google Chrome Framework-61.0.3143.0.breakpad inflating: full-build-mac/Google Chrome Framework-61.0.3144.0.breakpad inflating: full-build-mac/Google Chrome Framework-61.0.3145.0.breakpad inflating: full-build-mac/Google Chrome Framework-61.0.3146.0.breakpad inflating: full-build-mac/Google Chrome Framework-61.0.3147.0.breakpad inflating: full-build-mac/Google Chrome Framework-61.0.3148.0.breakpad inflating: full-build-mac/Google Chrome Framework-61.0.3149.0.breakpad inflating: full-build-mac/Google Chrome Framework-61.0.3150.0.breakpad inflating: full-build-mac/Google Chrome Framework-61.0.3151.0.breakpad inflating: full-build-mac/Google Chrome Framework-61.0.3152.0.breakpad inflating: full-build-mac/Google Chrome Framework-61.0.3153.0.breakpad inflating: full-build-mac/Google Chrome Framework-61.0.3154.0.breakpad inflating: full-build-mac/Google Chrome Framework-61.0.3155.0.breakpad inflating: full-build-mac/Google Chrome Framework-61.0.3156.0.breakpad inflating: full-build-mac/Google Chrome Framework-61.0.3157.0.breakpad inflating: full-build-mac/Google Chrome Framework-61.0.3158.0.breakpad These aren't shipped or analysed, but each one is about 300MB. So it means, in this example, 8.7GB of the 11GB that the builder packages up in `full-build-mac.zip` is taken up by obsolete .breakpad files. (does this then get shipped around a bunch of perf/tester bots? That would probably slow things down a bunch).
,
Aug 3 2017
Annie/Ned -- does this bump up the priority of our discussions around crash stacks/better apis from chrome/solder this morning? Seems like a pretty decent storage hit...
,
Aug 3 2017
I think this is not very related. The fix here seems to be making sure that the builder can clean up the old files properly? I remembered Stephen wrote a fix a while ago, so this bug may no longer be relevant.
,
Aug 3 2017
,
Aug 3 2017
Cleanup disk isn't running on these machines. There's a larger fix to this that agable@ was working on, which would fix cleanup disk so it doesn't fail when the machine reboots while it's running. I have a CL out to not reboot our builders after every build, since I don't think we need to do that.
,
Aug 8 2017
When I had to disable my minidump unittests it was because there were breakpad files in there that shouldn't be and there wasn't when I needed them. It was very flaky. This could be due to cleanup issues. It also makes it hard to determine what breakpad file to grab when trying to symbolize a crash dump. I just feel like this feeds into the vein of the crash reports not being where we think they are and not cleaned up properly. Maybe I should try the unittest again to see if things are in a better state.
,
Aug 25 2017
Stephen - assigning to you so we don't have to see this in the untriaged queue.
,
Nov 9 2017
This is still failing. The builders are breaking again, see https://uberchromegw.corp.google.com/i/chromium.perf/builders/Mac%20Builder/builds/136316 agable@, do you know if your fix was every deployed?
,
Nov 10 2017
So... cleanup_disk hasn't run on these bots in a long time. I've manually ran this on several machines, and it's deleted a bunch of old log files, but none of the actual bad breakpad files. I don't understand that. I also don't understand why these files are being created in the first place. They really shouldn't be here....
,
Nov 10 2017
I found the problem. https://chrome-internal-review.googlesource.com/c/infra/infra_internal/+/434433 was landed a while ago, which stopped cleanup_disk from touching output directories. This is a problem for us. Although we really shouldn't be generating this many files.
,
Nov 10 2017
I ssh-ed to all the bots and removed the breakpad files. So we're safe for a few months. But we really should solve this problem of generating so many breakpad files.
,
Dec 27 2017
We can't really revert the CL in #10, because it was landed for valid reasons. I'm going to make a "small revert" of it, which only deletes breakpad files that are too old.
,
Dec 27 2017
We can't really revert the CL in #10, because it was landed for valid reasons. I'm going to make a "small revert" of it, which only deletes breakpad files that are too old. https://chromium-review.googlesource.com/c/chromium/tools/build/+/845037
,
Jun 19 2018
The following revision refers to this bug: https://chrome-internal.googlesource.com/infra/infra_internal/+/0425201a1fe664bb693f0cda2da2fcb4e2129bd3 commit 0425201a1fe664bb693f0cda2da2fcb4e2129bd3 Author: Stephen Martinis <martiniss@google.com> Date: Tue Jun 19 00:08:14 2018
,
Jul 2
Issue 854950 has been merged into this issue.
,
Jul 2
Whoops, I forgot to land a CL to push this out to prod. Sorry!
,
Jul 2
The following revision refers to this bug: https://chrome-internal.googlesource.com/infra/puppet/+/68f997930e65b5ff112ed6f50cee17002b8a3ce5 commit 68f997930e65b5ff112ed6f50cee17002b8a3ce5 Author: Stephen Martinis <martiniss@google.com> Date: Mon Jul 02 23:43:09 2018
,
Jul 2
The following revision refers to this bug: https://chrome-internal.googlesource.com/infra/puppet/+/4d3806cce432cc6ce1b6a7a07636323127c04073 commit 4d3806cce432cc6ce1b6a7a07636323127c04073 Author: Stephen Martinis <martiniss@google.com> Date: Mon Jul 02 23:54:44 2018
,
Jul 3
#18 promotes this to stable. I manually ran cleanup disk on a few bots, and it deleted 26 GB of stuff. So it should help. IIRC it runs at about 2 AM PST, so by tomorrow we should be fine on these alerts.
,
Jul 3
Attached is a graph of machine disk usage on chromium.perf. Looks like it made a big dent!
,
Jul 10
|
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by benhenry@chromium.org
, Aug 3 2017Components: Infra>Client>Perf
Labels: -Pri-3 -Performance-Sheriff Pri-2