CrOS VM logs are captured only some of the time (also cleanup old logs) |
|||
Issue descriptionSee https://chromium-review.googlesource.com/c/chromium/src/+/1193944/6 where a CL kept failing the vm sanity test. Chrome was crashing, and only one of the test runs had the vm logs: https://isolateserver.appspot.com/browse?namespace=default-gzip&hash=3eb20d0a20213f0d487b9c7ffcd34fd68c7219fd From there you can see one of the chrome logs w/ the source of the browser crash: https://isolateserver.appspot.com/browse?namespace=default-gzip&digest=36873fea20303b22607e48344098bec00e582b74&as=chrome_20180828-114804 Unfortunately, all the other task runs didn't have any logs, eg: https://isolateserver.appspot.com/browse?namespace=default-gzip&hash=adc857b53ab87de928795a68c13accef21a78d95 I should figure out why we seem to be capturing logs only some of the time. Might be a race condition when the test reaches its timeout. Also cleaning up some of the old logs wouldn't be a bad idea prior to starting the test.
,
Aug 28
> Are we re-using the VM between test runs? A fresh VM should have minimal logs. Yeah. Until a new VM image get published, we reuse the same one. (Though we start/stop it between tests, so it's just files on disk that get persisted.) > I can have the test runner delete logs before each run? That would definitely be helpful. Not sure what the cleanest way to do that is tho. If there's no "flush_logs" utility, then maybe just a "rm *" in the log dirs?
,
Aug 28
Yup, I'll just rm the logs. We do something similar between test runs for other state files: https://cs.chromium.org/chromium/src/third_party/catapult/telemetry/telemetry/core/cros_interface.py?l=585-586
,
Sep 5
That sounds like what we want. Could you add that to cros_run_vm_test? I'm also seeing some VMs that have GBs worth of chrome crashes in /var/spool/crash/, causing the VM to run out of disk space. I bet some change that makes chrome crash gets tested and retried in the CQ, causing those crash dumps to pile up in the VMs.
,
Sep 27
I think this problem should be solved with the copy-on-write CL which we will use on the chrome bots. Devs are probably ok with logs accumulating.
,
Sep 27
,
Sep 28
Yep, log cleanup should be a nonissue (for chrome at least) after bug 887753 is done. |
|||
►
Sign in to add a comment |
|||
Comment 1 by achuith@chromium.org
, Aug 28