Bob-paladin: logging_CrashSender fails in policy signature verification |
||||
Issue descriptionIt happens at 3 successive builders: https://uberchromegw.corp.google.com/i/chromeos/builders/bob-paladin/builds/3797 https://uberchromegw.corp.google.com/i/chromeos/builders/bob-paladin/builds/3798 https://uberchromegw.corp.google.com/i/chromeos/builders/bob-paladin/builds/3799 In 3798 & 3799, logging_CrashSender failed both and the second retry lasts too long and gets aborted. Assign to sheriff to investigate whether there's recent changes making it flaky. cc gardener & ARC Constable. If it continues to fail logging_CrashSender, bob will be moved to experimental.
,
Aug 13
Let's keep this bug about logging_CrashSender, the login_RetrieveActiveSessions may be something completely different. For logging_CrashSender, we find this in the log when trying to mock a crash sending: 20:56:37 DEBUG| crash_sender stdout/stderr: [0812/205607:ERROR:device_policy_impl.cc(712)] Signature does not match the data or can not be verified! [0812/205607:ERROR:device_policy_impl.cc(752)] Policy signature verification failed! There's also a ~30-50 sec delay every time, which is not there normally for the test, and ultimately cumulates up to time out the suite. I think I narrowed this down to crash_sender calling 'metrics_library -c', which tries to check device policy (via libpolicy) for consent to send stats. As far as I can tell, this message appears when /var/lib/whitelist/owner.key and /var/lib/whitelist/policy do not match. It looks like the test is installing its own versions of these files in https://cs.corp.google.com/chromeos_public/src/third_party/autotest/files/client/cros/crash/crash_test.py?g=0&l=195 to make sure that doesn't happen. Apparently something there breaks here, but I found no recent changes in the test, metrics_library or libpolicy that would be an obvious culprit. This policy stuff is really way out of my area and I don't really know how any of it works or who's even working on it. Dan, looks like you have been reviewing most libpolicy changes recently. Can you find a suitable owner for this?
,
Aug 13
Note that there also seems to be a disk space exhaustion issue due to a kernel warning firing 10 times a second on Bob (and Kevin, but Kevin has bigger disks) right now. I think it's quite possible that these weird failures are fallout from that (e.g. some daemon not being able to write something it wants to and then freaking out about it...). I filed it as issue 873822 , so if we can't get any further here maybe we should wait and see if fixing that magically resolves things.
,
Aug 13
Sorry, I don't know much about this code. Adding some people who probably do.
,
Aug 20
The failure turned into a timeout in the same test: https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?id=2859620 However, I can see the symptoms of issue 873687 in /var/log/messages, so possibly related too.
,
Aug 20
^^ I mean the symptoms of issue 873822 .
,
Sep 4
logging_CrashSender looks green right now, closing WontFix. |
||||
►
Sign in to add a comment |
||||
Comment 1 by xixuan@chromium.org
, Aug 13