Issue metadata
Sign in to add a comment
|
Undetected poison change passed CQ for Fuchsia x64 |
||||||||||||||||||
Issue descriptionFiled by sheriff-o-matic@appspot.gserviceaccount.com on behalf of petermayo@google.com content_unittests failing on chromium.linux/Fuchsia x64 Unable to launch content_unittests. Accused CL: https://chromium-review.googlesource.com/1113946 Passed try run here: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/fuchsia_x64/51652 Builders failed on: - Fuchsia x64: https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Fuchsia%20x64 - Fuchsia x64 try: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/fuchsia_x64/51735-51939 and more extending that range (obviously some pass by not running the test, and some fail compile too) (sorry I don't recall the CQ team label)
,
Jun 27 2018
,
Jun 27 2018
Issue 857176 has been merged into this issue.
,
Jun 27 2018
+fdegans who was helpful in reproing the failure and testing
,
Jun 27 2018
Looks like the bot is green again after the revert: https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Fuchsia%20x64 Do you have an idea why the bot was failing? From https://logs.chromium.org/v/?s=chromium%2Fbuildbucket%2Fcr-buildbucket.appspot.com%2F8942539095223463824%2F%2B%2Fsteps%2Fcontent_unittests%2F0%2Fstdout I can't see how my change could cause this error: [ERROR:garnet/bin/appmgr/root_loader.cc(71)] Could not load url: file://content_unittests
,
Jun 27 2018
nor do I, but - reverting makes the error go away - landing it on db409e2 or later causes it to fail - landing it on bb3da314544e64b99dbbfd68042649be3dbd13a1 doesn't to find out you could do a git bisect, or you could use the chromium infrastructure and git cl try -r <rev> for the revs above (please keep the number of tries reasonable, e.g. do n-ary division) Good luck
,
Jun 27 2018
I did some bisecting locally, it looks like landing this CL after https://chromium.googlesource.com/chromium/src/+/1a64677500027ed83a9585040feb8a46438e7c58 is the source of failure. I'm still not sure why though.
,
Jun 27 2018
,
Jun 27 2018
Sheriffs: if this was the result of two conflicting changes landing at roughly the same time, it's not clear what you expect troopers (or cq folks) to do. Please clarify.
,
Jun 28 2018
re #9: Ideally we would find interesting ways to detect and avert. In previous discussions at the start of CQ days the belief was that these problems were vanishingly small. That makes the study of instances of them happening in real life interesting. When filed, it was unclear why the test would fail by hanging/aborting/failing to launch and whether there was a confounding issue. That's still a little interesting, but probably after the devs involved find out how the two CLs conflicted. I don't think there is any trooper input on that now. This would also be an interesting case for findit to have been able to detect and revert. Issuing try jobs for empty CLs at the revisions in the failure range would have found the CL to revert. 34 is clearly(?) too many try jobs, but a few layers could have done it quickly. I would like the people who implement and monitor the test infrastructure to see what happened here more than dictate action or implementation for them.
,
Jun 28 2018
I'm not sure how or if this could turn into a change. More details of how the changes conflicted would be useful in figuring out how/if this sort of thing could be detected.
,
Jun 28 2018
blocked on issue 858692 , I think?
,
Jul 6
,
Jul 10
Can we close this? From issue 858692 it looks like the content_unittests failure was fixed; I relanded the original CL.
,
Jul 10
Re:#14 I hope not. This issue is about making the infrastructure better able to detect and identify the type of failure. Perhaps if you have details of how a flaky package manager caused the content_unittests to pass on the CQ run but reliably fail after landing, and then reliably succeed after revert? i.e. did the flakiness violate a precondition on which the system design is based?
,
Jan 10
Downgrading P2s that haven't been modified in more than 6 months, which also do not have a component or owner. |
|||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||
Comment 1 by petermayo@google.com
, Jun 27 2018