Improve failure messages for "provision Failure" auto-generated failures |
|||||||
Issue descriptionSearching for "provision Failure" results in a very large number of autifiled issues: https://bugs.chromium.org/p/chromium/issues/list?can=2&q=%22provision%20Failure%20%22&sort=-status&colspec=ID%20Pri%20M%20Status%20Owner%20Summary%20Modified We need to track down the root cause(s) of these failures and: a) Change the description so that if there is more than one cause that gets exposed in the summary. b) File issues and fix the top root causes.
,
Mar 8 2016
,
Mar 8 2016
Linking to issue 591628 and removing the top-level Code-Yellow label.
,
Apr 4 2016
,
May 2 2016
Another provision failure today on the trick-pfq builder due to ssh timeout that is not clear whether it's legit or just an infra flake: https://bugs.chromium.org/p/chromium/issues/detail?id=589367#c31
,
May 2 2016
I don't quite understand what is this bug for. The reason of the provision failure has already been listed in the bug. The root cause is not always easy to be detected. The bug has already listed all the useful links for the developers to find the root cause. For example, a ssh timeout error could be a network flake, or something wrong from the DUT side. I don't think it is doable to expose the *root cause*, the purpose of the bug is to help developers to find the root cause.
,
May 2 2016
Historically this symptom has been a major pain point for Gardeners which is why it was identified as something we need to improve and or triage better. 5 weeks ago we had >100 pfq failures with "provision Failure" in the Summary: https://bugs.chromium.org/p/chromium/issues/list?can=2&q=pfq+%22provision+Failure%22+opened%3Etoday-35+opened%3C%3Dtoday-28&sort=-modified+pri&colspec=ID+Pri+M+Status+Owner+Summary+OS+Modified+autofiled&x=m&y=releaseblock&cells=ids At that failure rate, "the root cause is not easily detected" is a huge problem. Last week we only had 8 pfq failures with that symptom, so I think we may have addressed much of this issue with other fixes. If that trend continues (i.e the occurrence of the symptom remains low), we can lower the priority on this or resolve it WontFix.
,
May 10 2016
,
May 10 2016
Re-assign to xixuan@, since she is working on redesign the devserver workflow, which will change the provision process. Xixuan, can you try to improve the provisioning error message when you implement your design for the devserver?
,
May 10 2016
yep, in the plan and try my best~
,
May 27 2016
,
Jun 3 2016
|
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by akes...@chromium.org
, Mar 8 2016