explain high provision failure rate for samus on viceroy |
|||||
Issue descriptionCompare the provision failure rate for cyan: https://viceroy.corp.google.com/chromeos/special_tasks?board=cyan&type=Provision&success=&topstreams=5&duration=870851#_VG_SNxSGuSj and samus: https://viceroy.corp.google.com/chromeos/special_tasks?board=parrot&type=Provision&success=&topstreams=20&duration=870851#_VG_SNxSGuSj samus is showing more failures than successes, Why are there so many failures on samus?
,
Nov 9 2016
,
Nov 10 2016
Some scary links: https://wmatrix.googleplex.com/unfiltered?hide_missing=True&tests=provision_AutoUpdate.double&releases=55&days_back=100 https://wmatrix.googleplex.com/unfiltered?hide_missing=True&tests=provision_AutoUpdate.double&days_back=100&releases=54 https://wmatrix.googleplex.com/unfiltered?hide_missing=True&tests=provision_AutoUpdate.double&days_back=100&releases=56
,
Nov 10 2016
Nothing obvious in R55 https://crosland.corp.google.com/log/8872.7.0..8872.8.0
,
Nov 10 2016
Nothing obvious for R54 https://crosland.corp.google.com/log/8872.7.0..8872.8.0 Maybe a push to prod happened around October 12th?
,
Nov 10 2016
Charlene did a push to prod on October 12th and Xixuan had 2 relevant changes there. git log a22c4a8..f88507a https://chromium-review.googlesource.com/393886 https://chromium-review.googlesource.com/394268 Notice M56 is green, so we may miss backintegrates for the client?
,
Nov 16 2016
Based on comments I am assuming this is specific to ChromeOS, hence labeling accordingly please remove if that's not the case.
,
Nov 17 2016
So, there really are two issues here. 1) Bad push to prod (#5). 2) Confusing graphs (original issue). I apologise for mixing! Luigi: 1) your second link is for parrot, not samus. The link graph for samus is https://viceroy.corp.google.com/chromeos/special_tasks?board=samus&type=Provision&success=&topstreams=20&utc_end=1478757404&duration=870851 I just (Wed Nov 16 22:08:44 PST 2016) locked 3 bad samus. chromeos1-row1-rack1-host6 NO 2016-11-16 19:35:35 http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos1-row1-rack1-host6/58863118-repair/ chromeos4-row12-rack5-host21 NO 2016-11-16 21:36:40 http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row12-rack5-host21/58863563-repair/ chromeos2-row4-rack1-host19 NO 2016-11-16 21:39:49 http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row4-rack1-host19/554019-repair/ Lets see if the samus numbers look better tomorrow.
,
Nov 17 2016
Looking at the very last bit it looks like the red graph (very bottom) dropped a little in the past few hours while the green graph went up https://viceroy.corp.google.com/chromeos/special_tasks?board=samus&duration=870851&success=&topstreams=20&type=Provision
,
Nov 18 2016
Looking at the graph in the last again the green spike at the end has fully reverted and red is winning.
,
Nov 18 2016
That said we have freshly misbehaving samus chromeos2-row4-rack2-host11 NO 2016-11-16 17:15:00 http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row4-rack2-host11/552538-repair/ chromeos1-row1-rack1-host2 NO 2016-11-17 16:07:34 http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos1-row1-rack1-host2/562382-repair/ chromeos1-row1-rack1-host6 NO 2016-11-16 19:35:35 http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos1-row1-rack1-host6/58863118-repair/ chromeos4-row12-rack5-host21 NO 2016-11-16 21:36:40 http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row12-rack5-host21/58863563-repair/ chromeos2-row4-rack1-host19 NO 2016-11-17 16:07:26 http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row4-rack1-host19/562381-repair/
,
Nov 18 2016
The 2 FAFT DUTs from yesterday got unlocked somehow. I left them. I locked chromeos1-row1-rack1-host2 and chromeos2-row4-rack2-host11 .
,
Nov 18 2016
The misbehaving DUTs are the same as yesterday. The provision graph came back up. I would love to see documentation for each graph what their meaning is. I find them hard to interpret. In this case maybe the faft pools (say chromeos2-row4-rack2-host11) dominate everything else? At this stage we don't know what the counts mean or where they come from. Until we figure out how to convert graphs into action I will not look at this any further.
,
Dec 8 2016
Based on new metrics: http://shortn/_xFgDfticRt, I don't see 'cyan provision failure is largely less than samus'. So I will close this bug for now. |
|||||
►
Sign in to add a comment |
|||||
Comment 1 by semenzato@chromium.org
, Nov 9 201669.5 KB
69.5 KB View Download