Need better UI presentation for infra failing builds due to lack of capacity in isolated tests |
||
Issue descriptionAs per go/top-cq-flakes: https://datastudio.google.com/c/reporting/12dYEpcepJ5_6ZOhprbd5GpDNooiUJONV/page/AYfX Looking at a sample build: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/cast_shell_linux/178938 We see many errors of the form: """ Invalid Swarming task state: NO_RESOURCE """ Is this due to insufficient swarming capacity? If so I would expect a clearer message, and perhaps even a different presentation color just so we can distinguish that from other problems we need to investigate. + sergeyberezin, current CCI trooper + jbudorick, maruel, stgao
,
Nov 13
I have a strong suspicion that it was due to the network maintenance at that time: https://groups.google.com/a/google.com/forum/#!topic/chrome-infrastructure-announce/d_4zm3cY6ls It affected GCE bots, and therefore, may have affected large pools of bots after they were respawned by the Machine Provider (which happens every 24h).
,
Nov 13
The actual outage should now be over, and I don't see any suspicious failures in the recent history of the builder. Since the bug talks about UX more than the actual outage, I'll keep it open and rephrase the title. It's not really a trooper issue - adding it to chrome-client-infra-backlog list for tracking.
,
Nov 13
#2 is correct. See go/chops-pm-105 (internal) for context. I would definitely agree about the UI needing to be better. |
||
►
Sign in to add a comment |
||
Comment 1 by sergeybe...@chromium.org
, Nov 13