[Flake] Flake Detection: Include bot_id in flake occurrence |
|||
Issue descriptionPer user's request, bot_id is a piece of useful information to find out flakes caused by hardware mis-configuration. We should include bot_id in flake occurrence model and display it on the UI.
,
Oct 10
,
Oct 11
There is no way to get the bot id from sql query, since the info is missing there. We need to plumbing this on Findit side. I could take it, since I did some investigation already.
,
Oct 11
liaoyuke@ and I took a try earlier this week and it looks we can get bot_id from build.swarming_dimensions or something?
,
Oct 12
On a per-shard basis, failures like this one: https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Win10%20FYI%20Release%20%28Intel%20HD%20630%29/2342 have the bot ID in the "bot assigned to task": https://chromium-swarm.appspot.com/task?id=407460a820aedf10&refresh=10&show_raw=1 Thinking about this more, if multiple shards failed, then potentially multiple bot IDs will need to be displayed. This information will still be very useful at least for the tests that run on physical hardware, since a significant percentage of flakes seen nowadays come from misconfigured machines, so all of the failures will show up on a small number of bots, making it faster to identify the cause.
,
Oct 12
A test will only be run in one shard, so for each flake occurrence, there will only be one bot id. We will be able to know which test runs in which shard, as the output.json of each shard has that info. The task result of each shard includes a bot id. Swarming API provides this info. Hopefully, this won't be urgent, so I will pick up this task in a week or two. |
|||
►
Sign in to add a comment |
|||
Comment 1 by chanli@chromium.org
, Oct 10