Swarming: store "device_status" in task output |
||||||||||
Issue description
Before swarming there used to be a "device_status" step on Android perf bots which included a lot of useful information to figure out the details of devices running on that bot, e.g.:
{
"adb_status": "device",
"battery": {
[...]
},
"blacklisted": false,
"imei_slice": "",
"ro.build.description": "aosp_bullhead-userdebug 6.0.1 MOB30K 2787339 test-keys",
"ro.build.fingerprint": "Android/aosp_bullhead/bullhead:6.0.1/MOB30K/2787339:userdebug/test-keys",
"ro.build.id": "MOB30K",
"ro.build.product": "bullhead",
"serial": "01fc0cb218c2a26c",
"usb_status": true,
"wifi_ip": ""
},
https://luci-logdog.appspot.com/v/?s=chrome%2Fbb%2Fchromium.perf%2FAndroid_Nexus5X_WebView_Perf__1_%2F3981%2F%2B%2Frecipes%2Fsteps%2Fdevice_status%2F0%2Flogs%2Fjson.output%2F0
I've seen that tests on swarmed bots have a "swarming.summary" log with some of the information, e.g.:
"bot_dimensions": {
"android_devices": [
"1"
],
"device_os": [
"L",
"LMY47W"
],
"device_type": [
"sprout"
],
"id": [
"build18-b1--device2"
],
"os": [
"Android"
],
"pool": [
"Chrome-perf"
]
},
https://luci-logdog.appspot.com/v/?s=chrome%2Fbb%2Fchromium.perf%2FAndroid_One_Perf%2F85%2F%2B%2Frecipes%2Fsteps%2Fblink_perf.dom.reference_on_Android%2F0%2Flogs%2Fswarming.summary%2F0
But is there some log where we could find the rest of the details? I'm particularly interested in the "ro.build.description" and "ro.build.fingerprint".
,
May 22 2017
I would love to see if there is a way to make this more generic, so we can check on non-Telemetry tests too. But if logging this from the test itself is the more reasonable solution in swarming world, I'm happy to make those Telemetry changes too.
,
May 23 2017
I really miss having all the info (especially OS version) right in the buildbot status page too.
,
May 23 2017
So, there aren't any devices attached to bot running the buildbot process. It triggers swarming tasks, which actually have the devices. The swarming.summary is about all we have right now. We could maybe get a summary of all the devices all the tests that ran in that build ran on. If that makes sense. That would generalize better to other bots. We do technically have this: https://chromium-swarm.appspot.com/bot?id=build10-b1--device4&show_state=true&sort_stats=total%3Adesc -- This is the state for a bot. We could theoretically grab that and dump it somewhere.
,
May 24 2017
> We do technically have this: > https://chromium-swarm.appspot.com/bot?id=build10-b1--device4&show_state=true&sort_stats=total%3Adesc -- This is the state for a bot. We could theoretically grab that and dump it somewhere. That could be very useful. Just putting links to that page from the buildbot status page would be an improvement. Even better if we can grab that status info and dump it to a logdog. Aside, I see it has "ro.build.fingerprint", could we also add "ro.build.description"? For some reason on some devices we can only use the latter to identify a build as "svelte" (e.g. go/ithwy)
,
Mar 30 2018
This could be useful; not really my purview anymore.
,
Apr 3 2018
+jbudorick maybe someone from your team could help with this?
,
Apr 3 2018
This looks like it'd require some work both in swarming and in the recipes if we want to surface info on the build page.
,
May 17 2018
Another example where this could have come handy: issue 843516 It would be great even if just the full "ro.build.description" of the device could be shown in the swarming interface showed somewhere.
,
May 17 2018
Click the "+" button next to "State" on a bot's page. It shows some properties from the device, ro.build.fingerprint being one of them.
,
May 18 2018
Hmm.. I can't seem to find it. For example, I would like to know the fingerprint of the device that ran this task: https://chrome-swarming.appspot.com/task?id=3d8518b8959d6a10&refresh=10&show_raw=1 From there I was able to find this: https://chrome-swarming.appspot.com/bot?id=build204-b7--device7&show_state=true&sort_stats=total%3Adesc But I don't see the "ro.build.fingerprint" there.
,
May 18 2018
That's because it's unauthorized.
,
May 21 2018
But the device *was* authorized at the time that the task ran. Right? What I want is that historical record of what the device was when the test ran.
,
May 21 2018
That's correct. The only thing persisted at time of task is the dimensions of the bot. If there's other aspects of the device that your tasks depend that but aren't adequately described its dimensions, I suggest adding them. If you describe the things you're looking for, I can look into adding them to the dimensions.
,
May 22 2018
Well, I want "all of it". Ideally we would like to keep all sorts of device information (e.g. fingerprint, build description, serial no, temperature, who knows what else ...). It's not that the tasks *depend* on this information, it's more like a set of useful things to know about a device when trying to diagnose or troubleshoot an issue. I appreciate that bot dimensions, or even the swarming UI, might not be the right place for this information. Other possible options are: 1. On one extreme this could be added as debug output from the test itself (Telemetry in our case). 2. Somehow on the chromium.perf recipe; before running the test, also run some "device_status" script to gather info on the available device and expose it on the build page; e.g. somewhere on: https://ci.chromium.org/buildbot/chromium.perf/Android%20Nexus5X%20Perf/1782 3. Somewhere even more generic, so it applies not only to chromium.perf, but also other waterfalls (internal swarmed bots, pool of bisect bots, etc.) Option 1 is the easiest for me to implement. But, although the technical details on what is needed for 2-3 are hazy to me, I believe something more general along those lines might be desirable? +eyaich FYI - as you might be more familiar with what is needed for something like option 2.
,
May 22 2018
I don't think bot dimensions are the right place for this since this is debugging information and not information you will target when triggering the test. I think this is best done while the test is running (either before or after) or as Juan's use case points out you might lose critical information about the bot while the test is running. I think you have two options here: a) add it to swarming to output this information into the summary json. This would give you lots of options for display once the swarming task returns. b) Have telemetry, or more likely the script that runs on the bot that runs telemetry, output this debugging information. A likely place for this would be to an artifact in the json test results format that can get plumbed through to SOM. Not sure if this is where you prefer it to surface, but this would be very specific to the perf use case. Otherwise any solution you implement that queries swarming in the recipe after the tasks return (ie when you know what devices they ran on) will be subject to stale info.
,
May 23 2018
On option "a" you mean adding that info to a swarming.summary like this one? https://logs.chromium.org/v/?s=chrome%2Fbb%2Fchromium.perf%2FAndroid_Nexus5X_Perf%2F1782%2F%2B%2Frecipes%2Fsteps%2Fangle_perftests_on_Android%2F0%2Flogs%2Fswarming.summary%2F0 That would be fantastic! If possible I would strongly prefer that option.
,
May 23 2018
Since swarming.summary is already surfaced as a link in the buildbot page this might be sufficient. Adding Marc-Antoine to see if has insight on the swarming side for adding the state of the bot to the summary.
,
May 23 2018
My 2 cent is beside using these for debugging, it would also be great if we can add assertion on them to make test fails in case of misconfigured devices. It is also possible that the "assertion" of should be achieved by other mechanism like swarming dimension or quarantine, I am not sure.
,
May 23 2018
> My 2 cent is beside using these for debugging, it would also be great if we can add assertion on them to make test fails in case of misconfigured devices. I think this "assertion" is a separate use case, and that *is* what bot dimensions are for. If there is a value that test should *expect* to see on a device, then that info should be included in the bot dimensions when scheduling the task. As I mentioned in #15, this is about a different issue, having a place where to dump useful information about the *state* that a device had at the time when the task ran.
,
May 23 2018
I see two options: - Snapshot the bot's state into the task result. It would be the state of the bot at the beginning of the task, not at the end. - Storing metadata into the task's output in a well known file. They are not mutually exclusive, both can be done. The bot's dimensions are already snapshotted. Ref: https://cs.chromium.org/chromium/infra/luci/appengine/swarming/server/task_result.py?l=414
,
Jun 7 2018
I would be very happy with either of those.
,
Jun 7 2018
Filed the snapshot idea to issue 850560. Let's focus this issue on the second part, which is storing metadata in the task output.
,
Jan 16
,
Jan 16
|
||||||||||
►
Sign in to add a comment |
||||||||||
Comment 1 by nedngu...@google.com
, May 22 2017Components: Speed>Telemetry