New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 724977 link

Starred by 1 user

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 3
Type: Bug



Sign in to add a comment

Swarming: store "device_status" in task output

Project Member Reported by perezju@chromium.org, May 22 2017

Issue description

Before swarming there used to be a "device_status" step on Android perf bots which included a lot of useful information to figure out the details of devices running on that bot, e.g.:

  {
    "adb_status": "device",
    "battery": {
      [...]
    },
    "blacklisted": false,
    "imei_slice": "",
    "ro.build.description": "aosp_bullhead-userdebug 6.0.1 MOB30K 2787339 test-keys",
    "ro.build.fingerprint": "Android/aosp_bullhead/bullhead:6.0.1/MOB30K/2787339:userdebug/test-keys",
    "ro.build.id": "MOB30K",
    "ro.build.product": "bullhead",
    "serial": "01fc0cb218c2a26c",
    "usb_status": true,
    "wifi_ip": ""
  },
https://luci-logdog.appspot.com/v/?s=chrome%2Fbb%2Fchromium.perf%2FAndroid_Nexus5X_WebView_Perf__1_%2F3981%2F%2B%2Frecipes%2Fsteps%2Fdevice_status%2F0%2Flogs%2Fjson.output%2F0


I've seen that tests on swarmed bots have a "swarming.summary" log with some of the information, e.g.:

      "bot_dimensions": {
        "android_devices": [
          "1"
        ],
        "device_os": [
          "L",
          "LMY47W"
        ],
        "device_type": [
          "sprout"
        ],
        "id": [
          "build18-b1--device2"
        ],
        "os": [
          "Android"
        ],
        "pool": [
          "Chrome-perf"
        ]
      },
https://luci-logdog.appspot.com/v/?s=chrome%2Fbb%2Fchromium.perf%2FAndroid_One_Perf%2F85%2F%2B%2Frecipes%2Fsteps%2Fblink_perf.dom.reference_on_Android%2F0%2Flogs%2Fswarming.summary%2F0

But is there some log where we could find the rest of the details? I'm particularly interested in the "ro.build.description" and "ro.build.fingerprint".

 
Cc: kbr@chromium.org
Components: Speed>Telemetry
This could also be a telemetry feature.
I would love to see if there is a way to make this more generic, so we can check on non-Telemetry tests too.

But if logging this from the test itself is the more reasonable solution in swarming world, I'm happy to make those Telemetry changes too.
I really miss having all the info (especially OS version) right in the buildbot status page too.
So, there aren't any devices attached to bot running the buildbot process. It triggers swarming tasks, which actually have the devices.

The swarming.summary is about all we have right now. 

We could maybe get a summary of all the devices all the tests that ran in that build ran on. If that makes sense. That would generalize better to other bots.

We do technically have this:
https://chromium-swarm.appspot.com/bot?id=build10-b1--device4&show_state=true&sort_stats=total%3Adesc -- This is the state for a bot. We could theoretically grab that and dump it somewhere. 
> We do technically have this:
> https://chromium-swarm.appspot.com/bot?id=build10-b1--device4&show_state=true&sort_stats=total%3Adesc -- This is the state for a bot. We could theoretically grab that and dump it somewhere.

That could be very useful. Just putting links to that page from the buildbot status page would be an improvement. Even better if we can grab that status info and dump it to a logdog.

Aside, I see it has "ro.build.fingerprint", could we also add "ro.build.description"? For some reason on some devices we can only use the latter to identify a build as "svelte" (e.g. go/ithwy)
Owner: ----
Status: Available (was: Assigned)
This could be useful; not really my purview anymore.
Cc: jbudorick@chromium.org
+jbudorick maybe someone from your team could help with this?
Cc: bpastene@chromium.org
Components: Infra>Client>Chrome
This looks like it'd require some work both in swarming and in the recipes if we want to surface info on the build page.
Cc: nednguyen@chromium.org
Another example where this could have come handy: issue 843516

It would be great even if just the full "ro.build.description" of the device could be shown in the swarming interface showed somewhere.
Click the "+" button next to "State" on a bot's page. It shows some properties from the device, ro.build.fingerprint being one of them.
Hmm.. I can't seem to find it.

For example, I would like to know the fingerprint of the device that ran this task:
https://chrome-swarming.appspot.com/task?id=3d8518b8959d6a10&refresh=10&show_raw=1

From there I was able to find this:
https://chrome-swarming.appspot.com/bot?id=build204-b7--device7&show_state=true&sort_stats=total%3Adesc

But I don't see the "ro.build.fingerprint" there.
That's because it's unauthorized.
But the device *was* authorized at the time that the task ran. Right?

What I want is that historical record of what the device was when the test ran.
That's correct. The only thing persisted at time of task is the dimensions of the bot. If there's other aspects of the device that your tasks depend that but aren't adequately described its dimensions, I suggest adding them.

If you describe the things you're looking for, I can look into adding them to the dimensions.
Cc: eyaich@chromium.org
Well, I want "all of it".

Ideally we would like to keep all sorts of device information (e.g. fingerprint, build description, serial no, temperature, who knows what else ...). It's not that the tasks *depend* on this information, it's more like a set of useful things to know about a device when trying to diagnose or troubleshoot an issue.

I appreciate that bot dimensions, or even the swarming UI, might not be the right place for this information. Other possible options are:

1. On one extreme this could be added as debug output from the test itself (Telemetry in our case).

2. Somehow on the chromium.perf recipe; before running the test, also run some "device_status" script to gather info on the available device and expose it on the build page; e.g. somewhere on: https://ci.chromium.org/buildbot/chromium.perf/Android%20Nexus5X%20Perf/1782

3. Somewhere even more generic, so it applies not only to chromium.perf, but also other waterfalls (internal swarmed bots, pool of bisect bots, etc.)

Option 1 is the easiest for me to implement. But, although the technical details on what is needed for 2-3 are hazy to me, I believe something more general along those lines might be desirable?

+eyaich FYI - as you might be more familiar with what is needed for something like option 2.
I don't think bot dimensions are the right place for this since this is debugging information and not information you will target when triggering the test.

I think this is best done while the test is running (either before or after) or as Juan's use case points out you might lose critical information about the bot while the test is running.  

I think you have two options here: 

a)  add it to swarming to output this information into the summary json.  This would give you lots of options for display once the swarming task returns.

b) Have telemetry, or more likely the script that runs on the bot that runs telemetry, output this debugging information.   A likely place for this would be to an artifact in the json test results format that can get plumbed through to SOM.  Not sure if this is where you prefer it to surface, but this would be very specific to the perf use case.

Otherwise any solution you implement that queries swarming in the recipe after the tasks return (ie when you know what devices they ran on) will be subject to stale info.
On option "a" you mean adding that info to a swarming.summary like this one?
https://logs.chromium.org/v/?s=chrome%2Fbb%2Fchromium.perf%2FAndroid_Nexus5X_Perf%2F1782%2F%2B%2Frecipes%2Fsteps%2Fangle_perftests_on_Android%2F0%2Flogs%2Fswarming.summary%2F0

That would be fantastic!

If possible I would strongly prefer that option.
Cc: mar...@chromium.org
Since swarming.summary is already surfaced as a link in the buildbot page this might be sufficient.  

Adding Marc-Antoine to see if has insight on the swarming side for adding the state of the bot to the summary.
My 2 cent is beside using these for debugging, it would also be great if we can add assertion on them to make test fails in case of misconfigured devices. 

It is also possible that the "assertion" of  should be achieved by other mechanism like swarming dimension or quarantine, I am not sure.
> My 2 cent is beside using these for debugging, it would also be great if we can add assertion on them to make test fails in case of misconfigured devices. 

I think this "assertion" is a separate use case, and that *is* what bot dimensions are for. If there is a value that test should *expect* to see on a device, then that info should be included in the bot dimensions when scheduling the task.

As I mentioned in #15, this is about a different issue, having a place where to dump useful information about the *state* that a device had at the time when the task ran.
I see two options:
- Snapshot the bot's state into the task result. It would be the state of the bot at the beginning of the task, not at the end.
- Storing metadata into the task's output in a well known file.

They are not mutually exclusive, both can be done. The bot's dimensions are already snapshotted.
Ref: https://cs.chromium.org/chromium/infra/luci/appengine/swarming/server/task_result.py?l=414
I would be very happy with either of those.
Summary: Swarming: store "device_status" in task output (was: "device_status" equivalent after swarming?)
Filed the snapshot idea to issue 850560. Let's focus this issue on the second part, which is storing metadata in the task output.
Components: Test>Telemetry
Components: -Speed>Telemetry

Sign in to add a comment