New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 608427 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
gone, assign your bugs elsewhere :)
Closed: Jul 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug



Sign in to add a comment

Keep track of expected swarming android devices

Project Member Reported by bpastene@chromium.org, May 2 2016

Issue description

Currently, when a device drops offline, the swarming host loses all track of it and can no longer report it as being 'unavailable.'

We should either stream stats from these devices into monarch, or do what the old buildbot setup did and create a file on the host itself to keep track of the expected phones.
 

Comment 1 by stip@chromium.org, May 2 2016

I'd prefer a centralized solution instead of writing local device files.

Can you elaborate more on your monarch idea? I guess one solution would be to just have a 'check-in' ping whenever we see the device, and do a monarch query for devices that haven't checked in for a while. We would have to report the host the device is on as well. This has the disadvantage that decommissioned devices would keep showing up unless we had some way to remove them from the data.

Ultimately it would be awesome to have a machine database to handle this kind of stuff, but that seems a bit far off at the moment.
Regarding the monarch solution: that's correct. Any device that's stopped reporting its status would be assumed offline.

I agree a centralized database of this information would be best, but that's still a ways off from being usable and we're running into issues with missing devices today. Ditto with the ts_mon stuff.

I vote for adding a local file to the host. It'll be a quick and small hack that can easily be undone when one of these other solutions becomes available. 
It's important to discern between devices genuinely disconnected and dead devices (e.g. powered off).

What I'd like to know is what lsusb reports on unpowered / hung devices, so that ghost devices (e.g. down but still connected via usb) could still be reported appropriately.

Then the problem is trivial to fix as the swarming bot can report itself as Android properly.
IIRC lsusb reports nothing for unpowered / hung devices. (We see a separate class of issues in which lsusb reports the device but adb doesn't.)
Oh, that's sad to hear. :(
I was specifically thinking about "lsusb reports the device but adb doesn't" which could be raised up in swarming bot code but thining about it more, python-adb probably uses the same basis as lsusb, so it (probably) already does the same output for unavailable devices.

Comment 6 by stip@chromium.org, May 3 2016

Cc: bpastene@chromium.org
Owner: ----
So I now realize that in comment 1 bpastene@ was referring to the existing monitoring / ticket filing pipeline built in https://bugs.chromium.org/p/chromium/issues/detail?id=519884. The advantage there is that the query-monarch-and-file-a-ticket logic is already implemented in https://cs.corp.google.com/piper///depot/google3/googleclient/chrome/infra/device_ticket_filer/device_ticket_filer.py.

I'll take ownership of this bug for now.

Comment 7 by stip@chromium.org, May 3 2016

Cc: -stip@chromium.org
Owner: stip@chromium.org
Project Member

Comment 8 by bugdroid1@chromium.org, May 12 2016

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infra/infra_internal.git/+/9b3180106d2a000dc2bdb1da173de965ea6437fb

commit 9b3180106d2a000dc2bdb1da173de965ea6437fb
Author: stip <stip@google.com>
Date: Thu May 12 21:23:54 2016

Project Member

Comment 10 by bugdroid1@chromium.org, May 20 2016

Comment 11 by stip@chromium.org, May 23 2016

We're getting data now from the swarming devices. Only thing left to do is get the queries from monarch to locate devices that have dropped off.
Project Member

Comment 12 by bugdroid1@chromium.org, Jun 1 2016

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infra/puppet/+/07a8e221f3b07bbbd93c97952ea3e06ce21e0dd4

commit 07a8e221f3b07bbbd93c97952ea3e06ce21e0dd4
Author: Mike Stipicevic <stip@chromium.org>
Date: Fri May 20 23:39:25 2016

Project Member

Comment 13 by bugdroid1@chromium.org, Jun 1 2016

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infra/puppet/+/e53317158e1a8c50ddedee33f2d2025544f26794

commit e53317158e1a8c50ddedee33f2d2025544f26794
Author: Mike Stipicevic <stip@chromium.org>
Date: Sat May 21 00:06:57 2016

Comment 15 by stip@chromium.org, Jul 21 2016

Status: Fixed (was: Assigned)

Sign in to add a comment