New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 655782 link

Starred by 1 user

Issue metadata

Status: Archived
Owner:
Closed: Nov 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment

devserver field for special_task_count / special_task_duration metrics, (in particular, Provision)

Project Member Reported by akes...@chromium.org, Oct 13 2016

Issue description

Per meeting earlier today, it would make it easier to understand how per-devserver load affects provision time if the provision metrics were labelled with devserver.

This is also rather related to crbug.com/650481 and may as well be implemented at the same time.
 
Cc: jrbarnette@chromium.org
Does the relevant code know which devserver was used?
Cc: keta...@chromium.org
Devservers are resolved in autoserv.  Getting that back to the scheduler is painful, in the same way that getting provision error information back is painful.
Maybe we can try to think of other ways.

The final goal is figuring out if our scheduling is overcommitting devservers.  In theory this should not happen, because of the "health check" done on the devserver before deciding to have it provision each DUT.  In practice, these problems may occur:

1. the health check may be inaccurate
2. many health checks may happen close to each other, when the system is still lightly loaded, leading to overcommits.

Maybe this information can be reliably collected on the devservers?  Maybe we already have it in the devserver logs?

Maybe have the drones emit a provision metric?  It will overlap with the current special_task_count (or maybe it won't and thus reveal a new issue).

Comment 6 by autumn@chromium.org, Oct 17 2016

Labels: -current-issue
Project Member

Comment 7 by bugdroid1@chromium.org, Oct 27 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/f93754802bb4de87aee96d67ba3045ef4d223cce

commit f93754802bb4de87aee96d67ba3045ef4d223cce
Author: Aviv Keshet <akeshet@chromium.org>
Date: Wed Oct 26 09:09:03 2016

[autotest] add metrics to provision-by-devserver code

We already collect some information about provision jobs, in the form of
special task metrics. However, special tasks are a scheduler concept,
and a fair amount of useful informationa about provision jobs is
invisible to them (in particular the devserver used, and details about
failures that occured).

Since provision is so uniquely import in infra, makes sense to have more
instrumentation internal to the provision job. This CL starts adding
some.

BUG= chromium:655782 
TEST=None

Change-Id: Icc4eb6eea2600adb61490b34b63c48f3d2718ca7
Reviewed-on: https://chromium-review.googlesource.com/403549
Reviewed-by: Aviv Keshet <akeshet@chromium.org>
Tested-by: Aviv Keshet <akeshet@chromium.org>
Commit-Queue: Aviv Keshet <akeshet@chromium.org>

[modify] https://crrev.com/f93754802bb4de87aee96d67ba3045ef4d223cce/server/hosts/cros_host.py
[modify] https://crrev.com/f93754802bb4de87aee96d67ba3045ef4d223cce/client/common_lib/cros/dev_server.py

Status: Fixed (was: Assigned)
Project Member

Comment 9 by bugdroid1@chromium.org, Nov 10 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/16b8df37317c901e03bba2ced92ff0f91cfa3315

commit 16b8df37317c901e03bba2ced92ff0f91cfa3315
Author: Shelley Chen <shchen@chromium.org>
Date: Thu Oct 27 23:24:21 2016

[autotest] Add metric collection during autoupdate

Adding in collection of metrics when autoupdate
command completes.  Currently only for devserver
and success/failure.

BUG= chromium:655782 
BRANCH=None
TEST=None

Change-Id: If1270a7d5c9eec2ee05d81bce900a9c47c23ab40
Signed-off-by: Shelley Chen <shchen@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/404554
Reviewed-by: Aseda Aboagye <aaboagye@chromium.org>

[modify] https://crrev.com/16b8df37317c901e03bba2ced92ff0f91cfa3315/client/common_lib/cros/autoupdater.py

Project Member

Comment 10 by bugdroid1@chromium.org, Nov 16 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/61d289840e3482df2ef0cf74832a6d707def2ac2

commit 61d289840e3482df2ef0cf74832a6d707def2ac2
Author: Shelley Chen <shchen@chromium.org>
Date: Fri Oct 28 16:40:20 2016

[autotest] adding in board,build_type,milestone in autoupdater

BUG= chromium:655782 
BRANCH=None
TEST=None

Change-Id: I40828ff3a43befd7863a09f8171fb282080e68ea
Signed-off-by: Shelley Chen <shchen@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/404868
Reviewed-by: Aseda Aboagye <aaboagye@chromium.org>

[modify] https://crrev.com/61d289840e3482df2ef0cf74832a6d707def2ac2/client/common_lib/cros/autoupdater.py

Project Member

Comment 11 by bugdroid1@chromium.org, Nov 20 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/a223ce029cd3b9cfdb29b9d2709eda5618b05c9b

commit a223ce029cd3b9cfdb29b9d2709eda5618b05c9b
Author: xixuan <xixuan@chromium.org>
Date: Thu Nov 17 00:43:01 2016

autotest: add raised_error into metrics for CrOS auto-update by devserver

This CL parses the raised_error for every failed round of auto-update by
devserver, and add it to privision metrics.

BUG= chromium:655782 
TEST=Run auto_update locally with local devserver to test function
|_parse_raised_error_for_auto_update|.

Change-Id: Ifb42154056d22f1959d0b083d4edd1beda00ef2f
Reviewed-on: https://chromium-review.googlesource.com/412125
Commit-Ready: Aviv Keshet <akeshet@chromium.org>
Tested-by: Xixuan Wu <xixuan@chromium.org>
Reviewed-by: Aviv Keshet <akeshet@chromium.org>

[modify] https://crrev.com/a223ce029cd3b9cfdb29b9d2709eda5618b05c9b/client/common_lib/cros/dev_server.py

Project Member

Comment 12 by bugdroid1@chromium.org, Dec 7 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/6a09e1cf031a4772e72cc8214e9f749e4eecf4fe

commit 6a09e1cf031a4772e72cc8214e9f749e4eecf4fe
Author: xixuan <xixuan@chromium.org>
Date: Mon Dec 05 21:18:10 2016

autotest: Add two more auto-update error patterns.

This CL adds 2 more CrOS auto-update error patterns based on the parsed error
infos from tko database.

BUG= chromium:655782 
TEST=Run auto_update locally with local devserver to test new patterns.

Change-Id: Ibce2af29659dfb09566a8d92c37da209ba212564
Reviewed-on: https://chromium-review.googlesource.com/416501
Commit-Ready: Xixuan Wu <xixuan@chromium.org>
Tested-by: Xixuan Wu <xixuan@chromium.org>
Reviewed-by: Xixuan Wu <xixuan@chromium.org>

[modify] https://crrev.com/6a09e1cf031a4772e72cc8214e9f749e4eecf4fe/client/common_lib/cros/dev_server.py

Comment 13 by dchan@google.com, Jan 21 2017

Labels: VerifyIn-57

Comment 14 by dchan@google.com, Mar 4 2017

Labels: VerifyIn-58

Comment 15 by dchan@google.com, Apr 17 2017

Labels: VerifyIn-59

Comment 16 by dchan@google.com, May 30 2017

Labels: VerifyIn-60
Labels: VerifyIn-61

Comment 18 by dchan@chromium.org, Oct 14 2017

Status: Archived (was: Fixed)

Sign in to add a comment