New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 835941 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: May 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug

Blocking:
issue 804625



Sign in to add a comment

Alert on inventory run failures.

Reported by jrbarnette@chromium.org, Apr 23 2018

Issue description

We have new metrics and a new dashboard that depend on the inventory
runs completing more-or-less reliably once every 8 hours:
    https://viceroy.corp.google.com/chromeos/untestable?duration=8d

If a run fails to complete, the data for that dashboard will show
empty tables.  We need an alert that will fire when an inventory run
fails to produce metrics.

 
Blocking: 804625
Components: -Infra>Client>ChromeOS Infra>Client>ChromeOS>Test
Owner: jrbarnette@chromium.org
Owner: ----
Possibly mine, but the assignment needs ratification.

Owner: jrbarnette@chromium.org
Status: Assigned (was: Untriaged)
sounds related directly to your other work
Labels: -Chase-Pending Chase
Owner: pprabhu@chromium.org
Cc: jrbarnette@chromium.org
Status: Started (was: Assigned)
Add service liveness metrics to lab_inventory (cl stack): https://chromium-review.googlesource.com/c/chromiumos/third_party/autotest/+/1048608
Project Member

Comment 7 by bugdroid1@chromium.org, May 9 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/51ad14e50d98a2b5d608cb10b8e1a6a38ac05ea8

commit 51ad14e50d98a2b5d608cb10b8e1a6a38ac05ea8
Author: Prathmesh Prabhu <pprabhu@chromium.org>
Date: Wed May 09 21:39:56 2018

lab_inventory: Let exceptions escape main()

Exceptions from the lab_inventory script should be allowed to escape
main so that callers can correctly handle the failure case.

BUG= chromium:835941 
TEST=None

Change-Id: Ia75cda1c032cca31a5827cf56aeff2e564513515
Reviewed-on: https://chromium-review.googlesource.com/1048605
Commit-Ready: Prathmesh Prabhu <pprabhu@chromium.org>
Tested-by: Prathmesh Prabhu <pprabhu@chromium.org>
Reviewed-by: Richard Barnette <jrbarnette@google.com>

[modify] https://crrev.com/51ad14e50d98a2b5d608cb10b8e1a6a38ac05ea8/site_utils/lab_inventory.py

Project Member

Comment 8 by bugdroid1@chromium.org, May 9 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/6b48edefd74f028d5ffba82007cd4aaa743af476

commit 6b48edefd74f028d5ffba82007cd4aaa743af476
Author: Prathmesh Prabhu <pprabhu@chromium.org>
Date: Wed May 09 21:39:57 2018

lab_inventory: Let --debug imply --debug-metrics

It is natural to not actually report metrics when lab_inventory is run
with --debug. This reduces some complexity in the script that was only
needed to support the weird use case where someone wants to run with
--debug, but still report metrics.

BUG= chromium:835941 
TEST=None

Change-Id: Ieb3752d917737627173faee99d9b4087de68ea59
Reviewed-on: https://chromium-review.googlesource.com/1048606
Commit-Ready: Prathmesh Prabhu <pprabhu@chromium.org>
Tested-by: Prathmesh Prabhu <pprabhu@chromium.org>
Reviewed-by: Richard Barnette <jrbarnette@google.com>

[modify] https://crrev.com/6b48edefd74f028d5ffba82007cd4aaa743af476/site_utils/lab_inventory.py

Project Member

Comment 9 by bugdroid1@chromium.org, May 9 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/58728f40fe2a952416efb40012ccafb87c755742

commit 58728f40fe2a952416efb40012ccafb87c755742
Author: Prathmesh Prabhu <pprabhu@chromium.org>
Date: Wed May 09 21:39:57 2018

lab_inventory: Flush metrics even in case of errors

This ensures that we will not drop metrics on the floor when exceptions
happen.

BUG= chromium:835941 
TEST=None

Change-Id: Icbcb5e52e48b3eed4e5122906aab3772b844932f
Reviewed-on: https://chromium-review.googlesource.com/1048607
Commit-Ready: Prathmesh Prabhu <pprabhu@chromium.org>
Tested-by: Prathmesh Prabhu <pprabhu@chromium.org>
Reviewed-by: Richard Barnette <jrbarnette@google.com>

[modify] https://crrev.com/58728f40fe2a952416efb40012ccafb87c755742/site_utils/lab_inventory.py

Project Member

Comment 10 by bugdroid1@chromium.org, May 9 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/b69a6cc8cedebdc2f1eaca6c45c8d71668aae694

commit b69a6cc8cedebdc2f1eaca6c45c8d71668aae694
Author: Prathmesh Prabhu <pprabhu@chromium.org>
Date: Wed May 09 21:39:58 2018

lab_inventory: Report service liveness and duration metrics

BUG= chromium:835941 
TEST=Run with --debug

Change-Id: I9f925584facbe5e55ecb5268b47ddbfbe63bcdc9
Reviewed-on: https://chromium-review.googlesource.com/1048608
Commit-Ready: Prathmesh Prabhu <pprabhu@chromium.org>
Tested-by: Prathmesh Prabhu <pprabhu@chromium.org>
Reviewed-by: Richard Barnette <jrbarnette@google.com>

[modify] https://crrev.com/b69a6cc8cedebdc2f1eaca6c45c8d71668aae694/site_utils/lab_inventory.py

dashboard created with the tick metrics: cr/196286013
pending final review for alerts
Status: Fixed (was: Started)
Alerts landed in staging. I'll promote to prod once they fire a few times.

Sign in to add a comment