Really alert on shard_client tick failures |
|||
Issue descriptionSee issue 793532 where all shards' shard_client died at noon, and we only accidentally noticed it ~5:30 The reason is that we don't allow shard_client tick to fail in most conditions: https://cs.corp.google.com/chromeos_public/src/third_party/autotest/files/scheduler/shard/shard_client.py?l=334 And then don't report success/failure with the tick: https://cs.corp.google.com/chromeos_public/src/third_party/autotest/files/scheduler/shard/shard_client.py?l=367 So, the tick doesn't reflect shard_client health. Ask: [1] Report success/failure with shard_client tick [2] Add / verify alerting on shard_client tick failure.
,
Dec 10 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/04d40db2543cbcc343f22771865d602d1e0ed19b commit 04d40db2543cbcc343f22771865d602d1e0ed19b Author: Aviv Keshet <akeshet@chromium.org> Date: Sun Dec 10 00:03:29 2017 autotest: shard_client: only count successful ticks BUG= chromium:793538 TEST=None Change-Id: I6757355b3b32f0a1cae3677514ff26c1b7915f6f Reviewed-on: https://chromium-review.googlesource.com/818566 Commit-Ready: Aviv Keshet <akeshet@chromium.org> Tested-by: Aviv Keshet <akeshet@chromium.org> Reviewed-by: Prathmesh Prabhu <pprabhu@chromium.org> [modify] https://crrev.com/04d40db2543cbcc343f22771865d602d1e0ed19b/scheduler/shard/shard_client.py
,
Dec 11 2017
Pending push to prod
,
Dec 11 2017
,
Dec 18 2017
|
|||
►
Sign in to add a comment |
|||
Comment 1 by akes...@chromium.org
, Dec 9 2017Status: Started (was: Untriaged)