MonitoringDecreasingValueError: Monotonically increasing metric "chromeos/cbuildbot/git/command_durations" was given value "-0.079006", which is not greater than or equal to "None". |
||||||
Issue descriptionGetting a bunch of these errors, causing cq completion to fail: https://uberchromegw.corp.google.com/i/chromeos/builders/master-paladin/builds/18587 Traceback (most recent call last): File "/b/c/cbuild/repository/chromite/bin/cros_mark_as_stable", line 169, in <module> DoMain() File "/b/c/cbuild/repository/chromite/bin/cros_mark_as_stable", line 165, in DoMain commandline.ScriptWrapperMain(FindTarget) File "/b/c/cbuild/repository/chromite/lib/commandline.py", line 911, in ScriptWrapperMain ret = target(argv[1:]) File "/b/c/cbuild/repository/chromite/scripts/cros_mark_as_stable.py", line 281, in main git_project_overlays, manifest, package_list) File "/b/c/cbuild/repository/chromite/scripts/cros_mark_as_stable.py", line 342, in _WorkOnCommit parallel.RunTasksInProcessPool(_CommitOverlays, inputs) File "/b/c/cbuild/repository/chromite/lib/parallel.py", line 810, in RunTasksInProcessPool queue.put((idx, input_args)) File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__ self.gen.next() File "/b/c/cbuild/repository/chromite/lib/parallel.py", line 751, in BackgroundTaskRunner queue.put(_AllTasksComplete()) File "/b/c/cbuild/repository/chromite/lib/parallel.py", line 751, in BackgroundTaskRunner queue.put(_AllTasksComplete()) File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__ self.gen.next() File "/b/c/cbuild/repository/chromite/lib/parallel.py", line 562, in ParallelTasks raise BackgroundFailure(exc_infos=errors) chromite.lib.parallel.BackgroundFailure: <class 'infra_libs.ts_mon.common.errors.MonitoringDecreasingValueError'>: Monotonically increasing metric "chromeos/cbuildbot/git/command_durations" was given value "-0.079006", which is not greater than or equal to "None".
,
May 14 2018
https://logs.chromium.org/v/?s=chromeos%2Fbb%2Fchromeos%2Fnyan-full-compile-paladin%2F12614%2F%2B%2Frecipes%2Fsteps%2FUprev%2F0%2Fstdout is the link to the direct log; shapiroc, can you take a look? Affected one builder.
,
May 14 2018
,
May 14 2018
The root of the issue is that 1) SecondsTimer is a distribution, therefore a cumulative metric, and 2) the timings are literally saying it you gained time by running the git commands. =) Realistically this is due to the relatively high cardinality of metrics due to the lack of identifying fields at command time. The proper solution, for now, is probably to take out the timings for git commands and identify a means to inject identifiable fields at the timing level. -- Mike
,
May 14 2018
,
May 14 2018
https://chromium-review.googlesource.com/c/chromiumos/chromite/+/1058207 is out to roll back the timing changes. -- Mike
,
May 15 2018
,
May 21 2018
The changes, for git timing, have been rolled back and this error is no longer presenting. Will take a second pass at implementing some timings for git commands with a better cardinality rate. As an aside, durations for pinpointing issues with specific builds/targets are not ideal. -- Mike
,
May 21 2018
|
||||||
►
Sign in to add a comment |
||||||
Comment 1 by ayatane@chromium.org
, May 14 2018