Moblab has to wait for ts_mon to flush even though it never users it |
||||||||||
Issue descriptionWe see these messages in logs with a number of seconds wait associated with the waiting INFO:root:Waiting for ts_mon flushing process to finish... ts_mon is not used on moblab and we should not wait for any time for it to be flushed.
,
Jun 20 2017
What process is emitting these logs? Could we pass a flag to the process to tell it that it should skip sending metrics?
,
Jun 20 2017
I believe several processes emit it - but this specific example was gs_offloader
,
Jun 20 2017
Is there an easy way to tell what environment we're in (moblab or not) from Python? A config file or environmental variable? I don't like the idea of adding a flag to every process which happens to use ts_mon.
,
Jun 20 2017
Can we use the same logic as the startup seems to use - no config file present ? I see this error all the time also: https://cs.corp.google.com/chromeos_public/chromite/third_party/infra_libs/ts_mon/config.py?rcl=568068e60abdcd2036842a0866e53127248f2b6e&l=258
,
Jun 21 2017
Hm, so there are two things to improve here: 1. The ts_mon subprocess (from chromite/lib/ts_mon_config.py) hangs for ~30s on exit, even though ts_mon wasn't set up properly. When the subprocess sets up ts_mon (here https://cs.corp.google.com/chromeos_public/chromite/lib/ts_mon_config.py?q=ts_mon_config.py&dr&l=254), it should short-circuit and quit if ts mon wasn't set up properly. 2. There are spurious WARNINGs when the config file is missing. Do we actually want to remove this? It might still be useful in other contexts where we actually expect ts mon to be configured. For 2, maybe we could put a dummy empty ts-mon config file for Moblab, which signifies we intentionally don't want any metrics, and we shouldn't warn about it. This way, we can keep WARNING in other contexts when ts-mon should be set up, but Moblab logs are kept clean of the WARNINGs.
,
Jun 21 2017
,
Jun 21 2017
Not too worried about #2 we get lots of spurious logging in moblab so we are somewhat used to the noise. Someday it can be cleaned up. #1 us an issue - I would like to make sure moblab never waits around for 30 seconds for something that can never be there.
,
Jun 21 2017
,
Jun 26 2017
,
Jun 28 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/ce8aa196899f884989e3bd0d08358f3dd7916bd3 commit ce8aa196899f884989e3bd0d08358f3dd7916bd3 Author: Paul Hobbs <phobbs@google.com> Date: Wed Jun 28 00:56:15 2017 ts_mon_config: Quit if ts-mon is not setup This change will prevent us from wasting ~30s waiting for ts-mon to flush in cases where we have no metrics set up, like developer desktops or moblab. BUG= chromium:734657 TEST=Added a unit test, which passes Change-Id: Ib7001916fe69db9eef5e9fdd1c5ba42b8ffafc58 Reviewed-on: https://chromium-review.googlesource.com/545100 Commit-Ready: Paul Hobbs <phobbs@google.com> Tested-by: Paul Hobbs <phobbs@google.com> Reviewed-by: Paul Hobbs <phobbs@google.com> Reviewed-by: Keith Haddow <haddowk@chromium.org> Reviewed-by: Dan Shi <dshi@google.com> [modify] https://crrev.com/ce8aa196899f884989e3bd0d08358f3dd7916bd3/lib/ts_mon_config.py [modify] https://crrev.com/ce8aa196899f884989e3bd0d08358f3dd7916bd3/lib/ts_mon_config_unittest.py
,
Jun 28 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/e28ef52459aeefe61b92cad937c5abbdf8e79b5c commit e28ef52459aeefe61b92cad937c5abbdf8e79b5c Author: Paul Hobbs <phobbs@google.com> Date: Wed Jun 28 22:29:46 2017 ts_mon_config: Quit if ts-mon is not setup This change will prevent us from wasting ~30s waiting for ts-mon to flush in cases where we have no metrics set up, like developer desktops or moblab. BUG= chromium:734657 TEST=Added a unit test, which passes Change-Id: Ib7001916fe69db9eef5e9fdd1c5ba42b8ffafc58 Reviewed-on: https://chromium-review.googlesource.com/545100 Commit-Ready: Paul Hobbs <phobbs@google.com> Tested-by: Paul Hobbs <phobbs@google.com> Reviewed-by: Paul Hobbs <phobbs@google.com> Reviewed-by: Keith Haddow <haddowk@chromium.org> Reviewed-by: Dan Shi <dshi@google.com> (cherry picked from commit ce8aa196899f884989e3bd0d08358f3dd7916bd3) Reviewed-on: https://chromium-review.googlesource.com/551107 Commit-Queue: Keith Haddow <haddowk@chromium.org> Tested-by: Keith Haddow <haddowk@chromium.org> Trybot-Ready: Keith Haddow <haddowk@chromium.org> [modify] https://crrev.com/e28ef52459aeefe61b92cad937c5abbdf8e79b5c/lib/ts_mon_config.py [modify] https://crrev.com/e28ef52459aeefe61b92cad937c5abbdf8e79b5c/lib/ts_mon_config_unittest.py
,
Jun 28 2017
,
Jun 29 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/73a00deaea807968deee0b903c291a8312000a87 commit 73a00deaea807968deee0b903c291a8312000a87 Author: Paul Hobbs <phobbs@google.com> Date: Thu Jun 29 21:16:36 2017 ts_mon_config: Refactor _ConsumeMessages Refactor _ConsumeMessages loop into a class. It's more convenient to keep the state in an object. (The state is the reset_after_flush list, whether we have metrics pending, and when our last flush was) This also improves the flush loop's behavior to not wait *indefinitely* for a new metric - instead, it will wait up to FLUSH_INTERVAL. This forces a flush of existing metrics every FLUSH_INTERVAL, which fixes the behavior of metrics which depend on repeated flushes, such as the chromite.lib.metrics.Presence context manager. BUG= chromium:735698 , chromium:734657 TEST=Unit tests, which pass. Change-Id: If96df3e4d7d0d9673efb5377f045c8e45b848695 Reviewed-on: https://chromium-review.googlesource.com/545101 Commit-Ready: Paul Hobbs <phobbs@google.com> Tested-by: Paul Hobbs <phobbs@google.com> Reviewed-by: Paul Hobbs <phobbs@google.com> [modify] https://crrev.com/73a00deaea807968deee0b903c291a8312000a87/lib/ts_mon_config.py [modify] https://crrev.com/73a00deaea807968deee0b903c291a8312000a87/lib/ts_mon_config_unittest.py
,
Jan 22 2018
|
||||||||||
►
Sign in to add a comment |
||||||||||
Comment 1 by dshi@chromium.org
, Jun 19 2017