mysql_stats.py eats up lots of CPU on servers |
|||
Issue descriptionWe have lots of CPU consumption from mysql_stats.py, around 80% of one CPU core per server: 1616 chromeo+ 20 0 170184 11400 4612 R 35.3 0.0 555:20.04 /usr/bin/python /usr/local/autotest/site_utils/stats/mysql_stats.py 3504 chromeo+ 20 0 246220 21628 1756 S 25.0 0.1 417:46.80 /usr/bin/python /usr/local/autotest/site_utils/stats/mysql_stats.py 3508 chromeo+ 20 0 179340 26120 3616 S 19.1 0.1 306:05.36 /usr/bin/python /usr/local/autotest/site_utils/stats/mysql_stats.py I think there is a regression caused by: commit e86ccd8c5bccf09a9f28fc713786dd499d0eca3d Author: Paul Hobbs <phobbs@google.com> Date: Fri Mar 24 19:39:24 2017 -0700 [mysql_stats] Add retry to mysql_stats queries Currently the mysql_stats.py script fails very quickly, and even with an upstart job to retry running it, it will give up quickly. This change introduces a retry loop (using common_lib.cros.retry). BUG= chromium:705188 TEST=retry_unittest.py and mysql_stats_unittest.py pass. Change-Id: I3e78a34f0f89cf9e134049f0b55bb7d9ab2ed2e3 Reviewed-on: https://chromium-review.googlesource.com/459718 Commit-Ready: Paul Hobbs <phobbs@google.com> Tested-by: Paul Hobbs <phobbs@google.com> Reviewed-by: Aviv Keshet <akeshet@chromium.org> Look at this part of the diff: - while True: - now = time.time() - QueryAndEmit(baselines, cursor) - time_spent = time.time() - now - sleep_duration = LOOP_INTERVAL - time_spent - time.sleep(max(0, sleep_duration)) [...] + while True: + now = time.time() + QueryAndEmit(baselines, conn) + time_spent = time.time() - now + sleep_duration = LOOP_INTERVAL - time_spent So this patch removed the sleep() call, and now mysql_stats.py eats more CPU. The first thing for this bug is to fix that regression, then we should look at the CPU usage, if that's not enough we can increase LOOP_INTERVAL.
,
May 11 2017
The change has been deployed, the CPU usage is improving on server45: https://viceroy.corp.google.com/chromeos/machines?duration=1d&hostname=chromeos-server45&refresh=-1 Sadly the fix needs a manual restart of the mysql_stats service. I have restarted it on server45 and server100. If things don't eventually restart by themselves I will do it by hand. I am keeping that bug open until then.
,
May 11 2017
Also I don't think we'll need to change LOOP_INTERVAL, mysql_stats doesn't even show in top now.
,
May 11 2017
I'm rolling out a small db migration, so dealing with fabric anyway. I'll restart the stats service as a side effect and update here when done.
,
May 11 2017
Should be done (on shards, not yet cautotest + afes)
,
May 26 2017
|
|||
►
Sign in to add a comment |
|||
Comment 1 by bugdroid1@chromium.org
, May 10 2017