automatically kill slow autotest database queries |
|||||||||||||||||
Issue descriptionFor background on slow queries, see: https://bugs.chromium.org/p/chromium/issues/detail?id=625536 https://bugs.chromium.org/p/chromium/issues/detail?id=592704 https://bugs.chromium.org/p/chromium/issues/detail?id=626198 Summary: Sometimes we get a backlog of very weird and very slow (multi-day or invincible) queries. They can accumulate and slow down the db significantly. Mysql didn't (until a very recent version) have the concept of a server-side query timeout. However, we could implement such a thing ourselves with a cron job that periodically kills all queries older than X.
,
Jul 7 2016
We should maybe do this on our shards as well.
,
Jul 7 2016
Charlene can you take a look at this?
,
Jul 7 2016
sure, will take a look
,
Jul 12 2016
,
Jul 15 2016
Does someone understand why these queries can take so long? Surely, our DB isn't so big that queries should take hours to execute.
,
Jul 15 2016
The database is several hundred GB in size. The afe table itself is on job id # 65 million. I think we cleaned out the old jobs (i.e. old than 50 ish million) a few months ago, but it accumulates quickly, and can't be cleaned out in a footprint reducing way without downtime (yay MySQl).
,
Jul 16 2016
Issue 599267 has been merged into this issue.
,
Jul 16 2016
Issue 592704 has been merged into this issue.
,
Jul 16 2016
Merged into this bug a few of the old bugs with examples of super slow queries.
,
Jul 28 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/bbf1daaadfcf1c13c4b2a8c2e723908daefdf1e3 commit bbf1daaadfcf1c13c4b2a8c2e723908daefdf1e3 Author: Shuqian Zhao <shuqianz@chromium.org> Date: Tue Jul 26 21:54:04 2016 [autotest] Automatically kill the slow autotest db queries over 1800s Sometimes we get a backlog of very weird and very slow (multi-day or invincible) queries. They can accumulate and slow down the db significantly. This script will Automatically kill the slow db queries over 1800s. Future work is to set it as a cron job that runs every 30 mins. BUG= chromium:626424 TEST=Test on a test master and also on product backup database server. Change-Id: I011a142ba1c2502977abac0e49bdffcf82ef7c78 Reviewed-on: https://chromium-review.googlesource.com/363524 Reviewed-by: Shuqian Zhao <shuqianz@chromium.org> Tested-by: Shuqian Zhao <shuqianz@chromium.org> [add] https://crrev.com/bbf1daaadfcf1c13c4b2a8c2e723908daefdf1e3/site_utils/kill_slow_queries.py
,
Aug 2 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/750d12bfd8fb1c406e6df623d46cf6a6a6fb42fe commit 750d12bfd8fb1c406e6df623d46cf6a6a6fb42fe Author: Shuqian Zhao <shuqianz@chromium.org> Date: Tue Aug 02 17:37:57 2016 [autotest] set kill_slow_queries.py as an executable file. In order to set kill_slow_queries as a cronjob, it must an executable file. BUG= chromium:626424 TEST=None Change-Id: I319968d8baed404af4b8a07c8cdd9cbb4cce0978 Reviewed-on: https://chromium-review.googlesource.com/365422 Tested-by: Shuqian Zhao <shuqianz@chromium.org> Reviewed-by: Dan Shi <dshi@google.com> [modify] https://crrev.com/750d12bfd8fb1c406e6df623d46cf6a6a6fb42fe/site_utils/kill_slow_queries.py
,
Aug 8 2016
,
Aug 31 2016
I didn't see anything in chromeos_admin or any puppet automation that sets up the cron job. I guess the cron job was just set up manually on the mysql server? We should make this a puppet-controlled thing so we can, for instance, change the parameters from chromeos_admin without needing to ssh into the server.
,
Aug 31 2016
I will first manually change the script to 5 mins, and then add it to puppet
,
Aug 31 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/c6352c0ef82002833090e6d38bfdf27a1b6110b5 commit c6352c0ef82002833090e6d38bfdf27a1b6110b5 Author: Shuqian Zhao <shuqianz@chromium.org> Date: Wed Aug 31 20:39:39 2016 [autotest] change query lifetime from 1800s to 300s (5mins) Change the db query lifetime from 1800s (30mins) to 300s (5mins). BUG= chromium:626424 TEST=None Change-Id: I6c8be555961a25a90e16e601bcb0ffcf741f43a0 Reviewed-on: https://chromium-review.googlesource.com/379158 Reviewed-by: Dan Shi <dshi@google.com> Tested-by: Shuqian Zhao <shuqianz@chromium.org> [modify] https://crrev.com/c6352c0ef82002833090e6d38bfdf27a1b6110b5/site_utils/kill_slow_queries.py
,
Aug 31 2016
The following revision refers to this bug: https://chrome-internal.googlesource.com/chromeos/chromeos-admin/+/9742c834a8356b2c6c38f785299eff20131be2b0 commit 9742c834a8356b2c6c38f785299eff20131be2b0 Author: Shuqian Zhao <shuqianz@chromium.org> Date: Wed Aug 31 21:23:51 2016
,
Sep 27 2016
,
Oct 7 2016
,
Oct 10 2016
,
Nov 19 2016
,
Jan 21 2017
,
Mar 4 2017
,
Apr 17 2017
,
May 30 2017
,
Aug 1 2017
,
Oct 14 2017
|
|||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||
Comment 1 by akes...@chromium.org
, Jul 7 2016