cautotest: Search for job by name still has the potential to bring down DB |
|
Issue descriptionToday, jeffvansay@ tried searching for jobs matching android_wifi_dynamic on the AFE. A few of these must have timed out on cautotest/ and he tried repeatedly. As a result, 47 really slow queries accumulated (and more that must have finished) each taking 10+ minutes and bringing the AFE db to a grind. These graphs speak for themselves: slow queries shot through the roof: https://viceroy.corp.google.com/chromeos/afe_db?host_name=chromeos-server25&duration=6h&hostname=cros-autotest-shard4&refresh=90#_VG_bj0oe_1J DB threads shot through the roof: https://viceroy.corp.google.com/chromeos/afe_db?host_name=chromeos-server25&duration=6h&hostname=cros-autotest-shard4&refresh=90#_VG_tfl7ygNc And DB itself became unresponsive to other queries: https://viceroy.corp.google.com/chromeos/afe_db?host_name=chromeos-server25&duration=6h&hostname=cros-autotest-shard4&refresh=90#_VG_m4322s0o (Notice no network traffic... DB is busy computing the queries and not accepting new requests) This is unnacceptable. We need to make sure it's impossible to bring cautotest down by searching for a job. |
|
►
Sign in to add a comment |
|
Comment 1 by pprabhu@chromium.org
, Jun 21 2017Status: Duplicate (was: Untriaged)