New issue
Advanced search Search tips

Issue 788888 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Aug 17
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

Scheduler resource leak triggered DB slowness

Project Member Reported by pprabhu@chromium.org, Nov 27 2017

Issue description

We think we have a resource leak in the scheduler that causes DB slowness. Every push, things get better: http://shortn/_3MzbHZGVY0
 
Chase: Just reset the scheduler every 8 hours. We still get to see this pattern, but it doesn't get large enough to affect DB too badly.
Status: Started (was: Assigned)
CL in review: https://chromium-review.googlesource.com/c/chromiumos/third_party/autotest/+/806743

Needs a prod push and a puppet change to enforce this in prod.
Project Member

Comment 3 by bugdroid1@chromium.org, Dec 8 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/74d96d7a8a803828c7096761fe22b8da2d860889

commit 74d96d7a8a803828c7096761fe22b8da2d860889
Author: Prathmesh Prabhu <pprabhu@chromium.org>
Date: Fri Dec 08 11:59:23 2017

autotest: Add a process lifetime argument to monitor_db

BUG= chromium:788888 
TEST=Manual run with / without new argument.

Change-Id: I9dd48771620b2143ae9be384f4bbb7820bff9fba
Reviewed-on: https://chromium-review.googlesource.com/806743
Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com>
Tested-by: Prathmesh Prabhu <pprabhu@chromium.org>
Reviewed-by: Xixuan Wu <xixuan@chromium.org>

[modify] https://crrev.com/74d96d7a8a803828c7096761fe22b8da2d860889/scheduler/monitor_db.py

pending a push to prod and shadow config change
Project Member

Comment 5 by bugdroid1@chromium.org, Dec 13 2017

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/chromeos/chromeos-admin/+/8159b9dd94660662507d910d0ad5edb473335128

commit 8159b9dd94660662507d910d0ad5edb473335128
Author: Prathmesh Prabhu <pprabhu@chromium.org>
Date: Wed Dec 13 05:37:08 2017

Status: Fixed (was: Started)
Status: Started (was: Fixed)
Keep it open to determine if scheduler restart was enough to smooth out the peaks seen here: https://viceroy.corp.google.com/chromeos/capacity_health?duration=90d&hostname=chromeos-server2&refresh=-1
Labels: -Chase
Any update on this?
Status: Fixed (was: Started)
All services are now restarted daily except the master DB mysql server.
We haven't seen weekly cyclic issues around this.

Sign in to add a comment