New issue
Advanced search Search tips

Issue 895493 link

Starred by 1 user

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug



Sign in to add a comment

Swarming: continuous stream of memory exhaustion on the backend

Project Member Reported by mar...@chromium.org, Oct 15

Issue description

It started happening a few months ago. This issue is about diagnosing the root cause and fixing it. I'll include splitting the main.py script in two.
 
Project Member

Comment 1 by bugdroid1@chromium.org, Oct 15

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/6d0a4be256392e549f34cb4873ac13de9bec1ca2

commit 6d0a4be256392e549f34cb4873ac13de9bec1ca2
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Mon Oct 15 18:31:53 2018

[swarming] fix frontend / backend split

There is a memory leak issue, and I'm starting with reducing the number
of variables by properly splitting the apps.

- Removes an old /internal/.+ redirect on the frontend module to the
  backend handlers.
- Improve comments in app.yaml and module-backend.yaml

Bug: 895493
Change-Id: Ica28f415dd9aa8d76561a19f0e4ff96487ce592d
Reviewed-on: https://chromium-review.googlesource.com/c/1280745
Reviewed-by: Jao-ke Chin-Lee <jchinlee@chromium.org>
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>

[modify] https://crrev.com/6d0a4be256392e549f34cb4873ac13de9bec1ca2/appengine/swarming/app.yaml
[add] https://crrev.com/6d0a4be256392e549f34cb4873ac13de9bec1ca2/appengine/swarming/main_backend.py
[rename] https://crrev.com/6d0a4be256392e549f34cb4873ac13de9bec1ca2/appengine/swarming/main_frontend.py
[modify] https://crrev.com/6d0a4be256392e549f34cb4873ac13de9bec1ca2/appengine/swarming/module-backend.yaml

Project Member

Comment 2 by bugdroid1@chromium.org, Oct 17

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/c9107453932bd9cb1df0c32e6de93664bf175394

commit c9107453932bd9cb1df0c32e6de93664bf175394
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Wed Oct 17 20:42:22 2018

[components] reorder utils.py

No functional change. Only code moved around.

I want to add new code, but I felt this file was due for a reordering
first. The new code will be in Handler section.

R=jchinlee@chromium.org

Bug: 895493
Change-Id: I074027fea7948515b1b1aece6f464b78e93985a7
Reviewed-on: https://chromium-review.googlesource.com/c/1286911
Reviewed-by: Jao-ke Chin-Lee <jchinlee@chromium.org>
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>

[modify] https://crrev.com/c9107453932bd9cb1df0c32e6de93664bf175394/appengine/components/components/utils.py

Project Member

Comment 3 by bugdroid1@chromium.org, Oct 18

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/d63fdb9a27d0e89fb2dc11424b6e47533b82d542

commit d63fdb9a27d0e89fb2dc11424b6e47533b82d542
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Thu Oct 18 19:14:29 2018

[swarming] Report memory usage after each handler

Add this to config service too.

This will help diagnose leaks as I'll be able to look at all log for one
specific GAE instance and see the difference in memory usage across
requests.

I will need to add it to gae_ts_mon too, but I'll need to tweak upstream
first.

Bug: 895493
Change-Id: I231fd282846a2736475d08609444ddbbecd8a1db
Reviewed-on: https://chromium-review.googlesource.com/c/1281543
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>
Reviewed-by: Vadim Shtayura <vadimsh@chromium.org>
Reviewed-by: Jao-ke Chin-Lee <jchinlee@chromium.org>

[modify] https://crrev.com/d63fdb9a27d0e89fb2dc11424b6e47533b82d542/appengine/components/components/config/apps.py
[modify] https://crrev.com/d63fdb9a27d0e89fb2dc11424b6e47533b82d542/appengine/components/components/ereporter2/main.py
[modify] https://crrev.com/d63fdb9a27d0e89fb2dc11424b6e47533b82d542/appengine/components/components/utils.py
[modify] https://crrev.com/d63fdb9a27d0e89fb2dc11424b6e47533b82d542/appengine/swarming/main_backend.py
[modify] https://crrev.com/d63fdb9a27d0e89fb2dc11424b6e47533b82d542/appengine/swarming/main_frontend.py

Project Member

Comment 4 by bugdroid1@chromium.org, Oct 31

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/b544336328f8805c17b98aa0a011b938237a82cf

commit b544336328f8805c17b98aa0a011b938237a82cf
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Wed Oct 31 20:32:24 2018

Fix a small error in d63fdb9a27d0e89

The hook for memory usage must be disabled in unit tests, not when running in
the dev server.

Bug: 895493
Change-Id: I4389101ef7723cecfb020df80ac5a257d9ebf085
Reviewed-on: https://chromium-review.googlesource.com/c/1289351
Reviewed-by: Vadim Shtayura <vadimsh@chromium.org>
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>

[modify] https://crrev.com/b544336328f8805c17b98aa0a011b938237a82cf/appengine/components/components/utils.py

Project Member

Comment 5 by bugdroid1@chromium.org, Dec 5

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/d62dd4dc8474fced5bc4fe850ee80b44e57dfae9

commit d62dd4dc8474fced5bc4fe850ee80b44e57dfae9
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Wed Dec 05 20:00:47 2018

[gae_ts_mon] report memory usage if it increased significantly

I suspect this may be causing issues. We'll have to traces to determine
if so.

R=jeffcarp@chromium.org

Bug: 895493
Change-Id: Ic33677a5b7c89c734a8cae6d1f13c61cb6689443
Reviewed-on: https://chromium-review.googlesource.com/c/1286545
Reviewed-by: Jeff Carpenter <jeffcarp@chromium.org>
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>
Cr-Commit-Position: refs/heads/master@{#19367}
[modify] https://crrev.com/d62dd4dc8474fced5bc4fe850ee80b44e57dfae9/appengine_module/gae_ts_mon/handlers.py
[modify] https://crrev.com/d62dd4dc8474fced5bc4fe850ee80b44e57dfae9/appengine_module/gae_ts_mon/test/handlers_test.py

Sign in to add a comment