New issue
Advanced search Search tips

Issue 873736 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Sep 12
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Feature

Blocking:
issue 872545



Sign in to add a comment

Swarming: use named cache size hints

Project Member Reported by mar...@chromium.org, Aug 13

Issue description

Do regression based on current caches

The bots already report their named cache via state , leverage this by having the Swarming server look in it's bot states to find a hot cache size from another bot and pass this hint (size) down to the bot.
Ref: go/swarming-bot-cache

Once this is implemented, we can start lowering the free disk space values in bot_config.
 
Project Member

Comment 1 by bugdroid1@chromium.org, Aug 13

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/2fefe5b2ea53afd03727a7815b260b7085c299e0

commit 2fefe5b2ea53afd03727a7815b260b7085c299e0
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Mon Aug 13 20:41:11 2018

[swarming] add DB index and task queue for named cache size cache

Change 1 out of 3.

The code will be added in a follow up, but these two files need to be
deployed first, otherwise this will cause exceptions on the server.

R=qyearsley@chromium.org

Bug:  873736 
Change-Id: I4d27ff4f927fcb40d66e32be8a93b8991ac52914
Reviewed-on: https://chromium-review.googlesource.com/1172929
Reviewed-by: Quinten Yearsley <qyearsley@chromium.org>
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>

[modify] https://crrev.com/2fefe5b2ea53afd03727a7815b260b7085c299e0/appengine/swarming/index.yaml
[modify] https://crrev.com/2fefe5b2ea53afd03727a7815b260b7085c299e0/appengine/swarming/queue.yaml

Project Member

Comment 2 by bugdroid1@chromium.org, Aug 13

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/d4d65c898922a10d96efff33ddb175ea69ac06d8

commit d4d65c898922a10d96efff33ddb175ea69ac06d8
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Mon Aug 13 22:31:06 2018

[swarming] Add server side cache implementation

Change 2 out of 3.

This adds all the server side code to precalculate the named cache
hints. The cron job is not yet enabled, as otherwise this would cause
server-side exceptions during deployment. This means that until the
next CL is deployed, the hints shall all be -1.

It doesn't try to do anything fuzzy for now, just look at P(95).

R=qyearsley@chromium.org

Bug:  873736 
Change-Id: I5222effbfdad2b5a940be8718b69869ad858a803
Reviewed-on: https://chromium-review.googlesource.com/1172930
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>
Reviewed-by: Quinten Yearsley <qyearsley@chromium.org>

[modify] https://crrev.com/d4d65c898922a10d96efff33ddb175ea69ac06d8/appengine/swarming/handlers_backend.py
[modify] https://crrev.com/d4d65c898922a10d96efff33ddb175ea69ac06d8/appengine/swarming/handlers_bot.py
[modify] https://crrev.com/d4d65c898922a10d96efff33ddb175ea69ac06d8/appengine/swarming/handlers_bot_test.py
[modify] https://crrev.com/d4d65c898922a10d96efff33ddb175ea69ac06d8/appengine/swarming/handlers_test.py
[add] https://crrev.com/d4d65c898922a10d96efff33ddb175ea69ac06d8/appengine/swarming/server/named_caches.py
[add] https://crrev.com/d4d65c898922a10d96efff33ddb175ea69ac06d8/appengine/swarming/server/named_caches_test.py
[modify] https://crrev.com/d4d65c898922a10d96efff33ddb175ea69ac06d8/appengine/swarming/server/pools_config.py
[modify] https://crrev.com/d4d65c898922a10d96efff33ddb175ea69ac06d8/appengine/swarming/server/pools_config_test.py

Project Member

Comment 3 by bugdroid1@chromium.org, Aug 13

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/dd977b101cf63b24d8e4feae100ae9c67c6d88cb

commit dd977b101cf63b24d8e4feae100ae9c67c6d88cb
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Mon Aug 13 23:31:57 2018

[swarming] Doc follow up for f4353ade23550586b095685

I forgot to upload the last patchset updating doc and comments before
sending to CQ. Oops.

No functional change.

R=qyearsley@chromium.org

Bug:  873736 
Change-Id: I9a289f97077c148907c947c63a2d9c72c6a6f231
Reviewed-on: https://chromium-review.googlesource.com/1173346
Reviewed-by: Quinten Yearsley <qyearsley@chromium.org>
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>

[modify] https://crrev.com/dd977b101cf63b24d8e4feae100ae9c67c6d88cb/appengine/swarming/server/named_caches.py

Project Member

Comment 4 by bugdroid1@chromium.org, Aug 14

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/6b52b41378c8a4e77c184772032321ccd0299908

commit 6b52b41378c8a4e77c184772032321ccd0299908
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Tue Aug 14 13:49:57 2018

Revert "[swarming] Add server side cache implementation"

This reverts commits d4d65c898922a10d96efff33ddb175ea69ac06d8 and
dd977b101cf63b24d8e4feae100ae9c67c6d88cb.

https://chromium-review.googlesource.com/1173346
https://chromium-review.googlesource.com/1172930

Reason: I had forgot that I wanted to key the named cache hints on the
OS. And there's been a MP failure introduced in the last roll that
forces me to roll forward.

TBR=qyearsley@chromium.org

Bug:  873736 , 873951
Change-Id: Iabf43497936844b38485d537f7227be7a752fe92
Reviewed-on: https://chromium-review.googlesource.com/1174591
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>
Reviewed-by: Marc-Antoine Ruel <maruel@chromium.org>

[modify] https://crrev.com/6b52b41378c8a4e77c184772032321ccd0299908/appengine/swarming/handlers_backend.py
[modify] https://crrev.com/6b52b41378c8a4e77c184772032321ccd0299908/appengine/swarming/handlers_bot.py
[modify] https://crrev.com/6b52b41378c8a4e77c184772032321ccd0299908/appengine/swarming/handlers_bot_test.py
[modify] https://crrev.com/6b52b41378c8a4e77c184772032321ccd0299908/appengine/swarming/handlers_test.py
[delete] https://crrev.com/dd977b101cf63b24d8e4feae100ae9c67c6d88cb/appengine/swarming/server/named_caches.py
[delete] https://crrev.com/dd977b101cf63b24d8e4feae100ae9c67c6d88cb/appengine/swarming/server/named_caches_test.py
[modify] https://crrev.com/6b52b41378c8a4e77c184772032321ccd0299908/appengine/swarming/server/pools_config.py
[modify] https://crrev.com/6b52b41378c8a4e77c184772032321ccd0299908/appengine/swarming/server/pools_config_test.py

Project Member

Comment 5 by bugdroid1@chromium.org, Aug 14

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/4d64e9dbf039eb4d7dfa8c09529c3172d0c44a7a

commit 4d64e9dbf039eb4d7dfa8c09529c3172d0c44a7a
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Tue Aug 14 17:38:49 2018

Reapply "[swarming] Add server side cache implementation"

This reverts commit 6b52b41378c8a4e77c184772032321ccd0299908.

Includes changes to key the named caches on the OS and adds fuzzy querying for
other OSes when one is not found for the bot's OS.

Bug:  873736 
Change-Id: I64cbc4ca9734761f6129d334e8909eb594b56c2a
Reviewed-on: https://chromium-review.googlesource.com/1174557
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>
Reviewed-by: Quinten Yearsley <qyearsley@chromium.org>

[modify] https://crrev.com/4d64e9dbf039eb4d7dfa8c09529c3172d0c44a7a/appengine/swarming/handlers_backend.py
[modify] https://crrev.com/4d64e9dbf039eb4d7dfa8c09529c3172d0c44a7a/appengine/swarming/handlers_bot.py
[modify] https://crrev.com/4d64e9dbf039eb4d7dfa8c09529c3172d0c44a7a/appengine/swarming/handlers_bot_test.py
[modify] https://crrev.com/4d64e9dbf039eb4d7dfa8c09529c3172d0c44a7a/appengine/swarming/handlers_test.py
[add] https://crrev.com/4d64e9dbf039eb4d7dfa8c09529c3172d0c44a7a/appengine/swarming/server/named_caches.py
[add] https://crrev.com/4d64e9dbf039eb4d7dfa8c09529c3172d0c44a7a/appengine/swarming/server/named_caches_test.py
[modify] https://crrev.com/4d64e9dbf039eb4d7dfa8c09529c3172d0c44a7a/appengine/swarming/server/pools_config.py
[modify] https://crrev.com/4d64e9dbf039eb4d7dfa8c09529c3172d0c44a7a/appengine/swarming/server/pools_config_test.py

Project Member

Comment 6 by bugdroid1@chromium.org, Aug 15

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/03bc5d4159bad2732edf69678f84a665d7fbc375

commit 03bc5d4159bad2732edf69678f84a665d7fbc375
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Wed Aug 15 15:59:22 2018

[swarming] Enable the cron job to precalculate named cache hints

Change 3 out of 3.

At this point, the bots will start receiving named cache hints.

A follow up will add the bot side handling, which will take the hint in
account to make free space before starting the task.

Bug:  873736 
Change-Id: I257562c35a5012d780466b6858be1eef79af4c45
Reviewed-on: https://chromium-review.googlesource.com/1172916
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>
Reviewed-by: Quinten Yearsley <qyearsley@chromium.org>

[modify] https://crrev.com/03bc5d4159bad2732edf69678f84a665d7fbc375/appengine/swarming/cron.yaml

Project Member

Comment 7 by bugdroid1@chromium.org, Aug 15

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/0cfbd6f84ca2686c7b38cd7c778956c445d50261

commit 0cfbd6f84ca2686c7b38cd7c778956c445d50261
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Wed Aug 15 17:08:16 2018

[swarming] Fix errors for named cache cache

- Incorrect encoding when triggering the task queue
- Handle bot with no 'os' dimension.

Add more logging.

Bug:  873736 
Change-Id: I93896ce53b874180a230b41d66b3a45ff3e588ac
Reviewed-on: https://chromium-review.googlesource.com/1175931
Reviewed-by: Quinten Yearsley <qyearsley@chromium.org>
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>

[modify] https://crrev.com/0cfbd6f84ca2686c7b38cd7c778956c445d50261/appengine/swarming/handlers_backend.py
[modify] https://crrev.com/0cfbd6f84ca2686c7b38cd7c778956c445d50261/appengine/swarming/server/named_caches.py
[modify] https://crrev.com/0cfbd6f84ca2686c7b38cd7c778956c445d50261/appengine/swarming/server/named_caches_test.py

Project Member

Comment 8 by bugdroid1@chromium.org, Aug 29

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/28faa3511cd71826ef43e06553f5176ee27f33f2

commit 28faa3511cd71826ef43e06553f5176ee27f33f2
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Wed Aug 29 19:02:23 2018

[client] Implement the bot side of named cache size hint

The trimming code itself is still disabled, so there is no behavior change yet.

This is a breaking change for run_isolated.py: it's --named-cache argument now
accepts three parameters instead of 2, the third one is the named cache size
hint, or -1.

When run_isolated starts and it detects that some named caches are missing, it
makes sure that the free disk space is increased by the hint requested. This
code is not yet activated.

This means that a trim may occur before the task starts, if run_isolated.py
determines that more free disk space is needed.

Change the API to pass strings, so that JSON encoding doesn't get in the way.

Bug:  873736 
Change-Id: I9b8a8837dba7ae4e12b3761e32c1cddcfb4a9ee8
Reviewed-on: https://chromium-review.googlesource.com/1177902
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>
Reviewed-by: Quinten Yearsley <qyearsley@chromium.org>

[modify] https://crrev.com/28faa3511cd71826ef43e06553f5176ee27f33f2/appengine/swarming/handlers_bot.py
[modify] https://crrev.com/28faa3511cd71826ef43e06553f5176ee27f33f2/appengine/swarming/handlers_bot_test.py
[modify] https://crrev.com/28faa3511cd71826ef43e06553f5176ee27f33f2/appengine/swarming/swarming_bot/bot_code/task_runner.py
[modify] https://crrev.com/28faa3511cd71826ef43e06553f5176ee27f33f2/appengine/swarming/swarming_bot/bot_code/task_runner_test.py
[modify] https://crrev.com/28faa3511cd71826ef43e06553f5176ee27f33f2/client/run_isolated.py
[modify] https://crrev.com/28faa3511cd71826ef43e06553f5176ee27f33f2/client/tests/run_isolated_smoke_test.py
[modify] https://crrev.com/28faa3511cd71826ef43e06553f5176ee27f33f2/client/tests/run_isolated_test.py

Project Member

Comment 9 by bugdroid1@chromium.org, Sep 5

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/a3738d4338b910c0ad615d1ac41c09762dd977e2

commit a3738d4338b910c0ad615d1ac41c09762dd977e2
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Tue Sep 04 17:19:59 2018

[client] fix condition for object

The shorthand 'if foo:' would not work for empty caches, as when
__nonzero__ is not implemented, python calls __len__.

This means that an empty cache is False, but that's not the expected behavior.

Bug:  873736 
Change-Id: I4e9558fe5d199a85acbc79edfac44f8be64edf0d
Reviewed-on: https://chromium-review.googlesource.com/1203632
Reviewed-by: Jao-ke Chin-Lee <jchinlee@chromium.org>
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>

[modify] https://crrev.com/a3738d4338b910c0ad615d1ac41c09762dd977e2/client/local_caching.py
[modify] https://crrev.com/a3738d4338b910c0ad615d1ac41c09762dd977e2/client/tests/local_caching_test.py

Project Member

Comment 10 by bugdroid1@chromium.org, Sep 6

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/9a74523450059db52a8007178bc1997ab03c3abb

commit 9a74523450059db52a8007178bc1997ab03c3abb
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Thu Sep 06 12:22:03 2018

[swarming] fix named cache hint calculation

It was looking at the wrong key. Yay for untyped data structures.

R=jchinlee@chromium.org

Bug:  873736 
Change-Id: I9a6444232eec1364d8c29d036efb36c4b8676068
Reviewed-on: https://chromium-review.googlesource.com/1208472
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>
Reviewed-by: Jao-ke Chin-Lee <jchinlee@chromium.org>

[modify] https://crrev.com/9a74523450059db52a8007178bc1997ab03c3abb/appengine/swarming/server/named_caches.py
[modify] https://crrev.com/9a74523450059db52a8007178bc1997ab03c3abb/appengine/swarming/server/named_caches_test.py

Project Member

Comment 11 by bugdroid1@chromium.org, Sep 6

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/3c22b45c06209673eb2da00f0d985fb541e68f0f

commit 3c22b45c06209673eb2da00f0d985fb541e68f0f
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Thu Sep 06 23:23:20 2018

[swarming] Rename max_size to hint; index os

- max_size was confusing, it's really a size hint, not the max size. How the
  hint is calculated is not part of the entity itself.
- Make os indexed, since I realized this is actually useful by using the
  datstore entity browser.

R=jchinlee@chromium.org

Bug:  873736 
Change-Id: I793f2e0a666346d2a50593ee7359eda6359e3d0d
Reviewed-on: https://chromium-review.googlesource.com/1210022
Reviewed-by: Jao-ke Chin-Lee <jchinlee@chromium.org>
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>

[modify] https://crrev.com/3c22b45c06209673eb2da00f0d985fb541e68f0f/appengine/swarming/server/named_caches.py

Project Member

Comment 12 by bugdroid1@chromium.org, Sep 10

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/2abfc04c5e1dcb846e1fbb7ae882a8a11f01220f

commit 2abfc04c5e1dcb846e1fbb7ae882a8a11f01220f
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Mon Sep 10 14:40:35 2018

[client] Enforce missing named cache hints

This will impact the free disk space created on the bots, but the end goal is to
reduce the values in bot_config for minimum free disk space, which should enable
keeping more caches locally.

Bug:  873736 
Change-Id: I4434225e823ddb4a66ad99d7ad853b4f1657deeb
Reviewed-on: https://chromium-review.googlesource.com/1178325
Reviewed-by: Jao-ke Chin-Lee <jchinlee@chromium.org>
Reviewed-by: Quinten Yearsley <qyearsley@chromium.org>
Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org>

[modify] https://crrev.com/2abfc04c5e1dcb846e1fbb7ae882a8a11f01220f/client/run_isolated.py
[modify] https://crrev.com/2abfc04c5e1dcb846e1fbb7ae882a8a11f01220f/client/tests/run_isolated_test.py

Project Member

Comment 13 by bugdroid1@chromium.org, Sep 12

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infradata/config/+/5e4738ce73fb439be5e95464d797bfa5ba3738b7

commit 5e4738ce73fb439be5e95464d797bfa5ba3738b7
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Wed Sep 12 13:24:40 2018

Project Member

Comment 14 by bugdroid1@chromium.org, Sep 12

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infradata/config/+/4f645fa93f95dcb47621c885748351193b2db66b

commit 4f645fa93f95dcb47621c885748351193b2db66b
Author: Marc-Antoine Ruel <maruel@chromium.org>
Date: Wed Sep 12 16:55:32 2018

Status: Fixed (was: Started)

Sign in to add a comment