New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 898216 link

Starred by 1 user

Issue metadata

Status: Started
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug



Sign in to add a comment

swarming recipe_module: download a subset of outputs files from the task's isolated tree

Project Member Reported by robert...@chromium.org, Oct 23

Issue description

The use case is getting a binary from an isolated test, specifically when the binary has been built with code coverage instrumentation.
 
Status: Available (was: Untriaged)
Cc: joshuaseaton@google.com erikc...@chromium.org tikuta@chromium.org
Summary: swarming recipe_module: download a subset of outputs files from the task's isolated tree (was: Provide a way to have a recipe download a file (or a subset of files) from an isolate.)
- Updated the topic to be more specific.
- I think this should be added to the isolated Go tool. Joshua what do you think about it? That said, it's fairly orthogonal to this issue.

Use case:
- A task had a lot of outputs. There's multiple cases of this, especially in failure code path where a lot of logs are dumped.
- The recipe retrieves the task results, but only outputs.json (or equivalent) is used, the rest is not needed.

Expected:
Only actually needed output is retrieved.

Actual:
A lot of I/O is done with the data discarded right after. This is particularly bad for webkit layout tests.

What Roberto is really looking for is for the swarming collect to support this.
Owner: tikuta@chromium.org
Status: Started (was: Available)
How do you think adding filtering option to swarming client?

https://chromium-review.googlesource.com/c/infra/luci/client-py/+/1309276/
Unless I'm misinterpreting, I believe fuchsia's swarming module can already do this.

We have the user build up a `TaskRequest` object, for which `outputs` are explicitly enumerated:
https://fuchsia.googlesource.com/infra/recipes/+/master/recipe_modules/swarming/api.py#106
We pass the corresponding JSON to the swarming go client's `spawn_tasks` command - and, on collect()-ing, only those registered outputs are returned.

robertocn@, would that give what you want?

Now that we've finished upstreaming out isolated recipe module to recipes-py, I'll start iterating this week with iannucci@ to do the same with our swarming module.

https://chromium-review.googlesource.com/c/infra/luci/luci-py/+/1309855 is good, I'm waiting for the comments to be addressed.

But as Joshua noted, work on issue 894045 is likely going to subsume this change, but I am not investing time on this at the moment, and this is "some" amount of work.

If someone wants to take upon the migration, I think it's the best way forward. But I'm also fine with adding this right now to the python client as the benefits for Chrome use case is large.
Project Member

Comment 6 by bugdroid1@chromium.org, Nov 6

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-py.git/+/8c72886a52713f1b46f6e672be984d6d7411ef75

commit 8c72886a52713f1b46f6e672be984d6d7411ef75
Author: Takuto Ikuta <tikuta@chromium.org>
Date: Tue Nov 06 20:42:43 2018

[client] Add --filepath-filter option to choose download files from isolate

If I use this option like below, downloading subset files of
webkit_layout_tests become 3 seconds.
That is much faster than 9 min to download all isolate outputs.

$ time tools/swarming_client/swarming.py collect -S
https://chromium-swarm.appspot.com --json task.json
--task-output-dir=tmp --task-summary-json=summary.json
--task-output-stdout=none --filepath-filter='^output.json$'
vm78-m9: 40e16cc5f27bcf10 0
vm222-m9: 40e16cb0049b4510 0
vm419-m1: 40e16cabf8539210 0
vm1405-m4: 40e16cae4b766f10 0
vm1456-m4: 40e16cb5e008a610 0
vm311-m4: 40e16cba51a68b10 0
vm180-m9: 40e16caa5ed5e210 0
vm131-m4: 40e16cb895348610 0
vm458-m4: 40e16cbda53f2910 0
vm91-m4: 40e16cc0a9655a10 0
vm449-m4: 40e16cc986923010 0
vm49-m4: 40e16cc43c95f010 0

real    0m3.497s
user    0m1.549s
sys     0m0.592s

Bug:  868878 , 898216
Change-Id: I8f264bd263373241e48438426b623e89b3cf544e
Reviewed-on: https://chromium-review.googlesource.com/c/1309855
Reviewed-by: Marc-Antoine Ruel <maruel@chromium.org>
Commit-Queue: Takuto Ikuta <tikuta@chromium.org>

[modify] https://crrev.com/8c72886a52713f1b46f6e672be984d6d7411ef75/client/isolateserver.py
[modify] https://crrev.com/8c72886a52713f1b46f6e672be984d6d7411ef75/client/swarming.py
[modify] https://crrev.com/8c72886a52713f1b46f6e672be984d6d7411ef75/client/tests/swarming_test.py

robertocn@, you will be able to specify regexp filter for isolated files after roll of swarming client. Does this fulfill your requirements?

Sign in to add a comment