swarming recipe_module: download a subset of outputs files from the task's isolated tree |
|||
Issue descriptionThe use case is getting a binary from an isolated test, specifically when the binary has been built with code coverage instrumentation.
,
Oct 29
- Updated the topic to be more specific. - I think this should be added to the isolated Go tool. Joshua what do you think about it? That said, it's fairly orthogonal to this issue. Use case: - A task had a lot of outputs. There's multiple cases of this, especially in failure code path where a lot of logs are dumped. - The recipe retrieves the task results, but only outputs.json (or equivalent) is used, the rest is not needed. Expected: Only actually needed output is retrieved. Actual: A lot of I/O is done with the data discarded right after. This is particularly bad for webkit layout tests. What Roberto is really looking for is for the swarming collect to support this.
,
Oct 31
How do you think adding filtering option to swarming client? https://chromium-review.googlesource.com/c/infra/luci/client-py/+/1309276/
,
Nov 3
Unless I'm misinterpreting, I believe fuchsia's swarming module can already do this. We have the user build up a `TaskRequest` object, for which `outputs` are explicitly enumerated: https://fuchsia.googlesource.com/infra/recipes/+/master/recipe_modules/swarming/api.py#106 We pass the corresponding JSON to the swarming go client's `spawn_tasks` command - and, on collect()-ing, only those registered outputs are returned. robertocn@, would that give what you want? Now that we've finished upstreaming out isolated recipe module to recipes-py, I'll start iterating this week with iannucci@ to do the same with our swarming module.
,
Nov 5
https://chromium-review.googlesource.com/c/infra/luci/luci-py/+/1309855 is good, I'm waiting for the comments to be addressed. But as Joshua noted, work on issue 894045 is likely going to subsume this change, but I am not investing time on this at the moment, and this is "some" amount of work. If someone wants to take upon the migration, I think it's the best way forward. But I'm also fine with adding this right now to the python client as the benefits for Chrome use case is large.
,
Nov 6
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/8c72886a52713f1b46f6e672be984d6d7411ef75 commit 8c72886a52713f1b46f6e672be984d6d7411ef75 Author: Takuto Ikuta <tikuta@chromium.org> Date: Tue Nov 06 20:42:43 2018 [client] Add --filepath-filter option to choose download files from isolate If I use this option like below, downloading subset files of webkit_layout_tests become 3 seconds. That is much faster than 9 min to download all isolate outputs. $ time tools/swarming_client/swarming.py collect -S https://chromium-swarm.appspot.com --json task.json --task-output-dir=tmp --task-summary-json=summary.json --task-output-stdout=none --filepath-filter='^output.json$' vm78-m9: 40e16cc5f27bcf10 0 vm222-m9: 40e16cb0049b4510 0 vm419-m1: 40e16cabf8539210 0 vm1405-m4: 40e16cae4b766f10 0 vm1456-m4: 40e16cb5e008a610 0 vm311-m4: 40e16cba51a68b10 0 vm180-m9: 40e16caa5ed5e210 0 vm131-m4: 40e16cb895348610 0 vm458-m4: 40e16cbda53f2910 0 vm91-m4: 40e16cc0a9655a10 0 vm449-m4: 40e16cc986923010 0 vm49-m4: 40e16cc43c95f010 0 real 0m3.497s user 0m1.549s sys 0m0.592s Bug: 868878 , 898216 Change-Id: I8f264bd263373241e48438426b623e89b3cf544e Reviewed-on: https://chromium-review.googlesource.com/c/1309855 Reviewed-by: Marc-Antoine Ruel <maruel@chromium.org> Commit-Queue: Takuto Ikuta <tikuta@chromium.org> [modify] https://crrev.com/8c72886a52713f1b46f6e672be984d6d7411ef75/client/isolateserver.py [modify] https://crrev.com/8c72886a52713f1b46f6e672be984d6d7411ef75/client/swarming.py [modify] https://crrev.com/8c72886a52713f1b46f6e672be984d6d7411ef75/client/tests/swarming_test.py
,
Nov 6
robertocn@, you will be able to specify regexp filter for isolated files after roll of swarming client. Does this fulfill your requirements? |
|||
►
Sign in to add a comment |
|||
Comment 1 by estaab@chromium.org
, Oct 25