New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 854610 link

Starred by 3 users

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug



Sign in to add a comment

isolate: the tar file archiver should bucket more

Project Member Reported by mar...@chromium.org, Jun 20 2018

Issue description

The current state of the tar archiver is that all files below 100kb are tarred.
The bucketing level is per input entry listed in the isolate file, so that if "test/data/" is listed as an input in the isolate file, only files under this directory are grouped. If there were a second entry "out/Release/", they would be tarred independently.

This generally works great, unless "test/data/" is several GiB in size and that often a single file is updated in this directory, it means that the whole thing is likely to be tarred an uploaded again, which is suboptimal.


Goal:
Improve the efficiency of incremental upload of large input tree with low (but non-zero) churn rate. This is especially true for layout tests.


AIs:
- Make the bucketing algorithm use a trade off to favor making more tarfiles, grouped by subdirectories in a relatively deterministic way.
- Detect single item tar files and do not tar it. The current code may degenerate in this situation. In fact, groups of less than 4 (?) items should probably never be tarred, it's likely not worth it.


References:

Selection of files to tar:
https://chromium.googlesource.com/infra/luci/luci-go/+/44ec31d1076c4f57e848afd199a8cf9f0ab30af5/client/archiver/tarring_archiver.go#140

Bucketing algorithm:
https://chromium.googlesource.com/infra/luci/luci-go/+/44ec31d1076c4f57e848afd199a8cf9f0ab30af5/client/archiver/tar_archiver.go#41

Uploader of the tarred files:
https://chromium.googlesource.com/infra/luci/luci-go/+/44ec31d1076c4f57e848afd199a8cf9f0ab30af5/client/archiver/upload_tracker.go#137
 

Sign in to add a comment