Querying swarming for webview vs non webview bots is indistinguishable |
||
Issue descriptionCurrently, the soft device affinity algorithm for perf assumes that dimensions for a specific configuration are unique within the swarming pool. The basic algorithm is as follows: 1) query swarming for all bots that match the configuration dimensions <os, gpu, pool, (device_type), (device_os)> 2) For each shard, query swarming for the last task that ran that shard with the given dimensions + the shard tag. Note: perf includes the shard as a tag on each job we trigger. 3) If that bot is alive trigger on that bot. If dead try and find an alive bot that doesn't already have a task triggered on it based on the set that came back in #1 and the set you found in #2. If the shard hasn't been triggered yet, choose any bot that isn't allocated. The problem is for webview, the device itself is identical, meaning the dimensions are identical. In this case the isolate that we trigger on them is just different. Therefore the set of devices that come back in #1 will include the pool allocated to webview and non-webview and they will appear as "alive available" bots to our algorithm since they won't get allocated in this run. Therefore, we have to figure out a way to distinguish which bots are for webview vs non-webview. Potential solutions: 1) Create a new swarming pool for just the bots allocated to webview so the dimensions are the 2) Before we allocate to a "new" alive bot, query for ANY tasks that have been run on that bot and make sure they are from the same configuration (configuration comes in as a tag from the swarming recipe). If it was previously allocated to a different configuration at any time in the past, don't choose it. #1 is very straightforward and their are no heuristics that we could be missing since it is a hard cut off. #2 will add latency to the job as well as there might be corner cases with new bots that get added in with no task history and the potential for one configuration to get starved. I think we are leaning towards #1 for a first pass.
,
May 2 2018
The following revision refers to this bug: https://chrome-internal.googlesource.com/infradata/config/+/30f53a79b78aac4b93a2a448b0f3929a01cd3c96 commit 30f53a79b78aac4b93a2a448b0f3929a01cd3c96 Author: Emily Hanley <eyaich@google.com> Date: Wed May 02 12:50:28 2018
,
May 2 2018
The following revision refers to this bug: https://chrome-internal.googlesource.com/infradata/config/+/803bf6caa757a4cc4319b6820c95a2cd2b392cf6 commit 803bf6caa757a4cc4319b6820c95a2cd2b392cf6 Author: Emily Hanley <eyaich@google.com> Date: Wed May 02 13:06:21 2018
,
May 2 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/bf2a489726c105eda1f64f5cc282b32c38d63c69 commit bf2a489726c105eda1f64f5cc282b32c38d63c69 Author: Emily Hanley <eyaich@google.com> Date: Wed May 02 14:57:53 2018 Updating weview bots to use chromium.tests.perf-webview Bug: 838302 Change-Id: I41a1640a1415c19e1b48ce442ee309632b7f4c90 Reviewed-on: https://chromium-review.googlesource.com/1036935 Reviewed-by: Ned Nguyen <nednguyen@google.com> Commit-Queue: Emily Hanley <eyaich@chromium.org> Commit-Queue: Ned Nguyen <nednguyen@google.com> Cr-Commit-Position: refs/heads/master@{#555387} [modify] https://crrev.com/bf2a489726c105eda1f64f5cc282b32c38d63c69/testing/buildbot/chromium.perf.json [modify] https://crrev.com/bf2a489726c105eda1f64f5cc282b32c38d63c69/tools/perf/core/perf_data_generator.py
,
Aug 3
This bug has an owner, thus, it's been triaged. Changing status to "assigned". |
||
►
Sign in to add a comment |
||
Comment 1 by eyaich@chromium.org
, Apr 30 2018