Soft device affinity for perf bots |
|||||
Issue descriptionIn perf we need perf tests to be able to run on exactly the same device from run to run to get accurate results for our perf tests (device affinity). We want to implement a slightly less brittle solution that will choose bots from a lame duck pool when a bot dies so tests don't have to stop running while we wait for bots to come back up. See the soft device affinity section of this doc for more info: https://docs.google.com/document/d/1E3TsQRZeWGnYYAGCrB4kFTPPmfHVjuUJU0mKhMey8fY/edit#
,
Apr 10 2018
,
Apr 11 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/5e0e8dd9ae252135ba51c987ce275744a82713ec commit 5e0e8dd9ae252135ba51c987ce275744a82713ec Author: Emily Hanley <eyaich@google.com> Date: Wed Apr 11 18:01:49 2018 Adding shard as a tag to perf swarming devices. This is needed for soft device affinity so we have a way to know which bot each shard ran on last. Bug: 831252 Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel Change-Id: Idfde1b5f85816fdc9f496de2e9bbac4ef8298693 Reviewed-on: https://chromium-review.googlesource.com/1005721 Commit-Queue: Emily Hanley <eyaich@chromium.org> Reviewed-by: Kenneth Russell <kbr@chromium.org> Cr-Commit-Position: refs/heads/master@{#549929} [modify] https://crrev.com/5e0e8dd9ae252135ba51c987ce275744a82713ec/testing/trigger_scripts/base_test_triggerer.py [modify] https://crrev.com/5e0e8dd9ae252135ba51c987ce275744a82713ec/testing/trigger_scripts/perf_device_trigger.py [modify] https://crrev.com/5e0e8dd9ae252135ba51c987ce275744a82713ec/testing/trigger_scripts/perf_device_trigger_unittest.py
,
Apr 24 2018
,
Apr 30 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/681d1d4e820f5399f19b8bcb0e4b1e45a1abe0d3 commit 681d1d4e820f5399f19b8bcb0e4b1e45a1abe0d3 Author: Emily Hanley <eyaich@google.com> Date: Mon Apr 30 17:36:21 2018 Soft device affinity implementation. Now that shards as tags are being sent down on perf jobs, this code can stop sending specific device ids, but instead smartly tries to allocate perf jobs to bots based on to following: 1) what bot we last triggered it on by querying swarming for a list of tasks based on the dimensions and shard 2) what bots are currently alive by querying swarming for all bots with the given dimensions and checking that they are not quarantined and not is_dead. This requires that the dimensions for perf hardware is unique, which it will be with the new set of devices that will run this recipe. Bug: 831252 Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel Change-Id: I2bfb708a1bb65dbdf85c85b976c463605fb28335 Reviewed-on: https://chromium-review.googlesource.com/1017304 Reviewed-by: Marc-Antoine Ruel <maruel@chromium.org> Reviewed-by: Ned Nguyen <nednguyen@google.com> Reviewed-by: Kenneth Russell <kbr@chromium.org> Commit-Queue: Emily Hanley <eyaich@chromium.org> Cr-Commit-Position: refs/heads/master@{#554807} [modify] https://crrev.com/681d1d4e820f5399f19b8bcb0e4b1e45a1abe0d3/testing/buildbot/chromium.perf.fyi.json [modify] https://crrev.com/681d1d4e820f5399f19b8bcb0e4b1e45a1abe0d3/testing/buildbot/chromium.perf.json [modify] https://crrev.com/681d1d4e820f5399f19b8bcb0e4b1e45a1abe0d3/testing/trigger_scripts/base_test_triggerer.py [modify] https://crrev.com/681d1d4e820f5399f19b8bcb0e4b1e45a1abe0d3/testing/trigger_scripts/perf_device_trigger.py [modify] https://crrev.com/681d1d4e820f5399f19b8bcb0e4b1e45a1abe0d3/testing/trigger_scripts/perf_device_trigger_unittest.py [modify] https://crrev.com/681d1d4e820f5399f19b8bcb0e4b1e45a1abe0d3/testing/trigger_scripts/trigger_multiple_dimensions.py [modify] https://crrev.com/681d1d4e820f5399f19b8bcb0e4b1e45a1abe0d3/tools/perf/core/perf_data_generator.py
,
Apr 30 2018
,
May 7 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/8ce1edf89f8e3acd4d2540c165553d7262cc3e9f commit 8ce1edf89f8e3acd4d2540c165553d7262cc3e9f Author: Emily Hanley <eyaich@google.com> Date: Mon May 07 15:07:10 2018 Using bot_id from tasks/list query response instead of parsing for id From what I can see, bot_id is a gauranteed field in the query response vs assuming it comes back as a tag. Bug: 831252 Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel Change-Id: I6cb0c48a1dce8ffcea35ce3ab65404366c450c9a Reviewed-on: https://chromium-review.googlesource.com/1047065 Commit-Queue: Emily Hanley <eyaich@chromium.org> Reviewed-by: Marc-Antoine Ruel <maruel@chromium.org> Cr-Commit-Position: refs/heads/master@{#556440} [modify] https://crrev.com/8ce1edf89f8e3acd4d2540c165553d7262cc3e9f/testing/trigger_scripts/perf_device_trigger.py [modify] https://crrev.com/8ce1edf89f8e3acd4d2540c165553d7262cc3e9f/testing/trigger_scripts/perf_device_trigger_unittest.py
,
May 16 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/8661ce9e0bc0f5e47f2c8914d6d7b3bf4e44d133 commit 8661ce9e0bc0f5e47f2c8914d6d7b3bf4e44d133 Author: Emily Hanley <eyaich@google.com> Date: Wed May 16 17:27:18 2018 turning on soft device affinity for low end mac and linux on chromium.perf Bug: 831252 Change-Id: Ib447d09769d5ea4de8079a494cfc2e6ceaad5423 Reviewed-on: https://chromium-review.googlesource.com/1061581 Reviewed-by: Ned Nguyen <nednguyen@google.com> Commit-Queue: Emily Hanley <eyaich@chromium.org> Cr-Commit-Position: refs/heads/master@{#559164} [modify] https://crrev.com/8661ce9e0bc0f5e47f2c8914d6d7b3bf4e44d133/testing/buildbot/chromium.perf.json [modify] https://crrev.com/8661ce9e0bc0f5e47f2c8914d6d7b3bf4e44d133/tools/perf/core/perf_data_generator.py
,
Sep 3
|
|||||
►
Sign in to add a comment |
|||||
Comment 1 by nednguyen@chromium.org
, Apr 10 2018