New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 831252 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Sep 3
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug

Blocked on:
issue 838302
issue 781021



Sign in to add a comment

Soft device affinity for perf bots

Project Member Reported by eyaich@chromium.org, Apr 10 2018

Issue description

In perf we need perf tests to be able to run on exactly the same device from run to run to get accurate results for our perf tests (device affinity). 

We want to implement a slightly less brittle solution that will choose bots from a lame duck pool when a bot dies so tests don't have to stop running while we wait for bots to come back up.  

See the soft device affinity section of this doc for more info: https://docs.google.com/document/d/1E3TsQRZeWGnYYAGCrB4kFTPPmfHVjuUJU0mKhMey8fY/edit#


 
Cc: mar...@chromium.org

Comment 2 by mar...@chromium.org, Apr 10 2018

Blockedon: 781021
Project Member

Comment 3 by bugdroid1@chromium.org, Apr 11 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/5e0e8dd9ae252135ba51c987ce275744a82713ec

commit 5e0e8dd9ae252135ba51c987ce275744a82713ec
Author: Emily Hanley <eyaich@google.com>
Date: Wed Apr 11 18:01:49 2018

Adding shard as a tag to perf swarming devices.

This is needed for soft device affinity so we have a way to know
which bot each shard ran on last.

Bug:  831252 
Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel
Change-Id: Idfde1b5f85816fdc9f496de2e9bbac4ef8298693
Reviewed-on: https://chromium-review.googlesource.com/1005721
Commit-Queue: Emily Hanley <eyaich@chromium.org>
Reviewed-by: Kenneth Russell <kbr@chromium.org>
Cr-Commit-Position: refs/heads/master@{#549929}
[modify] https://crrev.com/5e0e8dd9ae252135ba51c987ce275744a82713ec/testing/trigger_scripts/base_test_triggerer.py
[modify] https://crrev.com/5e0e8dd9ae252135ba51c987ce275744a82713ec/testing/trigger_scripts/perf_device_trigger.py
[modify] https://crrev.com/5e0e8dd9ae252135ba51c987ce275744a82713ec/testing/trigger_scripts/perf_device_trigger_unittest.py

Status: Started (was: Untriaged)
Project Member

Comment 5 by bugdroid1@chromium.org, Apr 30 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/681d1d4e820f5399f19b8bcb0e4b1e45a1abe0d3

commit 681d1d4e820f5399f19b8bcb0e4b1e45a1abe0d3
Author: Emily Hanley <eyaich@google.com>
Date: Mon Apr 30 17:36:21 2018

Soft device affinity implementation.

Now that shards as tags are being sent down on perf jobs, this code can stop sending
specific device ids, but instead smartly tries to allocate perf jobs to bots based on
to following:

1) what bot we last triggered it on by querying swarming for a list of tasks based
on the dimensions and shard
2) what bots are currently alive by querying swarming for all bots with the given
dimensions and checking that they are not quarantined and not is_dead.

This requires that the dimensions for perf hardware is unique, which it will be with
the new set of devices that will run this recipe.

Bug:  831252 
Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel
Change-Id: I2bfb708a1bb65dbdf85c85b976c463605fb28335
Reviewed-on: https://chromium-review.googlesource.com/1017304
Reviewed-by: Marc-Antoine Ruel <maruel@chromium.org>
Reviewed-by: Ned Nguyen <nednguyen@google.com>
Reviewed-by: Kenneth Russell <kbr@chromium.org>
Commit-Queue: Emily Hanley <eyaich@chromium.org>
Cr-Commit-Position: refs/heads/master@{#554807}
[modify] https://crrev.com/681d1d4e820f5399f19b8bcb0e4b1e45a1abe0d3/testing/buildbot/chromium.perf.fyi.json
[modify] https://crrev.com/681d1d4e820f5399f19b8bcb0e4b1e45a1abe0d3/testing/buildbot/chromium.perf.json
[modify] https://crrev.com/681d1d4e820f5399f19b8bcb0e4b1e45a1abe0d3/testing/trigger_scripts/base_test_triggerer.py
[modify] https://crrev.com/681d1d4e820f5399f19b8bcb0e4b1e45a1abe0d3/testing/trigger_scripts/perf_device_trigger.py
[modify] https://crrev.com/681d1d4e820f5399f19b8bcb0e4b1e45a1abe0d3/testing/trigger_scripts/perf_device_trigger_unittest.py
[modify] https://crrev.com/681d1d4e820f5399f19b8bcb0e4b1e45a1abe0d3/testing/trigger_scripts/trigger_multiple_dimensions.py
[modify] https://crrev.com/681d1d4e820f5399f19b8bcb0e4b1e45a1abe0d3/tools/perf/core/perf_data_generator.py

Comment 6 by eyaich@chromium.org, Apr 30 2018

Blockedon: 838302
Project Member

Comment 7 by bugdroid1@chromium.org, May 7 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/8ce1edf89f8e3acd4d2540c165553d7262cc3e9f

commit 8ce1edf89f8e3acd4d2540c165553d7262cc3e9f
Author: Emily Hanley <eyaich@google.com>
Date: Mon May 07 15:07:10 2018

Using bot_id from tasks/list query response instead of parsing for id

From what I can see, bot_id is a gauranteed field in the query response
vs assuming it comes back as a tag.

Bug:  831252 
Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel
Change-Id: I6cb0c48a1dce8ffcea35ce3ab65404366c450c9a
Reviewed-on: https://chromium-review.googlesource.com/1047065
Commit-Queue: Emily Hanley <eyaich@chromium.org>
Reviewed-by: Marc-Antoine Ruel <maruel@chromium.org>
Cr-Commit-Position: refs/heads/master@{#556440}
[modify] https://crrev.com/8ce1edf89f8e3acd4d2540c165553d7262cc3e9f/testing/trigger_scripts/perf_device_trigger.py
[modify] https://crrev.com/8ce1edf89f8e3acd4d2540c165553d7262cc3e9f/testing/trigger_scripts/perf_device_trigger_unittest.py

Project Member

Comment 8 by bugdroid1@chromium.org, May 16 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/8661ce9e0bc0f5e47f2c8914d6d7b3bf4e44d133

commit 8661ce9e0bc0f5e47f2c8914d6d7b3bf4e44d133
Author: Emily Hanley <eyaich@google.com>
Date: Wed May 16 17:27:18 2018

turning on soft device affinity for low end mac and linux on chromium.perf

Bug:  831252 
Change-Id: Ib447d09769d5ea4de8079a494cfc2e6ceaad5423
Reviewed-on: https://chromium-review.googlesource.com/1061581
Reviewed-by: Ned Nguyen <nednguyen@google.com>
Commit-Queue: Emily Hanley <eyaich@chromium.org>
Cr-Commit-Position: refs/heads/master@{#559164}
[modify] https://crrev.com/8661ce9e0bc0f5e47f2c8914d6d7b3bf4e44d133/testing/buildbot/chromium.perf.json
[modify] https://crrev.com/8661ce9e0bc0f5e47f2c8914d6d7b3bf4e44d133/tools/perf/core/perf_data_generator.py

Status: Fixed (was: Started)

Sign in to add a comment