New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 665529 link

Starred by 1 user

Issue metadata

Status: Duplicate
Merged: issue 665142
Owner: ----
Closed: Feb 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 3
Type: Bug

Blocked on:
issue 673341

Blocking:
issue 667736
issue 667772



Sign in to add a comment

Perf S5 CQ bots take too long

Project Member Reported by fmea...@chromium.org, Nov 15 2016

Issue description

Cc: jbudorick@chromium.org
From John: using samsung s5 is not a great idea because they're flaky and the android infra team don't manage them well. I think we should consider switching this to nexus5x.
We're talking about S5s (among other things) later today. Hold off on doing device switches for now.
Cc: dtu@chromium.org robert...@chromium.org
Thanks, John!

Ned, do you think if we were able to get more devices + more reliable devices it would definitely be worth keeping these on the CQ? Any idea how many devices we'd need?

+robertocn, dtu: do the CQ bots require 1 host per device?
From stability point of view, we run our smoke tests on android nexus 5 & nexus 5x on CQ regularly, but not on Samsung device. Given the fact that we are not going to have android_s5_rel_ng anytime soon & the high failure rate we see so far, I don't think it worths it to keep Samsung S5 on CQ_EXTRA_TRYBOT
Removing the S5s sgtm.
To echo John's comment in #5, we are going to remove the S5s from the perf waterfall until all the nexus devices we have there are more stable, and then re-evaluate adding more non-nexus devices.

So we should replace the S5 CQ bots we have with a more stable device, and get some redundancy. Does N5X seem like a good pick to everyone? Ned, how many? Roberto, do we need 1:1 host:device or could we have multiple devices used at the same time on one host?
Cc: eyaich@chromium.org
In the short term, I would say a single configuration of 1 host + 7 devices is more than enough. These only get triggered when people create new benchmarks/change benchmarks which is not super often.
In the longer run: once we have swarming everywhere, I would advocate using the same swarming pool we use for bisect.
Roberto, Dave, can we run CQ jobs in parallel if a bot has multiple devices? I don't think we can, which would mean we'd need to set up multiple hosts. If so, Ned, what is the minimum redundancy you think we need?
Annie: I would find the number of time many people happen to change a benchmark at a same time. 

"git log --since=9/16/2016 --oneline -- tools/perf/benchmarks/ | wc -l" shows that we have 56 commits to the benchmark/ folder in the last 2 months. So that means 1 commit to benchmarks/ folder per day. Assuming 1 host + 7 devices allows us to run any benchmark module in a reasonable of time [1], I think 2 hosts is good enough.

[1] is only whole true for benchmark module that do not contains too many benchmarks. smoothness.py module for example, contains 29 smoothness benchmarks and takes a lot of time to run all of them & increase the failure rate. Team should either consider (1) a better way of inferring which benchmarks should be smoke test upon a change or (2) splitting smoothness.py to multiple files.
Owner: eyaich@chromium.org
Emily is going to work on this.
Blocking: 667772
Blocking: 667736
Project Member

Comment 13 by bugdroid1@chromium.org, Nov 22 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/d5dd88f528645f6413aab414894d3eada2135614

commit d5dd88f528645f6413aab414894d3eada2135614
Author: charliea <charliea@chromium.org>
Date: Tue Nov 22 17:33:12 2016

Remove android_s5_perf_cq from tools/perf presubmits

eyaich@ is currently working on a more thorough dismantling of the
Android S5 perfbots. In the meanwhile, we can stop requiring tools/perf
changes to run through the (flaky) CQ.

BUG= 665529 

Review-Url: https://codereview.chromium.org/2520353003
Cr-Commit-Position: refs/heads/master@{#433904}

[modify] https://crrev.com/d5dd88f528645f6413aab414894d3eada2135614/tools/perf/PRESUBMIT.py

Cc: simonhatch@chromium.org
Owner: ----
Emily is getting pretty swamped before her leave. Simon or Dave, would one of you be able to take a look at setting up more stable CQ bots? We should also talk about how this would work in pinpoint.
Is the next step of this bug about removing samsung s5 trybot?
My understanding is that Emily removed the samsung s5 trybot and the next steps are:

* To add a N5X one, ideally with some redundancy so that we can have multiple jobs in CQ at once
* To refactor the code to only run a maximum number of benchmarks so that a large refactor doesn't need to run for hours.
* Consider switching the desktop CQ bots to VMs to get more parallelism as well
So no I never got started on this bug, but I just got out a CL to actually remove them from master.tryserver.chromium.perf: https://chromium-review.googlesource.com/c/415521/

After that a restart will take it off the waterfall and then I will file a ticket with labs to actually remove it from the lab.  
Project Member

Comment 18 by bugdroid1@chromium.org, Dec 2 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/build.git/+/00e9a1c210ed9aa63f5c5ebbb6073f289a21ea01

commit 00e9a1c210ed9aa63f5c5ebbb6073f289a21ea01
Author: Emily Hanley <eyaich@google.com>
Date: Fri Dec 02 16:37:55 2016

Removing android_s5_perf_cq bot from master.tryserver.chromium.perf

BUG= chromium:665529 

Change-Id: I19fd87702c839eeb9e16fe19be66566ae0ac0e21
Reviewed-on: https://chromium-review.googlesource.com/415521
Commit-Queue: Emily Hanley <eyaich@chromium.org>
Reviewed-by: Mike Stipicevic <stip@chromium.org>

[modify] https://crrev.com/00e9a1c210ed9aa63f5c5ebbb6073f289a21ea01/masters/master.tryserver.chromium.perf/master.cfg
[modify] https://crrev.com/00e9a1c210ed9aa63f5c5ebbb6073f289a21ea01/masters/master.tryserver.chromium.perf/slaves.cfg
[delete] https://crrev.com/bfafef31482c11bfdf10fc56a00ce85623e020de/scripts/slave/recipes/bisection/android_bisect.expected/basic_perf_tryjob_android_s5_perf_cq.json
[delete] https://crrev.com/bfafef31482c11bfdf10fc56a00ce85623e020de/scripts/slave/recipes/bisection/android_bisect.expected/basic_perf_tryjob_with_metric_android_s5_perf_cq.json
[delete] https://crrev.com/bfafef31482c11bfdf10fc56a00ce85623e020de/scripts/slave/recipes/bisection/android_bisect.expected/basic_perf_tryjob_with_revisions_android_s5_perf_cq.json
[delete] https://crrev.com/bfafef31482c11bfdf10fc56a00ce85623e020de/scripts/slave/recipes/bisection/android_bisect.expected/basic_recipe_android_s5_perf_cq.json
[delete] https://crrev.com/bfafef31482c11bfdf10fc56a00ce85623e020de/scripts/slave/recipes/bisection/android_bisect.expected/perf_cq_no_benchmark_to_run_android_s5_perf_cq.json
[delete] https://crrev.com/bfafef31482c11bfdf10fc56a00ce85623e020de/scripts/slave/recipes/bisection/android_bisect.expected/perf_cq_no_changes_android_s5_perf_cq.json
[delete] https://crrev.com/bfafef31482c11bfdf10fc56a00ce85623e020de/scripts/slave/recipes/bisection/android_bisect.expected/perf_cq_run_benchmark_android_s5_perf_cq.json
[delete] https://crrev.com/bfafef31482c11bfdf10fc56a00ce85623e020de/scripts/slave/recipes/bisection/android_bisect.expected/perf_tryjob_config_error_android_s5_perf_cq.json
[delete] https://crrev.com/bfafef31482c11bfdf10fc56a00ce85623e020de/scripts/slave/recipes/bisection/android_bisect.expected/perf_tryjob_failed_test_android_s5_perf_cq.json
[modify] https://crrev.com/00e9a1c210ed9aa63f5c5ebbb6073f289a21ea01/scripts/slave/recipes/bisection/android_bisect.py
[delete] https://crrev.com/bfafef31482c11bfdf10fc56a00ce85623e020de/scripts/slave/recipes/bisection/android_bisect_staging.expected/basic_perf_tryjob_android_s5_perf_cq.json
[delete] https://crrev.com/bfafef31482c11bfdf10fc56a00ce85623e020de/scripts/slave/recipes/bisection/android_bisect_staging.expected/basic_perf_tryjob_with_metric_android_s5_perf_cq.json
[delete] https://crrev.com/bfafef31482c11bfdf10fc56a00ce85623e020de/scripts/slave/recipes/bisection/android_bisect_staging.expected/basic_perf_tryjob_with_revisions_android_s5_perf_cq.json
[delete] https://crrev.com/bfafef31482c11bfdf10fc56a00ce85623e020de/scripts/slave/recipes/bisection/android_bisect_staging.expected/basic_recipe_android_s5_perf_cq.json
[delete] https://crrev.com/bfafef31482c11bfdf10fc56a00ce85623e020de/scripts/slave/recipes/bisection/android_bisect_staging.expected/perf_cq_no_benchmark_to_run_android_s5_perf_cq.json
[delete] https://crrev.com/bfafef31482c11bfdf10fc56a00ce85623e020de/scripts/slave/recipes/bisection/android_bisect_staging.expected/perf_cq_no_changes_android_s5_perf_cq.json
[delete] https://crrev.com/bfafef31482c11bfdf10fc56a00ce85623e020de/scripts/slave/recipes/bisection/android_bisect_staging.expected/perf_cq_run_benchmark_android_s5_perf_cq.json
[delete] https://crrev.com/bfafef31482c11bfdf10fc56a00ce85623e020de/scripts/slave/recipes/bisection/android_bisect_staging.expected/perf_tryjob_config_error_android_s5_perf_cq.json
[delete] https://crrev.com/bfafef31482c11bfdf10fc56a00ce85623e020de/scripts/slave/recipes/bisection/android_bisect_staging.expected/perf_tryjob_failed_test_android_s5_perf_cq.json
[modify] https://crrev.com/00e9a1c210ed9aa63f5c5ebbb6073f289a21ea01/scripts/slave/recipes/bisection/android_bisect_staging.py

Project Member

Comment 19 by bugdroid1@chromium.org, Dec 6 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/6f9637e257b44580340ea9c0b8b96920c093db7c

commit 6f9637e257b44580340ea9c0b8b96920c093db7c
Author: eyaich <eyaich@chromium.org>
Date: Tue Dec 06 14:14:45 2016

Removing android_s5_perf_cq bot from mb config map

BUG= chromium:665529 

Review-Url: https://codereview.chromium.org/2550713002
Cr-Commit-Position: refs/heads/master@{#436586}

[modify] https://crrev.com/6f9637e257b44580340ea9c0b8b96920c093db7c/tools/mb/mb_config.pyl

Blockedon: 673341
This bot is officially removed from the configuration, so someone is free to take on the next steps that Annie outlined.
Mergedinto: 665142
Status: Duplicate (was: Untriaged)
This is now a dup.
Project Member

Comment 23 by bugdroid1@chromium.org, Feb 2 2017

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/chrome-golo/chrome-golo/+/8043883b68b06bde93d81f864b708a4c00ad6e60

commit 8043883b68b06bde93d81f864b708a4c00ad6e60
Author: Peter Schmidt <pschmidt@google.com>
Date: Thu Feb 02 18:54:39 2017

Sign in to add a comment