New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 697444 link

Starred by 1 user

Issue metadata

Status: Untriaged
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 2
Type: Bug

Blocked on:
issue 719631
issue 719629



Sign in to add a comment

Understand why benchmarks were dropped during team's move to generate_perf_json.py

Project Member Reported by sullivan@chromium.org, Mar 1 2017

Issue description

From https://bugs.chromium.org/p/chromium/issues/detail?id=697195#c2:

"the Mac bots on the chromium.perf waterfall that used to run [performance_browser_tests] were removed at least six months ago. Consequently during the team's move to src/tools/perf/generate_perf_json.py the Mac entries were dropped."

Looking at an old version of the file:
https://chromium.googlesource.com/chromium/src/+/6aacb58a448d503715acff53cea8abe86f51aea2/testing/buildbot/chromium.perf.json

We used to run performance_browser_tests on 10.8 and 10.9, but when these bots were upgraded to 10.10 and above it looks like we just forgot to add them. How will we avoid this in the future when updating to 10.13 and beyond?
 
Cc: nednguyen@chromium.org
Owner: ----
Hmhh, using manual bisect, I find that https://codereview.chromium.org/2277003004 is when the the benchmark got dropped. This tells me that with the huge number of benchmarks & bots that we have,  chromium.perf.json is not a suitable form for humans to understand what happen to benchmarks. This is especially risky given that it's generated by the generate_perf_json.py script which could generate a huge diff. 

Not sure what to do about this yet. One idea is we can generate a smaller validating file that list benchmarks by the bots they run on to make it easier for reviewers to look at the diff. The files would look s.t like:

benchmark A: 'Android Nexus 5...' , 'Mac ...', '....'
benchmark B: '...
......

Martniss, Annie: thoughts?
Cc: dpranke@chromium.org
+Dirk for opinions 
Note that it's a little more subtle than the test getting dropped in https://codereview.chromium.org/2277003004 -- the bots got upgraded from 10.8->10.10 and 10.9->10.11 and we did not add new entries for these tests. That CL just removed the already-turned-down bots from the JSON file.

I also find the huge diffs generated by the script hard to read. I know Dirk has talked about pulling the telemetry tests into a single step; it seems like having simpler steps would help.
Cc: dtu@chromium.org
Annie, Dave: did you any of you recall how did the process of upgrading from 10.8 --> 10.10, 10.9 --? 10.11 happen? If that is a mere "s/10.8/10.10" kind of CL, then I expect thing would still work after the update. 
Yeah, I think the big problem is that when replacing a bot there are a lot of configs to update after requesting that labs update the machine:

* The hostnames for slaves on chromium.perf and tryserver.chromium.perf
* The bisect bot mapping from bot to tryserver
* This file
* Likely some more things I'm forgetting

It's just too easy to forget this step. Telemetry has a high-level concept "this test should run on Mac", but it doesn't translate for c++ perf tests.

Comment 6 by dtu@chromium.org, Mar 1 2017

It should be possible to add a high level concept like that "this test should run on Mac" to generate_perf_json.py. Would that be insufficient for the C++ perf tests?
Cc: phajdan.jr@chromium.org
I consider the per-builder/per-step JSON configurations to be virtually unmanageable in their current form for all of the larger builders. Fewer steps would certainly help, but don't solve the problem by themselves.

The GPU builders, for example, have a separate python script that generates the files from a more manageable form. I thought Perf did as well, but maybe I'm misremembering.

At any rate, I've asked Paweł to work on coming up with a more manageable format; see  bug 662541 .
Perf does have a script to generate the json files. The problem is that the diffs it creates are too large to be fully understood, so they're just blindly LGTMed, I believe (much like recipe expectations).
Yes :). The same thing is mostly true for the GPU expectations.
Dirk: should we expect resolving  bug 662541  will give us a format that make the diff on the JSON config consumable by code reviewer, or should we invest more on making a more declarative form with generate_perf_json.py? (suggestion in #6)
Yes, my goals for  bug 662541  would be to meet your needs so that we can get rid of generate_perf_json.py altogether and changes to the new files would be as reviewable as, say, changes to GN files.

We don't have a timeline or a design for that work, though, and so we'd have to evaluate whether it'd be better to wait for that, help work on that, or try to make some improvements to generate_perf_json in the meantime.

Comment 12 by a...@chromium.org, Mar 3 2017

Owner: dpranke@chromium.org
Status: Assigned (was: Untriaged)
avi@ - why did you assign this to me?
Components: Infra>Client
Owner: ----
Status: Available (was: Assigned)
Components: -Infra>Client Infra>Client>Perf

Comment 16 by m...@chromium.org, May 6 2017

BTW, bug 697195 is still open. So, this is also an "unbounded in time" problem. ;-)

What confuses me is why the data stoppage alerts didn't alarm? Or, maybe they did and we ignored them because we knew the old bots were being taken down?

Seems the simple solution here is to have a simple tool that shows which tests were running on the old bots, but not the new ones. If there were even just a way to dump a list of the tests that run on each bot, you could do some simple bash commands like:

  (list_tests.py --bot='Mac 10.8' | sort | unique) > /tmp/tests_that_run_on_10_8

  (list_tests.py --bot='Mac Pro 10.11' | sort | unique) > /tmp/tests_that_run_on_10_11

  diff /tmp/tests_that_run_on_10_*

Going back to some of the earlier comments, I think the history of the perf lab is adding a lot of benchmark and builder configs, but rarely updating or turning things down. We do not currently have good docs for these operations. When we want to update the version, where are all of the things we need to update before we can close the bug? When we remove a specific builder config, what needs to happen to swarming and buildbot configs and what magic scripts need to be run before we can close the bug? I'm extremely confused when I have to do this as if I don't add the correct reviewer, I will miss something.

Maybe the answer here is just good docs? If that's the case, I can help work on that.
I definitely thinks better documentation & process would help a lot here. For once, I am pretty confused about which are all the bot configs we have.

Another core issue is we have too many benchmarks & too many bot configs to create a static spec file that is understandable by human. Right now no one really review the changes to chromium.perf.json files because they are too big to look at.
Blockedon: 719629 719631
Project Member

Comment 20 by bugdroid1@chromium.org, May 31 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/8e0e80c3608ac2a08e87ead7d103578f6751f2f6

commit 8e0e80c3608ac2a08e87ead7d103578f6751f2f6
Author: miu <miu@chromium.org>
Date: Wed May 31 03:35:57 2017

Fix breakages while performance_browser_tests was not running on Mac.

In our vain attempt to re-enable performance_browser_tests on the Mac
bots, a number of breakages seem to have happened over the months where
we were not even aware the tests weren't running.

This change fixes all of the following:

1. Adds a BUILD dependency on //chrome:chrome_app. Without this, the
test binary would be missing many runtime dependencies (e.g., the whole
"Chrome Framework" bundle).

2. Migrate MachPortsTest to use BrowserTestBase::embedded_test_server()
instead of its own server of "fake content" for its browser tests.

3. Add base::ScopedAllowIO for a test utility class that creates and
tears down its own set of threads. It's not clear why the thread
restriction is only checked on Mac (perhaps it's in the way the
browser test's main thread is set up?). Nevertheless, it's perfectly
reasonable for the main thread to block on stopping these other threads:
After the test has run, the threads are shut down before the data
analysis begins, since we want to be sure the data sets are no longer
mutating. Also, added PRESUBMIT.py exception for this use case.

BUG=697444, 722367 
TBR=nduca

Review-Url: https://codereview.chromium.org/2901113003
Cr-Commit-Position: refs/heads/master@{#475773}

[modify] https://crrev.com/8e0e80c3608ac2a08e87ead7d103578f6751f2f6/PRESUBMIT.py
[modify] https://crrev.com/8e0e80c3608ac2a08e87ead7d103578f6751f2f6/chrome/test/BUILD.gn
[modify] https://crrev.com/8e0e80c3608ac2a08e87ead7d103578f6751f2f6/chrome/test/perf/mach_ports_performancetest.cc
[modify] https://crrev.com/8e0e80c3608ac2a08e87ead7d103578f6751f2f6/media/cast/test/utility/standalone_cast_environment.cc

Project Member

Comment 21 by sheriffbot@chromium.org, May 31 2018

Labels: Hotlist-Recharge-Cold
Status: Untriaged (was: Available)
This issue has been Available for over a year. If it's no longer important or seems unlikely to be fixed, please consider closing it out. If it is important, please re-triage the issue.

Sorry for the inconvenience if the bug really should have been left as Available.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot

Sign in to add a comment