Understand why benchmarks were dropped during team's move to generate_perf_json.py |
||||||||
Issue descriptionFrom https://bugs.chromium.org/p/chromium/issues/detail?id=697195#c2: "the Mac bots on the chromium.perf waterfall that used to run [performance_browser_tests] were removed at least six months ago. Consequently during the team's move to src/tools/perf/generate_perf_json.py the Mac entries were dropped." Looking at an old version of the file: https://chromium.googlesource.com/chromium/src/+/6aacb58a448d503715acff53cea8abe86f51aea2/testing/buildbot/chromium.perf.json We used to run performance_browser_tests on 10.8 and 10.9, but when these bots were upgraded to 10.10 and above it looks like we just forgot to add them. How will we avoid this in the future when updating to 10.13 and beyond?
,
Mar 1 2017
+Dirk for opinions
,
Mar 1 2017
Note that it's a little more subtle than the test getting dropped in https://codereview.chromium.org/2277003004 -- the bots got upgraded from 10.8->10.10 and 10.9->10.11 and we did not add new entries for these tests. That CL just removed the already-turned-down bots from the JSON file. I also find the huge diffs generated by the script hard to read. I know Dirk has talked about pulling the telemetry tests into a single step; it seems like having simpler steps would help.
,
Mar 1 2017
Annie, Dave: did you any of you recall how did the process of upgrading from 10.8 --> 10.10, 10.9 --? 10.11 happen? If that is a mere "s/10.8/10.10" kind of CL, then I expect thing would still work after the update.
,
Mar 1 2017
Yeah, I think the big problem is that when replacing a bot there are a lot of configs to update after requesting that labs update the machine: * The hostnames for slaves on chromium.perf and tryserver.chromium.perf * The bisect bot mapping from bot to tryserver * This file * Likely some more things I'm forgetting It's just too easy to forget this step. Telemetry has a high-level concept "this test should run on Mac", but it doesn't translate for c++ perf tests.
,
Mar 1 2017
It should be possible to add a high level concept like that "this test should run on Mac" to generate_perf_json.py. Would that be insufficient for the C++ perf tests?
,
Mar 1 2017
I consider the per-builder/per-step JSON configurations to be virtually unmanageable in their current form for all of the larger builders. Fewer steps would certainly help, but don't solve the problem by themselves. The GPU builders, for example, have a separate python script that generates the files from a more manageable form. I thought Perf did as well, but maybe I'm misremembering. At any rate, I've asked Paweł to work on coming up with a more manageable format; see bug 662541 .
,
Mar 1 2017
Perf does have a script to generate the json files. The problem is that the diffs it creates are too large to be fully understood, so they're just blindly LGTMed, I believe (much like recipe expectations).
,
Mar 1 2017
Yes :). The same thing is mostly true for the GPU expectations.
,
Mar 1 2017
Dirk: should we expect resolving bug 662541 will give us a format that make the diff on the JSON config consumable by code reviewer, or should we invest more on making a more declarative form with generate_perf_json.py? (suggestion in #6)
,
Mar 1 2017
Yes, my goals for bug 662541 would be to meet your needs so that we can get rid of generate_perf_json.py altogether and changes to the new files would be as reviewable as, say, changes to GN files. We don't have a timeline or a design for that work, though, and so we'd have to evaluate whether it'd be better to wait for that, help work on that, or try to make some improvements to generate_perf_json in the meantime.
,
Mar 3 2017
,
Mar 3 2017
avi@ - why did you assign this to me?
,
Mar 7 2017
,
Apr 26 2017
,
May 6 2017
BTW, bug 697195 is still open. So, this is also an "unbounded in time" problem. ;-) What confuses me is why the data stoppage alerts didn't alarm? Or, maybe they did and we ignored them because we knew the old bots were being taken down? Seems the simple solution here is to have a simple tool that shows which tests were running on the old bots, but not the new ones. If there were even just a way to dump a list of the tests that run on each bot, you could do some simple bash commands like: (list_tests.py --bot='Mac 10.8' | sort | unique) > /tmp/tests_that_run_on_10_8 (list_tests.py --bot='Mac Pro 10.11' | sort | unique) > /tmp/tests_that_run_on_10_11 diff /tmp/tests_that_run_on_10_*
,
May 8 2017
Going back to some of the earlier comments, I think the history of the perf lab is adding a lot of benchmark and builder configs, but rarely updating or turning things down. We do not currently have good docs for these operations. When we want to update the version, where are all of the things we need to update before we can close the bug? When we remove a specific builder config, what needs to happen to swarming and buildbot configs and what magic scripts need to be run before we can close the bug? I'm extremely confused when I have to do this as if I don't add the correct reviewer, I will miss something. Maybe the answer here is just good docs? If that's the case, I can help work on that.
,
May 8 2017
I definitely thinks better documentation & process would help a lot here. For once, I am pretty confused about which are all the bot configs we have. Another core issue is we have too many benchmarks & too many bot configs to create a static spec file that is understandable by human. Right now no one really review the changes to chromium.perf.json files because they are too big to look at.
,
May 31 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/8e0e80c3608ac2a08e87ead7d103578f6751f2f6 commit 8e0e80c3608ac2a08e87ead7d103578f6751f2f6 Author: miu <miu@chromium.org> Date: Wed May 31 03:35:57 2017 Fix breakages while performance_browser_tests was not running on Mac. In our vain attempt to re-enable performance_browser_tests on the Mac bots, a number of breakages seem to have happened over the months where we were not even aware the tests weren't running. This change fixes all of the following: 1. Adds a BUILD dependency on //chrome:chrome_app. Without this, the test binary would be missing many runtime dependencies (e.g., the whole "Chrome Framework" bundle). 2. Migrate MachPortsTest to use BrowserTestBase::embedded_test_server() instead of its own server of "fake content" for its browser tests. 3. Add base::ScopedAllowIO for a test utility class that creates and tears down its own set of threads. It's not clear why the thread restriction is only checked on Mac (perhaps it's in the way the browser test's main thread is set up?). Nevertheless, it's perfectly reasonable for the main thread to block on stopping these other threads: After the test has run, the threads are shut down before the data analysis begins, since we want to be sure the data sets are no longer mutating. Also, added PRESUBMIT.py exception for this use case. BUG=697444, 722367 TBR=nduca Review-Url: https://codereview.chromium.org/2901113003 Cr-Commit-Position: refs/heads/master@{#475773} [modify] https://crrev.com/8e0e80c3608ac2a08e87ead7d103578f6751f2f6/PRESUBMIT.py [modify] https://crrev.com/8e0e80c3608ac2a08e87ead7d103578f6751f2f6/chrome/test/BUILD.gn [modify] https://crrev.com/8e0e80c3608ac2a08e87ead7d103578f6751f2f6/chrome/test/perf/mach_ports_performancetest.cc [modify] https://crrev.com/8e0e80c3608ac2a08e87ead7d103578f6751f2f6/media/cast/test/utility/standalone_cast_environment.cc
,
May 31 2018
This issue has been Available for over a year. If it's no longer important or seems unlikely to be fixed, please consider closing it out. If it is important, please re-triage the issue. Sorry for the inconvenience if the bug really should have been left as Available. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot |
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by nedngu...@google.com
, Mar 1 2017Owner: ----