linux asan tests appear to be significantly slower than regular tests |
|||||||
Issue descriptionmigrating from bug 793993 ... Some tests run under ASan/LSan on the waterfalls appear to be *significantly* slower than the tests run under a regular release build. Compare, for example: https://ci.chromium.org/buildbot/chromium.linux/Linux%20Tests/65487 https://ci.chromium.org/buildbot/chromium.memory/Linux%20ASan%20LSan%20Tests%20%281%29/40807 blink_heap_unittests takes 5s on release, 37s under Asan, a 7x slowdown: https://chromium-swarm.appspot.com/user/task/3a60c4d6b2f11f10 https://chromium-swarm.appspot.com/user/task/3a60df2818889e10 net_unittests goes from 8m to 121m (split across 4 shards), a 15x slowdown. webkit_unit_tests goes from 30s to 2460s, an 80x slowdown: https://chromium-swarm.appspot.com/user/task/3a60c5666b162910 https://chromium-swarm.appspot.com/user/task/3a60df87d19cde10 Here's a plx query to show the breakdowns for all of the tests: SELECT tags_master as master, tags_buildername as builder, tags_stepname as stepname, sum(completed_ts - started_ts) as dur, sum(cost_usd) as cost, (sum(cost_usd) / count (distinct tags_build_id)) as cost_per_build FROM FLATTEN(FLATTEN(FLATTEN(FLATTEN(FLATTEN(chrome_infra.swarming_tasks.yesterday, tags_project), tags_master), tags_buildername), tags_stepname), tags_build_id) WHERE state = 'COMPLETED' and tags_project = 'chromium' and ((tags_master = 'chromium.linux' and tags_buildername = 'Linux Tests' and tags_build_id = '65487') or (tags_master = 'chromium.memory' and tags_buildername = 'Linux ASan LSan Tests (1)' and tags_build_id = '40806')) and completed_ts > (PARSE_UTC_USEC('2017-12-11') / 1000000) and completed_ts < (PARSE_UTC_USEC('2017-12-11') / 1000000) + 86400 GROUP BY master, builder, stepname ORDER BY master asc, builder asc, stepname asc with the results attached kcc@, can you help us find people to dig into what's going on here? Is it possible the bots are overly resource constrained and thrashing, or something? We are sorely tempted to turn off some of the worst offenders here, because the slowdown simply isn't acceptable even though we catch a lot of bugs w/ ASan and LSan. But, I'm optimistic we can figure out what's going on fairly easily.
,
Dec 13 2017
Potentially related to bug 791698
,
Dec 13 2017
#0: I glanced at the resource metrics from a bot graph from viceroy during a run of browser_tests on ASAN and didn't notice anything too thrashy. IIRC, memory usage was <50%. (CPU usage was high but not 100%.) Admittedly anecdata.
,
Dec 13 2017
I believe I saw significant slowdowns for things like net_unittests back in May, though I might be getting that confused w/ MSan. I don't recall the 80x differences, though.
,
Dec 13 2017
Is this something I can reproduce locally? If not, I'm afraid I don't want to own it :)
,
Dec 13 2017
I did see at least 8x slowdowns locally, so at least some of it. You should be able to easily build locally and test it on the bots using swarming, as well. We can help if need be.
,
Dec 13 2017
> You should be able to easily build locally That I can do. What is the exact build config? is_asan = true is_debug = false ? > test it on the bots using swarming Sorry, I won't go there (and I have no one on the team to help, sorry again)
,
Dec 13 2017
Kostya, I can take a look if it requires any special Chrome-related actions (but can also ask questions regarding LLVM side involved here :) ).
,
Dec 13 2017
This is what I've done in a fresh chromium checkout:
gn gen out/opt '--args=is_debug=false' --check
gn gen out/asan '--args=is_asan=true is_debug=false' --check
ninja -C out/asan net_unittests
ninja -C out/opt net_unittests
for f in asan opt ; do ./out/$f/net_unittests --single-process-tests > $f.log 2>&1; done
asan passes in 10m:
[==========] 24028 tests from 838 test cases ran. (608418 ms total)
opt got stuck:
[ RUN ] ProxyConfigServiceLinuxTest.KDEFileChanged
[221565:222383:1213/073132.067866:2475260036086:ERROR:proxy_config_service_linux.cc(1277)] Unable to set up proxy configuration change notifications
opt.log lines 10203-10257/10257 (END)
<hangs>
So, single-process net_unittests runs for 10 minutes, not even close to 121m mentioned above.
I failed to build blink_heap_unittests and webkit_unit_tests
../../ui/accessibility/ax_node_data.h:17:10: fatal error: 'ui/accessibility/ax_enums.h' file not found
#include "ui/accessibility/ax_enums.h"
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Please advise on a local experiment that can demonstrate the slowdown.
If the slowdown reproduces only on the bots, then it's most likely an infra problem
(wrong build flags, not enough RAM, ...) and I am not ready to own it
,
Dec 13 2017
base_unittests runs ~ the same time with and w/o asan (11 vs 13 seconds)
,
Dec 13 2017
I think machine configuration (or the environment) must be part of the problem. It's relatively easy to download the binaries used to run a given task and run it locally, and also to build locally and then trigger the task under swarming, so that should allow us to do some better troubleshooting even if we can't repo the extreme slowdowns locally. I'll post more instructions later this morning.
,
Dec 19 2017
Un-owning the bug to reflect the fact that I am not working on it. If/when there is a local reproducer feel free to assign it back. Again, I am sorry, but I don't have capacity to debug problems on unfamiliar infra (i.e. other than locally)
,
Dec 19 2017
@kcc - no problem. I hadn't updated the bug because I was actually working on things that would make it easier for you to reproduce issues. That work has landed, and so I'm going to see if I can come up w/ a repro case and will bounce it back if I can.
,
Jan 19 2018
I looked at this some yesterday and, as I noted over in https://bugs.chromium.org/p/chromium/issues/detail?id=736521#c17, am somewhat suspicious of a couple of runtime flags, --test-launcher-batch-limit=1 and --test-launcher-print-test-stdio=always. I'm planning to move them from the deep dark recipe hole from which they currently get added to the tests over in to the src-side spec, and then we can look at removing them.
,
Jan 23 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/edfe7f871492931def83874a0f4023df7232dc58 commit edfe7f871492931def83874a0f4023df7232dc58 Author: John Budorick <jbudorick@chromium.org> Date: Tue Jan 23 15:27:22 2018 chromium.memory: Surface test launcher args used by the Linux ASAN bot. This also adds the ability to specify args on a per-bot basis in waterfalls.pyl. Bug: 736521, 794372 Change-Id: I83af8884fccbe3937e4a46773389b4a0aebf2267 Reviewed-on: https://chromium-review.googlesource.com/876531 Commit-Queue: John Budorick <jbudorick@chromium.org> Reviewed-by: Kenneth Russell <kbr@chromium.org> Reviewed-by: Dirk Pranke <dpranke@chromium.org> Cr-Commit-Position: refs/heads/master@{#531240} [modify] https://crrev.com/edfe7f871492931def83874a0f4023df7232dc58/testing/buildbot/chromium.memory.json [modify] https://crrev.com/edfe7f871492931def83874a0f4023df7232dc58/testing/buildbot/generate_buildbot_json.py [modify] https://crrev.com/edfe7f871492931def83874a0f4023df7232dc58/testing/buildbot/generate_buildbot_json_unittest.py [modify] https://crrev.com/edfe7f871492931def83874a0f4023df7232dc58/testing/buildbot/waterfalls.pyl
,
Feb 1 2018
I'm not sure what the state of this is at this point, and I'm probably not the person who is going to work on it further. Punting to jbudorick to own or reassign.
,
Feb 2 2018
I'm still working on argument relocation & hopefully removal, per #14. Got distracted by trooper things.
,
Feb 14 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/tools/build/+/ee4e4b57de01b92f165eaff9c8b2268bf0d5da90 commit ee4e4b57de01b92f165eaff9c8b2268bf0d5da90 Author: John Budorick <jbudorick@chromium.org> Date: Wed Feb 14 22:56:12 2018 Move linux ASAN bot test launcher args to src. Pair w/ https://chromium-review.googlesource.com/c/chromium/src/+/876531 Bug: 736521, 794372 Change-Id: I74d96d27de4ce025688297585ddb21abe132e7c2 Reviewed-on: https://chromium-review.googlesource.com/876532 Commit-Queue: John Budorick <jbudorick@chromium.org> Reviewed-by: Dirk Pranke <dpranke@chromium.org> Reviewed-by: Stephen Martinis <martiniss@chromium.org> [modify] https://crrev.com/ee4e4b57de01b92f165eaff9c8b2268bf0d5da90/scripts/slave/recipe_modules/chromium_tests/chromium_memory.py [modify] https://crrev.com/ee4e4b57de01b92f165eaff9c8b2268bf0d5da90/scripts/slave/recipe_modules/chromium/config.py [modify] https://crrev.com/ee4e4b57de01b92f165eaff9c8b2268bf0d5da90/scripts/slave/recipe_modules/chromium/tests/configs.py
,
Apr 17 2018
My theory is that ASan has some process startup overhead. Reserving 20TB of shadow memory virtual address space is not free, and perhaps it is more expensive on GCE VMs. I only have one data point to support this theory, though, and it is that local net_unittests runs take 519s with the normal parallel test launcher, but they take 713s when I run with --single-process-tests, disabling the parallelism. In other words, using 56-way parallelism on a Z840 only gets a 37% speedup. Is there a way to make it so the test launcher reuses the same process for more test runs? That would test the theory, and if it is correct, reduce the total amount of work to run these tests. We'd still probably want to shard net_unittests up more.
,
Apr 18 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/6b83624e92eb2229a678a087fffce57d939248e4 commit 6b83624e92eb2229a678a087fffce57d939248e4 Author: Reid Kleckner <rnk@google.com> Date: Wed Apr 18 17:37:33 2018 Shard net_unittests on ToTLinuxASan 16 ways to match memory bot BUG=chromium:794372 R=thakis@chromium.org NOTRY=True Change-Id: Iaef5d1b0e4fcb155d90de714187e16496fdfdee1 Reviewed-on: https://chromium-review.googlesource.com/1015847 Commit-Queue: Reid Kleckner <rnk@chromium.org> Reviewed-by: Nico Weber <thakis@chromium.org> Cr-Commit-Position: refs/heads/master@{#551732} [modify] https://crrev.com/6b83624e92eb2229a678a087fffce57d939248e4/testing/buildbot/chromium.clang.json [modify] https://crrev.com/6b83624e92eb2229a678a087fffce57d939248e4/testing/buildbot/test_suite_exceptions.pyl
,
May 9 2018
Issue 814491 has been merged into this issue.
,
Jun 11 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/4ef47d5f79c6dcbc52f8cf29ead1491f146bc827 commit 4ef47d5f79c6dcbc52f8cf29ead1491f146bc827 Author: Takuto Ikuta <tikuta@chromium.org> Date: Mon Jun 11 13:15:36 2018 Increase swarming shards for some asan tests viz_browser_tests, viz_content_browsertests and content_browsertest take more than 20 minutes on linux asan bot. This patch increases the shard for such slow tests. * viz_browser_tests, 10 -> 20: max shard duration become 14 mins from > 25 mins * viz_content_browsertests, 2 -> 8: max shard duration become 15 mins from > 50 mins * content_browsertests, 4 -> 8: max shard duration become 16 mins from > 25 mins linux_chromium_asan_rel_ng has the slowest average CQ cycle time, so reducing test execution time is effective for overall CQ cycle time. 1 week stats of average cycle time of each try builder: http://shortn/_bEvlzGZaH9 Recent history of tests on linux_chromium_asan_rel_ng: * viz_browser_tests https://chromium-swarm.appspot.com/tasklist?c=name&c=state&c=created_ts&c=duration&c=pending_time&c=pool&c=bot&et=1528691160000&f=state%3ACOMPLETED_SUCCESS&f=name-tag%3Aviz_browser_tests&f=buildername-tag%3Alinux_chromium_asan_rel_ng&l=50&s=duration%3Adesc&st=1528604760000 * viz_content_browsertests https://chromium-swarm.appspot.com/tasklist?c=name&c=state&c=created_ts&c=duration&c=pending_time&c=pool&c=bot&et=1528691160000&f=state%3ACOMPLETED_SUCCESS&f=buildername-tag%3Alinux_chromium_asan_rel_ng&f=name-tag%3Aviz_content_browsertests&l=50&s=duration%3Adesc&st=1528604760000 * content_browsertests https://chromium-swarm.appspot.com/tasklist?c=name&c=state&c=created_ts&c=duration&c=pending_time&c=pool&c=bot&et=1528691160000&f=state%3ACOMPLETED_SUCCESS&f=buildername-tag%3Alinux_chromium_asan_rel_ng&f=name-tag%3Acontent_browsertests&l=50&s=duration%3Adesc&st=1528604760000 Bug: 794372 Change-Id: Ic5a4daae81890e702b9090055ed30e65bea92231 Reviewed-on: https://chromium-review.googlesource.com/1094816 Reviewed-by: Nico Weber <thakis@chromium.org> Reviewed-by: Marc-Antoine Ruel <maruel@chromium.org> Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> Cr-Commit-Position: refs/heads/master@{#565984} [modify] https://crrev.com/4ef47d5f79c6dcbc52f8cf29ead1491f146bc827/testing/buildbot/chromium.memory.json [modify] https://crrev.com/4ef47d5f79c6dcbc52f8cf29ead1491f146bc827/testing/buildbot/test_suite_exceptions.pyl
,
Jun 12 2018
I noticed that init for googletest consumes much cpu resources in asan net_unittests. From perf profiled result, testing::internal::UnitTestImpl::GetTestCase took nearly 40% of cpu resources. https://cs.chromium.org/chromium/src/third_party/mesa/src/src/gtest/src/gtest.cc?l=4096&rcl=9d9b0710470f581cb5485b02b6acd8415cc093e8 The actual slowness looks come from asan specific strcmp function used inside std::find_if. ref: https://github.com/llvm-project/llvm-project-20170507/commit/8df65c1de2e1f82d828b7e8314ddcfaabacb94b9 So I tried to change gtest not using strcmp there. https://chromium-review.googlesource.com/c/chromium/src/+/1096591 In my tries, max shard duration of net_unittest become 18:40 in https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux_chromium_asan_rel_ng/32387 18:34 in https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux_chromium_asan_rel_ng/32445 This is roughly 1.5x faster than usual duration of net_unittests. e.g. 28:08 in https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux_chromium_asan_rel_ng/32409 29:11 in https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux_chromium_asan_rel_ng/32406 27:33 in https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux_chromium_asan_rel_ng/32252 Sadly, the change looks not reduce the time of other tests. I will send PR to googletest. But I hope ASAN compile does more optimization for injected functions.
,
Jun 12 2018
Made PR. https://github.com/google/googletest/pull/1627 This will make net_unittest 2x faster on asan builder.
,
Jun 12 2018
Nice!
,
Jun 12 2018
Cool! The way GTest registers all tests with dynamic initialization is really inefficient. It caused huge startup time problems for us back when we ran valgrind and dr. memory (Google internal bug about it: b/6304344). It might be worth it to push some kind of static section concatenation registration scheme upstream. It would probably improve startup time for all chrome test binaries, both with ASan and without. It might be worth it...
,
Jun 13 2018
I don't understand why https://github.com/google/googletest/pull/1627/files helps. Is std::find_if with forward iterators transformed to a call to strcmp() but with reverse iterators isn't? And instrumented strcmp() is much slower than a regular search loop with asan? If so, maybe we shouldn't do the strcmp transform in llvm when asan is enabled? Or is there a different reason why that patch helps?
,
Jun 13 2018
In net_unittests, there are nearly 40k tests. And GetTestCase is called from AddTestInfo https://cs.chromium.org/chromium/src/third_party/googletest/src/googletest/src/gtest-internal-inl.h?l=657&rcl=9077ec7efe5b652468ab051e93c67589d5cb8f85 AddTestInfo is called from MakeAndRegisterTestInfo https://cs.chromium.org/chromium/src/third_party/googletest/src/googletest/src/gtest.cc?l=2573&rcl=9077ec7efe5b652468ab051e93c67589d5cb8f85 And MakeAndRegisterTestInfo is called for each test with testcase name to initialize global static variable. That means GetTestCase is almost always called with the test_case_name with previous call. So the TestCase should be the last element of test_case_ or not found. If we use forward iterator, we always need to apply strcmp for all elements in test_case_, but when we use reverse iterator, we find the testcase with a strcmp or failed to find. Not found case does not happen frequently.
,
Jun 13 2018
> And instrumented strcmp() is much slower than a regular search loop with asan? Maybe. > If so, maybe we shouldn't do the strcmp transform in llvm when asan is enabled? Or is there a different reason why that patch helps? Not sure, basically if we use std::string, comparison is pruned by size check before strcmp. If a program uses strcmp heavily, it might be better to find the way to optimize that (like gtest).
,
Jun 14 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/a8353999c7b25aa2d09c6b4dfe3a91347637b4ce commit a8353999c7b25aa2d09c6b4dfe3a91347637b4ce Author: Takuto Ikuta <tikuta@chromium.org> Date: Thu Jun 14 13:02:46 2018 Roll src/third_party/googletest/src/ 9077ec7ef..ce468a17c (5 commits) Reduces the max shard duration of net_unittest on the asan/tsan builders by about 50%. https://chromium.googlesource.com/external/github.com/google/googletest.git/+log/9077ec7efe5b..ce468a17c434 $ git log 9077ec7ef..ce468a17c --date=short --no-merges --format='%ad %ae %s' 2018-06-13 misterg Docs sync/internal 2018-06-13 misterg Doc sync/internal 2018-06-12 tikuta Reduce the number of strcmp calling while initialization 2018-06-11 misterg Sync with internal docs 2018-06-11 misterg Sync with internal docs Created with: roll-dep src/third_party/googletest/src R=dpranke@chromium.org,thakis@chromium.org BUG=794372 Change-Id: I704490e983697784fcc73c6fa7462bfb35a0694e Reviewed-on: https://chromium-review.googlesource.com/1100670 Commit-Queue: Nico Weber <thakis@chromium.org> Reviewed-by: Nico Weber <thakis@chromium.org> Cr-Commit-Position: refs/heads/master@{#567237} [modify] https://crrev.com/a8353999c7b25aa2d09c6b4dfe3a91347637b4ce/DEPS
,
Jun 18 2018
Comparing: https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Linux%20ASan%20LSan%20Tests%20%281%29/46826 (a few days before tikuta's roll) https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Linux%20ASan%20LSan%20Tests%20%281%29/47025 (a few days after) net_unittests is now about 3x as fast. Before, 16 shards at about 1500 s each, for unsharded 400 minutes / 6.7 hours total runtime. Now, 16 shards at about 500 s each, for about 133 min / 2.2 h total runtime. Meanwhile, on https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Linux%20ASan%20LSan%20Tests%20%281%29/47025 without asan, net_unittests runs on a single shard in a bit under 4 minutes. So that helped a lot, but comment 0 had a 15x slowdown from asan for net_unittests, while that change go us from a 100x slowdown to merely a 33x slowdown -- it's better than before, but less good than where we were when this bug got filed :-/
,
Jun 18 2018
Yeah, googletest initialization yet takes time on the tests, 42% out of 52% cpu consumption. Not sure why the situation become bad, I guess there are some changes in sanitizer makes things worse.
,
Jun 18 2018
https://plx.corp.google.com/scripts2/script_5b._2451af_0000_29d7_8f88_94eb2c14e7be Hmm, there seems to be gradual increase from end of January to the end of March.
,
Jun 25 2018
I did further optimization for googletest. More I removed memory allocation from net_unittests, more the execution speed is improved. This patch run around 2 times faster. https://chromium-review.googlesource.com/c/chromium/src/+/1112880/22 But why memory allocation in asan becomes such a slow on this test? In net_unittests, many of memory allocation seems come from parameterized test. Allocation pattern of parameterized test is very bad for asan allocator? When I see perf's result of net_unittests on master, I see large time consumption in __sanitizer::StackDepotBase<__sanitizer::StackDepotNode, 1, 20>::Put. And hotspot seems around https://github.com/llvm-project/llvm-project-20170507/blob/135c4a0f43501987991610a289cf36758dbc5f8a/compiler-rt/lib/sanitizer_common/sanitizer_stackdepotbase.h#L105 But not sure whether it can be optimized further.
,
Jun 25 2018
tikuta, were you able to repro this locally? What was your local setup? kcc said he'd investigate if he gets explicit steps on how to repro locally.
,
Jun 28 2018
#35, I forget to add myself to cc and adding star. Below instruction can be used. I used following args.gn to build net_unittests. ``` dcheck_always_on = true is_asan = true is_component_build = false is_debug = false is_lsan = true strip_absolute_paths_from_debug_symbols = true symbol_level = 1 use_goma = true ``` And take perf stats with following script. ``` #!/bin/bash set -x # kill net_unittests after some timeout, some tests in net_unittests on linux corp machine seems to become very slow due to corp specific damon. time perf record --call-graph lbr timeout 10 ./net_unittests --test-launcher-batch-limit=1 --test-launcher-print-test-stdio=always # remove temporal files rm -rf /tmp/.org.chromium* ``` Also I noticed that is_asan is not only the config causing this slowness. dcheck_always_on adds some large cpu usage to some tests. When I disabled dcheck, max shard duration of browser_tests reduced from around 8 mins to 5 mins. https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux_chromium_asan_rel_ng/44567 And most of browser_tests slowness seems to come from around https://cs.chromium.org/chromium/src/base/memory/weak_ptr.cc?l=44&rcl=fb40db1204f14ad0e420652787c4a70ce374c4a4
,
Jun 28 2018
So I'm considering to disable dcheck_always_on in CQ linux_chromium_asan_rel_ng bots. I think enabling DCHECK only in waterfall asan bot is reasonable. Because browser_tests from linux_chromium_asan_rel_ng is largest resource consumer in swarming task. https://screenshot.googleplex.com/twWBBGp11x6 Taken from below query SELECT buildername, stepname, COUNT(1) AS cnt, SUM(started_ts - created_ts) AS pending_sum_s, SUM(completed_ts - started_ts) AS sum_s, AVG(completed_ts - started_ts) AS avg_s, MAX(completed_ts - started_ts) AS max_s FROM chrome_infra.swarming_tasks.last7days, UNNEST(tags_stepname) as stepname, UNNEST(tags_buildername) as buildername, UNNEST(tags_master) as mastername, UNNEST(tags_patch_project) as patch_project WHERE completed_ts is not null AND started_ts is not null GROUP BY buildername, stepname ORDER BY sum_s DESC LIMIT 100;
,
Jun 28 2018
Waterfall bots and its matching cq bot must share a config (for good reasons), so we can't have the cq bot not use dchecks without the main waterfall bot losing them too. But if it saves lots of resources, then disabling in both places is probably fine since the non-asan bots have dchecks enabled. Sounds like that's less than a 50% speedup, and asan/no-dchecks is still 10x slower than no-asan/dchecks. kcc, can you check why asan builds are so much slower here with the steps from comment 36?
,
Jun 28 2018
#38, for dcheck, we don't use the same config between CQ and waterfall to detect miss usage of DCHECK now. See https://groups.google.com/a/google.com/d/msg/chrome-client-infra/wny0cZELz6s/w032I7D7BwAJ Here, I don't care whether dcheck is enabled or not if we can disable dcheck on CQ bot.
,
Jun 28 2018
Huh, I thought that was impossible. Thanks for teaching me :-) Anyways, disabling dchecks on the asan bot sounds fine to me, but it seems a bit like a workaround. The root issue is that asan is much slower than advertised here for some reason.
,
Jul 19
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/54671d97894b47be18a43c542538cdac934cf40c commit 54671d97894b47be18a43c542538cdac934cf40c Author: Takuto Ikuta <tikuta@chromium.org> Date: Thu Jul 19 19:06:35 2018 Increase the number of shards for some tests This is to make CQ check faster when update clang like below. https://chromium-review.googlesource.com/c/chromium/src/+/1143094 This CL increases the number of shards so that max shard duration reduced around 30 minutes. I use shard duration of following builds to adjust the number of shards. * https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux_chromium_chromeos_asan_rel_ng/15713 * https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux_chromium_chromeos_msan_rel_ng/831 Bug: 794372, 865455 Change-Id: Iee4e55af62219ba6d4f36d3ff5bcf53e5b7e9725 Reviewed-on: https://chromium-review.googlesource.com/1143435 Commit-Queue: Takuto Ikuta <tikuta@chromium.org> Reviewed-by: Nico Weber <thakis@chromium.org> Cr-Commit-Position: refs/heads/master@{#576584} [modify] https://crrev.com/54671d97894b47be18a43c542538cdac934cf40c/testing/buildbot/chromium.memory.json [modify] https://crrev.com/54671d97894b47be18a43c542538cdac934cf40c/testing/buildbot/test_suite_exceptions.pyl
,
Jul 27
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/412ad67c04bb591f9f9f7dafaeefe42b2c7201b7 commit 412ad67c04bb591f9f9f7dafaeefe42b2c7201b7 Author: Scott Violet <sky@chromium.org> Date: Fri Jul 27 20:54:06 2018 chromeos: increase number of shards for mash_browsertests The number of shards for browser_tests on "Linux Chromium OS ASan LSan Tests (1)" was increased to 30 a while back, but not mash_browser_tests. This means mash_browser_tests often timeout on the bot. I'm upping the number of shards to be the same for the two. BUG=794372 TEST=none Change-Id: I24dd03bb0c61a65d34dfb290af1269026f9563a3 Reviewed-on: https://chromium-review.googlesource.com/1153549 Reviewed-by: Dirk Pranke <dpranke@chromium.org> Reviewed-by: John Budorick <jbudorick@chromium.org> Commit-Queue: Scott Violet <sky@chromium.org> Cr-Commit-Position: refs/heads/master@{#578768} [modify] https://crrev.com/412ad67c04bb591f9f9f7dafaeefe42b2c7201b7/testing/buildbot/chromium.memory.json [modify] https://crrev.com/412ad67c04bb591f9f9f7dafaeefe42b2c7201b7/testing/buildbot/test_suite_exceptions.pyl
,
Jul 30
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/98759ece03864c3241a7d46225c4ee98e289d2ce commit 98759ece03864c3241a7d46225c4ee98e289d2ce Author: James Cook <jamescook@chromium.org> Date: Mon Jul 30 22:02:13 2018 chromeos: increase shards count for LSAN viz_browser_tests The number of shards for browser_tests on "Linux Chromium OS ASan LSan Tests (1)" was increased to 30 a while back, but not viz_browser_tests. This means viz_browser_tests often timeout on the bot. I'm upping the number of shards to be the same for the two. (sky@ recently did something similar for mash_browser_tests) BUG=794372 TEST=none Change-Id: Ibbdb086bc37e8a4722a0e5ab0eeb6312a8db4741 Reviewed-on: https://chromium-review.googlesource.com/1155634 Reviewed-by: Dirk Pranke <dpranke@chromium.org> Commit-Queue: James Cook <jamescook@chromium.org> Cr-Commit-Position: refs/heads/master@{#579186} [modify] https://crrev.com/98759ece03864c3241a7d46225c4ee98e289d2ce/testing/buildbot/chromium.memory.json [modify] https://crrev.com/98759ece03864c3241a7d46225c4ee98e289d2ce/testing/buildbot/test_suite_exceptions.pyl
,
Sep 20
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/b86f625a6dacfbc4588a05d7c312a9e2de26a5e8 commit b86f625a6dacfbc4588a05d7c312a9e2de26a5e8 Author: John Budorick <jbudorick@chromium.org> Date: Thu Sep 20 21:42:04 2018 Move test-launcher-* args for sanitizer tests src-side. Bug: 794372 Change-Id: I6b6d81ab0c72fd1b781d03e242f78ddef24fef95 Reviewed-on: https://chromium-review.googlesource.com/1235015 Reviewed-by: Stephen Martinis <martiniss@chromium.org> Commit-Queue: John Budorick <jbudorick@chromium.org> Cr-Commit-Position: refs/heads/master@{#592954} [modify] https://crrev.com/b86f625a6dacfbc4588a05d7c312a9e2de26a5e8/testing/buildbot/chromium.clang.json [modify] https://crrev.com/b86f625a6dacfbc4588a05d7c312a9e2de26a5e8/testing/buildbot/chromium.memory.json [modify] https://crrev.com/b86f625a6dacfbc4588a05d7c312a9e2de26a5e8/testing/buildbot/waterfalls.pyl
,
Sep 20
The following revision refers to this bug: https://chromium.googlesource.com/chromium/tools/build/+/e61705ca04e5fceffee9019eae57dc97fc02a17a commit e61705ca04e5fceffee9019eae57dc97fc02a17a Author: John Budorick <jbudorick@chromium.org> Date: Thu Sep 20 23:36:06 2018 chromium: clean up ASAN configs & move test launcher args to src. Bug: 794372 Change-Id: I049b74c368b393d7b2b2ffc8cc1af13c5d2d280c Reviewed-on: https://chromium-review.googlesource.com/1236539 Reviewed-by: Stephen Martinis <martiniss@chromium.org> Commit-Queue: John Budorick <jbudorick@chromium.org> [modify] https://crrev.com/e61705ca04e5fceffee9019eae57dc97fc02a17a/scripts/slave/recipes/chromium.expected/msan.json [modify] https://crrev.com/e61705ca04e5fceffee9019eae57dc97fc02a17a/scripts/slave/recipe_modules/chromium_tests/chromium_memory.py [modify] https://crrev.com/e61705ca04e5fceffee9019eae57dc97fc02a17a/scripts/slave/recipe_modules/chromium_tests/client_v8_fyi.py [modify] https://crrev.com/e61705ca04e5fceffee9019eae57dc97fc02a17a/scripts/slave/recipes/chromium.expected/dynamic_gtest_memory_asan_no_lsan.json [modify] https://crrev.com/e61705ca04e5fceffee9019eae57dc97fc02a17a/scripts/slave/recipe_modules/chromium/tests/runtest.expected/msan.json [modify] https://crrev.com/e61705ca04e5fceffee9019eae57dc97fc02a17a/scripts/slave/recipe_modules/chromium_tests/chromium_lkgr.py [modify] https://crrev.com/e61705ca04e5fceffee9019eae57dc97fc02a17a/scripts/slave/recipe_modules/chromium/config.py [modify] https://crrev.com/e61705ca04e5fceffee9019eae57dc97fc02a17a/scripts/slave/recipe_modules/chromium/tests/configs.py [modify] https://crrev.com/e61705ca04e5fceffee9019eae57dc97fc02a17a/scripts/slave/recipes/chromium.expected/dynamic_gtest_memory_mac64.json [modify] https://crrev.com/e61705ca04e5fceffee9019eae57dc97fc02a17a/scripts/slave/recipes/chromium.expected/tsan.json
,
Sep 21
I finally got around to c#14. The CL removing --test-launcher-batch-size=1 is here: https://chromium-review.googlesource.com/c/chromium/src/+/1237409 Its try run of linux_chromium_asan_rel_ng (https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux_chromium_asan_rel_ng/103915) has some interesting results: - blink_heap_unittests: <10s, compared to a current median ~50s - net_unittests: ~60s per shard x 15 shards = 900s, compared to a current median >6000s - webkit_unit_tests: ~60s per shard x 5 shards = 300s, compared to a current median >2000s (current values from http://shortn/_ywjdqcSlxq)
,
Sep 21
Looks great improvement! Will you remove --test-launcher-batch-size=1 entirely? Or only remove from asan CQ bots? I'm bit considering the case that multiple test exeuction affect the behavior each other.
,
Sep 21
Planning to remove all current uses of the flag from the bots. That is a concern, though I don't think maintaining --test-launcher-batch-size=1 is a good long-term way to handle it.
,
Sep 21
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/ff1be83fd7a40f2c96e08d7f448f6907b014294b commit ff1be83fd7a40f2c96e08d7f448f6907b014294b Author: John Budorick <jbudorick@chromium.org> Date: Fri Sep 21 15:28:21 2018 Remove --test-launcher-batch-limit=1 from sanitizer tests. Bug: 794372 Change-Id: Ic68afc2f66f340cf807406c3446567f886cd005e Reviewed-on: https://chromium-review.googlesource.com/1237409 Reviewed-by: Stephen Martinis <martiniss@chromium.org> Commit-Queue: John Budorick <jbudorick@chromium.org> Cr-Commit-Position: refs/heads/master@{#593192} [modify] https://crrev.com/ff1be83fd7a40f2c96e08d7f448f6907b014294b/testing/buildbot/chromium.clang.json [modify] https://crrev.com/ff1be83fd7a40f2c96e08d7f448f6907b014294b/testing/buildbot/chromium.memory.json [modify] https://crrev.com/ff1be83fd7a40f2c96e08d7f448f6907b014294b/testing/buildbot/waterfalls.pyl
,
Sep 21
Linux ASan LSan Tests (1) before: https://logs.chromium.org/logs/chromium/buildbucket/cr-buildbucket.appspot.com/8934761588602796048/+/steps/Tests_statistics/0/logs/detailed_stats/0 after: https://logs.chromium.org/logs/chromium/buildbucket/cr-buildbucket.appspot.com/8934760219412324608/+/steps/Tests_statistics/0/logs/detailed_stats/0 |
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by martiniss@chromium.org
, Dec 13 2017