latest clang asserts on internal iOS bots |
|||||||
Issue descriptione.g. https://uberchromegw.corp.google.com/i/internal.bling.main/builders/trunk-autoroller/builds/6970/steps/compile/logs/stdio Filed https://llvm.org/bugs/show_bug.cgi?id=31374 , delta'ing a reduction atm. The background here is that the clang roll before that broke tests on these bots (bug 673495) and I said that we could revert that time, but I wouldn't revert next time due to this having 0 bot coverage on the tot waterfall. We then fixed that miscompile, rolled forward, and that immediately triggered the next breakage on internal ios bots. Bug 673621 covers getting at least somewhat better test coverage. I'll try to get a repro, revert the bad clang change, and roll forward, but the next next time I won't do that either unless there are tot bots.
,
Dec 14 2016
But the reproducer I have is with -O0, so debug should be fine.
,
Dec 14 2016
The clang crash was in gtm_http_fetcher, which is currently not a dependency we have in the public tree. I'll see if I can get it there. (In general, I'll work on getting as much code into the public tree as I can.)
,
Dec 14 2016
current status: upstream regression identified and reverted, trying to build binaries here https://codereview.chromium.org/2576093002/ (but upstream buildbots look kinda unhappy, so that might fail)
,
Dec 14 2016
Additional attempts: https://codereview.chromium.org/2577833002/ https://codereview.chromium.org/2575203002/
,
Dec 14 2016
Issue 674297 has been merged into this issue.
,
Dec 14 2016
,
Dec 14 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/b98c5b040ba9993be101449622a250819f0b5a34 commit b98c5b040ba9993be101449622a250819f0b5a34 Author: Nico Weber <thakis@chromium.org> Date: Wed Dec 14 23:50:07 2016 clean up stale LLD checkouts on mac and linux bots currently it's impossible to build clang packages due to this. BUG= 674274 R=hans@chromium.org Review-Url: https://codereview.chromium.org/2579573002 . Cr-Commit-Position: refs/heads/master@{#438681} [modify] https://crrev.com/b98c5b040ba9993be101449622a250819f0b5a34/tools/clang/scripts/update.py
,
Dec 15 2016
More: https://codereview.chromium.org/2580663002 (also https://codereview.chromium.org/2579543002 https://codereview.chromium.org/2573353002 but those didn't work out)
,
Dec 15 2016
https://codereview.chromium.org/2580663002 made it through. I pushed it to goma, in ~2h it should be in all the data center. I'll then kick off try runs.
,
Dec 15 2016
Status: Upload to goma has completed, I kicked off try jobs on https://codereview.chromium.org/2580663002/ with the command described in src/docs/updating_clang.md . The bots have a timeout after a total run time of 2h, and since a clang roll means goma's cache is completely empty, some of the runs usually time out. So in a bit over 2h that command needs to be run again, and then 2h from then hopefully all the bots will be fairly green and the roll can land.
,
Dec 15 2016
,
Dec 15 2016
Most try results are back. Everything looks great except Mac ASan. Filed bug 674435 , the roll has failed. (Also visible on https://build.chromium.org/p/chromium.fyi/builders/ClangToTMacASan%20tester but that hadn't cycled yet when I kicked things off.) Guess we'll try to fix / revert that and then try again tomorrow (Thu). But since this took until 3am today (Wed), I'll come in later tomorrow and I won't try as long tomorrow, so who knows if I'll finish a roll attempt tomorrow.
,
Dec 15 2016
Issue 674449 has been merged into this issue.
,
Dec 15 2016
Note that since #3 I removed gtm_http_fetcher as it was unnecessary. But there still is a similar error with the library that replaces it: gtm_session_fetcher (see issue 674449 for stack trace).
,
Dec 15 2016
Maybe the roll cam be saved after all, see other bug
,
Dec 15 2016
In case this helps, here is the preprocessed source, associated run script, and crash backtrace.
,
Dec 15 2016
The crash is already fixed upstream, see comment 4 and llvm.org bug in comment 0. We need to push out a new clang binary with the fix. That ran into a whole host of other issues. We thought we had fixed them all and built new packages, but someone broke asan on mac while we weren't looking (see blocking bug). Not 100% clear yet if there's a workaround for that, or if that needs another upstream change and new packages pushed everywhere yet again.
,
Dec 15 2016
,
Dec 15 2016
,
Dec 16 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/ecd7aefbeb1138d1bff9302f8061e28a058d8ac1 commit ecd7aefbeb1138d1bff9302f8061e28a058d8ac1 Author: hans <hans@chromium.org> Date: Fri Dec 16 06:38:43 2016 Roll clang 289575:289864. Ran `tools/clang/scripts/upload_revision.py 289864`. BUG= 674274 Review-Url: https://codereview.chromium.org/2585473003 Cr-Commit-Position: refs/heads/master@{#439049} [modify] https://crrev.com/ecd7aefbeb1138d1bff9302f8061e28a058d8ac1/tools/clang/scripts/update.py
,
Dec 16 2016
Current status: We just landed a roll that fixed that compiler crash. However, in the meantime upstream added another compiler crash. We fixed that too, and built yet another compiler that we pushed to goma. It's now on all the bots, and https://codereview.chromium.org/2582763002/ is running try jobs. Since the last roll landed a few hours ago, we're hoping that the bots will be merciful. There's now an iOS tot bot at https://uberchromegw.corp.google.com/i/internal.bling.fyi/builders/clang-tot-device whose compile cycled green after today's fix, so compile should actually work after this roll. (some of the tests fail, but they look mostly harmless to me, not like compiler bugs. In base_unittests a partitionalloc test fails, but partitionalloc just got moved to base and probably is just failing on ios and needs to be ifdef'd out. And then a few "eg" tests, which look like ui integration tests, which usually are a bit flaky)
,
Dec 16 2016
Ok, I'll sleep a bit. I started a bash loop [1] that reruns tryjobs on the roll every 45 min for a while, so that goma caches get warmed up (the first few jobs on linux_rel and linux_asan usually time out 'cause compile caches aren't warm enough yet). If anyone thinks that boxes on https://codereview.chromium.org/2582763002/ look sufficiently green, feel free to hit cq. The only thing I'm a tiny bit worried about is msan since r289878 touched the runtime. Hopefully it'll just work, we'll see. I'll be back in 5.5 hours. 1: $ for i in {1..6}; do sleep 2700; git cl try && git cl try -m tryserver.chromium.mac -b mac_chromium_asan_rel_ng && git cl try -m tryserver.chromium.linux -b linux_chromium_chromeos_dbg_ng -b linux_chromium_chromeos_asan_rel_ng -b linux_chromium_msan_rel_ng && git cl try -m tryserver.blink -b linux_trusty_blink_rel ; done
,
Dec 16 2016
(link to msan change: https://reviews.llvm.org/D27791)
,
Dec 16 2016
Looks like linux_chromium_rel_ng is now consistently failing while running telemetry_perf_unittests (see for example https://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_rel_ng/builds/357983). One of the test (core.stacktrace_unittest.TabStackTraceTest.testCrashSymbols ) is to have the application crash and then inspect the callstack to check that it is correctly symbolicated, but it is not. I see the following in the dumped callstack "0x7fb445401000 - 0x7fb44e962fff chrome ??? (main) (WARNING: Corrupt symbols, chrome, 1A8ECE9F3396DE54D4501AAFB73B08180)". I don't know if this is related or not to the test though.
,
Dec 16 2016
Do you know how to run that locally?
,
Dec 16 2016
Filling back from the roll cl: """ Wait, that bug is about telemetry_unittests, while the failures are telemetry_perf_unittests with these failures: failures: core.stacktrace_unittest.TabStackTraceTest.testCrashSymbols core.stacktrace_unittest.TabStackTraceTest.testBadBreakpadFileIgnored core.stacktrace_unittest.TabStackTraceTest.testCrashMinimalSymbols Literally the one commit before I was done fixing upstream bugs (289925, http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20161212/413488.html) changed .debug_line so maybe this is real. Looking :-( """ I've tried debugging from afar, but `out/gn/chrome about:crash` requires UI and I only have a text ssh session at hand. So I'll shower and drive to the office now to look more. In parallel, I've speculatively reverted that debug info change in 289944 and I'm building binaries with that revert at https://codereview.chromium.org/2577403002
,
Dec 16 2016
The following revision refers to this bug: https://chrome-internal.googlesource.com/chrome/ios_internal.git/+/0a098540ff45a3e30b973ecbce56f48baf844c2d commit 0a098540ff45a3e30b973ecbce56f48baf844c2d Author: Sylvain Defresne <sdefresne@google.com> Date: Fri Dec 16 15:31:58 2016
,
Dec 16 2016
Updates from the ios_internal side: Last night's clang roll fixed the x86 (simulator) compiler crash, so we've rolled it into the internal tree. The arm (device 32bit) crash still exists, but we've temporarily switched to use xcode's clang for that bot. The internal tree is no longer blocked by the clang issues. I'd like to switch back to chromium's clang as soon as I can, but this is not as much of a fire on our end now.
,
Dec 16 2016
Thank you very much for that update!
,
Dec 16 2016
What's with the web_shell_egtest failures on https://uberchromegw.corp.google.com/i/internal.bling.fyi/builders/clang-tot-device , are those expected? (All other tests have cycled green by now)
,
Dec 16 2016
I doubt that web_shell_egtest failures are caused by compiler bug. baxley@, do you know why WebShellEGTests are not feeling well?
,
Dec 16 2016
Please ignore the web_shell_egtest failures, I'm pretty confident they are unrelated to the clang roll. The web_shell_egtests are failing on our main waterfall too. I think we added those tests to device bots before they were ready.
,
Dec 16 2016
I think rohitrao@ is correct regarding the web shell tests on devices. I landed a CL upstream to not run them on devices, with a bug to investigate. Sorry if this added confusion.
,
Dec 16 2016
The following revision refers to this bug: https://chrome-internal.googlesource.com/chrome/ios_internal.git/+/256d8ec640d7c1955865cca0c1bcb39a402b1dd3 commit 256d8ec640d7c1955865cca0c1bcb39a402b1dd3 Author: Rohit Rao <rohitrao@google.com> Date: Fri Dec 16 21:53:42 2016
,
Dec 17 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/4cbe831867596ba640a3de6bc9318707734be569 commit 4cbe831867596ba640a3de6bc9318707734be569 Author: thakis <thakis@chromium.org> Date: Sat Dec 17 05:12:50 2016 Roll clang 289864:289944. Ran `tools/clang/scripts/upload_revision.py 289944`. BUG= 674274 Review-Url: https://codereview.chromium.org/2577403002 Cr-Commit-Position: refs/heads/master@{#439325} [modify] https://crrev.com/4cbe831867596ba640a3de6bc9318707734be569/tools/clang/scripts/update.py
,
Dec 17 2016
The following revision refers to this bug: https://chrome-internal.googlesource.com/chrome/ios_internal.git/+/e297c1176d7e06479dae14dd8bf89bc45a4639a6 commit e297c1176d7e06479dae14dd8bf89bc45a4639a6 Author: sdefresne <sdefresne@google.com> Date: Sat Dec 17 15:09:15 2016
,
Dec 17 2016
The following revision refers to this bug: https://chrome-internal.googlesource.com/chrome/ios_internal.git/+/80433a45b1aa650a6c7e62e5c99c5128ed56845e commit 80433a45b1aa650a6c7e62e5c99c5128ed56845e Author: sdefresne <sdefresne@google.com> Date: Sat Dec 17 15:11:30 2016
,
Dec 19 2016
,
Dec 19 2016
Thank you for quickly fixing the issue even though it was happening on downstream only bots. |
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by thakis@chromium.org
, Dec 14 2016