New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 674274 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner:
Closed: Dec 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: iOS , Mac
Pri: 0
Type: Bug

Blocked on:
issue 674435



Sign in to add a comment

latest clang asserts on internal iOS bots

Project Member Reported by thakis@chromium.org, Dec 14 2016

Issue description

e.g. https://uberchromegw.corp.google.com/i/internal.bling.main/builders/trunk-autoroller/builds/6970/steps/compile/logs/stdio

Filed https://llvm.org/bugs/show_bug.cgi?id=31374 , delta'ing a reduction atm.

The background here is that the clang roll before that broke tests on these bots (bug 673495) and I said that we could revert that time, but I wouldn't revert next time due to this having 0 bot coverage on the tot waterfall. We then fixed that miscompile, rolled forward, and that immediately triggered the next breakage on internal ios bots. Bug 673621 covers getting at least somewhat better test coverage.

I'll try to get a repro, revert the bad clang change, and roll forward, but the next next time I won't do that either unless there are tot bots.
 

Comment 1 by thakis@chromium.org, Dec 14 2016

(https://build.chromium.org/p/chromium.fyi/builders/ClangToTiOS/builds/9578 is happy. Maybe it should do release bots to find this? And Rohit says GTM is on that bot and just needs hooking up, so that should definitely happen)

Comment 2 by thakis@chromium.org, Dec 14 2016

But the reproducer I have is with -O0, so debug should be fine.
The clang crash was in gtm_http_fetcher, which is currently not a dependency we have in the public tree.  I'll see if I can get it there.  (In general, I'll work on getting as much code into the public tree as I can.)

Comment 4 by thakis@chromium.org, Dec 14 2016

current status: upstream regression identified and reverted, trying to build binaries here https://codereview.chromium.org/2576093002/ (but upstream buildbots look kinda unhappy, so that might fail)

Comment 6 by thakis@chromium.org, Dec 14 2016

Issue 674297 has been merged into this issue.
Cc: eugene...@chromium.org
Project Member

Comment 8 by bugdroid1@chromium.org, Dec 14 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/b98c5b040ba9993be101449622a250819f0b5a34

commit b98c5b040ba9993be101449622a250819f0b5a34
Author: Nico Weber <thakis@chromium.org>
Date: Wed Dec 14 23:50:07 2016

clean up stale LLD checkouts on mac and linux bots

currently it's impossible to build clang packages due to this.

BUG= 674274 
R=hans@chromium.org

Review-Url: https://codereview.chromium.org/2579573002 .
Cr-Commit-Position: refs/heads/master@{#438681}

[modify] https://crrev.com/b98c5b040ba9993be101449622a250819f0b5a34/tools/clang/scripts/update.py

https://codereview.chromium.org/2580663002 made it through. I pushed it to goma, in ~2h it should be in all the data center. I'll then kick off try runs.
Status: Upload to goma has completed, I kicked off try jobs on https://codereview.chromium.org/2580663002/ with the command described in src/docs/updating_clang.md . The bots have a timeout after a total run time of 2h, and since a clang roll means goma's cache is completely empty, some of the runs usually time out. So in a bit over 2h that command needs to be run again, and then 2h from then hopefully all the bots will be fairly green and the roll can land.
Blockedon: 674435
Most try results are back. Everything looks great except Mac ASan. Filed  bug 674435 , the roll has failed. (Also visible on https://build.chromium.org/p/chromium.fyi/builders/ClangToTMacASan%20tester but that hadn't cycled yet when I kicked things off.)

Guess we'll try to fix / revert that and then try again tomorrow (Thu). But since this took until 3am today (Wed), I'll come in later tomorrow and I won't try as long tomorrow, so who knows if I'll finish a roll attempt tomorrow.

Comment 14 by jif@chromium.org, Dec 15 2016

Issue 674449 has been merged into this issue.
Note that since #3 I removed gtm_http_fetcher as it was unnecessary. But there still is a similar error with the library that replaces it: gtm_session_fetcher (see issue 674449 for stack trace).
 Maybe the roll cam be saved after all, see other bug

Comment 17 by jif@chromium.org, Dec 15 2016

In case this helps, here is the preprocessed source, associated run script, and crash backtrace.
GTLRURITemplate-73aff2.m
6.2 MB View Download
GTLRURITemplate-73aff2.sh
7.5 KB View Download
clang_2016-12-15-180225_jeanfrancoisg-macpro.crash
12.2 KB Download
The crash is already fixed upstream, see comment 4 and llvm.org bug in comment 0. We need to push out a new clang binary with the fix. That ran into a whole host of other issues. We thought we had fixed them all and built new packages, but someone broke asan on mac while we weren't looking (see blocking bug). Not 100% clear yet if there's a workaround for that, or if that needs another upstream change and new packages pushed everywhere yet again.
Blockedon: 674665
Blockedon: -674665
Project Member

Comment 21 by bugdroid1@chromium.org, Dec 16 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/ecd7aefbeb1138d1bff9302f8061e28a058d8ac1

commit ecd7aefbeb1138d1bff9302f8061e28a058d8ac1
Author: hans <hans@chromium.org>
Date: Fri Dec 16 06:38:43 2016

Roll clang 289575:289864.

Ran `tools/clang/scripts/upload_revision.py 289864`.

BUG= 674274 

Review-Url: https://codereview.chromium.org/2585473003
Cr-Commit-Position: refs/heads/master@{#439049}

[modify] https://crrev.com/ecd7aefbeb1138d1bff9302f8061e28a058d8ac1/tools/clang/scripts/update.py

Current status: We just landed a roll that fixed that compiler crash. However, in the meantime upstream added another compiler crash. We fixed that too, and built yet another compiler that we pushed to goma. It's now on all the bots, and https://codereview.chromium.org/2582763002/ is running try jobs. Since the last roll landed a few hours ago, we're hoping that the bots will be merciful. There's now an iOS tot bot at https://uberchromegw.corp.google.com/i/internal.bling.fyi/builders/clang-tot-device whose compile cycled green after today's fix, so compile should actually work after this roll. (some of the tests fail, but they look mostly harmless to me, not like compiler bugs. In base_unittests a partitionalloc test fails, but partitionalloc just got moved to base and probably is just failing on ios and needs to be ifdef'd out. And then a few "eg" tests, which look like ui integration tests, which usually are a bit flaky)
Ok, I'll sleep a bit. I started a bash loop [1] that reruns tryjobs on the roll every 45 min for a while, so that goma caches get warmed up (the first few jobs on linux_rel and linux_asan usually time out 'cause compile caches aren't warm enough yet). If anyone thinks that boxes on https://codereview.chromium.org/2582763002/ look sufficiently green, feel free to hit cq. The only thing I'm a tiny bit worried about is msan since r289878 touched the runtime. Hopefully it'll just work, we'll see. I'll be back in 5.5 hours.


1: $ for i in {1..6}; do sleep 2700; git cl try &&     git cl try -m tryserver.chromium.mac -b mac_chromium_asan_rel_ng &&     git cl try -m tryserver.chromium.linux -b linux_chromium_chromeos_dbg_ng       -b linux_chromium_chromeos_asan_rel_ng -b linux_chromium_msan_rel_ng &&     git cl try -m tryserver.blink -b linux_trusty_blink_rel ; done
(link to msan change: https://reviews.llvm.org/D27791)
Looks like linux_chromium_rel_ng is now consistently failing while running telemetry_perf_unittests (see for example https://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_rel_ng/builds/357983).

One of the test (core.stacktrace_unittest.TabStackTraceTest.testCrashSymbols ) is to have the application crash and then inspect the callstack to check that it is correctly symbolicated, but it is not. I see the following in the dumped callstack "0x7fb445401000 - 0x7fb44e962fff  chrome  ???  (main)  (WARNING: Corrupt symbols, chrome, 1A8ECE9F3396DE54D4501AAFB73B08180)". I don't know if this is related or not to the test though.
Do you know how to run that locally?
Filling back from the roll cl:

"""
Wait, that bug is about telemetry_unittests, while the failures are
telemetry_perf_unittests with these failures:

failures:
core.stacktrace_unittest.TabStackTraceTest.testCrashSymbols
core.stacktrace_unittest.TabStackTraceTest.testBadBreakpadFileIgnored
core.stacktrace_unittest.TabStackTraceTest.testCrashMinimalSymbols
 

Literally the one commit before I was done fixing upstream bugs (289925,
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20161212/413488.html)
changed .debug_line so maybe this is real. Looking :-(
"""

I've tried debugging from afar, but `out/gn/chrome about:crash` requires UI and I only have a text ssh session at hand. So I'll shower and drive to the office now to look more. In parallel, I've speculatively reverted that debug info change in 289944 and I'm building binaries with that revert at https://codereview.chromium.org/2577403002
Project Member

Comment 28 by bugdroid1@chromium.org, Dec 16 2016

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/chrome/ios_internal.git/+/0a098540ff45a3e30b973ecbce56f48baf844c2d

commit 0a098540ff45a3e30b973ecbce56f48baf844c2d
Author: Sylvain Defresne <sdefresne@google.com>
Date: Fri Dec 16 15:31:58 2016

Updates from the ios_internal side:

Last night's clang roll fixed the x86 (simulator) compiler crash, so we've rolled it into the internal tree.  The arm (device 32bit) crash still exists, but we've temporarily switched to use xcode's clang for that bot.

The internal tree is no longer blocked by the clang issues.  I'd like to switch back to chromium's clang as soon as I can, but this is not as much of a fire on our end now.

Thank you very much for that update!
What's with the web_shell_egtest failures on https://uberchromegw.corp.google.com/i/internal.bling.fyi/builders/clang-tot-device , are those expected? (All other tests have cycled green by now)
Cc: baxley@chromium.org
I doubt that web_shell_egtest failures are caused by compiler bug. baxley@, do you know why WebShellEGTests are not feeling well?
Please ignore the web_shell_egtest failures, I'm pretty confident they are unrelated to the clang roll.

The web_shell_egtests are failing on our main waterfall too.  I think we added those tests to device bots before they were ready.
I think rohitrao@ is correct regarding the web shell tests on devices. I landed a CL upstream to not run them on devices, with a bug to investigate. Sorry if this added confusion.
Project Member

Comment 35 by bugdroid1@chromium.org, Dec 16 2016

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/chrome/ios_internal.git/+/256d8ec640d7c1955865cca0c1bcb39a402b1dd3

commit 256d8ec640d7c1955865cca0c1bcb39a402b1dd3
Author: Rohit Rao <rohitrao@google.com>
Date: Fri Dec 16 21:53:42 2016

Project Member

Comment 36 by bugdroid1@chromium.org, Dec 17 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/4cbe831867596ba640a3de6bc9318707734be569

commit 4cbe831867596ba640a3de6bc9318707734be569
Author: thakis <thakis@chromium.org>
Date: Sat Dec 17 05:12:50 2016

Roll clang 289864:289944.

Ran `tools/clang/scripts/upload_revision.py 289944`.

BUG= 674274 

Review-Url: https://codereview.chromium.org/2577403002
Cr-Commit-Position: refs/heads/master@{#439325}

[modify] https://crrev.com/4cbe831867596ba640a3de6bc9318707734be569/tools/clang/scripts/update.py

Project Member

Comment 37 by bugdroid1@chromium.org, Dec 17 2016

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/chrome/ios_internal.git/+/e297c1176d7e06479dae14dd8bf89bc45a4639a6

commit e297c1176d7e06479dae14dd8bf89bc45a4639a6
Author: sdefresne <sdefresne@google.com>
Date: Sat Dec 17 15:09:15 2016

Project Member

Comment 38 by bugdroid1@chromium.org, Dec 17 2016

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/chrome/ios_internal.git/+/80433a45b1aa650a6c7e62e5c99c5128ed56845e

commit 80433a45b1aa650a6c7e62e5c99c5128ed56845e
Author: sdefresne <sdefresne@google.com>
Date: Sat Dec 17 15:11:30 2016

Status: Fixed (was: Assigned)
Thank you for quickly fixing the issue even though it was happening on downstream only bots.

Sign in to add a comment