LLDB core dumps when debugging net_unittests |
|||||
Issue descriptionChrome Version : 51.0.2704.7 OS Version: OS X 10.11.4 What steps will reproduce the problem? In a regular Chromium Checkout on Mac: 1. git checkout 77d41139d261342a429d2775c59d8e8a386d4c81 (Note: It happens at ToT as well; the above is the first bad commit by my bisect.) 2. gclient sync -j32 -D 3. ninja -C out/Debug -j8 net_unittests 4. cd out/Debug 5. lldb net_unittests 6. breakpoint set -f http_stream_factory_impl_job.cc -l 358 7. process launch -- --gtest_filter=*HttpStreamFactoryImplRequestTest.SetPriority* 8. [After breakpoint hit] print priority_ What is the expected result? I expect to see the value of the priority_ data member. What happens instead of that? Boom. (I.e. the debugger crashes with a segfault.) Please provide any additional information below. Attach a screenshot if possible. The lldb being used above is on my path at /usr/bin/lldb. The version of that binary is: $ lldb -version lldb-350.0.21.7 I believe that is from the installation of Xcode 7.3.1 (which is the Xcode on my machine). UserAgentString: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.7 Safari/537.36
,
Apr 27 2016
+tzik who wrote that patch +zturner who knows lldb well. Zach, do you know if this still repros at LLVM trunk?
,
Apr 27 2016
[+zturner@ for c#2]
,
Apr 27 2016
I'm not set up to test on OSX, but if there's a way to get a core dump of LLDB itself made available on an external link, I can talk to the people at Apple to see if they can help diagnose the problem. I looked at the source code of the line triggering the crash, and it looks like you are trying to print the value of a class member variable. So could you try printing the value of `this` first? Perhaps it is null, or pointing to garbage memory, and LLDB triggers an access violation trying to read from a protected address or something.
,
Apr 27 2016
I've uploaded the core file to Google Drive; globally sharable on the web link from my personal account: https://drive.google.com/open?id=0B0bQbesna1UTMmxpV1M4dzVtMjQ. Please let me know when it's no longer being used, as I'd like to reclaim the 5GB of space it's using :-}. I tried the "print this" experiment and it segfaulted right then. Having said that, I don't think this is null, since this is a test that gets run all the time in regular chrome and my repro is on an unmodified checkout. Log: rdsmith-macbookpro:../out/Debug [master] $ lldb net_unittests (lldb) target create "net_unittests" Current executable set to 'net_unittests' (x86_64). (lldb) breakpoint set -f http_stream_factory_impl_job.cc -l 356 Breakpoint 1: where = net_unittests`net::HttpStreamFactoryImpl::Job::SetPriority(net::RequestPriority) + 33 at http_stream_factory_impl_job.cc:356, address = 0x0000000103a00be1 (lldb) process launch -- --gtest_filter=*HttpStreamFactoryImplRequestTest.SetPriority* Process 33176 launched: '/Users/rdsmith/Sandboxen/chrome/src/out/Debug/net_unittests' (x86_64) Debugger detected, switching to single process mode. Pass --test-launcher-debug-launcher to debug the launcher itself. Detected presence of a debugger, running without test timeouts. Note: Google Test filter = *HttpStreamFactoryImplRequestTest.SetPriority* [==========] Running 2 tests from 1 test case. [----------] Global test environment set-up. [----------] 2 tests from NextProto/HttpStreamFactoryImplRequestTest [ RUN ] NextProto/HttpStreamFactoryImplRequestTest.SetPriority/0 Process 33176 stopped * thread #1: tid = 0x6a00b2, 0x0000000103a00be1 net_unittests`net::HttpStreamFactoryImpl::Job::SetPriority(this=0x000000010b804e00, priority=MEDIUM) + 33 at http_stream_factory_impl_job.cc:356, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1 frame #0: 0x0000000103a00be1 net_unittests`net::HttpStreamFactoryImpl::Job::SetPriority(this=0x000000010b804e00, priority=MEDIUM) + 33 at http_stream_factory_impl_job.cc:356 353 } 354 355 void HttpStreamFactoryImpl::Job::SetPriority(RequestPriority priority) { -> 356 priority_ = priority; 357 // TODO(akalin): Propagate this to |connection_| and maybe the 358 // preconnect state. 359 } (lldb) print this Segmentation fault: 11 (core dumped) rdsmith-macbookpro:../out/Debug [master] $ ls -alg /cores total 9293952 drwxrwxr-t@ 3 admin 102 Apr 27 13:20 . drwxr-xr-x 33 wheel 1190 Mar 28 16:59 .. -r-------- 1 admin 4758503424 Apr 27 13:21 core.33175 rdsmith-macbookpro:../out/Debug [master] $
,
Apr 27 2016
Can you confirm what compiler is being used? And if it is clang, is -Gmodules being used? I'm pretty sure the answer is no, just want to be sure.
,
Apr 27 2016
I did a ninja -v, the cut&pasted the command line for the compilation unit in question and suffixed --version. The output is below, but I believe the answers to your questions are: clang version 3.9.0, no -GModifules. rdsmith-macbookpro:../out/Debug [master] $ ../../third_party/llvm-build/Release+Asserts/bin/clang++ -MMD -MF obj/net/http/net.http_stream_factory_impl_job.o.d -DV8_DEPRECATION_WARNINGS -D__ASSERT_MACROS_DEFINE_VERSIONS_WITHOUT_UNDERSCORE=0 -DCHROMIUM_BUILD -DCR_CLANG_REVISION=267383-1 -DUSE_LIBJPEG_TURBO=1 -DENABLE_WEBRTC=1 -DENABLE_MEDIA_ROUTER=1 -DENABLE_PEPPER_CDMS -DENABLE_NOTIFICATIONS -DENABLE_TOPCHROME_MD=1 -DFIELDTRIAL_TESTING_ENABLED -DENABLE_TASK_MANAGER=1 -DENABLE_EXTENSIONS=1 -DENABLE_PDF=1 -DENABLE_PLUGIN_INSTALLATION=1 -DENABLE_PLUGINS=1 -DENABLE_SESSION_SERVICE=1 -DENABLE_THEMES=1 -DENABLE_AUTOFILL_DIALOG=1 -DENABLE_PRINTING=1 -DENABLE_BASIC_PRINTING=1 -DENABLE_PRINT_PREVIEW=1 -DENABLE_SPELLCHECK=1 -DUSE_BROWSER_SPELLCHECKER=1 -DENABLE_CAPTIVE_PORTAL_DETECTION=1 -DENABLE_APP_LIST=1 -DENABLE_SETTINGS_APP=1 -DENABLE_SUPERVISED_USERS=1 -DENABLE_SERVICE_DISCOVERY=1 -DV8_USE_EXTERNAL_STARTUP_DATA -DFULL_SAFE_BROWSING -DSAFE_BROWSING_CSD -DSAFE_BROWSING_DB_LOCAL -DNET_IMPLEMENTATION -DUSE_KERBEROS -DDLOPEN_KERBEROS -DENABLE_BUILT_IN_DNS -DPROTOBUF_USE_DLLS -DGOOGLE_PROTOBUF_NO_RTTI -DGOOGLE_PROTOBUF_NO_STATIC_INITIALIZER -DU_USING_ICU_NAMESPACE=0 -DU_ENABLE_DYLOAD=0 -DU_NOEXCEPT= -DU_STATIC_IMPLEMENTATION -DUSE_LIBPCI=1 -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -DDYNAMIC_ANNOTATIONS_ENABLED=1 -DWTF_USE_DYNAMIC_ANNOTATIONS=1 -Igen -I../.. -I../../sdch/open-vcdiff/src -I../../third_party/boringssl/src/include -I../../third_party/protobuf/src -I../../third_party/zlib -Igen/protoc_out -I../../third_party/icu/source/i18n -I../../third_party/icu/source/common -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk -O0 -gdwarf-2 -fvisibility=hidden -Werror -mmacosx-version-min=10.7 -arch x86_64 -Wall -Wextra -Wno-unused-parameter -Wno-missing-field-initializers -Wno-selector-type-mismatch -Wpartial-availability -Wheader-hygiene -Wno-char-subscripts -Wno-unneeded-internal-declaration -Wno-covered-switch-default -Wstring-conversion -Wno-c++11-narrowing -Wno-deprecated-register -Wno-inconsistent-missing-override -Wno-shift-negative-value -Wno-undefined-var-template -Wexit-time-destructors -std=c++11 -stdlib=libc++ -fno-rtti -fno-exceptions -fvisibility-inlines-hidden -fno-threadsafe-statics -Xclang -load -Xclang /Users/rdsmith/Sandboxen/chrome/src/third_party/llvm-build/Release+Asserts/lib/libFindBadConstructs.dylib -Xclang -add-plugin -Xclang find-bad-constructs -Xclang -plugin-arg-find-bad-constructs -Xclang check-templates -Xclang -plugin-arg-find-bad-constructs -Xclang follow-macro-expansion -Xclang -plugin-arg-find-bad-constructs -Xclang check-implicit-copy-ctors -fcolor-diagnostics -fno-strict-aliasing -fstack-protector-all -c ../../net/http/http_stream_factory_impl_job.cc -o obj/net/http/net.http_stream_factory_impl_job.o --version clang version 3.9.0 (trunk 267383) Target: x86_64-apple-darwin15.4.0 Thread model: posix InstalledDir: /Users/rdsmith/Sandboxen/chrome/src/out/Debug/../../third_party/llvm-build/Release+Asserts/bin rdsmith-macbookpro:../out/Debug [master] $
,
Apr 27 2016
Do you have any files in ~/Library/Logs/DiagnosticReports that start with lldb and have a time stamp matching the time of one of the crashes? If so, could you attach one (or email me offline if it might contain sensitive information)?
,
Apr 27 2016
I believe this is the diagnostic report that matches the crash I put on Google drive. It being an OS distributors binary of an open source program debugging another open source program, I *believe* I'm safe from the sensitive information problem :-}.
,
Apr 27 2016
So apparently there's an infinite recursion bug, and unfortunately not enough stack frames are captured to see where the recursion is originating from. You can try updating to the Xcode 7.3.1 developer preview to see if the bug has been fixed in that version of LLDB. As a workaround, you can try using the 'frame variable' command instead of the 'print' command. print is a little more powerful because it can evaluate arbitrary expressions, but 'frame variable' doesn't use the expression evaluator, so if it works with that but not 'print' it means the bug is somewhere in the expression evaluator. I think in order to get a better bug report (assuming the latest LLDB doesn't fix it) is to build LLDB from source and see if you can get a better stack trace.
,
Apr 27 2016
So "frame variable priority_" does indeed work, which suggests that the bug is indeed in the expression evaluator. When you suggest trying the developer preview of 7.3.1, is that a different binary than the normal 7.3.1? I'm currently running 7.3.1. (7D1012). If there's some other 7.3.1 I should look at, I'll give that a try, but could you give me a pointer to it? If not, I'll try to build lldb from source and get a better dump sometime soon, but that may not happen instantly.
,
Apr 27 2016
That one should be the latest I think. You might also try "log enable -f ~/lldb.log lldb all" and then reproduce the crash. Best way I can think of to get better information on the events leading up to the recursion.
,
Apr 27 2016
Log from "log enable -f ~/lldb.log lldb all" attached.
,
Apr 29 2016
You can remove the google drive link to free up space if you need to. Any chance you could upload a net_unittests binary and a copy of the matching source for that one file?
,
May 3 2016
Removed. Source file @ https://drive.google.com/open?id=0B0bQbesna1UTWno3bXJsSDR6cW8 . net_unittests @ https://drive.google.com/open?id=0B0bQbesna1UTcklDcDNzckhvbDg . Both should be viewable by anyone on the web. Is there a reasonable chance that you (sic?) can make progress with those uploads? I spent some time trying to get a source compile of lldb on my laptop last night, and ran into the usual problems of a series of dependencies being required, which I've slogged through several of, but still need to do SWIG and PRCE. If I can save myself the hassle, I wouldn't mind :-}. (My old buildable lldb checkout was on Linux.)
,
May 3 2016
Hopefully this is sufficient. I'll let you know. Thanks!
,
May 3 2016
Thank *you*!
,
May 3 2016
Here is the response I got from Apple folks (sorry for all the trouble we have to go through to get this figured out!): -------------------------------- We will need the dSYM file as well in order to be able to debug this. If there is no dSYM file next to the executable, then you can easily make it: % dsymutil net_unittests the UUID of the "net_unittests" binary needs to match, so you might end up having to attach both the executable (net_unittests) and the dSYM (net_unittests.dSYM) again because the one that was already uploaded might be different. "dwarfdump --uuid <file>" can be used to test and verify: % dwarfdump --uuid ~/Downloads/net_unittests UUID: 4A891057-9747-333F-92F5-547E76703D2A (x86_64) /Volumes/work/<username>/Downloads/net_unittests This can also be run on the dSYM bundle: % dwarfdump --uuid ~/Downloads/net_unittests.dSYM UUID: 4A891057-9747-333F-92F5-547E76703D2A (x86_64) /Volumes/work/<username>/Downloads/net_unittests the dSYM file is a bundle so it should probably be zipped up before attaching it. It we can get the exe + dSYM we should be able to reproduce this issue. You might also quickly retest to ensure the bug still happens when a dSYM file is available. Some bugs are related to LLDB and its ability to track down types in other files when using "-gmodules" and many of those bugs go away when you use a dSYM file since "dsymutil", which is a symlink to "llvm-dsymutil" will link all of the modules into one big DWARF file.
,
May 4 2016
Bundle available @ https://drive.google.com/open?id=0B0bQbesna1UTOFBaVXhzMVV2Ukk. I created it with "tar -czf ..." since that's the bundling format I'm familiar with and I didn't want to guess whether my use of gzip was compatible with their use of zip; I know they have access to "tar" :-}. It has both net_unittests and the net_unittests.dSYM directory in it. I repro'd the bug after creating the .dSYM file, but didn't do anything special in lldb to read the .dSYM file in, so if that's necessary for the test they requested you should guide me more precisely. And yeah, this is a lot of rounds. OTOH, I both want a debugger that works on Mac, and want to support LLDB development, so it's all worth it :-}.
,
May 4 2016
Just a heads up, it's reproducing with debug info, so this should be enough info to figure out the problem. I'll keep you posted when there's a fix available.
,
May 4 2016
Awesome; thank you! If you think it's likely I'll need to build lldb from scratch to get the fix, that's information I wouldn't have ahead of time, just so I can work on faulting in dependencies in the background.
,
May 6 2016
[mac triage]
,
Jun 22 2016
Zach: Any word on the status of this? Also still curious on c#22 (likelyhood of needing local build of lldb to get fix in finite time) just so I know if that's worthwhile working on in the background.
,
Jun 22 2016
I pinged them today and we managed to get a minimal repro, the bug is caused by having a template class which is specialized and inherits from a different specialization of the same class. Whenever a fix happens (hopefully soon, i will keep you posted), it will make it into Xcode eventually, but you will likely need to build your own if you want it right away. If you need helping building I'd suggest asking on lldb-dev mailing list, as I don't normally use osx
,
Jun 23 2016
Thanks! That's about what I figured. I'll start up building on OSX as a background task (it's sorta a pity, but it looks noticeably more complicated to build on OSX than on Linux).
,
Jun 23 2016
Supposedly on osx you just open up the Xcode workspace and hit build, but it's been a while since I've tried it
,
Jun 25 2016
This should be fixed in ToT, apparently it wasn't related to the inheritance issue, but rather affected any template whose parameter was an enum
,
Jul 18 2016
This fix should be present in any official version of LLDB with a version number greater or equal to lldb-360.1.32. Let me know if you're still seeing this after being on that version.
,
Jul 18 2016
For those following along, lldb-360 is part of Xcode 8 (currently in beta).
,
Oct 14 2016
Just curious, if anyone is trying Xcode 8, can you confirm if this is fixed? If you run "lldb -version" it should report "lldb-360.1.32" if it contains the fix.
,
Oct 18 2016
Looks good!
(lldb) p priority_
(net::RequestPriority) $0 = LOWEST
My lldb version appears to be higher, though:
$ lldb -version
lldb-360.1.50
,
Oct 18 2016
I believe it works fine on lldb-360.1.43 (this is from memory--I haven't gotten back into the original problem space to try it again since you asked, but I did since the fix was integrated).
,
Oct 18 2016
|
|||||
►
Sign in to add a comment |
|||||
Comment 1 by rdsmith@chromium.org
, Apr 26 2016