V8 tests are failing when compiled with -fsanitize=safe-stack |
||||||||
Issue descriptionsafe-stack was enabled in clang on Fuchsia by default. Following tests break in that configuration: V8ValueConverterImplTest.StripNullFromObjects V8ValueConverterImplTest.WeirdProperties ProxyServiceMojoTest.Error UserMediaClientImplTest.MediaVideoSourceFailToStart V8ValueConverterImplTest.KeysWithDots URLRequestContextBuilderMojoTest.MojoProxyResolver V8ValueConverterImplTest.ArrayPrototypeSetter ProxyServiceMojoTest.DnsResolution V8ValueConverterImplTest.WeirdTypes Need to investigate and fix these failures, then enable safe-stack on Fuchsia.
,
Mar 16 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/6705a9b1d17e8f56c9a8229c971a7286ce1fbe42 commit 6705a9b1d17e8f56c9a8229c971a7286ce1fbe42 Author: Hans Wennborg <hans@chromium.org> Date: Fri Mar 16 17:23:38 2018 Roll Clang 325667:327688 Clang r326867 turned on SafeStack when targeting Fuchsia by default. Turn it off in our build config until all the tests pass. Bug: 817298 , 821951 Change-Id: Id2d9f911eb62fe6823365baeba55a4ccc0b484ce Reviewed-on: https://chromium-review.googlesource.com/959003 Reviewed-by: Nico Weber <thakis@chromium.org> Commit-Queue: Hans Wennborg <hans@chromium.org> Cr-Commit-Position: refs/heads/master@{#543745} [modify] https://crrev.com/6705a9b1d17e8f56c9a8229c971a7286ce1fbe42/build/config/fuchsia/BUILD.gn [modify] https://crrev.com/6705a9b1d17e8f56c9a8229c971a7286ce1fbe42/tools/clang/scripts/update.py
,
Mar 27 2018
Yang, is this on your radar or on whom's radar should it be?
,
Mar 27 2018
If there's someone with a thorough understanding of v8's GC root collection who can talk to me for a bit then I can give advice on how to handle safestack (and can add APIs to Fuchsia to help if that turns out to be necessary).
,
Mar 27 2018
Can you give some background on what exactly safe-stack does? V8 implements it's own stack Walker, for GC, debugging, profiling, and more. I could imagine that changing the C++ stack layout could affect some things.
,
Mar 27 2018
Safe stack essentially is a second stack that's used for code pointers, see https://clang.llvm.org/docs/SafeStack.html +kcc
,
Mar 28 2018
It seems test cases where V8 heavily relies on stack walking, e.g. deoptimization, stack traces, profiling, etc. are not failing. In fact, most if not all seem to be unit tests for V8ValueConverter[0], which is implemented outside of V8. I'm not convinced this is a V8 issue. I'm adding some people who have touched V8ValueConverterImpl recently. [0] https://cs.chromium.org/chromium/src/content/renderer/v8_value_converter_impl.h
,
Mar 28 2018
The bugs caused by failing to handle the secondary stack block (called the "unsafe stack") in GC logic are likely to manifest in subtle and nonobvious ways. Please find someone who groks all of v8's use of stack inspection who can talk with me about the details.
,
Mar 29 2018
I'm just trying to point out that even though we have thousands of LayoutTests that involve V8, none of them are among the failed tests. The ones that do fail are a small set of inter-related unit tests. That suggests to me that this issue is not really a V8 one. I assume that safe-stack only affects the C++ stack. V8's GC only modifies stack frames introduced by V8 itself, and should not be affected by build time instrumentation by Clang. Do you have more information on how to reproduce these failures or additional information on each failure beyond a list of failed tests?
,
Mar 30 2018
Running these tests locally, it seems that the crash is non-deterministic. e.g. running the StripNullFromObjects test the first time, it crashed, but on subsequent runs it'll crash, with the following assertion failure in v8::Context::Enter(): [00005.045] 01688.01779> Exception thrown during bootstrapping [00005.045] 01688.01779> Error installing extension 'v8/gc'. [00005.045] 01688.01779> [00005.045] 01688.01779> [00005.045] 01688.01779> # [00005.045] 01688.01779> # Fatal error in ../../v8/src/api.h, line 362 [00005.045] 01688.01779> # Debug check failed: allow_empty_handle || that != nullptr. It seems that this v8::Context::OpenHandle() is sometimes being passed nullptr during V8::Context::Enter(), implying that |this| is null. That's happening inside the v8::Context setup in the test fixture's SetUp(). I notice that somewhere inside that call some work is being posted to the V8 worker thread, so might the non-determinism be threading-related?
,
Mar 30 2018
If I interpret the log correctly what happened here is that during bootstrapping (creating a new V8 context), setting up the extension called V8/gc for some reason failed. This causes the whole context creation to fail and therefore we end up with an empty V8::Local<V8::Context>. When trying to use this context later on, we inevitably run into a crash. It's interesting that the GC extension is installed. That would happen if --expose-gc is passed to V8. is that done by the test runner somehow? Installing an extension would only fail without crashing if we ran into an exception when running the JS definition of the extension. Weird somehow. Could you try again with --js-flags=--single_threaded to see whether it's a threading issue?
,
Apr 26 2018
Since I'm sitting round the corner from yangguo@ right now, I dug spent some time investigation this some more and found: - StackOverflow() is being detected by ParserBase::Next() (which is inlined, so you can't see it in these traces previously reported). - The failing check is |GetCurrentStackPosition() < stack_limit_|, which is comparing the data stack position against |stack_limit_|. - |stack_limit_| is set during bootstrap, based on GetCurrentStackPosition() with a |stack_size| applied to it. - Printing the current vs limit values we see: - Good run: current - limit = 1006736 - Bad run: current - limit = 1006728 - Notice that in neither case is current < limit. O_o - In the bad runs, we're hitting the exception the very first time we hit Next(). Also note that: - It is easy to repro a failure with --gtest-repeat=10; it will often then fail on the first run, while running without --gtest_repeat will typically not fail. - We are able to repro failures intermittently even with the V8 --predictable option set (disables parallel compilation & GC, and a few other things).
,
Apr 26 2018
Aaaand I just had a bad run with current - limit = 1006736, so that's a red herring.
,
Apr 26 2018
Based on the GetCurrentStackPosition() implementation, I'd expect it to return a pointer into the unsafe stack. That could be a problem if something were comparing those pointers to pointers that are actually to the safe stack (e.g. the actual machine SP). But I don't see off hand how that could be an issue if all the pointers in question came from GetCurrentStackPosition() in modules that are compiled with safe-stack enabled. It's probably worthwhile to check the disassembly of the initializing sample and the asserting sample to verify they are both using the same stack. In non-safe-stack code, they should refer to the machine SP (%rsp on x886, sp on arm) whereas in safe-stack code they should refer to the unsafe SP (%fs:0x18 on x86, extracted via `msr TPIDR_EL0, xN` and some arithmetic on arm).
,
Apr 26 2018
Re #16: Yes, I wondered if we might be mixing the pointers, but I was fprintf()ing the GetCurrentStackPosition() from within the same function as does the comparison - so even if we were mixing safe-stack and normal stack versions of that function, the two calls at this one location should be consistent. Checking the assembly for the functions does seem a logical next step, though I've never had to do that for a Fuchsia binary before. :P
,
Apr 28 2018
I want to point out again that I find it very suspicious that only these unit tests fail. So unless these are the only tests that run with safe-stack, it seems really weird that V8 would not fail anywhere else. I assume that the V8 binary used for these unit tests are the same one used in e.g. layout tests? Is the test harness somehow interfering with the stack check in the parser?
,
Aug 3
,
Aug 21
Hello, I just ran across this bug and this sounds similar to a failure I was hitting when trying to enable SafeStack on Linux that I fixed with https://chromium-review.googlesource.com/c/v8/v8/+/1162669 It might be worth checking if SafeStack still needs to be disabled for Fuschia. |
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by machenb...@chromium.org
, Mar 15 2018Components: -Infra>Client>V8 Blink>JavaScript