Issue metadata
Sign in to add a comment
|
Heap-use-after-free in v8::internal::Isolate::UnregisterFromReleaseAtTeardown |
||||||||||||||||||||||
Issue descriptionDetailed report: https://clusterfuzz.com/testcase?key=6413699569680384 Fuzzer: ochang_js_fuzzer_win Job Type: windows_asan_d8 Platform Id: windows Crash Type: Heap-use-after-free READ 8 Crash Address: 0x11b9df903418 Crash State: v8::internal::Isolate::UnregisterFromReleaseAtTeardown v8::internal::Managed<struct v8::internal::wasm::WasmModule>::GCDelete v8::internal::GlobalHandles::DispatchPendingPhantomCallbacks Sanitizer: address (ASAN) Recommended Security Severity: High Regressed: https://clusterfuzz.com/revisions?job=windows_asan_d8&range=52309:52310 Reproducer Testcase: https://clusterfuzz.com/download?testcase_id=6413699569680384 Issue filed automatically. See https://github.com/google/clusterfuzz-tools for more information.
,
Apr 2 2018
eholk@, your CL seems to be the only culprit: https://chromium.googlesource.com/v8/v8/+log/e1e870a38c10fc61fb28e44735ee577e25eba9e7..0cd7468b86d0958fdc6ae1d7f69da1569e05b5f8?pretty=fuller&n=10000
,
Apr 3 2018
* Ulan - can you let me know from the GC standpoint whether my diagnosis and any of the suggested fixes makes sense? Is there a better way to do this? Thanks!
I was able to figure out what's going on. Basically, we end up trying to initiate a GC while a GC is already ongoing. I think the potential GC cycle already existed, but my CL made it much easier to hit.
Here's a stack trace:
* frame #0: 0x00007ffff7fe1f38 libv8_libbase.so`v8::base::OS::Abort() at platform-posix.cc:381
frame #1: 0x00007ffff7fd30a2 libv8_libbase.so`::V8_Fatal() at logging.cc:170
frame #2: 0x00007ffff7fd27cf libv8_libbase.so`v8::base::(anonymous namespace)::DefaultDcheckHandler(char const*, int, char const*) at logging.cc:56
frame #3: 0x00007ffff5f01e73 libv8.so`::DispatchPendingPhantomCallbacks() [inlined] Invoke at global-handles.cc:871
frame #4: 0x00007ffff5f01e4f libv8.so`::DispatchPendingPhantomCallbacks() at global-handles.cc:843
frame #5: 0x00007ffff5f02538 libv8.so`::PostGarbageCollectionProcessing() at global-handles.cc:903
frame #6: 0x00007ffff5f6e3f8 libv8.so`::PerformGarbageCollection() at heap.cc:1710
frame #7: 0x00007ffff5f67c07 libv8.so`::CollectGarbage() at heap.cc:1329
frame #8: 0x00007ffff5fb5b81 libv8.so`::CollectGarbageOnMemoryPressure() [inlined] CollectAllGarbage at heap.cc:1099
frame #9: 0x00007ffff5fb5b3f libv8.so`::CollectGarbageOnMemoryPressure() at heap.cc:4671
frame #10: 0x00007ffff5f63556 libv8.so`::CheckMemoryPressure() at heap.cc:4649
frame #11: 0x00007ffff723f961 libv8.so`::FreeNativeModuleMemories() [inlined] AdjustAmountOfExternalAllocatedMemory at v8.h:10295
frame #12: 0x00007ffff723f8fb libv8.so`::FreeNativeModuleMemories() at wasm-code-manager.cc:1084
frame #13: 0x00007ffff723e936 libv8.so`::~NativeModule() at wasm-code-manager.cc:870
frame #14: 0x00007ffff7339ad6 libv8.so`::GCDelete() [inlined] NativeDelete at managed.h:83
frame #15: 0x00007ffff7339aaa libv8.so`::GCDelete() at managed.h:78
frame #16: 0x00007ffff5f01c50 libv8.so`::DispatchPendingPhantomCallbacks() [inlined] Invoke at global-handles.cc:878
frame #17: 0x00007ffff5f01b2e libv8.so`::DispatchPendingPhantomCallbacks() at global-handles.cc:843
frame #18: 0x00007ffff5f02538 libv8.so`::PostGarbageCollectionProcessing() at global-handles.cc:903
frame #19: 0x00007ffff5f6e3f8 libv8.so`::PerformGarbageCollection() at heap.cc:1710
frame #20: 0x00007ffff5f67c07 libv8.so`::CollectGarbage() at heap.cc:1329
frame #21: 0x00007ffff5fb5b81 libv8.so`::CollectGarbageOnMemoryPressure() [inlined] CollectAllGarbage at heap.cc:1099
frame #22: 0x00007ffff5fb5b3f libv8.so`::CollectGarbageOnMemoryPressure() at heap.cc:4671
frame #23: 0x00007ffff5f63556 libv8.so`::CheckMemoryPressure() at heap.cc:4649
frame #24: 0x00007ffff72da07b libv8.so`::TryAllocateBackingStore() at wasm-memory.cc:36
frame #25: 0x00007ffff72db25e libv8.so`::NewArrayBuffer() at wasm-memory.cc:239
frame #26: 0x00007ffff71a1ba5 libv8.so`::Build() [inlined] AllocateMemory at module-compiler.cc:2388
frame #27: 0x00007ffff71a1b73 libv8.so`::Build() at module-compiler.cc:1748
frame #28: 0x00007ffff719a5a3 libv8.so`::InstantiateToInstanceObject() at module-compiler.cc:403
frame #29: 0x00007ffff72d664c libv8.so`::WebAssemblyInstantiateImpl() at wasm-js.cc:340
frame #30: 0x00007ffff72cb344 libv8.so`::WebAssemblyInstance() at wasm-js.cc:428
frame #31: 0x00007ffff4a9c7f3 libv8.so`::Call() at api-arguments.cc:29
frame #32: 0x00007ffff4e8ef9e libv8.so`::HandleApiCallHelper<true>() at builtins-api.cc:107
frame #33: 0x00007ffff4e89b27 libv8.so`::Builtin_Impl_HandleApiCall() at builtins-api.cc:133
frame #34: 0x00007ffff4e888d0 libv8.so`::Builtin_HandleApiCall() at builtins-api.cc:125
What happens is that during a collection, we clean up a global handle wrapping a NativeModule object. This calls FreeNativeModuleMemories, which calls AdjustAmountOfExternalAllocatedMemory. That in turn calls CheckMemoryPressure, which checks if we are running a collection and decides we aren't, and then proceeds with another collection which tries to collect the NativeModule that was in the middle of being collected.
One easy workaround is to stop calling AdjustAmountOfExternalAllocatedMemory in WasmCodeManager. I have a strawman CL at https://crrev.com/c/991193/1. This prevents the issue here, but I don't think it's a good idea because it means the GC will then lose visibility into a lot of allocated memory.
I noticed if I run with --stress-incremental-marking, the bug does not repro. This is because the collection happens asynchronously, so we don't end up with the collection loop.
A better fix might be to have CheckMemoryPressure set a flag upon entry and return early if that flag is already set. The second patchset on my CL (https://crrev.com/c/991193/2) shows a quick stab at this fix. It also makes the problem go away, and it let's the GC still be aware of what WasmCodeManager is allocating, so it seems like a much better fix.
,
Apr 3 2018
Eric, there are many paths that can trigger nested GCs (not only CheckMemoryPressure). I think it is better to work under the assumption that nested GCs can happen and make clearing of the handles robust. I propose this fix: https://chromium-review.googlesource.com/c/v8/v8/+/992093
,
Apr 3 2018
The following revision refers to this bug: https://chromium.googlesource.com/v8/v8.git/+/68be89090bf98e5e6d1f7fd33533ba632aff10e2 commit 68be89090bf98e5e6d1f7fd33533ba632aff10e2 Author: Ulan Degenbaev <ulan@chromium.org> Date: Tue Apr 03 11:49:31 2018 [wasm] Fix phantom handle clearing in destructors. The destructor of the owner of a phantom handle must clear the phantom handle first before calling any function that can trigger GC. Bug: chromium:827806 Change-Id: I20141d0d710c486aec3d92e729d76a53069e16fd Reviewed-on: https://chromium-review.googlesource.com/992093 Reviewed-by: Michael Starzinger <mstarzinger@chromium.org> Commit-Queue: Ulan Degenbaev <ulan@chromium.org> Cr-Commit-Position: refs/heads/master@{#52325} [modify] https://crrev.com/68be89090bf98e5e6d1f7fd33533ba632aff10e2/src/wasm/module-compiler.cc [modify] https://crrev.com/68be89090bf98e5e6d1f7fd33533ba632aff10e2/src/wasm/wasm-code-manager.cc
,
Apr 3 2018
This is a serious security regression. If you are not able to fix this quickly, please revert the change that introduced it. If this doesn't affect a release branch, or has not been properly classified for severity, please update the Security_Impact or Security_Severity labels, and remove the ReleaseBlock label. To disable this altogether, apply ReleaseBlock-NA. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Apr 3 2018
Thanks for the fix, Ulan!
,
Apr 3 2018
,
Apr 3 2018
Oh, hmm. I added a regression test for this issue, and it looks like it is stil failing: https://logs.chromium.org/v/?s=v8%2Fbuildbucket%2Fcr-buildbucket.appspot.com%2F8950243452788230576%2F%2B%2Fsteps%2FCheck_-_extra%2F0%2Flogs%2Fregress-827806%2F0 Regression test at https://crrev.com/c/993237 I've since reverted the change that exposed the issue, https://crrev.com/c/985142 because of this failure and other ones. I'll reland once we can get everything straightened out. See https://crbug.com/v8/7619
,
Apr 3 2018
Thanks a lot for the regression test, Eric! I don't see the bug by looking at the code on laptop. I'll debug this on desktop first thing tomorrow morning.
,
Apr 3 2018
Awesome, thanks! If you add my regression test to tip-of-tree, it probably won't repro anymore. You'll need to reapply https://crrev.com/c/985142 to expose the bug.
,
Apr 4 2018
ClusterFuzz has detected this issue as fixed in range 52333:52334. Detailed report: https://clusterfuzz.com/testcase?key=6413699569680384 Fuzzer: ochang_js_fuzzer_win Job Type: windows_asan_d8 Platform Id: windows Crash Type: Heap-use-after-free READ 8 Crash Address: 0x11b9df903418 Crash State: v8::internal::Isolate::UnregisterFromReleaseAtTeardown v8::internal::Managed<struct v8::internal::wasm::WasmModule>::GCDelete v8::internal::GlobalHandles::DispatchPendingPhantomCallbacks Sanitizer: address (ASAN) Recommended Security Severity: High Regressed: https://clusterfuzz.com/revisions?job=windows_asan_d8&range=52309:52310 Fixed: https://clusterfuzz.com/revisions?job=windows_asan_d8&range=52333:52334 Reproducer Testcase: https://clusterfuzz.com/download?testcase_id=6413699569680384 See https://github.com/google/clusterfuzz-tools for more information. If you suspect that the result above is incorrect, try re-doing that job on the test case report page.
,
Apr 4 2018
ClusterFuzz testcase 6413699569680384 is verified as fixed, so closing issue as verified. If this is incorrect, please add ClusterFuzz-Wrong label and re-open the issue.
,
Apr 4 2018
,
Apr 4 2018
Fix is in flight: https://chromium-review.googlesource.com/c/v8/v8/+/995796 Looks like we would need to merge that back.
,
Apr 4 2018
The following revision refers to this bug: https://chromium.googlesource.com/v8/v8.git/+/96e83b78d42b08cd9761447a7c79867aeab1e83a commit 96e83b78d42b08cd9761447a7c79867aeab1e83a Author: Ulan Degenbaev <ulan@chromium.org> Date: Wed Apr 04 18:11:21 2018 [wasm] Use two-pass phantom callbacks for managed objects. The phantom handle API requires that the first pass callback does not invoke any V8 API. The current code breaks this requirement by invoking AdjustAmountOfExternalAllocatedMemory, which can cause GC. This patch splits the existing callback into two parts. The first part only resets the handle and the second part performs native delete. Bug: chromium:827806 Change-Id: I01eed09f94f5499cb9d13397066f4f908a0aa668 Reviewed-on: https://chromium-review.googlesource.com/995796 Commit-Queue: Ulan Degenbaev <ulan@chromium.org> Reviewed-by: Clemens Hammacher <clemensh@chromium.org> Cr-Commit-Position: refs/heads/master@{#52366} [modify] https://crrev.com/96e83b78d42b08cd9761447a7c79867aeab1e83a/src/managed.h
,
Apr 5 2018
The following revision refers to this bug: https://chromium.googlesource.com/v8/v8.git/+/ccde64615c3b5d8a942d61ac9db029e81c739944 commit ccde64615c3b5d8a942d61ac9db029e81c739944 Author: Eric Holk <eholk@chromium.org> Date: Thu Apr 05 18:49:23 2018 [wasm] Add regression test for chromium:827806 The bug was fixed in https://crrev.com/c/995796, but this CL adds a regression test to make sure it stays fixed. Bug: chromium:827806 Change-Id: I9f4aed364bbd310af4253da457887a8b8015533a Reviewed-on: https://chromium-review.googlesource.com/993237 Reviewed-by: Clemens Hammacher <clemensh@chromium.org> Commit-Queue: Eric Holk <eholk@chromium.org> Cr-Commit-Position: refs/heads/master@{#52409} [add] https://crrev.com/ccde64615c3b5d8a942d61ac9db029e81c739944/test/mjsunit/regress/wasm/regress-827806.js
,
Apr 30 2018
,
May 7 2018
Issue 840328 has been merged into this issue.
,
May 7 2018
The crash also happens in M66 (Issue 840328). Should we merge the fix back to M66?
,
May 7 2018
,
May 7 2018
+ awhalley@ for review
,
May 8 2018
abdulsyed@ - good for 66
,
May 8 2018
Approving this for M66. Branch:3359
,
May 8 2018
Before merging, can you please confirm this will be a safe merge overall? Seems like it's well tested, but won't introduce any new regressions on M66 branch?
,
May 8 2018
Ulan, you're better qualified than me to tell whether this is safe to merge. Could you share your assessment?
,
May 9 2018
Confirming that it should be safe as it has ~1 month of coverage.
,
May 9 2018
Thanks ulan@, would you be able to do the merge to 66?
,
May 9 2018
The following revision refers to this bug: https://chromium.googlesource.com/v8/v8.git/+/d2db87c786b2dc08b99365d8de4a9b0d97fb61ab commit d2db87c786b2dc08b99365d8de4a9b0d97fb61ab Author: Ulan Degenbaev <ulan@chromium.org> Date: Wed May 09 17:26:10 2018 Merged: [wasm] Use two-pass phantom callbacks for managed objects. Revision: 96e83b78d42b08cd9761447a7c79867aeab1e83a BUG= chromium:827806 LOG=N NOTRY=true NOPRESUBMIT=true NOTREECHECKS=true R=clemensh@chromium.org Change-Id: Ib349eac56466634b818a7896cb1192be8414ee1e Reviewed-on: https://chromium-review.googlesource.com/1052509 Reviewed-by: Clemens Hammacher <clemensh@chromium.org> Cr-Commit-Position: refs/branch-heads/6.6@{#63} Cr-Branched-From: d500271571b92cb18dcd7b15885b51e8f437d640-refs/heads/6.6.346@{#1} Cr-Branched-From: 265ef0b635f8761df7c89eb4e8ec9c1a6ebee184-refs/heads/master@{#51624} [modify] https://crrev.com/d2db87c786b2dc08b99365d8de4a9b0d97fb61ab/src/managed.h
,
May 9 2018
,
May 9 2018
,
May 10 2018
This bug is a regression and does not impact stable. Removing incorrectly added Release- labels. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
May 10 2018
Thanks Sheriffbot - though in this case it was the Impact label that was wrong
,
Jul 11
This bug has been closed for more than 14 weeks. Removing security view restrictions. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot |
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by sheriffbot@chromium.org
, Apr 1 2018