[wasm] WebAssembly.compile broken by histogram scopes |
|||||||
Issue descriptionIn chrome repo, then: 1) apply attached patch 2) build blink_tests: ninja -C out/Debug blink_tests 3) start httpd: third_party/WebKit/Tools/Scripts/run-blink-httpd 4) run any of the wasm tests. For example: out/Debug/content_shell --run-layout-test http://localhost:8000/wasm/wasm_serialization_tests.html #0 0x7f20543a07bb base::debug::StackTrace::StackTrace() #1 0x7f205439f4bc base::debug::StackTrace::StackTrace() #2 0x7f20544088bf logging::LogMessage::~LogMessage() #3 0x7f2054592db0 base::ThreadRestrictions::AssertSingletonAllowed() #4 0x7f205745d3d9 base::LazyInstance<>::Pointer() #5 0x7f205744fa80 content::RenderThreadImpl::current() #6 0x7f205744e425 content::(anonymous namespace)::CreateHistogram() #7 0x7f204dcf484e v8::internal::StatsTable::CreateHistogram() #8 0x7f204dce9bea v8::internal::Histogram::CreateHistogram() #9 0x7f204dcf544e v8::internal::Histogram::GetHistogram() #10 0x7f204dcf4785 v8::internal::Histogram::Enabled() #11 0x7f204dce9b6c v8::internal::Histogram::AddSample() #12 0x7f204e36a2a6 v8::internal::wasm::DecodeWasmModule() #13 0x7f204e3ef3fe AsyncCompileJob::DecodeModule() #14 0x7f204e3ef63a AsyncCompileJob::Task::RunInternal() #15 0x7f204d944d59 v8::internal::CancelableTask::Run() #16 0x7f204ec53485 _ZN4base8internal13FunctorTraitsIMNS_5TimerEFvvEvE6InvokeIPS2_JEEEvS4_OT_DpOT0_ #17 0x7f204ec533a1 _ZN4base8internal12InvokeHelperILb0EvE8MakeItSoIRKMNS_5TimerEFvvEJPS4_EEEvOT_DpOT0_ #18 0x7f204ec6bf97 _ZN4base8internal7InvokerINS0_9BindStateIMN2v84TaskEFvvEJNS0_12OwnedWrapperIS4_EEEEEFvvEE7RunImplIRKS6_RKSt5tupleIJS8_EEJLm0EEEEvOT_OT0_NS_13IndexSequenceIJXspT1_EEEE #19 0x7f204ec6bedc _ZN4base8internal7InvokerINS0_9BindStateIMN2v84TaskEFvvEJNS0_12OwnedWrapperIS4_EEEEEFvvEE3RunEPNS0_13BindStateBaseE #20 0x7f20543a6cce _ZNO4base8CallbackIFvvELNS_8internal8CopyModeE0ELNS2_10RepeatModeE0EE3RunEv #21 0x7f205459723a base::(anonymous namespace)::WorkerThread::ThreadMain() #22 0x7f205456ff8a base::(anonymous namespace)::ThreadFunc() #23 0x7f2059a67184 start_thread #24 0x7f20450a5bed clone
,
Apr 7 2017
+ahaas in case somehow this is related to async-compile
,
Apr 8 2017
I'm guessing the problem is that there is a test that is passing "nullptr" as an "Isolate" or a "Module". If so, the test is calling the code with a malformed test case. This already happened in one unit test when I was running tests. BTW, I won't be able to look at this till Tuesday since I am on vacation.
,
Apr 8 2017
I take this back. I don't know why we are getting this. However, one possibility is CL https://codereview.chromium.org/2795763002. That CL added collecting histogram information about array buffer allocations. It changes v8/src/objects.cc. My understanding is that an "Isolate" is thread safe. If it isn't then the change to method JSArrayBuffer::SetupAllocatingData is probably the problem.
,
Apr 10 2017
The Isolate is not thread safe. Only the main thread is allowed to modify the Isolate.
,
Apr 10 2017
See https://codereview.chromium.org/2812543002/. It appears to be due to attempting to lazy initialize histograms on a worker thread, which is not permitted.
,
Apr 10 2017
The following revision refers to this bug: https://chromium.googlesource.com/v8/v8.git/+/022e7ddf232b0696eb595769fa702463c417db5e commit 022e7ddf232b0696eb595769fa702463c417db5e Author: mtrofin <mtrofin@chromium.org> Date: Mon Apr 10 13:11:02 2017 Ensure counters are initialized, to avoid init on non-joinable threads. This occurs in the wasm scenario described in the referenced bug. DecodeWasmModule collects statistics. Blink inserts a CreateHistogramCallback that can't instantiate a histogram on non-joinable threads. Turns out, DecodeWasmModule is scheduled on such a thread, now that we have async compilation. This fix pre-initializes histograms when the callback is applied, which is assumed to be in a context that can carry out the instantiation. In Blink, this happens on the main thread. BUG= chromium:709684 Review-Url: https://codereview.chromium.org/2812543002 Cr-Commit-Position: refs/heads/master@{#44522} [modify] https://crrev.com/022e7ddf232b0696eb595769fa702463c417db5e/src/api.cc [modify] https://crrev.com/022e7ddf232b0696eb595769fa702463c417db5e/src/counters.cc [modify] https://crrev.com/022e7ddf232b0696eb595769fa702463c417db5e/src/counters.h
,
Apr 10 2017
The following revision refers to this bug: https://chromium.googlesource.com/v8/v8.git/+/022e7ddf232b0696eb595769fa702463c417db5e commit 022e7ddf232b0696eb595769fa702463c417db5e Author: mtrofin <mtrofin@chromium.org> Date: Mon Apr 10 13:11:02 2017 Ensure counters are initialized, to avoid init on non-joinable threads. This occurs in the wasm scenario described in the referenced bug. DecodeWasmModule collects statistics. Blink inserts a CreateHistogramCallback that can't instantiate a histogram on non-joinable threads. Turns out, DecodeWasmModule is scheduled on such a thread, now that we have async compilation. This fix pre-initializes histograms when the callback is applied, which is assumed to be in a context that can carry out the instantiation. In Blink, this happens on the main thread. BUG= chromium:709684 Review-Url: https://codereview.chromium.org/2812543002 Cr-Commit-Position: refs/heads/master@{#44522} [modify] https://crrev.com/022e7ddf232b0696eb595769fa702463c417db5e/src/api.cc [modify] https://crrev.com/022e7ddf232b0696eb595769fa702463c417db5e/src/counters.cc [modify] https://crrev.com/022e7ddf232b0696eb595769fa702463c417db5e/src/counters.h
,
Apr 10 2017
The following revision refers to this bug: https://chromium.googlesource.com/v8/v8.git/+/022e7ddf232b0696eb595769fa702463c417db5e commit 022e7ddf232b0696eb595769fa702463c417db5e Author: mtrofin <mtrofin@chromium.org> Date: Mon Apr 10 13:11:02 2017 Ensure counters are initialized, to avoid init on non-joinable threads. This occurs in the wasm scenario described in the referenced bug. DecodeWasmModule collects statistics. Blink inserts a CreateHistogramCallback that can't instantiate a histogram on non-joinable threads. Turns out, DecodeWasmModule is scheduled on such a thread, now that we have async compilation. This fix pre-initializes histograms when the callback is applied, which is assumed to be in a context that can carry out the instantiation. In Blink, this happens on the main thread. BUG= chromium:709684 Review-Url: https://codereview.chromium.org/2812543002 Cr-Commit-Position: refs/heads/master@{#44522} [modify] https://crrev.com/022e7ddf232b0696eb595769fa702463c417db5e/src/api.cc [modify] https://crrev.com/022e7ddf232b0696eb595769fa702463c417db5e/src/counters.cc [modify] https://crrev.com/022e7ddf232b0696eb595769fa702463c417db5e/src/counters.h
,
Apr 10 2017
The following revision refers to this bug: https://chromium.googlesource.com/v8/v8.git/+/038bafcb8c0435bc4130228237124331295d4713 commit 038bafcb8c0435bc4130228237124331295d4713 Author: jgruber <jgruber@chromium.org> Date: Mon Apr 10 14:56:44 2017 Revert of Ensure counters are initialized, to avoid init on non-joinable threads. (patchset #1 id:1 of https://codereview.chromium.org/2812543002/ ) Reason for revert: https://build.chromium.org/p/client.v8/builders/V8%20Linux64%20-%20gyp/builds/5221 Original issue's description: > Ensure counters are initialized, to avoid init on non-joinable threads. > > This occurs in the wasm scenario described in the referenced bug. > DecodeWasmModule collects statistics. Blink inserts a CreateHistogramCallback that > can't instantiate a histogram on non-joinable threads. Turns out, DecodeWasmModule > is scheduled on such a thread, now that we have async compilation. > > This fix pre-initializes histograms when the callback is applied, which is assumed to > be in a context that can carry out the instantiation. In Blink, this happens on the main > thread. > > BUG= chromium:709684 > > Review-Url: https://codereview.chromium.org/2812543002 > Cr-Commit-Position: refs/heads/master@{#44522} > Committed: https://chromium.googlesource.com/v8/v8/+/022e7ddf232b0696eb595769fa702463c417db5e TBR=jochen@chromium.org,mtrofin@chromium.org # Skipping CQ checks because original CL landed less than 1 days ago. NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG= chromium:709684 Review-Url: https://codereview.chromium.org/2812653002 Cr-Commit-Position: refs/heads/master@{#44527} [modify] https://crrev.com/038bafcb8c0435bc4130228237124331295d4713/src/api.cc [modify] https://crrev.com/038bafcb8c0435bc4130228237124331295d4713/src/counters.cc [modify] https://crrev.com/038bafcb8c0435bc4130228237124331295d4713/src/counters.h
,
Apr 10 2017
The following revision refers to this bug: https://chromium.googlesource.com/v8/v8.git/+/1ee2a998fa4c8842690467c96286a97b506f1424 commit 1ee2a998fa4c8842690467c96286a97b506f1424 Author: mtrofin <mtrofin@chromium.org> Date: Mon Apr 10 15:15:40 2017 Reland of Ensure counters are initialized, to avoid init on non-joinable threads. (patchset #1 id:1 of https://codereview.chromium.org/2812653002/ ) Reason for revert: Appears to be a flake. Both jgruber and I tried to repro locally and failed. Also change has little change of having had caused those failures. Original issue's description: > Revert of Ensure counters are initialized, to avoid init on non-joinable threads. (patchset #1 id:1 of https://codereview.chromium.org/2812543002/ ) > > Reason for revert: > https://build.chromium.org/p/client.v8/builders/V8%20Linux64%20-%20gyp/builds/5221 > > Original issue's description: > > Ensure counters are initialized, to avoid init on non-joinable threads. > > > > This occurs in the wasm scenario described in the referenced bug. > > DecodeWasmModule collects statistics. Blink inserts a CreateHistogramCallback that > > can't instantiate a histogram on non-joinable threads. Turns out, DecodeWasmModule > > is scheduled on such a thread, now that we have async compilation. > > > > This fix pre-initializes histograms when the callback is applied, which is assumed to > > be in a context that can carry out the instantiation. In Blink, this happens on the main > > thread. > > > > BUG= chromium:709684 > > > > Review-Url: https://codereview.chromium.org/2812543002 > > Cr-Commit-Position: refs/heads/master@{#44522} > > Committed: https://chromium.googlesource.com/v8/v8/+/022e7ddf232b0696eb595769fa702463c417db5e > > TBR=jochen@chromium.org,mtrofin@chromium.org > # Skipping CQ checks because original CL landed less than 1 days ago. > NOPRESUBMIT=true > NOTREECHECKS=true > NOTRY=true > BUG= chromium:709684 > > Review-Url: https://codereview.chromium.org/2812653002 > Cr-Commit-Position: refs/heads/master@{#44527} > Committed: https://chromium.googlesource.com/v8/v8/+/038bafcb8c0435bc4130228237124331295d4713 TBR=jochen@chromium.org,jgruber@chromium.org # Skipping CQ checks because original CL landed less than 1 days ago. NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG= chromium:709684 Review-Url: https://codereview.chromium.org/2813673002 Cr-Commit-Position: refs/heads/master@{#44529} [modify] https://crrev.com/1ee2a998fa4c8842690467c96286a97b506f1424/src/api.cc [modify] https://crrev.com/1ee2a998fa4c8842690467c96286a97b506f1424/src/counters.cc [modify] https://crrev.com/1ee2a998fa4c8842690467c96286a97b506f1424/src/counters.h
,
Apr 11 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/a9ec6cb4b268d78bd6efc1327758e40067e4a304 commit a9ec6cb4b268d78bd6efc1327758e40067e4a304 Author: mtrofin <mtrofin@chromium.org> Date: Tue Apr 11 12:53:10 2017 [wasm] Compile closer to how customers will. Ensure we don't regress from https://codereview.chromium.org/2812543002/ Although the fix is on the v8 side, the issue requires Blink-specifics. Regression testing on V8's side, in this case, would be a bit artificial. BUG= chromium:709684 Review-Url: https://codereview.chromium.org/2808743005 Cr-Commit-Position: refs/heads/master@{#463606} [modify] https://crrev.com/a9ec6cb4b268d78bd6efc1327758e40067e4a304/third_party/WebKit/LayoutTests/fast/wasm/wasm-limits-test.html [modify] https://crrev.com/a9ec6cb4b268d78bd6efc1327758e40067e4a304/third_party/WebKit/LayoutTests/fast/wasm/wasm-limits-tests.js [modify] https://crrev.com/a9ec6cb4b268d78bd6efc1327758e40067e4a304/third_party/WebKit/LayoutTests/http/tests/wasm/resources/load_wasm.js
,
Apr 24 2017
,
Apr 25 2017
Could this change cause the flakes in https://build.chromium.org/p/client.v8/builders/V8%20Linux64%20TSAN/builds/15032/steps/Check%20%28flakes%29/logs/regress-709684 ?
,
Apr 25 2017
I doubt https://codereview.chromium.org/2812543002 causes the flakiness, because that change involves code executing on the main thread; however, the original enabling of histograms may be of issue. I think this deserves a separate bug to track. Probably Karl would be the first investigator.
,
Apr 25 2017
Looking at the race conditions raised in comment 14, it does not look like https://codereview.chromium.org/2812543002 is the cause of the races. Rather, it looks like a different CL I submitted. However, I don't know why this is now causing races. In v8/src/wasm/module-decoder.cc:1143, I replaced: HistogramTimerScope wasm_decode_module_time_scope( isolate->counters()->wasm_decode_module_time()); with: HistogramTimerScope wasm_decode_module_time_scope( IsWasm(origin) ? isolate->counters()->wasm_decode_wasm_module_time() : isolate->counters()->wasm_decode_asm_module_time()); If the modified code is causing races, the old code should have also.
,
Apr 25 2017
Note: There were other (similar) changes that are now causing races, that I didn't document in comment 16. However, each of those changes should not have effected race conditions.
,
Apr 25 2017
From the TSan report it looks like several (at least two) modules are decoded in background threads at the same time, or at least without synchronization. This can easily happen, and if you look at the code in test/mjsunit/regress/wasm/regress-709684.js, it is very likely triggering exactly that. Now the problem is that updates (and reads) of the histograms are not synchronized. We could either fix this by 1) adding explicit synchronization (e.g. "SynchronizedHistogramTimerScope"), or 2) making histogram updates lock-free (CAS), or 3) just not updating the histogram from background tasks, and instead buffering the updates in thread-local state, and doing the update later in a foreground task. Number 3 is probably the easiest and cleanest solution for now (and it's ahaas' favorite ;) ).
,
May 1 2017
,
May 4 2017
Created https://bugs.chromium.org/p/chromium/issues/detail?id=709684 to track problem in the V8 issue tracker.
,
May 15 2017
V8 bug is https://bugs.chromium.org/p/v8/issues/detail?id=6361. Downgrading to a P2, since dangerous behavior has been fixed. Current code just doesn't record stats when asynchronous (WASM) compiles occur.
,
Oct 18 2017
Done. |
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by mtrofin@chromium.org
, Apr 7 2017