Wasm async compilation might never finish |
|||||||||||||||||||
Issue description
We have a potential deadlock in async compilation.
The finisher checks in a loop whether a new compilation unit is executed and needs to be finished. If none is available, it sets the finisher_is_running_ flag to false and exits.
Background tasks concurrently execute compilation units and put them in the queue to be finished. At this time they check whether a finisher is running, and start one if not.
Now it might happen that
1) the finisher detects that the queue is empty
2) a background task puts a unit in the finishing queue, but does not start a finisher since the flag is still set
3) the finisher clears the flag and exits.
Now in most cases this still is no problem, since later another unit will be put into the queue, and at that point the finisher is restarted.
If there are no more units though, this is a deadlock (async compilation never finishes, hence the promise will never be resolved or rejected).
Reproducer:
================================================================
load('test/mjsunit/wasm/wasm-constants.js');
load('test/mjsunit/wasm/wasm-module-builder.js');
let chain = Promise.resolve();
const builder = new WasmModuleBuilder();
for (let i = 0; i < 100; ++i) {
builder.addFunction('fun' + i, kSig_i_v)
.addBody([...wasmI32Const(i)])
.exportFunc();
}
const buffer = builder.toBuffer();
for (let i = 0; i < 2000; ++i) {
chain = chain.then(() => builder.asyncInstantiate());
}
chain.then(({module, instance}) => instance.exports.fun1155())
.then(res => print('Result of executing fun1155: ' + res));
================================================================
This bug exists at least since M65.
,
Mar 22 2018
Fixed. The CL in #1 is not backmergable, but we should fix the same issue on M66 (this fix is very similar).
,
Mar 22 2018
Please add affected OSs.
,
Mar 22 2018
This affects all OSes.
,
Mar 22 2018
This bug requires manual review: Less than 22 days to go before AppStore submit on M66 Please contact the milestone owner if you have questions. Owners: cmasso@(Android), cmasso@(iOS), josafat@(ChromeOS), abdulsyed@(Desktop) For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Mar 22 2018
Removing merge request for now. It seems that it still sometimes hangs. Andreas, can you look into that? See e.g. https://build.chromium.org/p/client.v8/builders/V8%20Linux/builds/23898
,
Apr 4 2018
Clemens, I give this issue back to you. I was not able to reproduce this issue locally.
,
Apr 9 2018
The following revision refers to this bug: https://chromium.googlesource.com/v8/v8.git/+/040a0ab4d446de69aac29958537881212aa9799b commit 040a0ab4d446de69aac29958537881212aa9799b Author: Clemens Hammacher <clemensh@chromium.org> Date: Mon Apr 09 15:45:26 2018 [wasm] Clean up mutexes in CompilationState CompilationState had three different mutexes, plus two atomic fields. Not holding the right mutexes at the right time has already led to failures. Hence, only use a single mutex to protect all shared state of the CompilationState. R=ahaas@chromium.org Bug: chromium:824681 Change-Id: I2c414f3ddb75e82944621590493fadcbbdfb781c Reviewed-on: https://chromium-review.googlesource.com/1000783 Reviewed-by: Andreas Haas <ahaas@chromium.org> Commit-Queue: Clemens Hammacher <clemensh@chromium.org> Cr-Commit-Position: refs/heads/master@{#52481} [modify] https://crrev.com/040a0ab4d446de69aac29958537881212aa9799b/src/wasm/module-compiler.cc
,
Apr 9 2018
The following revision refers to this bug: https://chromium.googlesource.com/v8/v8.git/+/231a96bb7237a5491ad1f59ef7e5a8f969cfac61 commit 231a96bb7237a5491ad1f59ef7e5a8f969cfac61 Author: Clemens Hammacher <clemensh@chromium.org> Date: Mon Apr 09 16:42:40 2018 [wasm] Fix deadlock in async compilation This fixes a deadlock related to throttling: It can happen that all background tasks detect that they should not produce more work because of throttling (!CanAcceptWork()). Reducing the number of running background tasks is done in a later step (OnBackgroundTaskStopped). If the finisher task finishes all outstanding units between these two calls, it will not schedule another background compilation task, but all background compilation tasks will quit, hence compilation will never finish. Fixing this should allow us to reenable the 'wasm-finish-compilation' test: https://crrev.com/c/999632 R=ahaas@chromium.org Bug: chromium:824681 Change-Id: I967e4d6b2917d369dd49bb80ce4bef552d10b371 Reviewed-on: https://chromium-review.googlesource.com/1002174 Commit-Queue: Clemens Hammacher <clemensh@chromium.org> Reviewed-by: Andreas Haas <ahaas@chromium.org> Cr-Commit-Position: refs/heads/master@{#52483} [modify] https://crrev.com/231a96bb7237a5491ad1f59ef7e5a8f969cfac61/src/wasm/module-compiler.cc
,
Apr 10 2018
The following revision refers to this bug: https://chromium.googlesource.com/v8/v8.git/+/bbb26b5f7532adba82621b25f11eb5df8f15f8d2 commit bbb26b5f7532adba82621b25f11eb5df8f15f8d2 Author: Clemens Hammacher <clemensh@chromium.org> Date: Tue Apr 10 08:24:27 2018 Reland "Reland "[d8][wasm] Test wasm compilation completion"" The deadlock should be fixed with https://crrev.com/c/1002174. This is a reland of 4d1c2907d365465b71e5a836d73377d4f900d421 Original change's description: > Reland "[d8][wasm] Test wasm compilation completion" > > This is a reland of ed2605f04095b9dcc0550e5f00757f4f34881c27 > > Original change's description: > > [d8][wasm] Test wasm compilation completion > > > > d8 was recently changed to keep running until wasm compilation has > > completed. This adds a message test to test that. > > > > R=ahaas@chromium.org > > > > Change-Id: I73af53b6df4ee5f9a6afd26cf2d71a269140465f > > Reviewed-on: https://chromium-review.googlesource.com/966184 > > Reviewed-by: Andreas Haas <ahaas@chromium.org> > > Commit-Queue: Clemens Hammacher <clemensh@chromium.org> > > Cr-Commit-Position: refs/heads/master@{#52008} > > Change-Id: Iadbd5056dfa58da454956c4e89369af8b0455b35 > Reviewed-on: https://chromium-review.googlesource.com/975242 > Reviewed-by: Andreas Haas <ahaas@chromium.org> > Commit-Queue: Clemens Hammacher <clemensh@chromium.org> > Cr-Commit-Position: refs/heads/master@{#52154} Bug: chromium:824681 Change-Id: I4077645bcfcb2320f6573bb779027add36feee3f Reviewed-on: https://chromium-review.googlesource.com/999632 Commit-Queue: Clemens Hammacher <clemensh@chromium.org> Reviewed-by: Andreas Haas <ahaas@chromium.org> Cr-Commit-Position: refs/heads/master@{#52505} [add] https://crrev.com/bbb26b5f7532adba82621b25f11eb5df8f15f8d2/test/message/wasm-finish-compilation.js [add] https://crrev.com/bbb26b5f7532adba82621b25f11eb5df8f15f8d2/test/message/wasm-finish-compilation.out [modify] https://crrev.com/bbb26b5f7532adba82621b25f11eb5df8f15f8d2/test/mjsunit/wasm/wasm-module-builder.js
,
Apr 11 2018
Fixed. Nothing pops up in canary, but given that the latest canary was quite broken, we should wait one more day. Requesting backmerge to 66, as this is a severe problem (wasm compilation just never finishes if you are unlucky). We will have to manually merge #1, #8 and #9.
,
Apr 11 2018
This bug requires manual review: Less than 2 days to go before AppStore submit on M66 Please contact the milestone owner if you have questions. Owners: cmasso@(Android), cmasso@(iOS), josafat@(ChromeOS), abdulsyed@(Desktop) For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Apr 11 2018
Can this wait for M67?
,
Apr 12 2018
It's not a security bug, but it causes webpages to misbehave. WebAssembly compilation will never finish, hence the promise will never be resolved. It's likely that customers will hit this.
,
Apr 12 2018
The following revision refers to this bug: https://chromium.googlesource.com/v8/v8.git/+/a05c7d51b18a8c2e06e575e13afe6a355e559ce5 commit a05c7d51b18a8c2e06e575e13afe6a355e559ce5 Author: Clemens Hammacher <clemensh@chromium.org> Date: Thu Apr 12 08:40:48 2018 [wasm] Fix data race on failed_ field R=ahaas@chromium.org Bug: chromium:831989 , chromium:824681 Change-Id: I0a8b2cc9f80af5f954bd358c30a3c6d84b6adeae Reviewed-on: https://chromium-review.googlesource.com/1009603 Reviewed-by: Andreas Haas <ahaas@chromium.org> Commit-Queue: Clemens Hammacher <clemensh@chromium.org> Cr-Commit-Position: refs/heads/master@{#52561} [modify] https://crrev.com/a05c7d51b18a8c2e06e575e13afe6a355e559ce5/src/wasm/module-compiler.cc
,
Apr 12 2018
Please merge after this has sufficient Canary coverage (next week)
,
Apr 12 2018
,
Apr 12 2018
M66 stable release in next Tuesday and Android already has the release candidate.
,
Apr 13 2018
How urgent is this? My recommendation is to target this for M67.
,
Apr 13 2018
Given that we won't hit the M66 stable release anyway, that would be fine for me as well. Even though I am a bit uncomfortable with this bug in the stable release for another six weeks.
,
Apr 16 2018
The NextAction date has arrived: 2018-04-16
,
Apr 16 2018
,
Apr 17 2018
#15 fixes a TSan issue, and did not make it for the M67 branch. Requesting backmerge of that single CL for M67.
,
Apr 18 2018
Your change meets the bar and is auto-approved for M67. Please go ahead and merge the CL to branch 3396 manually. Please contact milestone owner if you have questions. Owners: cmasso@(Android), cmasso@(iOS), kbleicher@(ChromeOS), govind@(Desktop) For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Apr 18 2018
The following revision refers to this bug: https://chromium.googlesource.com/v8/v8.git/+/a8c50f4fe88de0e3340bd7bad6ffdb340692f872 commit a8c50f4fe88de0e3340bd7bad6ffdb340692f872 Author: Clemens Hammacher <clemensh@chromium.org> Date: Wed Apr 18 08:25:56 2018 Merged: [wasm] Fix data race on failed_ field R=ahaas@chromium.org Change-Id: I6cfd5812675ef7700a1af2bf00ed4cceca0919d5 No-Try: true No-Presubmit: true No-Treechecks: true Bug: chromium:831989 , chromium:824681 Originally-reviewed-on: https://chromium-review.googlesource.com/1009603 Reviewed-on: https://chromium-review.googlesource.com/1016280 Reviewed-by: Andreas Haas <ahaas@chromium.org> Commit-Queue: Clemens Hammacher <clemensh@chromium.org> Cr-Commit-Position: refs/branch-heads/6.7@{#22} Cr-Branched-From: 8457e810efd34381448d51d93f50079cf1f6a812-refs/heads/6.7.288@{#2} Cr-Branched-From: e921be5c4f2c6407936bde750992dedbf47c1016-refs/heads/master@{#52547} [modify] https://crrev.com/a8c50f4fe88de0e3340bd7bad6ffdb340692f872/src/wasm/module-compiler.cc
,
Apr 18 2018
,
Apr 19 2018
Issue 829463 has been merged into this issue.
,
Apr 23 2018
I re-open the merge request because it turned out that this issue is blocking an important customer, see https://crbug.com/829463 . The important CLs are in Comment #1, #8, and #9. Ben, could you please comment on this merge request?
,
Apr 23 2018
This bug requires manual review: Request affecting a post-stable build Please contact the milestone owner if you have questions. Owners: cmasso@(Android), cmasso@(iOS), josafat@(ChromeOS), abdulsyed@(Desktop) For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Apr 23 2018
How safe is this merge overall? How well tested is this?
,
Apr 24 2018
This has been in canary for 14 days, and in dev for 6 days. We didn't see any regressions. Even though there might not be another respin of M66, we should merge it back just in case.
,
Apr 24 2018
,
Apr 25 2018
The fixes cannot easily be backmerged, so this will only be fixed in M67. |
|||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||
Comment 1 by bugdroid1@chromium.org
, Mar 22 2018