New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 824681 link

Starred by 4 users

Issue metadata

Status: Fixed
Owner:
Closed: Apr 2018
Cc:
Components:
EstimatedDays: ----
NextAction: 2018-04-16
OS: Linux , Android , Windows , iOS , Chrome , Mac , Fuchsia
Pri: 1
Type: Bug



Sign in to add a comment

Wasm async compilation might never finish

Project Member Reported by clemensh@chromium.org, Mar 22 2018

Issue description

We have a potential deadlock in async compilation.

The finisher checks in a loop whether a new compilation unit is executed and needs to be finished. If none is available, it sets the finisher_is_running_ flag to false and exits.
Background tasks concurrently execute compilation units and put them in the queue to be finished. At this time they check whether a finisher is running, and start one if not.

Now it might happen that
1) the finisher detects that the queue is empty
2) a background task puts a unit in the finishing queue, but does not start a finisher since the flag is still set
3) the finisher clears the flag and exits.

Now in most cases this still is no problem, since later another unit will be put into the queue, and at that point the finisher is restarted.
If there are no more units though, this is a deadlock (async compilation never finishes, hence the promise will never be resolved or rejected).

Reproducer:
================================================================
load('test/mjsunit/wasm/wasm-constants.js');
load('test/mjsunit/wasm/wasm-module-builder.js');

let chain = Promise.resolve();
const builder = new WasmModuleBuilder();
for (let i = 0; i < 100; ++i) {
  builder.addFunction('fun' + i, kSig_i_v)
      .addBody([...wasmI32Const(i)])
      .exportFunc();
}
const buffer = builder.toBuffer();
for (let i = 0; i < 2000; ++i) {
  chain = chain.then(() => builder.asyncInstantiate());
}
chain.then(({module, instance}) => instance.exports.fun1155())
    .then(res => print('Result of executing fun1155: ' + res));
================================================================

This bug exists at least since M65.
 
Project Member

Comment 1 by bugdroid1@chromium.org, Mar 22 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/v8/v8.git/+/be1b2d66c088da3e191956bd49ac07f55fb4e929

commit be1b2d66c088da3e191956bd49ac07f55fb4e929
Author: Clemens Hammacher <clemensh@chromium.org>
Date: Thu Mar 22 11:57:21 2018

[wasm] Fix deadlock on async compilation

See referenced bug: Async compilation can deadlock if a background task
queues the last compilation unit to be finished while the finisher
is already exiting because there was no more work.
This CL fixes this by making the finisher task check for new work after
setting the finisher_is_running_ flag to false.

R=ahaas@chromium.org
CC=kimanh@google.com

Bug:  chromium:824681 
Change-Id: If1f5700a9fdd5d150b36e37a5d14b692c2b0f3fb
Reviewed-on: https://chromium-review.googlesource.com/975301
Commit-Queue: Clemens Hammacher <clemensh@chromium.org>
Reviewed-by: Andreas Haas <ahaas@chromium.org>
Cr-Commit-Position: refs/heads/master@{#52139}
[modify] https://crrev.com/be1b2d66c088da3e191956bd49ac07f55fb4e929/src/wasm/module-compiler.cc
[add] https://crrev.com/be1b2d66c088da3e191956bd49ac07f55fb4e929/test/mjsunit/regress/wasm/regress-824681.js

Labels: Merge-Request-66
Status: Fixed (was: Started)
Fixed. The CL in #1 is not backmergable, but we should fix the same issue on M66 (this fix is very similar).
Please add affected OSs.
Labels: OS-Android OS-Chrome OS-Fuchsia OS-iOS OS-Linux OS-Mac OS-Windows
This affects all OSes.
Project Member

Comment 5 by sheriffbot@chromium.org, Mar 22 2018

Labels: -Merge-Request-66 Merge-Review-66 Hotlist-Merge-Review
This bug requires manual review: Less than 22 days to go before AppStore submit on M66
Please contact the milestone owner if you have questions.
Owners: cmasso@(Android), cmasso@(iOS), josafat@(ChromeOS), abdulsyed@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Cc: clemensh@chromium.org
Labels: -Hotlist-Merge-Review -Merge-Review-66
Owner: ahaas@chromium.org
Status: Assigned (was: Fixed)
Removing merge request for now. It seems that it still sometimes hangs.
Andreas, can you look into that?

See e.g. https://build.chromium.org/p/client.v8/builders/V8%20Linux/builds/23898

Comment 7 by ahaas@chromium.org, Apr 4 2018

Owner: clemensh@chromium.org
Clemens, I give this issue back to you. I was not able to reproduce this issue locally.
Project Member

Comment 8 by bugdroid1@chromium.org, Apr 9 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/v8/v8.git/+/040a0ab4d446de69aac29958537881212aa9799b

commit 040a0ab4d446de69aac29958537881212aa9799b
Author: Clemens Hammacher <clemensh@chromium.org>
Date: Mon Apr 09 15:45:26 2018

[wasm] Clean up mutexes in CompilationState

CompilationState had three different mutexes, plus two atomic fields.
Not holding the right mutexes at the right time has already led to
failures. Hence, only use a single mutex to protect all shared state of
the CompilationState.

R=ahaas@chromium.org

Bug:  chromium:824681 
Change-Id: I2c414f3ddb75e82944621590493fadcbbdfb781c
Reviewed-on: https://chromium-review.googlesource.com/1000783
Reviewed-by: Andreas Haas <ahaas@chromium.org>
Commit-Queue: Clemens Hammacher <clemensh@chromium.org>
Cr-Commit-Position: refs/heads/master@{#52481}
[modify] https://crrev.com/040a0ab4d446de69aac29958537881212aa9799b/src/wasm/module-compiler.cc

Project Member

Comment 9 by bugdroid1@chromium.org, Apr 9 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/v8/v8.git/+/231a96bb7237a5491ad1f59ef7e5a8f969cfac61

commit 231a96bb7237a5491ad1f59ef7e5a8f969cfac61
Author: Clemens Hammacher <clemensh@chromium.org>
Date: Mon Apr 09 16:42:40 2018

[wasm] Fix deadlock in async compilation

This fixes a deadlock related to throttling: It can happen that all
background tasks detect that they should not produce more work because
of throttling (!CanAcceptWork()). Reducing the number of running
background tasks is done in a later step (OnBackgroundTaskStopped).
If the finisher task finishes all outstanding units between these two
calls, it will not schedule another background compilation task, but
all background compilation tasks will quit, hence compilation will
never finish.

Fixing this should allow us to reenable the 'wasm-finish-compilation'
test: https://crrev.com/c/999632

R=ahaas@chromium.org

Bug:  chromium:824681 
Change-Id: I967e4d6b2917d369dd49bb80ce4bef552d10b371
Reviewed-on: https://chromium-review.googlesource.com/1002174
Commit-Queue: Clemens Hammacher <clemensh@chromium.org>
Reviewed-by: Andreas Haas <ahaas@chromium.org>
Cr-Commit-Position: refs/heads/master@{#52483}
[modify] https://crrev.com/231a96bb7237a5491ad1f59ef7e5a8f969cfac61/src/wasm/module-compiler.cc

Project Member

Comment 10 by bugdroid1@chromium.org, Apr 10 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/v8/v8.git/+/bbb26b5f7532adba82621b25f11eb5df8f15f8d2

commit bbb26b5f7532adba82621b25f11eb5df8f15f8d2
Author: Clemens Hammacher <clemensh@chromium.org>
Date: Tue Apr 10 08:24:27 2018

Reland "Reland "[d8][wasm] Test wasm compilation completion""

The deadlock should be fixed with https://crrev.com/c/1002174.

This is a reland of 4d1c2907d365465b71e5a836d73377d4f900d421

Original change's description:
> Reland "[d8][wasm] Test wasm compilation completion"
>
> This is a reland of ed2605f04095b9dcc0550e5f00757f4f34881c27
>
> Original change's description:
> > [d8][wasm] Test wasm compilation completion
> >
> > d8 was recently changed to keep running until wasm compilation has
> > completed. This adds a message test to test that.
> >
> > R=ahaas@chromium.org
> >
> > Change-Id: I73af53b6df4ee5f9a6afd26cf2d71a269140465f
> > Reviewed-on: https://chromium-review.googlesource.com/966184
> > Reviewed-by: Andreas Haas <ahaas@chromium.org>
> > Commit-Queue: Clemens Hammacher <clemensh@chromium.org>
> > Cr-Commit-Position: refs/heads/master@{#52008}
>
> Change-Id: Iadbd5056dfa58da454956c4e89369af8b0455b35
> Reviewed-on: https://chromium-review.googlesource.com/975242
> Reviewed-by: Andreas Haas <ahaas@chromium.org>
> Commit-Queue: Clemens Hammacher <clemensh@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#52154}

Bug:  chromium:824681 
Change-Id: I4077645bcfcb2320f6573bb779027add36feee3f
Reviewed-on: https://chromium-review.googlesource.com/999632
Commit-Queue: Clemens Hammacher <clemensh@chromium.org>
Reviewed-by: Andreas Haas <ahaas@chromium.org>
Cr-Commit-Position: refs/heads/master@{#52505}
[add] https://crrev.com/bbb26b5f7532adba82621b25f11eb5df8f15f8d2/test/message/wasm-finish-compilation.js
[add] https://crrev.com/bbb26b5f7532adba82621b25f11eb5df8f15f8d2/test/message/wasm-finish-compilation.out
[modify] https://crrev.com/bbb26b5f7532adba82621b25f11eb5df8f15f8d2/test/mjsunit/wasm/wasm-module-builder.js

Labels: M-67 Merge-Request-66
Status: Fixed (was: Assigned)
Fixed. Nothing pops up in canary, but given that the latest canary was quite broken, we should wait one more day.

Requesting backmerge to 66, as this is a severe problem (wasm compilation just never finishes if you are unlucky). We will have to manually merge #1, #8 and #9.
Project Member

Comment 12 by sheriffbot@chromium.org, Apr 11 2018

Labels: -Merge-Request-66 Merge-Review-66 Hotlist-Merge-Review
This bug requires manual review: Less than 2 days to go before AppStore submit on M66
Please contact the milestone owner if you have questions.
Owners: cmasso@(Android), cmasso@(iOS), josafat@(ChromeOS), abdulsyed@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Can this wait for M67?
It's not a security bug, but it causes webpages to misbehave. WebAssembly compilation will never finish, hence the promise will never be resolved. It's likely that customers will hit this.
Project Member

Comment 15 by bugdroid1@chromium.org, Apr 12 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/v8/v8.git/+/a05c7d51b18a8c2e06e575e13afe6a355e559ce5

commit a05c7d51b18a8c2e06e575e13afe6a355e559ce5
Author: Clemens Hammacher <clemensh@chromium.org>
Date: Thu Apr 12 08:40:48 2018

[wasm] Fix data race on failed_ field

R=ahaas@chromium.org

Bug:  chromium:831989 ,  chromium:824681 
Change-Id: I0a8b2cc9f80af5f954bd358c30a3c6d84b6adeae
Reviewed-on: https://chromium-review.googlesource.com/1009603
Reviewed-by: Andreas Haas <ahaas@chromium.org>
Commit-Queue: Clemens Hammacher <clemensh@chromium.org>
Cr-Commit-Position: refs/heads/master@{#52561}
[modify] https://crrev.com/a05c7d51b18a8c2e06e575e13afe6a355e559ce5/src/wasm/module-compiler.cc

Please merge after this has sufficient Canary coverage (next week)
NextAction: 2018-04-16
M66 stable release in next Tuesday and Android already has the release candidate.
How urgent is this? My recommendation is to target this for M67. 
Given that we won't hit the M66 stable release anyway, that would be fine for me as well. Even though I am a bit uncomfortable with this bug in the stable release for another six weeks.
The NextAction date has arrived: 2018-04-16
Labels: -Merge-Review-66 Merge-Rejected-66
Labels: Merge-Request-67
#15 fixes a TSan issue, and did not make it for the M67 branch. Requesting backmerge of that single CL for M67.
Project Member

Comment 24 by sheriffbot@chromium.org, Apr 18 2018

Labels: -Merge-Request-67 Merge-Approved-67 Hotlist-Merge-Approved
Your change meets the bar and is auto-approved for M67. Please go ahead and merge the CL to branch 3396 manually. Please contact milestone owner if you have questions.
Owners: cmasso@(Android), cmasso@(iOS), kbleicher@(ChromeOS), govind@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Project Member

Comment 25 by bugdroid1@chromium.org, Apr 18 2018

Labels: merge-merged-6.7
The following revision refers to this bug:
  https://chromium.googlesource.com/v8/v8.git/+/a8c50f4fe88de0e3340bd7bad6ffdb340692f872

commit a8c50f4fe88de0e3340bd7bad6ffdb340692f872
Author: Clemens Hammacher <clemensh@chromium.org>
Date: Wed Apr 18 08:25:56 2018

Merged: [wasm] Fix data race on failed_ field

R=​ahaas@chromium.org

Change-Id: I6cfd5812675ef7700a1af2bf00ed4cceca0919d5
No-Try: true
No-Presubmit: true
No-Treechecks: true
Bug:  chromium:831989 ,  chromium:824681 
Originally-reviewed-on: https://chromium-review.googlesource.com/1009603
Reviewed-on: https://chromium-review.googlesource.com/1016280
Reviewed-by: Andreas Haas <ahaas@chromium.org>
Commit-Queue: Clemens Hammacher <clemensh@chromium.org>
Cr-Commit-Position: refs/branch-heads/6.7@{#22}
Cr-Branched-From: 8457e810efd34381448d51d93f50079cf1f6a812-refs/heads/6.7.288@{#2}
Cr-Branched-From: e921be5c4f2c6407936bde750992dedbf47c1016-refs/heads/master@{#52547}
[modify] https://crrev.com/a8c50f4fe88de0e3340bd7bad6ffdb340692f872/src/wasm/module-compiler.cc

Labels: -Merge-Approved-67

Comment 27 by ahaas@chromium.org, Apr 19 2018

 Issue 829463  has been merged into this issue.

Comment 28 by ahaas@chromium.org, Apr 23 2018

Labels: -Merge-Rejected-66 Merge-Request-66
I re-open the merge request because it turned out that this issue is blocking an important customer, see  https://crbug.com/829463 . The important CLs are in Comment #1, #8, and #9.

Ben, could you please comment on this merge request?
Project Member

Comment 29 by sheriffbot@chromium.org, Apr 23 2018

Labels: -Merge-Request-66 Merge-Review-66
This bug requires manual review: Request affecting a post-stable build
Please contact the milestone owner if you have questions.
Owners: cmasso@(Android), cmasso@(iOS), josafat@(ChromeOS), abdulsyed@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
How safe is this merge overall? How well tested is this?
Cc: nattestad@chromium.org
This has been in canary for 14 days, and in dev for 6 days. We didn't see any regressions.
Even though there might not be another respin of M66, we should merge it back just in case.
Labels: -Merge-Review-66 Merge-Approved-66
Labels: -Merge-Approved-66
The fixes cannot easily be backmerged, so this will only be fixed in M67.

Sign in to add a comment