New issue
Advanced search Search tips

Issue 850183 link

Starred by 4 users

Issue metadata

Status: Started
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux , Android , Windows
Pri: 2
Type: Bug



Sign in to add a comment

Crash in WebSocketStreamRequestImpl::PerformUpgrade (public bug)

Project Member Reported by b...@chromium.org, Jun 6 2018

Issue description

There were a lot of crashes reported in WebSocketStreamRequestImpl::PerformUpgrade().  There is a bug specifically to address that, issue 842575, which is restricted as crasher bugs usually are.  The crash has been suppressed by https://crrev.com/c/1070327.

However, the underlying issue is not solved yet.  https://crrev.com/c/1070327 needs to be reverted, and the problem actually solved.  I am opening this public bug to track that effort.
 
Project Member

Comment 1 by bugdroid1@chromium.org, Jun 8 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/ca227d1ebcec554ebb88064873137420957cc678

commit ca227d1ebcec554ebb88064873137420957cc678
Author: Bence Béky <bnc@chromium.org>
Date: Fri Jun 08 05:58:44 2018

Do not fail gracefully in PerformUpgrade().

Do not fail gracefully in WebSocketStreamRequestImpl::PerformUpgrade().
This reverts https://crrev.com/c/1070327, which was only landed so that
crash is temporarily suppressed on Beta and Stable.

This CL will cause crashes in
WebSocketStreamRequestImpl::PerformUpgrade() or in other places, which
then can be investigated by the usual means (peppering more CHECKs in
the code, code inspections, etc.) and the underlying cause can be
solved.

Bug: 850183
Change-Id: I7b7b8a1d1b37d0881cc68bfac278c45f38bbaae8
Reviewed-on: https://chromium-review.googlesource.com/1091134
Commit-Queue: Adam Rice <ricea@chromium.org>
Reviewed-by: Adam Rice <ricea@chromium.org>
Cr-Commit-Position: refs/heads/master@{#565564}
[modify] https://crrev.com/ca227d1ebcec554ebb88064873137420957cc678/net/websockets/websocket_stream.cc

Comment 2 by b...@chromium.org, Jun 12 2018

69.0.3455.0 picked up the change above, I'll keep an eye out for new crashes on that version.
Is there some way we can fuzz this code to find a repro?
Project Member

Comment 4 by bugdroid1@chromium.org, Jul 19

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/f06fd249f75f6d40dc50840222ad3e2f5c472505

commit f06fd249f75f6d40dc50840222ad3e2f5c472505
Author: Bence Béky <bnc@chromium.org>
Date: Thu Jul 19 15:52:30 2018

[WS] Handle no handshake stream in PerformUpgrade().

The vast majority of recent crash reports point to line 190:
  CHECK(on_handshake_stream_created_has_been_called_);

Next branch cut is around the corner and this is still number one //net
crasher on M69.  https://crrev.com/c/940463 has been clearly identified
as the culprit.  Much pondering was still not enough to understand the
real underlying cause of this crash since the last branch point.  I
propose to be defensive in WebsocketStream.  While this is not the ideal
solution, I am not hopeful that we will find anything better any time
soon.  This approach allows the request to fail safely, while still
retaining the new WebSocket over HTTP/2 functionality.

Bug: 850183
Change-Id: I21a7c3919466c4339bebce41ee8ad1342fb9c179
Reviewed-on: https://chromium-review.googlesource.com/1140916
Commit-Queue: Bence Béky <bnc@chromium.org>
Reviewed-by: Adam Rice <ricea@chromium.org>
Cr-Commit-Position: refs/heads/master@{#576510}
[modify] https://crrev.com/f06fd249f75f6d40dc50840222ad3e2f5c472505/net/websockets/websocket_stream.cc

https://crrev.com/c/1140916 was picked up by 69.0.3497.0.  The last crash with this signature is from 69.0.3496.0.  I conclude that this crash is fixed (though it would be nice to prevent calls to PerformUpgrade() before handshake stream creating, but at least now we have a graceful failure).

Re comment #3: that's a great idea.

Comment 6 Deleted

Cc: benmason@chromium.org
Labels: ReleaseBlock-Stable M-72 M-71 OS-Android
I see considerable spike in current 'Android' stable# 70.0.3538.80, crashes from >17K unique clients so far.

Ref: https://goto.google.com/ljbji

bnc@, not very sure if this crash is originated from the same root cause as 'OS=Linux', however would be really great if you could take a look and confirm.

Thank you!
This is very strange. It looks like handshake_stream is null in the line

std::unique_ptr<WebSocketStream> stream = handshake_stream->Upgrade();

even though it is copied from handshake_stream_, which has a null check immediately above. This should be impossible unless

a) The line number is wrong
b) Upgrade() is being inlined
c) threads are involved
d) we hit a toolchain bug
e) we hit a CPU bug
f) This version was somehow compiled with an older version of this source file

Upgrade() being inlined would be more plausible if it wasn't a virtual function with two implementations. However, it's still more plausible than any other of my potential explanations.

There also seems to be an uptick in crashes in WebSocketBasicHandshakeStream::Upgrade() and WebSocketHttp2HandshakeStream::Upgrade(). Maybe someone is rolling out an HTTP/2 WebSocket server and the apparent correlation with version 70.0.3538.80 is an illusion?

The crash is not Android-only, which I think rules out d) e) or f).
I missed a possibility: handshake_stream could be a non-null but invalid value. In other words, this could be either heap corruption or a use-after-free situation.
Chrome's CFI (https://www.chromium.org/developers/testing/control-flow-integrity) turns these crashes into "Illegal operation" on Linux, adding extra support to the UAF theory.
Reminder M71 Stable is approaching VERY soon. Please review this bug and assess if this is indeed a RBS. If not, please remove the RBS label. If so, please make sure any planned work will be tested in Beta and verified before the Stable date. Thank you.

Requesting to take a look at M71 blockers ASAP due to upcoming Thanksgiving holidays next week.

Labels: -ReleaseBlock-Stable
Since this is already spiking in M70, I don't think we can consider it a release blocker for M71.

Comment 13 Deleted

Project Member

Comment 14 by bugdroid1@chromium.org, Dec 7

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/68506d8510f51e1b785e1291ced9438df306ea83

commit 68506d8510f51e1b785e1291ced9438df306ea83
Author: Bence Béky <bnc@chromium.org>
Date: Fri Dec 07 16:18:12 2018

Remove WebSocketHandshakeStreamCreateHelper::set_stream_request().

Move creation of WebSocketHandshakeStreamCreateHelper to
WebSocketStreamRequestAPI derived class constructor, making it possible
to eliminate WebSocketHandshakeStreamCreateHelper::set_stream_request().

This is not directly related to the crasher bug, but while I was digging
into the ownership dependencies and lifetime issues I stumbled upon this
low-hanging fruit for simplification.

Bug: 850183
Change-Id: Iaf73ec9d70fa4bfe7aac08314282b31208b9a207
Reviewed-on: https://chromium-review.googlesource.com/c/1365657
Commit-Queue: Bence Béky <bnc@chromium.org>
Reviewed-by: Adam Rice <ricea@chromium.org>
Cr-Commit-Position: refs/heads/master@{#614714}
[modify] https://crrev.com/68506d8510f51e1b785e1291ced9438df306ea83/net/websockets/websocket_channel.cc
[modify] https://crrev.com/68506d8510f51e1b785e1291ced9438df306ea83/net/websockets/websocket_channel.h
[modify] https://crrev.com/68506d8510f51e1b785e1291ced9438df306ea83/net/websockets/websocket_channel_test.cc
[modify] https://crrev.com/68506d8510f51e1b785e1291ced9438df306ea83/net/websockets/websocket_handshake_stream_create_helper.cc
[modify] https://crrev.com/68506d8510f51e1b785e1291ced9438df306ea83/net/websockets/websocket_handshake_stream_create_helper.h
[modify] https://crrev.com/68506d8510f51e1b785e1291ced9438df306ea83/net/websockets/websocket_handshake_stream_create_helper_test.cc
[modify] https://crrev.com/68506d8510f51e1b785e1291ced9438df306ea83/net/websockets/websocket_stream.cc
[modify] https://crrev.com/68506d8510f51e1b785e1291ced9438df306ea83/net/websockets/websocket_stream.h
[modify] https://crrev.com/68506d8510f51e1b785e1291ced9438df306ea83/net/websockets/websocket_stream_create_test_base.cc
[modify] https://crrev.com/68506d8510f51e1b785e1291ced9438df306ea83/net/websockets/websocket_test_util.h

Cc: gov...@chromium.org
bnc@, Just wondering do we need the above fix to M72 branch: 3626? (Seeing your c#14 as 'This is not directly related to the crasher bug', hence the ask).

Coming to the crash, so far we are seeing ~53 unique crashes on current M72 beta# 72.0.3626.14 (Android).

Ref: https://goto.google.com/wtbvr

Thank you!
73.0.3636.2 already has https://crrev.com/c/1365657 (the CL from comment #14) and there are crashes since then.  So this CL indeed does not fix the crash, no reason to merge.
Thank you bnc@.
Issue 908734 has been merged into this issue.

Sign in to add a comment