New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 829831 link

Starred by 68 users

Issue metadata

Status: Duplicate
Merged: issue 812137
Owner:
Closed: May 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 2
Type: Bug



Sign in to add a comment

problem with Chromium WebRTC stack when it is being used for the first time.

Reported by vliag...@gmail.com, Apr 6 2018

Issue description

UserAgent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36

Steps to reproduce the problem:
We have reports from our customers that Chrome intermittently freezes for a few seconds and sometimes minutes when starting a WebRTC call for first time after PC reboot. After some seconds/minutes it recovered again by itself and the webrtc call is finally established. The second call and all next calls work fine. 

The problem happens with different WebRTC apps like Circuit and appr.tc.  The problem is met only in Chrome. Tested in firefox and works ok. 

Steps to reproduce the problem:
1. reboot the PC
2. Start webRTC audio call for first time. It doesn't matter if the WebRTC call will take place immediately after PC rebbot or after hours. It just needs to be the first call after startup.

The problem is that Chrome is freezed for up to 4 minutes until the call is finally established . 

It looks like the issue is with the Chromium WebRTC stack when it is being used for the first time.  There must be some initialization which causes the freeze.

Tests took place at 06/04/2018 CET time. 

- The user rebooted his PC. 

- at 13:10 started a webrtc call using Circuit app with firefox and it worked. 

- at 13:15 started a webrtc call using  https://appr.tc/. It was freezed for about one minute and the call was finally established..  

- at 13:19 started a webrtc call using Circuit app with Chrome and it worked immediately. This succedded because this test was after the first webrtc call with https://appr.tc. 

---- here the user reboots again his PC.

- at 13:34 started a webrtc call using Circuit app with Chrome and it failed. Chrome was completely freezed. 

- at 13:43 started a webrtc call using Circuit app with Chrome and it worked. This was the second call!

Please find attached chrome_debug logs and wireshark logs for each test. 

What is the expected behavior?
The browser should not be freezed. 

What went wrong?
It looks like the issue is with the Chromium WebRTC stack when it is being used for the first time. 

Did this work before? N/A 

Chrome version: 65.0.3325.181  Channel: stable
OS Version: 6.1 (Windows 7, Windows Server 2008 R2)
Flash Version:
 

Comment 1 by vliag...@gmail.com, Apr 6 2018

Chrome_logs.zip
31.6 MB Download
Just some additional comments.

The same problem happens using the latest Chrome browser (v65) and using our desktop application which is based on Electron 1.8.3 which uses Chrome v59.

As far as we can tell, the freeze seems to happen after the application invokes the RTCPeerConnection.setLocalDescription() API. 
Components: Blink>WebRTC
Cc: pbomm...@chromium.org guidou@chromium.org abdulsyed@chromium.org tommi@chromium.org gov...@chromium.org
Labels: Needs-Triage-M65
Owner: hbos@chromium.org
Status: Assigned (was: Unconfirmed)
hbos@: Can you take a look? This could be a duplicate of one of the freezes you recently fixed.
Labels: ReleaseBlock-Stable M-66
Tagging as M66 stable blocker just in case merge is needed.
Reminder: Please note that M66 Stable is only 7 days away. This bug has been marked as ReleaseBlock Stable for M66. So please take a look and appropriately address this bug. 

Comment 8 by hbos@chromium.org, Apr 10 2018

Labels: Needs-Feedback
The freeze issues I've dealt with are unrecoverable deadlocks. Any issue about taking a minute to establish a call would be a separate.

Having to reboot the PC sounds really weird, and this being in stable with as simple repro instructions as "start a WebRTC call" sounds even weirder. There must be something special about the setup or else WebRTC would be very much broken?

vliagkou@ can you check if this still happens in Chrome Canary? Not sure I can repro.

Comment 9 by hbos@chromium.org, Apr 10 2018

When you say it freezes, are you just talking about the web application / the tab freezing, or the entire browser window freezing? Maybe this is related to https://crbug.com/817314.

Comment 10 by olka@chromium.org, Apr 10 2018

Cc: olka@chromium.org
Could someone who can repro it do a trace recording [1]: start it right before launching apprtc, select all categories, record until full?
Thanks!

[1] https://www.chromium.org/developers/how-tos/trace-event-profiling-tool/recording-tracing-runs

Comment 11 by vliag...@gmail.com, Apr 10 2018

hbos@ The entire browser window is freezing. 

Also, the user tested with Chrome Canary and the problem is still the same. 

Comment 12 by hbos@chromium.org, Apr 10 2018

Thanks, vilagkou@ can you try what olka@ suggested in #10?
1. Prepare to setup a call in one tab, and go to chrome://tracing in another tab.
2. Start recording in chrome://tracing.
3. Press call so that it freezes.
4. As soon as the browser is responding again, go to the tracing tab and finish the recording.
5. Save the tracing as a file and upload it to this issue by attaching a file to your comment.

Comment 13 by vliag...@gmail.com, Apr 10 2018

hbos@ Please find the requested trace attached. 
trace_Tue_Apr_10_2018_15.08.48.json.gz
6.8 MB Download
Cc: manoranj...@chromium.org
Any updates on this bug? 
Labels: -M-66 M-67
Given M66 stable timeline, punting this to M67.

Comment 16 by hbos@chromium.org, Apr 17 2018

Cc: roc...@chromium.org sergeyu@chromium.org hbos@chromium.org
Components: -Blink>WebRTC Blink>WebRTC>Network Internals>GPU>Internals
Owner: ----
Status: Untriaged (was: Assigned)
Looks like it is stuck at...

MessageLoop::RunTask > IPC Channel > ChannelMojo::OnMessageReceived > P2PHostMsg_SetOption for 86s,

SingleThreadTaskGraphRunner::RunTaskWithLockAcquired > RasterizerTaskImpl::RunOnWorkerThread > RasterTask > OneCopyRasterBuffer::Playback > BrowserGpuMemoryBufferManager::AllocateGpuMemoryBufferForSurface for 86s, and

MessageLoop::RunTask > GpuChannelHost::CreateViewCommandBuffer > CommandBufferProxyImpl::Initialize > GpuChannelHost::Send for 52s

I'm not sure what to do with this one, can someone re-triage?

Comment 17 by vliag...@gmail.com, Apr 20 2018

Is there any feedback for this problem? Thank you
M67 Stable promotion is coming soon. Your bug is labelled as Stable ReleaseBlock, pls make sure to land the fix and request a merge into the release branch ASAP. Thank you.


I tried to reproduce a few times to no avail, following the steps:

1. Restart PC.
2. Open Chrome dev build (67.0.3396.18)
3. Open appr.tc/r/roomname
4. Click "Join"
5. Open in another tab, click "Join"

Though after some investigation, I believe this may be a duplicate of another "freeze" bug ( crbug.com/812137 ) caused by the Windows QOSCreateHandle API hanging (root cause still unknown; possibly a Windows bug). This is indeed called from P2PHostMsg_SetOption (with OPT_DSCP).

Initially I was confused about how this could happen in AppRTC, since it doesn't use DSCP by default (it's enabled via a "googDscp" constraint when constructing the PeerConnection). But then I found that even when DSCP is disabled, we still call "SetOption(OPT_DSCP, DSCP_DEFAULT)", so we would potentially hit this bug regardless of whether DSCP is actually enabled.

To temporarily work around this hang, we've disabled calls to QOSCreateHandle (in M68, not M67). So if you can reproduce, can you verify the freeze doesn't occur in a Canary (M68) build? I'll request for this change to be merged to M67, since it has a larger impact than we thought.

Also, for future reference: I think a crash ID (from chrome://crashes) or minidump (see https://www.chromium.org/for-testers/bug-reporting-guidelines/hanging-tabs) would be more helpful for debugging than a performance trace. 
Components: -Internals>GPU>Internals
Labels: -Needs-Feedback
Owner: sergeyu@chromium.org
Status: Assigned (was: Untriaged)
#16 I don't think this is a GPU issue. The GPU calls are stuck because the IO thread is stuck inside P2PHostMsg_SetOption.

sergeyu@ I'm assigning this to you because you're a content/renderer/p2p OWNER. Please help us in triaging this further.
See my previous comment; I'm 99% sure this is  crbug.com/812137 , just need someone who can reproduce to confirm.
*** Bulk Edit ***
M67 Stable promotion is coming soon. Your bug is labelled as Stable ReleaseBlock, pls make sure to land the fix and request a merge into the release branch ASAP. 

If fix is already merged to M67 and nothing else is pending, pls mark the bug as fixed. Thank you.
Note that a fix for  crbug.com/812137  was already merged to M67 (https://chromium-review.googlesource.com/1005961), but I'm waiting for someone who can reproduce this to verify that  crbug.com/812137  is really the root cause.
@deadbeed, we have requested the user who can reproduce the problem to test with the latest Canary version. We are waiting feedback from him. 
*** Bulk Edit ***
M67 Stable promotion is coming VERY soon. Your bug is labelled as Stable ReleaseBlock, pls make sure to land the fix and request a merge into the release branch ASAP. 

If fix is already merged to M67 and nothing else is pending, pls mark the bug as fixed. Thank you.
Labels: -M-67 M-68
As this is regressed in M65 and M67 stable promotion is coming soon, punting this to M68.
Labels: -M-68 M-67
Mergedinto: 812137
Status: Duplicate (was: Assigned)
Going to go ahead and tentatively mark this as a duplicate to reduce noise.
Cc: zstein@chromium.org

Sign in to add a comment