New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 796469 link

Starred by 17 users

Issue metadata

Status: WontFix
Merged: issue 736725
Owner: ----
Closed: Nov 8
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 2
Type: Bug-Regression



Sign in to add a comment

WebRTC app freezes chrome tab with susbequent "Aw, snap"

Reported by teih...@gmail.com, Dec 20 2017

Issue description

UserAgent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.108 Safari/537.36

Steps to reproduce the problem:
1. Make outgoing webrtc audio call (connected) from chrome
2. Hangup above mention call
3. Make incoming webrtc audio call to chrome

It's reproducable on other machine and on other os (linux)

What is the expected behavior?
Tab shouldn't freeze

What went wrong?
Tab will freeze, and it even can't be closed by clicking on close cross (X). In console last record is that we set remote offer and getusermedia success callback is called.

After that sometimes (and sometimes not) appears window that page unresponsive.

Crashed report ID: no

How much crashed? Just one tab

Is it a problem with a plugin? N/A 

Did this work before? Yes 62

Chrome version: 63.0.3239.108  Channel: stable
OS Version: 10.0
Flash Version: Shockwave Flash 28.0 r0

We use javascript library (sipml5 flavour) for sip signaling. Everything works fine until upgrade to chrome 63, in other browsers everything still works fine (firefox, opera).

I think it's connected with our js code, that works before but now deprecated in some kind, but I can't figure out what exactly goes wrong because when I use breakpoint in my chrome dev tools and go through code step by step everything again works fine (races I think).

How can I nail down the problem?
 

Comment 1 Deleted

Comment 2 by teih...@gmail.com, Dec 21 2017

I found that trouble connected with media stream reusing between subsequent connections
Cc: rbasuvula@chromium.org
Components: UI>Browser
Labels: Needs-Feedback Needs-Triage-M63
@teihrib: Thanks for filing the issue! Could you please provide the crash id from chrome://crashes and if possible,Please create new profile without extensions and apps.Re-check once and let us know the observations of the issue which would help us to triage the issue further.

Thanks in Advance.

Comment 4 by teih...@gmail.com, Jan 18 2018

No, I can't - because it now just freezes and stop showing "aw, snap"
In the past I thinked that trouble is gone away, but it appears again, but less frequent and I can't reproduce it (but our customers faced sometimes).

How can I debug it?
Also it's strongly connected with webrtc - tab freezes after setting remote SDP for incoming call.
Project Member

Comment 5 by sheriffbot@chromium.org, Jan 18 2018

Labels: -Needs-Feedback
Thank you for providing more feedback. Adding requester "rbasuvula@chromium.org" to the cc list and removing "Needs-Feedback" label.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot

Comment 6 by ben...@gmail.com, Jan 19 2018

I can confirm that this issue is affecting my whole team when using WebRTC-based conferencing with service from Vidyo. We can somewhat reliably reproduce the issue so if it is helpful to work directly with someone on this, feel free to contact me.

Comment 7 by teih...@gmail.com, Jan 23 2018

We need help of somebody related to webrtc development to investigate the issue as this is not usual "ah, snap". And fortunatelly Ben and his team can reproduce it reliably. @rbasuvula help please
Cc: marchuk@google.com
Labels: Hotlist-Enterprise
teihrib@,  benark@  so there is nothing in chrome://crashes after issue happens? (as sometimes it might not necessary visually crash, but still may  generate crash id, which we are interested in).

If you can constantly reproduce it, can you please provide test credentials?
Labels: Needs-Feedback
Tested in chrome # 63.0.3239.108, Stable #65.0.3325.181, Canary #67.0.3393.0 on win 7 & 10.0 and not able to reproduce the issue.Please find the screen shots for your reference.
Note: Tried with Hangouts.

@teihrib: Could you please update on comment #8 of the issue which would help us to triage the issue further.

Thanks in Advance.
796469.mp4
7.1 MB View Download

Comment 10 by gee...@tokbox.com, Apr 10 2018

We are able to reproduce the issue. Please see below for details.

Problem:

Establishing a webRTC session with vendor Tokbox (opentokdemo.tokbox.com), Vidyo and Truclinic applications intermittently  fails in the Chrome browser.  The browser prompts for permissions to use A/V devices, displays the local A/V feed, create an SDP offer, and freezes after sending the SDP offer. 



 After a long pause the browser will appear unresponsive, with a Chrome browser message being prompted to wait or exit.  During this phase, other elements of the browser may be hung or unresponsive. Mouse clicking on menus or attempting to close or bring up developer mode may hang. During this time we may see CPU activity being used, but it is not typically pegging at 100%. Other tabs and web page access in chrome will also often be unresponsive. Eventually the session will time out completely.  



 Refresh chrome once, and the application will then successfully negotiate and maintain a webRTC connection. After the first refresh, all subsequent video sharing attempts will work for the duration of that device boot. The device tends to go back to a failure state upon next reboot.



Other browsers such as Firefox remain responsive and do not produce this failure at all.



Reproduction Steps: 

Boot the computer >> Log in >> Launch Chrome Browser >> Go to vendor web application URL >>  Select/start a webRTC session >> Wait for hang


 

Browser versions:

We were able to reproduce on multiple Chrome browser versions, from v61 through current latest v65 both 32bit and 64bit.

 

Devices:

We have witnessed this error across several device models but seem to have higher success in reproducing the error our older model systems such as the HP Compaq Pro 6300. Those devices were deployed around 2013-2014. In addition, we have seen it fail on newer 2016-2017 purchased devices like HP Proddesk models.

 

***

Specs on a common sample machine:   

 

OS Name             Microsoft Windows 7 Professional

Version 6.1.7601 Service Pack 1 Build 7601

OS Manufacturer             Microsoft Corporation


System Name    IS1408015

System Manufacturer    Hewlett-Packard

System Model   HP Compaq Pro 6300 SFF

System Type      x64-based PC

Processor            Intel(R) Core(TM) i5-3470 CPU @ 3.20GHz, 3201 Mhz, 4 Core(s), 4 Logical Processor(s)

BIOS Version/Date          Hewlett-Packard K01 v02.90, 7/16/2013

SMBIOS Version               2.7



***

 

Operating System:

Windows 7 64-bit predominantly.  There may have been reports of failures on Windows 10, but we not been able to confirm or collect logs from any recent win10 devices.

 
Components: Blink>WebRTC

Comment 12 by teih...@gmail.com, Apr 11 2018

For me it was fixed in chrome 64 and I think it was https://bugs.chromium.org/p/chromium/issues/detail?id=736725
Project Member

Comment 13 by sheriffbot@chromium.org, Apr 11 2018

Labels: -Needs-Feedback
Thank you for providing more feedback. Adding the requester to the cc list.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Owner: hbos@chromium.org
Status: Assigned (was: Unconfirmed)
hbos@: This is almost certainly a duplicate as described in #12. Please close if this is the case.

Comment 15 by hbos@chromium.org, Apr 12 2018

Mergedinto: 736725
Status: Duplicate (was: Assigned)
Sounds like a duplicate. Sorry for the delay teihrib@ I'm glad it is working again.

Comment 16 by hbos@chromium.org, Apr 12 2018

 Issue 736725  merged its fix into M64. But Comment 10 says it was reproducible in M65 as well... There is a very similar issue 813574 also that was fixed and merged into M66.

However I think #10 is a different issue. These deadlock issues will not cause the entire browser to be unresponsive or sluggish, only the single tab.
I've seen https://crbug.com/817314 and  https://crbug.com/829573  which makes everything laggy but I don't know if related.
Cc: huib@chromium.org huib@google.com
++logfiles for #10 scenario (RVG)

https://drive.google.com/drive/folders/1PqGZdr34F0m3fhKghJ4_uz3K6kIrYSBe?usp=sharing

We are collecting trace log to find out whether it's deadlock or not.

However we collected webrtc logs and found out that when it fails, it sends STUN pings
but not receiving responses 

(ID 1658: UDP_SOCKET from chrome-net-export-log.json
numerous records withour response
t=1087353 [st=826240]    UDP_BYTES_SENT
                         --> address = "74.201.205.8:32595"
                         --> byte_count = 108


as well as same numerous STUN pings in  webrtc logfile sharp-failed.log
without responses

[12472:11428:0413/155804.973:VERBOSE1:port.cc(1304)] Jingle:Conn[09D881C0:audio:x3V6ldG6:1:0:local:udp:172.16.201.x:52986->7a+Qpy0k:1:2130706431:local:udp:74.201.205.x:57027|C--W|0|0|9079290933605826558|-]: Sending STUN ping , id=70446c4d386e32676e51556a, nomination=0
...
[12472:11428:0413/155809.972:VERBOSE1:port.cc(1235)] Jingle:Conn[09D881C0:audio:x3V6ldG6:1:0:local:udp:172.16.201.x:52986->7a+Qpy0k:1:2130706431:local:udp:74.201.205.x:57027|C--I|0|0|9079290933605826558|-]: UpdateState(), ms since last received response=48099206, ms since last received data=48099206, rtt=6000, pings_since_last_response=70446c4d386e32676e51556a 7256486f6f496d39356b3251 4d7566756245697433574d55 6d366b545a74315578496a35 61514e5145455033326d5453 ... 97 more

And another obervation, that IPV6 interface any:0:0:0:x:x:x:x:x is added each time when it fails per #10:

[12472:11428:0413/155805.110:INFO:tcpport.cc(176)] Jingle:Port[09D8C978:audio:1:0:local:Net[any:0:0:0:x:x:x:x:x/0:Unknown]]: Not listening due to firewall restrictions.
[12472:11428:0413/155805.110:INFO:basicportallocator.cc(810)] Jingle:Port[09D8C978:audio:1:0:local:Net[any:0:0:0:x:x:x:x:x/0:Unknown]]: Gathered candidate: Cand[:2654133475:1:tcp:1509949951:[0:0:0:x:x:x:x:x]:9:local::0:nQct:xode1ggiiasEnGbiiBQPX6FG:0:50:0]
[12472:11428:0413/155805.110:INFO:basicportallocator.cc(837)] Jingle:Port[09D8C978:audio:1:0:local:Net[any:0:0:0:x:x:x:x:x/0:Unknown]]: Port ready.
[12472:11428:0413/155805.111:WARNING:p2ptransportchannel.cc(569)] Jingle:Port[09D8C978:audio:1:0:local:Net[any:0:0:0:x:x:x:x:x/0:Unknown]]: SetOption(1, 65536) failed: 0
[12472:11428:0413/155805.111:WARNING:p2ptransportchannel.cc(569)] Jingle:Port[09D8C978:audio:1:0:local:Net[any:0:0:0:x:x:x:x:x/0:Unknown]]: SetOption(2, 65536) failed: 0
[12472:11428:0413/155805.111:WARNING:p2ptransportchannel.cc(569)] Jingle:Port[09D8C978:audio:1:0:local:Net[any:0:0:0:x:x:x:x:x/0:Unknown]]: SetOption(5, 0) failed: 0

which is not present in working log sharp-working.log
74.201.205.x is working tokbox's server, and in failed scenario working fine after refresh.

Need WebRTC expertise here to assist.

Comment 18 by huib@chromium.org, Apr 16 2018

@hbos, can you take a look whether this is webrtc, otherwise reassign?

Comment 20 by kotah@chromium.org, Apr 17 2018

Cc: kotah@chromium.org
Status: Assigned (was: Duplicate)
This issue is not a dup of  crbug.com/736725  - Changing status back to assigned.

Comment 21 by kotah@chromium.org, Apr 17 2018

Read recent updates again...marchuk@, Great find:) Not just STUN pings, but STUN requests timed out too:

[12472:11428:0413/155804.973:VERBOSE1:stunrequest.cc(252)] Sent STUN request 1; resend delay = 250
[12472:11428:0413/155805.023:VERBOSE1:stunrequest.cc(252)] Sent STUN request 1; resend delay = 250
[12472:11428:0413/155805.224:VERBOSE1:stunrequest.cc(252)] Sent STUN request 2; resend delay = 500
[12472:11428:0413/155805.274:VERBOSE1:stunrequest.cc(252)] Sent STUN request 2; resend delay = 500
...
[12472:11428:0413/155836.732:VERBOSE1:stunrequest.cc(252)] Sent STUN request 9; resend delay = 8000
[12472:11428:0413/155836.782:VERBOSE1:stunrequest.cc(252)] Sent STUN request 9; resend delay = 8000

But we can't tell why STUN requests didn't receive responses. Have they investigated from network perspective? They can check network capture, or an easier way to check would be to test in open networks.

Comment 22 by hbos@chromium.org, Apr 17 2018

Cc: hbos@chromium.org
Components: -Blink>WebRTC Blink>WebRTC>Network
Owner: ----
Status: Untriaged (was: Assigned)
I'm not sure what to make of this or where to reassign it. Putting the webrtc network component on it because of the network related logs, but that could just be a symptom of other thread hanging. If the tab is unresponsive, that is the renderer thread hanging, a different thread than one sending/receiving STUN messages.

What do you mean "in failed scenario working fine after refresh"? Is this not reproducible if you refresh the page?  https://crbug.com/829831  describes a problem where it is only broken the first time you use webrtc, then you have to reboot the PC before it is reproducible again... But the tracing is different there.

With this one I'm not even sure where it is stuck.
Components: -Blink>WebRTC>Network
If it's deadlocked, wouldn't it be possible to make a debug build, get a stack trace and figure out exactly where the deadlock is occurring?

https://www.chromium.org/for-testers/bug-reporting-guidelines/hanging-tabs
Can engineering team please provide any recent image of debug build for windows? 

Comment 25 by kotah@chromium.org, Apr 19 2018

Owner: deadbeef@chromium.org
@deadbeef, Do you know who or which team can answer to your question in #c23?

@hbos, Correct, according to customer issue disappears after page refresh.

@marchuk, Do we have a screencast or recording of the issue? It would help to make sure all of us are on the same page in understanding the issue.

Owner: ----
@kotah: Ideally whoever can reproduce the issue can provide a stack trace (sounds like marchuk@ can?). And actually, it sounds like you can get a stack trace from a release build minidump as described here: http://www.chromium.org/developers/how-tos/debugging-on-windows

Though this may not even be necessary; due to this CL, chromium may start reporting more hangs as crashes: https://chromium-review.googlesource.com/c/chromium/src/+/878825

Which takes effect in M66. Meaning it would show up in chrome://crashes.
Re: #25 Video of the issue
https://drive.google.com/open?id=1FnnDIyfXhcFTt2kcGPsIsG34G6ni7zri

Re: #26 It's only reproduced in customer's environment.
I have built chromium with is_debug = true and symbol_level = 2 and they are trying to create minidump, while running this.
Also they are checking if there are any crashes while running M66
Issue was never reproduced on Chromium version 68, which I built in #27.
Customers are trying to use Canary.

But still, to get where it freezes, we'd like to build debug Chromium 67 or 66 
but per https://www.chromium.org/developers/how-tos/get-the-code/working-with-release-branches
since I don't have access to go/ChromeReleaseBranches and other places, it looks like I cannot build, so request engineering to build 66 or 67 with 
gn ars as:

is_debug = true
symbol_level = 2

Comment 29 by hbos@chromium.org, May 4 2018

Do you not have access to https://g3doc.corp.google.com/company/teams/chrome/chrome_build_instructions.md ?

> Issue was never reproduced on Chromium version 68, which I built in #27.
> Customers are trying to use Canary.

Is this issue not reproducible in Canary? If that is the case I would suggest closing this as WontFix (not reproducible) since debugging/bisecting would be difficult if only repro in customer environment?
No access, I'd wish to have.

Customer is still checking on canary.
We got mini and full dumps of chrome.exe  from version 66.0.3359.139 stable from them, during the issue.
Added them to https://drive.google.com/corp/drive/u/0/folders/1PqGZdr34F0m3fhKghJ4_uz3K6kIrYSBe (RVG)

While trying to debug them, I see all of them lead to 
https://cs.chromium.org/chromium/src/base/message_loop/message_pump_win.cc?sq=package:chromium to line 215

We are still waiting for results of Canary.

dump.png
105 KB View Download
Components: Internals
marchuku@: #c30 suggests Chrome is not in deadlock between webrtc threads, but is simply in wait. Did they get the dumps from all chrome processes? Another way to get stack trace would be chrome://crash (not chrome://crashes), if user can still open another tab when the issue occurs. This forces Chrome to crash and generate crash report.

Going back to your previous comment #c17, did they confirm from application viewpoint that STUN timeout is not related to the application hang?

If customer sees the issue resolved with Canary, we should also check with Dev channel to confirm M68 resolves the issue. If it does, we should determine if the issue resolution comes from a change in Chrome and if we can merge the change to M67. M68 is still >2 months away for Stable channel.
Components: -Internals
Labels: Hotlist-DesktopUIChecked
Status: WontFix (was: Untriaged)
****Mass UI Triage***

We were unable to reproduce this bug. If this bug still reproduces for you, 
please reopen or file a new issue. Thanks!

Sign in to add a comment