New issue
Advanced search Search tips

Issue 905312 link

Starred by 1 user

Issue metadata

Status: Duplicate
Merged: issue 906690
Owner:
Closed: Nov 19
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment

video.WebRTCPeerConnectionWithCameraH264 timed out due to Chrome hang

Project Member Reported by derat@chromium.org, Nov 14

Issue description

In the grunt-paladin run at https://ci.chromium.org/p/chromeos/builders/luci.chromeos.general/CQ/b8929889351962845168, video.WebRTCPeerConnectionWithCameraH264 and video.WebRTCPeerConnectionWithCameraVP8 both failed in bvt-tast-cq. Logs are available at https://stainless.corp.google.com/browse/chromeos-autotest-results/257116764-chromeos-test/ .

2018/11/14 06:01:57 Started test video.WebRTCPeerConnectionWithCameraH264
2018/11/14 06:01:57 [06:01:56.712] Restarting ui job
2018/11/14 06:02:00 [06:01:59.765] Waiting for org.chromium.SessionManager D-Bus service
2018/11/14 06:02:00 [06:01:59.811] Asking session_manager to enable Chrome testing
2018/11/14 06:02:00 [06:01:59.818] Waiting for Chrome to write its debugging port to /home/chronos/DevToolsActivePort
2018/11/14 06:02:00 [06:02:00.548] Removing cryptohome for testuser@gmail.com
2018/11/14 06:02:01 [06:02:00.709] Finding OOBE DevTools target
2018/11/14 06:02:01 [06:02:00.843] Connecting to Chrome at ws://127.0.0.1:38543/devtools/page/D19BA01114ECA95D1727AEEFD09825F7
2018/11/14 06:02:02 [06:02:01.686] Waiting for OOBE
2018/11/14 06:02:06 [06:02:06.145] Logging in as user "testuser@gmail.com"
2018/11/14 06:02:06 [06:02:06.422] Waiting for cryptohome for user "testuser@gmail.com"
2018/11/14 06:02:08 [06:02:07.920] Waiting for OOBE to be dismissed
2018/11/14 06:02:16 [06:02:16.339] Creating new page with URL http://127.0.0.1:37575/loopback.html
2018/11/14 06:02:16 [06:02:16.508] Connecting to Chrome at ws://127.0.0.1:38543/devtools/page/052E98C4DDE276A9D85F1A8F232057EA
2018/11/14 06:03:57 [06:03:56.711] Error at webrtc.go:140: Timed out waiting for test completed: context deadline exceeded; last error follows: "!!(isTestDone)" is false
...
2018/11/14 06:04:15 Completed test video.WebRTCPeerConnectionWithCameraH264 in 2m18.004s with 2 error(s)

I see a bunch of video-related warnings and errors in https://storage.cloud.google.com/chromeos-autotest-results/257116764-chromeos-test/chromeos2-row3-rack11-host5/tast/results/system_logs/chrome/chrome_20181114-060159 that suggest that there was actually a problem.

The messages log file also shows that Chrome was hanging. session_manager eventually killed it, which I think produced the attached stack trace. After Chrome was restarted, it seems to have immediately started hanging again, which I think made the next test fail:

2018/11/14 06:04:15 Started test video.WebRTCPeerConnectionWithCameraVP8
2018/11/14 06:04:15 [06:04:14.718] Restarting ui job
2018/11/14 06:04:19 [06:04:18.947] Waiting for org.chromium.SessionManager D-Bus service
2018/11/14 06:04:19 [06:04:18.976] Asking session_manager to enable Chrome testing
2018/11/14 06:04:19 [06:04:18.982] Waiting for Chrome to write its debugging port to /home/chronos/DevToolsActivePort
2018/11/14 06:06:15 [06:06:14.717] Error at webrtc.go:99: Failed to connect to Chrome: failed to read Chrome debugging port: context deadline exceeded; last error follows: open /home/chronos/DevToolsActivePort: no such file or directory
...
2018/11/14 06:06:33 Completed test video.WebRTCPeerConnectionWithCameraVP8 in 2m18.005s with 2 error(s)

So I suspect that this is a flaky Chrome bug, but the Chrome OS Commit Queue is in bad enough shape right now that I'll probably need to set the "informational" attribute on the test if it can't be fixed soon -- we can't tolerate failures right now that aren't caused by the changes that the CQ is testing.
 
aa8540bf-bcea-4ea5-6690309f-986a54f2.dmp.txt
138 KB View Download
Thank you for reporting this problem.

The two video.WebRTCPeerConnectionWithCamera* tests are NOT in bvt-tast-cq yet.
The grunt-paladin run actually tested my CL (crrev.com/c/1312537), which will add these tests to bvt-tast-cq.

I marked Verified-1 at the CL for now so that this error won't block other CLs.
We would need some investigation before sending the CL to CQ again.
Ah, thanks for the clarification!
Cc: keiichiw@chromium.org zwisler@chromium.org
Labels: Hotlist-CrOS-Gardener
Owner: weidongg@chromium.org
The ui.ChromeCrashNotLoggedIn failure in the nyan_blaze-paladin run at http://cros-goldeneye/chromeos/healthmonitoring/buildDetails?buildbucketId=8929889316962419728 looks like it was also caused by a Chrome hang.

See http://stainless/browse/chromeos-autotest-results/swarming-412b0bee1c808a11/ -- it looks like Chrome failed to exit when the test sent SIGSEGV to it.

This probably needs the gardener's attention -- hopefully we didn't pull in a buggy build of Chrome.
Re #3, it looks like the latest run of nyan_blaze-paladin is green. Maybe it's a flaky issue. If it is a chrome hang, do you know where should I start to investigate?
#4: I haven't seen this again. The crash file from #3 is a SIGSEGV, not a SIGABRT, so I'm not sure that it's related. The Chrome log from the failure in #3 just contains this:

[10093:10093:1114/044908.199692:WARNING:install_attributes.cc(163)] Install attributes missing, first sign in
[10093:10093:1114/044908.247594:VERBOSE1:drm_display_host_manager.cc(88)] Failed to get DRM resources for '/dev/dri/card0': Invalid argument (22)
[10093:10093:1114/044908.247872:VERBOSE1:drm_device_handle.cc(90)] Succeeded authenticating /dev/dri/card1 in 0 ms with 1 attempt(s)
[10093:10159:1114/044908.335225:WARNING:accelerometer_reader.cc(246)] Accelerometer device directory is empty at /dev/cros-ec-accel
[10167:10210:1114/044908.609270:VERBOSE1:drm_device_manager.cc(58)] Primary DRM device added: /sys/devices/platform/tegra-drm/drm/card1
[10093:10093:1114/044908.751767:ERROR:input_method_manager_impl.cc(1114)] IMEEngine for "jkghodnilhceideoidjikpgommlajknk" is not registered
[10093:10093:1114/044908.754432:VERBOSE1:chrome_session_manager.cc(234)] Starting Chrome with login/oobe screen.
[10093:10093:1114/044908.755238:VERBOSE1:login_display_host_webui.cc(1190)] Showing OOBE screen: unknown
[10093:10093:1114/044908.772105:VERBOSE1:login_display_host_webui.cc(1262)] Current locale: en-US
[10093:10093:1114/044908.772221:VERBOSE1:input_events_blocker.cc(19)] InputEventsBlocker 0xb972fe48 created.
[10093:10093:1114/044908.824371:VERBOSE1:login_display_host_common.cc(97)] Mark device registered because there are remembered users: 1
[10093:10093:1114/044908.826662:ERROR:login_display_host_mojo.cc(157)] Not implemented reached in virtual void chromeos::LoginDisplayHostMojo::SetStatusAreaVisible(bool)
[10093:10093:1114/044908.826785:VERBOSE1:input_events_blocker.cc(28)] InputEventsBlocker 0xb972fe48 destroyed.
EOF

Looking at the log from a successful test, the next expected messages would probably be these:

[10552:10552:1114/045112.229069:VERBOSE1:drm_display_host_manager.cc(248)] Got display event ADD for /dev/dri/card1
[10552:10552:1114/045112.230138:VERBOSE1:drm_display_host_manager.cc(248)] Got display event ADD for /dev/dri/card0
[10552:10606:1114/045112.230377:VERBOSE1:drm_device_handle.cc(90)] Succeeded authenticating /dev/dri/card0 in 0 ms with 1 attempt(s)

So maybe there was a graphics issue. I don't see the same short log in the original failure, so maybe it's unrelated.

I'm not sure how to dig any deeper here until we have more failures. :-/
Status: WontFix (was: Assigned)
Recent several runs are all green, so close it.
Mergedinto: 906690
Status: Duplicate (was: WontFix)
Chrome is still hanging during testing -- see  issue 906690 .

Sign in to add a comment