New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 715318 link

Starred by 3 users

Issue metadata

Status: WontFix
Owner: ----
Closed: May 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug

Blocked on:
issue 718300



Sign in to add a comment

"Linux Debug (Intel HD 530)" occasionally fails to bring up browser

Project Member Reported by kbr@chromium.org, Apr 25 2017

Issue description

https://build.chromium.org/p/chromium.gpu.fyi/builders/Linux%20Debug%20%28Intel%20HD%20530%29?numbuilds=200 occasionally fails to start the browser at all for some test runs. It's strange because it's 100% repeatable for that particular invocation, but previous and subsequent steps -- which are all run with exactly the same binary -- pass successfully.

Attached are two logs, one from this passing run:
https://build.chromium.org/p/chromium.gpu.fyi/builders/Linux%20Debug%20%28Intel%20HD%20530%29/builds/1513

and one from this failing run:
https://build.chromium.org/p/chromium.gpu.fyi/builders/Linux%20Debug%20%28Intel%20HD%20530%29/builds/1514

Looking more deeply into the logs: one thing that's consistent across all of the attempts to bring up the browser is the port that's chosen for Telemetry's TsProxy, e.g.:
  --proxy-server=socks://localhost:47699

It looks like TsProxy attempts to find an ephemeral port when it starts up, so that it shouldn't collide with other services, but I still wonder whether failed attempts to bring up the browser should also restart TsProxy.

Ned, do you think that's a plausible reason for these failures?

 
passing.txt
153 KB View Download
failing.txt
69.8 KB View Download

Comment 1 by kbr@chromium.org, Apr 25 2017

Labels: Hotlist-PixelWrangler
Cc: pmeenan@chromium.org
Ken: I don't think there is any need for restarting ts_proxy server when the browser need to be restarted.

Comment 3 by kbr@chromium.org, Apr 26 2017

Cc: mar...@chromium.org vadimsh@chromium.org
Do you have any other guesses for why the browser would reliably hang during startup? There's a fresh user-data-dir every time. In other successful steps on the same machine, the same binaries are being used.

Unless something got corrupted during the isolate extraction, which I think is very unlikely (CC'ing Swarming folks just in case), I'm not sure what other state could be being preserved causing the browser to reliably fail to start.

My guess is it could be some dialog from the system that requires manual human intervention. For example, on Mac, we usually have the key chain diaglog on the bots that hang the benchmark.

The best way to diagnose this type of bug is enabling screenshot capture upon failure, I think.
Status: Available (was: Untriaged)
Another thing we can try here is implement system level logging to see what happened when browser failed to startup on Linux.

(This method need to be implemented in linux_platform_backend: https://github.com/catapult-project/catapult/blob/master/telemetry/telemetry/internal/platform/platform_backend.py#L94)
CL to add logging: https://codereview.chromium.org/2869483003/

Comment 8 by kbr@chromium.org, May 7 2017

Blockedon: 718300
Project Member

Comment 9 by sheriffbot@chromium.org, May 8 2018

Labels: Hotlist-Recharge-Cold
Status: Untriaged (was: Available)
This issue has been Available for over a year. If it's no longer important or seems unlikely to be fixed, please consider closing it out. If it is important, please re-triage the issue.

Sorry for the inconvenience if the bug really should have been left as Available.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot

Comment 10 by kbr@chromium.org, May 9 2018

Status: WontFix (was: Untriaged)
This bot doesn't exist any more. It's been replaced by https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Linux%20FYI%20Release%20(Intel%20HD%20630) which seems more reliable.

Sign in to add a comment