New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 638174 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner:
Closed: Aug 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug

Blocking:
issue 634399



Sign in to add a comment

WPR "failed to start" in bisect

Project Member Reported by petrcermak@chromium.org, Aug 16 2016

Issue description

Regression: https://bugs.chromium.org/p/chromium/issues/detail?id=634399#c12
Bisect: https://build.chromium.org/p/tryserver.chromium.perf/builders/mac_hdd_perf_bisect/builds/731
Failing steps: 
https://build.chromium.org/p/tryserver.chromium.perf/builders/mac_hdd_perf_bisect/builds/731/steps/Working%20on%20revision%20chromium%40408944.Performance%20Test%201%20of%205/logs/stdio
https://build.chromium.org/p/tryserver.chromium.perf/builders/mac_hdd_perf_bisect/builds/731/steps/Working%20on%20revision%20chromium%40408944.Performance%20Test%202%20of%205/logs/stdio

Error:

@@@STEP_LOG_LINE@Failure Output@@@@
@@@STEP_LOG_LINE@Failure Output@Traceback (most recent call last):@@@
@@@STEP_LOG_LINE@Failure Output@  RunBenchmark at /b/c/b/mac_hdd_perf_bisect/src/third_party/catapult/telemetry/telemetry/internal/story_runner.py:317@@@
@@@STEP_LOG_LINE@Failure Output@    benchmark.ShouldTearDownStateAfterEachStorySetRun())@@@
@@@STEP_LOG_LINE@Failure Output@  Run at /b/c/b/mac_hdd_perf_bisect/src/third_party/catapult/telemetry/telemetry/internal/story_runner.py:226@@@
@@@STEP_LOG_LINE@Failure Output@    _RunStoryAndProcessErrorIfNeeded(story, results, state, test)@@@
@@@STEP_LOG_LINE@Failure Output@  _RunStoryAndProcessErrorIfNeeded at /b/c/b/mac_hdd_perf_bisect/src/third_party/catapult/telemetry/telemetry/internal/story_runner.py:78@@@
@@@STEP_LOG_LINE@Failure Output@    state.WillRunStory(story)@@@
@@@STEP_LOG_LINE@Failure Output@  WillRunStory at /b/c/b/mac_hdd_perf_bisect/src/third_party/catapult/telemetry/telemetry/page/shared_page_state.py:236@@@
@@@STEP_LOG_LINE@Failure Output@    archive_path, page.make_javascript_deterministic)@@@
@@@STEP_LOG_LINE@Failure Output@  StartReplay at /b/c/b/mac_hdd_perf_bisect/src/third_party/catapult/telemetry/telemetry/core/network_controller.py:27@@@
@@@STEP_LOG_LINE@Failure Output@    archive_path, make_javascript_deterministic)@@@
@@@STEP_LOG_LINE@Failure Output@  StartReplay at /b/c/b/mac_hdd_perf_bisect/src/third_party/catapult/telemetry/telemetry/internal/platform/network_controller_backend.py:191@@@
@@@STEP_LOG_LINE@Failure Output@    local_ports = self._StartReplayServer()@@@
@@@STEP_LOG_LINE@Failure Output@  _StartReplayServer at /b/c/b/mac_hdd_perf_bisect/src/third_party/catapult/telemetry/telemetry/internal/platform/network_controller_backend.py:215@@@
@@@STEP_LOG_LINE@Failure Output@    return self._wpr_server.StartServer()@@@
@@@STEP_LOG_LINE@Failure Output@  StartServer at /b/c/b/mac_hdd_perf_bisect/src/third_party/catapult/telemetry/telemetry/internal/util/webpagereplay.py:217@@@
@@@STEP_LOG_LINE@Failure Output@    ''.join(self._LogLines()))@@@
@@@STEP_LOG_LINE@Failure Output@ReplayNotStartedError: Web Page Replay failed to start. Log output:@@@
@@@STEP_LOG_LINE@Failure Output@@@@
@@@STEP_LOG_LINE@Failure Output@Locals:@@@
@@@STEP_LOG_LINE@Failure Output@  is_posix : True@@@
@@@STEP_LOG_LINE@Failure Output@  log_fh   : <closed file '/tmp/tmpE6wmG9', mode 'w' at 0x115472030>@@@
@@@STEP_LOG_LINE@Failure Output@@@@

Ned: Can you please investigate this?
 
Labels: -Pri-3 Pri-1
Petr: is this a consistent failure or does it just happen flakily?
I bet that this is flaky failure due to the current way telemetry manages WPR's life cycle.

The way it works is:

WPR first start with ephemeral ports specification. WPR find http_port=a, https_port=b
Browser is start with --testing-fixed-http-port=a, --testing-fixed-https-port=b
We switch the WPR archive for a new story, hence need to restart WPR. But this time it's started with http_port=a, https_port=b instead of ephemeral port to avoid restarting the browser.

Everytime we start WPR with a fixed http/https port, there is the risk that the ports are already used by some other programs on bot --> lead to flaky failure.

The new architecture of using WPR through ts_proxy will address this since browser now only need to talk to ts_proxy & we can switch its outbound ports to WPR's ports dynamically.

This bug can be thought as merged into https://github.com/catapult-project/catapult/issues/2584
#2: I've seen it once so far (in one bisect).
Status: WontFix (was: Assigned)
https://github.com/catapult-project/catapult/issues/2584 should fix the issue. If people encounter this while doing bisect, they just need to rerun it.

Sign in to add a comment