New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 873616 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Aug 19
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug

Blocking:
issue 875881



Sign in to add a comment

Failing to start ts_proxy make system_health benchmark completely fail on 'Android Nexus6 WebView Perf'

Project Member Reported by nednguyen@chromium.org, Aug 13

Issue description

https://chrome-swarming.appspot.com/task?id=3f4b26da58b20a10&refresh=10&show_raw=1

Traceback (most recent call last):
  RunBenchmark at /b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/story_runner.py:368
    expectations=expectations, max_num_values=benchmark.MAX_NUM_VALUES)
  Run at /b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/story_runner.py:216
    test, finder_options.Copy(), story_set)
  traced_function at /b/swarming/w/ir/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py:52
    return func(*args, **kwargs)
  __init__ at /b/swarming/w/ir/third_party/catapult/telemetry/telemetry/page/shared_page_state.py:86
    self.platform.network_controller.Open(wpr_mode)
  traced_function at /b/swarming/w/ir/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py:52
    return func(*args, **kwargs)
  Open at /b/swarming/w/ir/third_party/catapult/telemetry/telemetry/core/network_controller.py:28
    self._network_controller_backend.Open(wpr_mode)
  Open at /b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/platform/network_controller_backend.py:70
    local_port = self._StartTsProxyServer()
  _StartTsProxyServer at /b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/platform/network_controller_backend.py:206
    self._ts_proxy_server.StartServer()
  StartServer at /b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/util/ts_proxy_server.py:101
    'Error starting tsproxy: %s' % err)
RuntimeError: Error starting tsproxy: None

Locals:
  cmd_line : ['/b/swarming/w/ir/.swarming_module_cache/vpython/09eff0/bin/python', '/b/swarming/w/ir/third_party/catapult/telemetry/third_party/tsproxy/tsproxy.py', '--port=0', '--desthost=127.0.0.1']
  err      : None
  fd       : 8
  fl       : 0
  timeout  : 10

https://ci.chromium.org/buildbot/chromium.perf/Android%20Nexus6%20WebView%20Perf/2331

This leads to all system health story being skipped.

I think we may want to do 2 things:
1) Add some sort of retry for this to stamp out the flake
2) Make ts_proxy failure recoverable error type
 
Labels: -Pri-2 Pri-1
Owner: nednguyen@chromium.org
Status: Started (was: Untriaged)
Project Member

Comment 3 by bugdroid1@chromium.org, Aug 16

The following revision refers to this bug:
  https://chromium.googlesource.com/catapult/+/0c8ee7bea22dcb4b96ef928a147ebe17e6cf3812

commit 0c8ee7bea22dcb4b96ef928a147ebe17e6cf3812
Author: Nghia Nguyen <nednguyen@google.com>
Date: Thu Aug 16 15:25:35 2018

Make ts_proxy error handleable by Telemetry story_runner

Previously, ts_proxy failures to startup or set a ts_proxy command will crash
the whole benchmark run. This CL makes such error recoverable by changing their
exception types to telemetry.core.exceptions.Error

Bug:  chromium:873616 
Change-Id: I6474d03fdee187eb0f046a6606002a32553a2699
Reviewed-on: https://chromium-review.googlesource.com/1177686
Commit-Queue: Ned Nguyen <nednguyen@google.com>
Reviewed-by: Juan Antonio Navarro Pérez <perezju@chromium.org>

[modify] https://crrev.com/0c8ee7bea22dcb4b96ef928a147ebe17e6cf3812/telemetry/telemetry/internal/util/ts_proxy_server.py
[modify] https://crrev.com/0c8ee7bea22dcb4b96ef928a147ebe17e6cf3812/telemetry/telemetry/internal/story_runner_unittest.py

Project Member

Comment 4 by bugdroid1@chromium.org, Aug 16

The following revision refers to this bug:
  https://chromium.googlesource.com/catapult/+/ed63b1319414a36b099cad8443d497bda8f085a2

commit ed63b1319414a36b099cad8443d497bda8f085a2
Author: Nghia Nguyen <nednguyen@google.com>
Date: Thu Aug 16 16:47:46 2018

Add retry for ts_proxy commands upon raised exception

Bug:  chromium:873616 
Change-Id: I4d05a3414f4592d2e7b3030f6df7d746bcfa32e5

NOTRY=true # CQ blocked by dashboard test, run tests locally to verify

Change-Id: I4d05a3414f4592d2e7b3030f6df7d746bcfa32e5
Reviewed-on: https://chromium-review.googlesource.com/1177687
Commit-Queue: Ned Nguyen <nednguyen@google.com>
Reviewed-by: Juan Antonio Navarro Pérez <perezju@chromium.org>

[modify] https://crrev.com/ed63b1319414a36b099cad8443d497bda8f085a2/telemetry/telemetry/internal/util/ts_proxy_server.py

Project Member

Comment 5 by bugdroid1@chromium.org, Aug 16

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/c0cd880d09098de5f6569b6cd4ce98b13ba7048d

commit c0cd880d09098de5f6569b6cd4ce98b13ba7048d
Author: catapult-chromium-autoroll <catapult-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com>
Date: Thu Aug 16 18:21:41 2018

Roll src/third_party/catapult 922ba81b497b..ed63b1319414 (4 commits)

https://chromium.googlesource.com/catapult.git/+log/922ba81b497b..ed63b1319414


git log 922ba81b497b..ed63b1319414 --date=short --no-merges --format='%ad %ae %s'
2018-08-16 nednguyen@google.com Add retry for ts_proxy commands upon raised exception
2018-08-16 perezju@chromium.org [Telemetry] Add story expectations for Pixel 2 bots
2018-08-16 etienneb@chromium.org Add windows performance counters processing.
2018-08-16 nednguyen@google.com Make ts_proxy error handleable by Telemetry story_runner


Created with:
  gclient setdep -r src/third_party/catapult@ed63b1319414

The AutoRoll server is located here: https://catapult-roll.skia.org

Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md

If the roll is causing failures, please contact the current sheriff, who should
be CC'd on the roll, and stop the roller if necessary.

CQ_INCLUDE_TRYBOTS=luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel

BUG= chromium:873616 , chromium:874391 , chromium:872900 , chromium:873616 
TBR=sullivan@chromium.org

Change-Id: Ia5d6e592153afef2fee1316ec25f3d09c44b0b90
Reviewed-on: https://chromium-review.googlesource.com/1178241
Reviewed-by: catapult-chromium-autoroll <catapult-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com>
Commit-Queue: catapult-chromium-autoroll <catapult-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com>
Cr-Commit-Position: refs/heads/master@{#583741}
[modify] https://crrev.com/c0cd880d09098de5f6569b6cd4ce98b13ba7048d/DEPS

Status: Fixed (was: Started)
We no longer have massive number of failures
Blocking: 875881

Comment 8 by benhenry@google.com, Jan 16 (6 days ago)

Components: Test>Telemetry

Comment 9 by benhenry@google.com, Jan 16 (6 days ago)

Components: -Speed>Telemetry

Sign in to add a comment