New issue
Advanced search Search tips

Issue 737593 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Aug 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: ----
Type: ----



Sign in to add a comment

v8.runtimestats.browsing_mobile is flaky chromium.perf/Android Nexus6 WebView Perf (2)

Project Member Reported by altimin@chromium.org, Jun 28 2017

Issue description

Several stories seem to time out randomly (mostly facebook).

v8.runtimestats.browsing_mobile failing on chromium.perf/Android Nexus6 WebView Perf (2)

Builders failed on: 
- Android Nexus6 WebView Perf (2): 
  https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus6%20WebView%20Perf%20%282%29



 
Assigning to mythria@ as an owner, please route as appropriate.
Cc: perezju@chromium.org altimin@chromium.org
Did something change here? May be the bots moved to swarming?  I can't open the link and the closest bot I can see is this: https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus6%20WebView%20Perf which has only two builds so far. 
Cc: martiniss@chromium.org
True, those bots recently moved to swarming.

We'll have to wait until the new webview swarming bots stabilize.

+martiniss FYI
I looked at the buildss again. There are about 5 failures (not counting the exceptions) over the last 50 runs. I am not sure what the problem is. The stack trace looks like this for all failures:

Traceback (most recent call last):
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/story_runner.py", line 109, in _RunStoryAndProcessErrorIfNeeded
    test.Measure(state.platform, results)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/web_perf/timeline_based_measurement.py", line 284, in Measure
    trace_result = platform.tracing_controller.StopTracing()
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/core/tracing_controller.py", line 47, in StopTracing
    return self._tracing_controller_backend.StopTracing()
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/platform/tracing_controller_backend.py", line 140, in StopTracing
    '\n'.join(raised_exception_messages))
TracingException: Exceptions raised when trying to stop tracing:
Traceback (most recent call last):
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/platform/tracing_controller_backend.py", line 128, in StopTracing
    agent.CollectAgentTraceData(builder)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/platform/tracing_agent/chrome_tracing_agent.py", line 248, in CollectAgentTraceData
    '\n'.join(raised_exception_messages))
ChromeTracingStoppedError: Exceptions raised when trying to collect Chrome devtool tracing:
Error when collecting Chrome tracing on devtools at port localabstract:webview_devtools_remote_14311:
Traceback (most recent call last):
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/platform/tracing_agent/chrome_tracing_agent.py", line 237, in CollectAgentTraceData
    client.CollectChromeTracingData(trace_data_builder)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/devtools_client_backend.py", line 382, in CollectChromeTracingData
    self._tracing_backend.CollectTraceData(trace_data_builder, timeout)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/tracing_backend.py", line 230, in CollectTraceData
    self._CollectTracingData(trace_data_builder, timeout)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/tracing_backend.py", line 258, in _CollectTracingData
    traceback.format_exc())
TracingUnrecoverableException: Exception raised while collecting tracing data:
Traceback (most recent call last):
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/tracing_backend.py", line 251, in _CollectTracingData
    self._inspector_websocket.DispatchNotifications(timeout)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_websocket.py", line 134, in DispatchNotifications
    self._Receive(timeout)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_websocket.py", line 149, in _Receive
    data = self._socket.recv()
  File "/b/swarming/w/ir/third_party/catapult/telemetry/third_party/websocket-client/websocket/_core.py", line 293, in recv
    opcode, data = self.recv_data()
  File "/b/swarming/w/ir/third_party/catapult/telemetry/third_party/websocket-client/websocket/_core.py", line 310, in recv_data
    opcode, frame = self.recv_data_frame(control_frame)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/third_party/websocket-client/websocket/_core.py", line 323, in recv_data_frame
    frame = self.recv_frame()
  File "/b/swarming/w/ir/third_party/catapult/telemetry/third_party/websocket-client/websocket/_core.py", line 357, in recv_frame
    return self.frame_buffer.recv_frame()
  File "/b/swarming/w/ir/third_party/catapult/telemetry/third_party/websocket-client/websocket/_abnf.py", line 336, in recv_frame
    self.recv_header()
  File "/b/swarming/w/ir/third_party/catapult/telemetry/third_party/websocket-client/websocket/_abnf.py", line 286, in recv_header
    header = self.recv_strict(2)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/third_party/websocket-client/websocket/_abnf.py", line 371, in recv_strict
    bytes_ = self.recv(min(16384, shortage))
  File "/b/swarming/w/ir/third_party/catapult/telemetry/third_party/websocket-client/websocket/_core.py", line 427, in _recv
    return recv(self.sock, bufsize)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/third_party/websocket-client/websocket/_socket.py", line 93, in recv
    "Connection is already closed.")
WebSocketConnectionClosedException: Connection is already closed.
There are also failures on v8.browsing and system_health benchmarks, but their stack trace is different. For example stack trace from this log: 
https://luci-logdog.appspot.com/v/?s=chrome%2Fbb%2Fchromium.perf%2FAndroid_Nexus6_WebView_Perf%2F31%2F%2B%2Frecipes%2Fsteps%2Fsystem_health.memory_mobile_on_Android%2F0%2Fstdout

WARNING:root:Exception when trying to capture screenshot: DeviceUnreachableError('ZX1G428H3P',)
Traceback (most recent call last):
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/story_runner.py", line 107, in _RunStoryAndProcessErrorIfNeeded
    state.RunStory(results)
  File "/b/swarming/w/ir/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py", line 75, in traced_function
    return func(*args, **kwargs)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/page/shared_page_state.py", line 305, in RunStory
    self._current_page.Run(self)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/page/__init__.py", line 112, in Run
    self.RunPageInteractions(action_runner)
  File "/b/swarming/w/ir/tools/perf/page_sets/system_health/system_health_story.py", line 142, in RunPageInteractions
    self._DidLoadDocument(action_runner)
  File "/b/swarming/w/ir/tools/perf/page_sets/system_health/browsing_stories.py", line 85, in _DidLoadDocument
    self._ScrollMainPage(action_runner)
  File "/b/swarming/w/ir/tools/perf/page_sets/system_health/browsing_stories.py", line 101, in _ScrollMainPage
    repeat_count=self.MAIN_PAGE_SCROLL_REPEAT)
  File "/b/swarming/w/ir/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py", line 75, in traced_function
    return func(*args, **kwargs)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/actions/action_runner.py", line 492, in RepeatableBrowserDrivenScroll
    prevent_fling=prevent_fling, speed=speed))
  File "/b/swarming/w/ir/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py", line 75, in traced_function
    return func(*args, **kwargs)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/actions/action_runner.py", line 55, in _RunAction
    action.WillRunAction(self._tab)
  File "/b/swarming/w/ir/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py", line 75, in traced_function
    return func(*args, **kwargs)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/actions/repeatable_scroll.py", line 28, in WillRunAction
    utils.InjectJavaScript(tab, 'gesture_common.js')
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/actions/utils.py", line 11, in InjectJavaScript
    tab.ExecuteJavaScript(js)
  File "/b/swarming/w/ir/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py", line 75, in traced_function
    return func(*args, **kwargs)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/browser/web_contents.py", line 144, in ExecuteJavaScript
    return self._inspector_backend.ExecuteJavaScript(*args, **kwargs)
  File "/b/swarming/w/ir/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py", line 75, in traced_function
    return func(*args, **kwargs)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_backend.py", line 39, in inner
    inspector_backend._ConvertExceptionFromInspectorWebsocket(e)
  File "/b/swarming/w/ir/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py", line 75, in traced_function
    return func(*args, **kwargs)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_backend.py", line 36, in inner
    return func(inspector_backend, *args, **kwargs)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_backend.py", line 210, in ExecuteJavaScript
    self._runtime.Execute(statement, context_id, timeout)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_runtime.py", line 21, in Execute
    self.Evaluate(expr + '; 0;', context_id, timeout)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_runtime.py", line 45, in Evaluate
    res = self._inspector_websocket.SyncRequest(request, timeout)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_websocket.py", line 110, in SyncRequest
    res = self._Receive(timeout)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_websocket.py", line 149, in _Receive
    data = self._socket.recv()
  File "/b/swarming/w/ir/third_party/catapult/telemetry/third_party/websocket-client/websocket/_core.py", line 293, in recv
    opcode, data = self.recv_data()
  File "/b/swarming/w/ir/third_party/catapult/telemetry/third_party/websocket-client/websocket/_core.py", line 310, in recv_data
    opcode, frame = self.recv_data_frame(control_frame)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/third_party/websocket-client/websocket/_core.py", line 323, in recv_data_frame
    frame = self.recv_frame()
  File "/b/swarming/w/ir/third_party/catapult/telemetry/third_party/websocket-client/websocket/_core.py", line 357, in recv_frame
    return self.frame_buffer.recv_frame()
  File "/b/swarming/w/ir/third_party/catapult/telemetry/third_party/websocket-client/websocket/_abnf.py", line 336, in recv_frame
    self.recv_header()
  File "/b/swarming/w/ir/third_party/catapult/telemetry/third_party/websocket-client/websocket/_abnf.py", line 286, in recv_header
    header = self.recv_strict(2)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/third_party/websocket-client/websocket/_abnf.py", line 371, in recv_strict
    bytes_ = self.recv(min(16384, shortage))
  File "/b/swarming/w/ir/third_party/catapult/telemetry/third_party/websocket-client/websocket/_core.py", line 427, in _recv
    return recv(self.sock, bufsize)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/third_party/websocket-client/websocket/_socket.py", line 93, in recv
    "Connection is already closed.")
DevtoolsTargetCrashException: Devtools target crashed

Cc: nedngu...@google.com
The cause of failures like #5 is the device going dead in the middle of a test, the telltale sign is:

  "DeviceUnreachableError: ZX1G428H3P"

And we're discussion a solution for that on issue 739833.

The failure in #4 is more strange. Indeed I would have expected a browser/device log under those circumstances. I'll double check why it's not happening. Ned, any ideas?

Note, though, that 5 failures in 50 runs is a pretty decent ratio for system health standards. For what I've seen those are single story failures (the rest of the ~20 stories run fine), so only about %0.1 of the story runs are failing.
I am not sure why StackTrace is empty either. Maybe add more logging to https://github.com/catapult-project/catapult/blob/master/telemetry/telemetry/internal/platform/android_platform_backend.py#L710 to debug this? 

It's possible that the device is disconnected from the host at this point?
I don't think so, the log shows that we continued communicating with the device with no problems, e.g.:

    return recv(self.sock, bufsize)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/third_party/websocket-client/websocket/_socket.py", line 93, in recv
    "Connection is already closed.")
WebSocketConnectionClosedException: Connection is already closed.
INFO:devil.utils.cmd_helper:[host]> /b/swarming/w/ir/third_party/catapult/devil/bin/deps/linux2/x86_64/bin/adb -s ZX1G4238BJ shell '( ps | grep -F org.chromium.webview_shell; echo "PIPESTATUS: ${PIPESTATUS[@]}" );echo %$?'
INFO:devil.utils.cmd_helper:[host]> /b/swarming/w/ir/third_party/catapult/devil/bin/deps/linux2/x86_64/bin/adb -s ZX1G4238BJ shell '( ps | grep -F org.chromium.webview_shell; echo "PIPESTATUS: ${PIPESTATUS[@]}" );echo %$?'
INFO:devil.utils.cmd_helper:[host]> /b/swarming/w/ir/third_party/catapult/devil/bin/deps/linux2/x86_64/bin/adb -s ZX1G4238BJ forward --remove tcp:56698
INFO:devil.utils.cmd_helper:[host]> /b/swarming/w/ir/third_party/catapult/devil/bin/deps/linux2/x86_64/forwarder_host --adb=/b/swarming/w/ir/third_party/catapult/devil/bin/deps/linux2/x86_64/bin/adb --serial-id=ZX1G4238BJ --unmap 46991
[  FAILED  ] browse:social:instagram@{'case': 'browse', 'group': 'social'} (79413 ms)
INFO:devil.utils.cmd_helper:[host]> /b/swarming/w/ir/third_party/catapult/devil/bin/deps/linux2/x86_64/bin/adb devices
INFO:devil.utils.cmd_helper:[host]> /b/swarming/w/ir/third_party/catapult/devil/bin/deps/linux2/x86_64/bin/adb devices
INFO:devil.utils.cmd_helper:[host]> /b/swarming/w/ir/third_party/catapult/devil/bin/deps/linux2/x86_64/bin/adb -s ZX1G4238BJ shell '( pm list packages );echo %$?'
https://luci-logdog.appspot.com/v/?s=chrome%2Fbb%2Fchromium.perf%2FAndroid_Nexus6_WebView_Perf%2F45%2F%2B%2Frecipes%2Fsteps%2Fv8.runtimestats.browsing_mobile_on_Android%2F0%2Fstdout

And the rest of the stories ran fine and succeeded.
Status: Fixed (was: Available)
From comment 6, I think this is Ok now. I am closing this bug. Please feel free to reopen if any further investigation is necessary.

Sign in to add a comment