New issue
Advanced search Search tips

Issue 737565 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Jul 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug



Sign in to add a comment

"connection was forcibly closed" requesting memory dump on Windows bots

Project Member Reported by perezju@chromium.org, Jun 28 2017

Issue description

The following error has started appearing frequently on windows bots:

TracingUnrecoverableException: Exception raised while sending a Tracing.requestMemoryDump request:
Traceback (most recent call last):
  File "c:\b\s\w\ir\third_party\catapult\telemetry\telemetry\internal\backends\chrome_inspector\tracing_backend.py", line 204, in DumpMemory
    response = self._inspector_websocket.SyncRequest(request, timeout)
  File "c:\b\s\w\ir\third_party\catapult\telemetry\telemetry\internal\backends\chrome_inspector\inspector_websocket.py", line 110, in SyncRequest
    res = self._Receive(timeout)
  File "c:\b\s\w\ir\third_party\catapult\telemetry\telemetry\internal\backends\chrome_inspector\inspector_websocket.py", line 149, in _Receive
    data = self._socket.recv()
  File "c:\b\s\w\ir\third_party\catapult\telemetry\third_party\websocket-client\websocket\_core.py", line 293, in recv
    opcode, data = self.recv_data()
  File "c:\b\s\w\ir\third_party\catapult\telemetry\third_party\websocket-client\websocket\_core.py", line 310, in recv_data
    opcode, frame = self.recv_data_frame(control_frame)
  File "c:\b\s\w\ir\third_party\catapult\telemetry\third_party\websocket-client\websocket\_core.py", line 323, in recv_data_frame
    frame = self.recv_frame()
  File "c:\b\s\w\ir\third_party\catapult\telemetry\third_party\websocket-client\websocket\_core.py", line 357, in recv_frame
    return self.frame_buffer.recv_frame()
  File "c:\b\s\w\ir\third_party\catapult\telemetry\third_party\websocket-client\websocket\_abnf.py", line 336, in recv_frame
    self.recv_header()
  File "c:\b\s\w\ir\third_party\catapult\telemetry\third_party\websocket-client\websocket\_abnf.py", line 286, in recv_header
    header = self.recv_strict(2)
  File "c:\b\s\w\ir\third_party\catapult\telemetry\third_party\websocket-client\websocket\_abnf.py", line 371, in recv_strict
    bytes_ = self.recv(min(16384, shortage))
  File "c:\b\s\w\ir\third_party\catapult\telemetry\third_party\websocket-client\websocket\_core.py", line 427, in _recv
    return recv(self.sock, bufsize)
  File "c:\b\s\w\ir\third_party\catapult\telemetry\third_party\websocket-client\websocket\_socket.py", line 80, in recv
    bytes_ = sock.recv(bufsize)
error: [Errno 10054] An existing connection was forcibly closed by the remote host
https://luci-logdog.appspot.com/v/?s=chrome%2Fbb%2Fchromium.perf%2FWin_10_Perf%2F1005%2F%2B%2Frecipes%2Fsteps%2Fsystem_health.memory_desktop_on__102b__GPU_on_Windows_on_Windows-10-10240%2F0%2Fstdout

The error is more frequent on win-10 (15/20 latest builds) often but not always on "load:search:yahoo"; and has also been seen in win-7, win-7-x64, and win-8 a few times.

Furthermore, the error is treated as "fatal", interrupting the rest of the execution of the benchmark.

I'll see if I can find the build where the error started and kick off a bisect from there.
 
The first failure I could find was:
https://luci-milo.appspot.com/buildbot/chromium.perf/Win%2010%20Perf/948

With CL range:
http://test-results.appspot.com/revision_range?start=479745&end=479841

Crossing fingers for a return code bisect ...
Project Member

Comment 3 by 42576172...@developer.gserviceaccount.com, Jun 28 2017


=== BISECT JOB RESULTS ===
NO Test failure found

Bisect Details
  Configuration: winx64_10_perf_bisect
  Benchmark    : system_health.memory_desktop
  Metric       : memory:chrome:all_processes:reported_by_chrome:effective_size_avg/load_search/load_search_yahoo

Revision             Exit Code      N
chromium@479744      0 +- N/A       20      good
chromium@479932      0 +- N/A       20      bad

Please refer to the following doc on diagnosing memory regressions:
  https://chromium.googlesource.com/chromium/src/+/master/docs/memory-infra/memory_benchmarks.md

To Run This Test
  src/tools/perf/run_benchmark -v --browser=release_x64 --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=load.search.yahoo system_health.memory_desktop

Debug Info
  https://chromeperf.appspot.com/buildbucket_job_status/8975536285219265696

Is this bisect wrong?
  https://chromeperf.appspot.com/bad_bisect?try_job_id=4521967824142336


| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Speed>Bisection.  Thank you!
Project Member

Comment 5 by 42576172...@developer.gserviceaccount.com, Jun 28 2017


=== BISECT JOB RESULTS ===
NO Test failure found

Bisect Details
  Configuration: winx64_10_perf_bisect
  Benchmark    : system_health.memory_desktop
  Metric       : memory:chrome:all_processes:reported_by_chrome:effective_size_avg/load_search/load_search_yahoo

Revision             Exit Code      N
chromium@479744      0 +- N/A       20      good
chromium@480744      0 +- N/A       20      bad

Please refer to the following doc on diagnosing memory regressions:
  https://chromium.googlesource.com/chromium/src/+/master/docs/memory-infra/memory_benchmarks.md

To Run This Test
  src/tools/perf/run_benchmark -v --browser=release_x64 --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=load.search.yahoo system_health.memory_desktop

Debug Info
  https://chromeperf.appspot.com/buildbucket_job_status/8975532312681599456

Is this bisect wrong?
  https://chromeperf.appspot.com/bad_bisect?try_job_id=4584612874944512


| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Speed>Bisection.  Thank you!
Components: Speed>Benchmarks
I suspect this is something wrong with the device in the lab. Do we see other benchmark failures on this bot as well?
Note that the error has shown up on all of win-10, win-7, win-7-x64, and win-8.

I'm wondering now if the error needs the benchmark to run for longer in order for it to appear?
Cc: erikc...@chromium.org etienneb@chromium.org
I think we may need to try reproduce this locally. I am very swarmed at the moment, so cc Erik & Etienne in case they are interested in helping out with this.
Oh, I bet this is hitting a local timeout:
"""
rror: [Errno 10054] An existing connection was forcibly closed by the remote host
Locals:
  request : {'method': 'Tracing.requestMemoryDump', 'id': 0}
  timeout : 90
"""
Memory dumps can require over 90 seconds.
Cc: -nednguyen@chromium.org nedngu...@google.com
Owner: perezju@chromium.org
Status: Assigned (was: Untriaged)
Ah, got it. Will try to increase that timeout.
Project Member

Comment 12 by bugdroid1@chromium.org, Jul 3 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/b3121dcc73abfee3cd11148cd9c43cf02db3d1a3

commit b3121dcc73abfee3cd11148cd9c43cf02db3d1a3
Author: catapult-deps-roller@chromium.org <catapult-deps-roller@chromium.org>
Date: Mon Jul 03 11:46:34 2017

Roll src/third_party/catapult/ 3b0c0e04d..68c788088 (1 commit)

https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/3b0c0e04db0b..68c788088273

$ git log 3b0c0e04d..68c788088 --date=short --no-merges --format='%ad %ae %s'
2017-07-03 perezju [Telemetry] Default DumpMemory timeout to 20 minutes

Created with:
  roll-dep src/third_party/catapult
BUG= 737565 


Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md

If the roll is causing failures, see:
http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls


CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel
TBR=sullivan@chromium.org

Change-Id: I8309233ab2e3e893f1ab2346423a06e6433dacd8
Reviewed-on: https://chromium-review.googlesource.com/558674
Reviewed-by: <catapult-deps-roller@chromium.org>
Commit-Queue: <catapult-deps-roller@chromium.org>
Cr-Commit-Position: refs/heads/master@{#483990}
[modify] https://crrev.com/b3121dcc73abfee3cd11148cd9c43cf02db3d1a3/DEPS

Status: Fixed (was: Assigned)

Sign in to add a comment