New issue
Advanced search Search tips

Issue 921731 link

Starred by 2 users

Issue metadata

Status: Untriaged
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment

provision failure on peppy-chrome-pfq DUT chromeos4-row6-rack12-host3

Project Member Reported by afakhry@chromium.org, Jan 14

Issue description

- peppy-chrome-pfq: https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?buildbucketId=8924338765317138192

provision     ABORT: Autotest client terminated unexpectedly: DUT is pingable, SSHable and did NOT restart un-expectedly. We probably lost connectivity during the test.

There are a bunch of errors in the logs:

- From autoserv.DEBUG:

   * rsync: connection unexpectedly closed (0 bytes received so far)
   ...
   * 01/14 11:27:12.565 ERROR|             utils:0287| [stderr] /usr/local/autotest/result_tools/utils.py: line 65: syntax error near unexpected token `('
01/14 11:27:12.565 ERROR|             utils:0287| [stderr] /usr/local/autotest/result_tools/utils.py: line 65: `FILES_TO_IGNORE = set(['
01/14 11:27:12.567 ERROR|            runner:0121| Non-critical failure: Failed to create directory summary for /tmp/sysinfo/autoserv-LQsScx/results/default.
Traceback (most recent call last):
  File "/usr/local/autotest/client/bin/result_tools/runner.py", line 114, in run_on_client
    timeout=_BUILD_DIR_SUMMARY_TIMEOUT)
  File "/usr/local/autotest/server/hosts/ssh_host.py", line 335, in run
    return self.run_very_slowly(*args, **kwargs)
  File "/usr/local/autotest/server/hosts/ssh_host.py", line 324, in run_very_slowly
    ssh_failure_retry_ok)
  File "/usr/local/autotest/server/hosts/ssh_host.py", line 268, in _run
    raise error.AutoservRunError("command execution error", result)
AutoservRunError: command execution error
* Command: 
    /usr/bin/ssh -a -x  -o ControlPath=/tmp/_autotmp_HPhyqassh-master/socket
    -o Protocol=2 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null
    -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o
    ServerAliveCountMax=3 -o ConnectionAttempts=4 -l root -p 22
    chromeos4-row6-rack12-host3 "export LIBC_FATAL_STDERR_=1; if type
    \"logger\" > /dev/null 2>&1; then logger -tag \"autotest\"
    \"server[stack::collect_client_job_results|run_on_client|run] ->
    ssh_run(/usr/local/autotest/result_tools/utils.py -p /tmp/sysinfo
    /autoserv-LQsScx/results/default -m 20000)\";fi;
    /usr/local/autotest/result_tools/utils.py -p /tmp/sysinfo/autoserv-
    LQsScx/results/default -m 20000"
Exit status: 2
Duration: 0.217199087143

stderr:
/usr/local/autotest/result_tools/utils.py: line 28: 
This is a utility to build a summary of the given directory. and save to a json
file.

usage: utils.py [-h] [-p PATH] [-m MAX_SIZE_KB]

optional arguments:
  -p PATH         Path to build directory summary.
  -m MAX_SIZE_KB  Maximum result size in KB. Set to 0 to disable result
                  throttling.

The content of the json file looks like:
{'default': {'/D': [{'control': {'/S': 734}},
                    {'debug': {'/D': [{'client.0.DEBUG': {'/S': 5698}},
                                       {'client.0.ERROR': {'/S': 254}},
                                       {'client.0.INFO': {'/S': 1020}},
                                       {'client.0.WARNING': {'/S': 242}}],
                               '/S': 7214}}
                      ],
              '/S': 7948
            }
}
: File name too long


   * AutotestRunError: Aborting - unexpected final status message from client on chromeos4-row6-rack12-host3


 
Cc: vapier@chromium.org lndmrk@chromium.org jclinton@chromium.org semenzato@chromium.org achuith@chromium.org
Same failure again happened on the same builder: https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?buildbucketId=8924262164639604432

Is there anything wrong with this DUT that it loses connection during the test? Or is the error "File name too long" related to the failures?
Components: -Infra>Client>ChromeOS Infra>Client>ChromeOS>Test
chrome seems to crash:

- From the "messages" log:
2019-01-15T07:41:54.928061-08:00 INFO kernel: [   21.934468] chrome[2179]: segfault at 0 ip 00007fc222fbf742 sp 00007ffe0c9bc050 error 4 in chrome[7fc2212b8000+8f19000]
2019-01-15T07:41:54.940853-08:00 INFO crash_reporter[390]: libminijail[2677]: mount '/dev/log' -> '/dev/log' type '' flags 0x1001
2019-01-15T07:41:54.959552-08:00 WARNING crash_reporter[390]: Could not load the device policy file.
2019-01-15T07:41:54.959851-08:00 WARNING crash_reporter[390]: [user] Received crash notification for chrome[2179] sig 11, user 1000 group 1000 (ignoring call by kernel - chrome crash; waiting for chrome to call us directly)

I wonder if it's the same crash mentioned in  issue 922114 . There are no crash dumps in the logs.
The rest of the logs for #3:

2019-01-15T07:41:55.068679-08:00 ERR cras_server[390]: Got error from client: rc: -104
2019-01-15T07:41:55.068795-08:00 ERR cras_server[390]: fetch err: -32 for 100000
2019-01-15T07:41:55.069972-08:00 INFO session_manager[390]: [INFO:session_manager_service.cc(306)] Browser process 2179 exited with signal 11 (Segmentation fault)
2019-01-15T07:41:55.070100-08:00 INFO session_manager[390]: [INFO:browser_job.cc(167)] Terminating process group for browser 2179 with signal 9: Ensuring browser processes are gone.
2019-01-15T07:41:55.070156-08:00 INFO session_manager[390]: [INFO:system_utils_impl.cc(125)] Sending 9 to -2179 as 1000
2019-01-15T07:41:55.070240-08:00 INFO session_manager[390]: [INFO:browser_job.cc(187)] Waiting up to 3 seconds for 2179's process group to exit
2019-01-15T07:41:56.053683-08:00 INFO session_manager[390]: [INFO:browser_job.cc(197)] Cleaned up browser process 2179


It's a SIGSEGV so it could be the same crash in  issue 922114 .

Sign in to add a comment