New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 900790 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Dec 15
Cc:
Components:
EstimatedDays: ----
NextAction: 2018-11-12
OS: Fuchsia
Pri: 1
Type: Bug

Blocking:
issue 881334



Sign in to add a comment

Nothing runs on Fuchsia/ARM64, under QEMU

Project Member Reported by w...@chromium.org, Nov 1

Issue description

We haven't had any successful Fuchsia/ARM64 runs on our FYI bots since https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/fuchsia-fyi-arm64-rel/2381

Since https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/fuchsia-fyi-arm64-rel/2382 all attempts to run test binaries fail with output like:


 
2018-10-18 22:26:28,522:INFO:root:Connecting to Fuchsia using SSH.
2018-10-18 22:26:35,711:INFO:root:Connected!
2018-10-18 22:26:35,711:INFO:root:Attaching kernel logger.
2018-10-18 22:26:35,721:INFO:root:Installing base_unittests.far.
lost connection
2018-10-18 22:26:48,073:INFO:root:Terminating kernel log reader.
2018-10-18 22:26:48,073:INFO:root:Shutting down QEMU.
[00000.000] PMM: boot reserve add [0x48530000, 0x486b9fff]
[00000.000] mem_arena.base 0x40000000 size 0x8000000
[00000.000] overriding mem arena 0 base from FDT: 0x40000000
[00000.000] overriding mem arena 0 size from FDT: 0x80000000
[00000.000] detected GICv2
[00000.000] GICv2m 0: base spi 80 count 64
[00000.000] arm generic timer freq 62500000 Hz
[00000.017] cntpct_per_ns: 00000000.1000000000000000
[00000.017] ns_per_cntpct: 00000010.0000000000000000
[00000.017] test_time_conversion_check_result:243: FAIL, off by 72057594037927936
[00000.017] reserving ramdisk phys range [0x48000000, 0x48410fff]
[00000.017] PMM: boot reserve add [0x48000000, 0x48410fff]
[00000.018] memory limit lib returned an error (-2), falling back to default arena
[00000.059] PMM: boot reserve marking WIRED [0x48000000, 0x48410fff]
[00000.060] PMM: boot reserve marking WIRED [0x48530000, 0x486b9fff]
[00000.060]
[00000.060] welcome to Zircon
[00000.060]
Traceback (most recent call last):
  File "/b/s/w/ir/build/fuchsia/test_runner.py", line 117, in <module>
    sys.exit(main())
  File "/b/s/w/ir/build/fuchsia/test_runner.py", line 104, in main
    args.install_only, args.package_manifest)
  File "/b/s/w/ir/build/fuchsia/run_package.py", line 109, in RunPackage
    target.PutFile(next_package_path, install_path)
  File "/b/s/w/ir/build/fuchsia/target.py", line 94, in PutFile
    self.PutFiles([source], dest, recursive)
  File "/b/s/w/ir/build/fuchsia/target.py", line 109, in PutFiles
    recursive)
  File "/b/s/w/ir/build/fuchsia/remote_cmd.py", line 109, in RunScp
    subprocess.check_call(scp_command, stdout=open(os.devnull, 'w'))
  File "/b/s/w/ir/.swarming_module/lib/python2.7/subprocess.py", line 186, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['scp', '-C', '-F', '/b/s/w/ir/out/Release/ssh_config', '-P', '41136', '/b/s/w/ir/out/Release/gen/base/base_unittests/base_unittests.far', 'localhost:/tmp/base_unittests.far']' returned non-zero exit status 1


Blocking: 881334
Cc: sergeyu@chromium.org jam...@chromium.org
Owner: w...@chromium.org
Status: Assigned (was: Untriaged)
I'll take a look, since these tests have been down for ~too long at this point...
Status: Started (was: Assigned)
It appears that the problem is the long-running "dlog" that we kick off, to capture the kernel log; if I run with --exclude-system-log then things at least get rather further.
NextAction: 2018-11-09
Status: ExternalDependency (was: Started)
Filed Fuchsia bug NET-1847 for this.
Project Member

Comment 6 by bugdroid1@chromium.org, Nov 8

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/9061920339041df91d0cc1825caa2d1248b3049f

commit 9061920339041df91d0cc1825caa2d1248b3049f
Author: Wez <wez@chromium.org>
Date: Thu Nov 08 17:42:22 2018

[Fuchsia] Disable capture of system logs via SSH'd 'dlog' on ARM64.

The 'dlog' SSH channel appears to trigger a bug that causes the package-
upload step to fail on ARM64 under QEMU.

Bug:  900790 
Change-Id: Ie5871cef98e94deb936ae591c87a5c90a960b092
Reviewed-on: https://chromium-review.googlesource.com/c/1325529
Reviewed-by: Sergey Ulanov <sergeyu@chromium.org>
Reviewed-by: John Budorick <jbudorick@chromium.org>
Commit-Queue: Wez <wez@chromium.org>
Cr-Commit-Position: refs/heads/master@{#606522}
[modify] https://crrev.com/9061920339041df91d0cc1825caa2d1248b3049f/testing/buildbot/chromium.fyi.json
[modify] https://crrev.com/9061920339041df91d0cc1825caa2d1248b3049f/testing/buildbot/waterfalls.pyl

NextAction: 2018-11-12
Leaving this open to track resolution of the underlying issue w/ virtio.
Project Member

Comment 8 by bugdroid1@chromium.org, Nov 8

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/a5f8582f43da7589fbc7cddd470d13bd36f80a11

commit a5f8582f43da7589fbc7cddd470d13bd36f80a11
Author: Wez <wez@chromium.org>
Date: Thu Nov 08 19:07:37 2018

Revert "[Fuchsia] Disable capture of system logs via SSH'd 'dlog' on ARM64."

This reverts commit 9061920339041df91d0cc1825caa2d1248b3049f.

Reason for revert: This work-around simply delays the inevitable; test runs still tend to have their SSH connection fail eventually, and so fail.

Original change's description:
> [Fuchsia] Disable capture of system logs via SSH'd 'dlog' on ARM64.
> 
> The 'dlog' SSH channel appears to trigger a bug that causes the package-
> upload step to fail on ARM64 under QEMU.
> 
> Bug:  900790 
> Change-Id: Ie5871cef98e94deb936ae591c87a5c90a960b092
> Reviewed-on: https://chromium-review.googlesource.com/c/1325529
> Reviewed-by: Sergey Ulanov <sergeyu@chromium.org>
> Reviewed-by: John Budorick <jbudorick@chromium.org>
> Commit-Queue: Wez <wez@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#606522}

TBR=wez@chromium.org,sergeyu@chromium.org,jbudorick@chromium.org

Change-Id: I3f9912cb004e64219238c24af95d6084097e768c
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug:  900790 
Reviewed-on: https://chromium-review.googlesource.com/c/1327412
Reviewed-by: Wez <wez@chromium.org>
Commit-Queue: Wez <wez@chromium.org>
Cr-Commit-Position: refs/heads/master@{#606560}
[modify] https://crrev.com/a5f8582f43da7589fbc7cddd470d13bd36f80a11/testing/buildbot/chromium.fyi.json
[modify] https://crrev.com/a5f8582f43da7589fbc7cddd470d13bd36f80a11/testing/buildbot/waterfalls.pyl

The NextAction date has arrived: 2018-11-12
See also  issue 909936 .
Project Member

Comment 11 by bugdroid1@chromium.org, Dec 4

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/3da02f9523de5ddb2134e13f95657ba474a5cabe

commit 3da02f9523de5ddb2134e13f95657ba474a5cabe
Author: Wez <wez@chromium.org>
Date: Tue Dec 04 01:29:37 2018

[Fuchsia] Disable use of SSH connection multiplexing.

Connection multiplexing results in unexpected hangs or resets when used
by our runner scripts, so disable it until the root cause is found and
resolved upstream.

Bug:  900790 
Change-Id: I5025a5d652cdcbfe0c206fc4bf42366efbee07be
Reviewed-on: https://chromium-review.googlesource.com/c/1359489
Reviewed-by: Sergey Ulanov <sergeyu@chromium.org>
Commit-Queue: Wez <wez@chromium.org>
Cr-Commit-Position: refs/heads/master@{#613377}
[modify] https://crrev.com/3da02f9523de5ddb2134e13f95657ba474a5cabe/build/fuchsia/boot_data.py

ARM64 test suites get a lot further, but still reach a point at which the SSH connection "times out" & we can't get any results back.
Project Member

Comment 13 by bugdroid1@chromium.org, Dec 4

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/6f3411e38c3ca50250de5ad353faf713abebfc1c

commit 6f3411e38c3ca50250de5ad353faf713abebfc1c
Author: Wez <wez@chromium.org>
Date: Tue Dec 04 18:37:33 2018

Revert "[Fuchsia] Disable use of SSH connection multiplexing."

This reverts commit 3da02f9523de5ddb2134e13f95657ba474a5cabe.

Reason for revert: This change improved things, but did not fully resolve the issue, and broke reverse port-forwarding e.g. for net_unittests.

Original change's description:
> [Fuchsia] Disable use of SSH connection multiplexing.
> 
> Connection multiplexing results in unexpected hangs or resets when used
> by our runner scripts, so disable it until the root cause is found and
> resolved upstream.
> 
> Bug:  900790 
> Change-Id: I5025a5d652cdcbfe0c206fc4bf42366efbee07be
> Reviewed-on: https://chromium-review.googlesource.com/c/1359489
> Reviewed-by: Sergey Ulanov <sergeyu@chromium.org>
> Commit-Queue: Wez <wez@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#613377}

TBR=wez@chromium.org,sergeyu@chromium.org

Change-Id: Ieeebbadcfcf4509bd8a6af0fc6bd914d02ca5c7a
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug:  900790 
Reviewed-on: https://chromium-review.googlesource.com/c/1361649
Reviewed-by: Wez <wez@chromium.org>
Commit-Queue: Wez <wez@chromium.org>
Cr-Commit-Position: refs/heads/master@{#613612}
[modify] https://crrev.com/6f3411e38c3ca50250de5ad353faf713abebfc1c/build/fuchsia/boot_data.py

Status: Fixed (was: ExternalDependency)
Fixed upstream; now we just have legit test failures on that bot! :D

Sign in to add a comment