New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 877194 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Aug 29
Cc:
Components:
EstimatedDays: ----
NextAction: 2018-08-29
OS: Fuchsia
Pri: 1
Type: Bug-Regression



Sign in to add a comment

All Fuchsia/ARM64 FYI-bot test runs fail with "Failed to connect to SSH"

Project Member Reported by w...@chromium.org, Aug 23

Issue description

All test suites have been failing since https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/fuchsia-fyi-arm64-rel/936 with output like:

2018-08-15 23:38:10,214:INFO:root:Connecting to Fuchsia using SSH.
2018-08-15 23:39:11,353:ERROR:root:Timeout limit reached.
Traceback (most recent call last):
  File "/b/s/w/ir/build/fuchsia/test_runner.py", line 117, in <module>
    sys.exit(main())
  File "/b/s/w/ir/build/fuchsia/test_runner.py", line 91, in main
    target.Start()
  File "/b/s/w/ir/build/fuchsia/qemu_target.py", line 155, in Start
    self._WaitUntilReady();
  File "/b/s/w/ir/build/fuchsia/target.py", line 158, in _WaitUntilReady
    raise FuchsiaTargetException('Couldn\'t connect using SSH.')
target.FuchsiaTargetException: Couldn't connect using SSH.
 
Owner: w...@chromium.org
Status: Assigned (was: Untriaged)
Assigning-to-self, as today's Gardener.
Cc: sergeyu@chromium.org stephanstross@google.com
Running the QEMU command outside the runner script, I see the QEMU command immediately fail, with:

qemu-system-aarch64: -netdev user,id=net0,net=192.168.3.0/24,dhcpstart=192.168.3.9,host=192.168.3.2,hostfwd=tcp::54655-:22: Could not set up host forwarding rule 'tcp::54655-:22'

Note that running with --system-log-file=- does _not_ show this logging, so that has also regressed again. :-/ 
Correction to #2: Looks like QEMU is actually just sitting there _hanging_ in this failure-mode, so there's nothing much that the script can do - no output if you run the command-line directly.

It does seem that the runner script fails to tear-down the QEMU process when it decides that things are doomed, so we should at least fix that.
Cc: pylaligand@google.com
This regressed with https://chromium-review.googlesource.com/c/chromium/src/+/1176437 - it looks like the new zircon.bin for ARM is non-functional.

Looking at the third_party/fuchsia-sdk/sdk/target/* directories from the SDK at the time that the CL landed:

$ ls -al third_party/fuchsia-sdk/sdk/target/*/zircon.bin
1298848 Aug 23 15:01 third_party/fuchsia-sdk/sdk/target/aarch64/zircon.bin
1233312 Aug 23 15:01 third_party/fuchsia-sdk/sdk/target/arm64/zircon.bin
1310512 Aug 23 15:01 third_party/fuchsia-sdk/sdk/target/x64/zircon.bin
1310512 Aug 23 15:01 third_party/fuchsia-sdk/sdk/target/x86_64/zircon.bin

The arm64 and aarch64 paths differ in size, and also in permissions (the arm64 and both x64 versions are executable).
Status: Started (was: Assigned)
Status: ExternalDependency (was: Started)
Project Member

Comment 7 by bugdroid1@chromium.org, Aug 24

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/7defe3c711b2ee4faf668b912dd1a7e1ff6925ce

commit 7defe3c711b2ee4faf668b912dd1a7e1ff6925ce
Author: Wez <wez@chromium.org>
Date: Fri Aug 24 05:54:21 2018

Fix qemu_target.py to kill QEMU process on failure.

Previously if QEMU ran but neither exited prematurely, nor did the
guest ever become connectible via SSH, then the QEMU sub-process would
be left running after the runner script exited.

Bug:  877194 
Change-Id: Id50fa44e11987f69afc469d73adc445b5066b7a5
Reviewed-on: https://chromium-review.googlesource.com/1187543
Commit-Queue: Kevin Marshall <kmarshall@chromium.org>
Reviewed-by: Kevin Marshall <kmarshall@chromium.org>
Cr-Commit-Position: refs/heads/master@{#585709}
[modify] https://crrev.com/7defe3c711b2ee4faf668b912dd1a7e1ff6925ce/build/fuchsia/qemu_target.py

NextAction: 2018-08-29
Status: Started (was: ExternalDependency)
Sdk fix has landed, just need to fix runner scripts.
The NextAction date has arrived: 2018-08-29
Project Member

Comment 11 by bugdroid1@chromium.org, Aug 29

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/9aa404d6e3b8ac31238a4af25ae387407f1094bd

commit 9aa404d6e3b8ac31238a4af25ae387407f1094bd
Author: Wez <wez@chromium.org>
Date: Wed Aug 29 17:19:48 2018

Use correct Zircon image when running ARM64 build under QEMU.

Zircon for ARM64 has different kernel images for QEMU versus physical
devices, at present.  Fix the runner script to use the correct kernel
image for QEMU, when targeting ARM64.

Bug:  877194 
Change-Id: I5131177e4299bcb4f505a90a356c0b2cdf14d283
Reviewed-on: https://chromium-review.googlesource.com/1195102
Commit-Queue: Kevin Marshall <kmarshall@chromium.org>
Reviewed-by: Kevin Marshall <kmarshall@chromium.org>
Cr-Commit-Position: refs/heads/master@{#587169}
[modify] https://crrev.com/9aa404d6e3b8ac31238a4af25ae387407f1094bd/build/fuchsia/qemu_target.py

Status: Fixed (was: Started)
Confirmed that build https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/fuchsia-fyi-arm64-rel/1336 is running tests successfully.

Thanks pylaligand@ for the fixes!

Sign in to add a comment