Issue metadata
Sign in to add a comment
|
CQ failure: whirlwind provision failure |
||||||||||||||||||||||||
Issue descriptionTracking bug for failure: https://luci-milo.appspot.com/buildbot/chromeos/whirlwind-paladin/8436 1 DUT failed to reboot after provision: chromeos4-row10-jetstream-host8
,
Jun 30 2017
Two successive whirlwind-paladin CQ runs last night failed with all DUTs failing to provision (whirlwind-paladin builds 8436 and 8435). The provisions generally failed due to system-services not running after provisioning (verify.cros legacy host verification failed). It appeared to me that the services most likely eventually started because verify.jetstream passed after multiple retries. This verifies that a jetstream service is up and running. It typically does not need to retry, but in this case it retried for close to 1 minute before succeeding. In the case of chromeos4-row10-jetstream-host8, the host was not reachable after provisioning. Possibly whatever was slowing down boot up on the other DUTs was making this DUT boot slowly enough to time out during SSH setup. This view is useful for seeing the provisioning failures around 5:30 PM and 7:00 PM yesterday: https://viceroy.corp.google.com/chromeos/suite_details?build_id=1633110 Provisioning appears normal today.
,
Jul 1 2017
Looking through logs from chromeos4-row10-jetstream-host8: The host became SSHable in the subsequent repair after a repair.servoreset. Looking at the logs pulled after reset, noticed some bluetooth issues, unclear if it is related: 1970-01-01T00:01:08.750285+00:00 WARNING kernel: [ 68.773063] udevd[144]: seq 1024 '/devices/soc.2/usb30.5/10000000.dwc3/xhci-hcd.1.auto/usb3/3-1/3-1:1.0/bluetooth/hci0' is taking a long time 2017-06-27T14:16:13.282770+00:00 ERR kernel: [ 188.773128] udevd[144]: seq 1024 '/devices/soc.2/usb30.5/10000000.dwc3/xhci-hcd.1.auto/usb3/3-1/3-1:1.0/bluetooth/hci0' killed 2017-06-27T14:16:13.282823+00:00 ERR kernel: [ 188.774317] udevd[144]: worker [713] failed while handling '/devices/soc.2/usb30.5/10000000.dwc3/xhci-hcd.1.auto/usb3/3-1/3-1:1.0/bluetooth/hci0'
,
Jul 5 2017
Maybe flake, +current deputy fyi
,
Jul 10 2017
I don't believe this was a flake: something went in to R61-9718.0.0 that greatly increased the time for whirlwinds to become fully operational. This in turn caused whirlwind host verification to fail 100% since R61-9718. See crbug/739583.
,
Jul 12 2017
This was due to a real failure related to https://chromium-review.googlesource.com/c/437525/ |
|||||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||||
Comment 1 by jrbarnette@chromium.org
, Jun 30 2017Labels: -Pri-2 Pri-1
Owner: ayatane@chromium.org
Status: Assigned (was: Untriaged)