New issue
Advanced search Search tips

Issue 865984 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Jul 24
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug



Sign in to add a comment

bob-release: Failed to install device image

Project Member Reported by rrangel@chromium.org, Jul 20

Issue description

The last three bob-release builds have failed at bvt-installer:

https://cros-goldeneye.corp.google.com/chromeos/legoland/builderHistory?buildConfig=bob-release&buildBranch=master


Running (ssh) '/usr/bin/update_engine_client --update --omaha_url=http://100.115.185.228:8082/update/bob-release/R69-10893.0.0'

[0719/224549:ERROR:update_engine_client.cc(232)] Update failed, current operation is UPDATE_STATUS_IDLE, last error code is ErrorCode::kDownloadTransferError(9)

Sounds like the DUT can't download the update?

 
Confirmation from DUT's udpate_engine.log

[0720/073604:INFO:libcurl_http_fetcher.cc(433)] HTTP response code: 206
[0720/073604:INFO:libcurl_http_fetcher.cc(490)] Transfer interrupted after downloading 3496612 of 1010362975 bytes. 1006866363 bytes remaining after 1 attempt(s)
[0720/073604:INFO:libcurl_http_fetcher.cc(502)] Restarting transfer to download the remaining bytes
[0720/073624:INFO:libcurl_http_fetcher.cc(146)] Starting/Resuming transfer
[0720/073624:INFO:libcurl_http_fetcher.cc(165)] Using proxy: no
[0720/073624:INFO:libcurl_http_fetcher.cc(286)] Not setting http(s) curl options because we are running a dev/test image
[0720/073624:ERROR:libcurl_http_fetcher.cc(436)] Unable to get http response code.
[0720/073624:INFO:libcurl_http_fetcher.cc(490)] Transfer interrupted after downloading 3496612 of 1010362975 bytes. 1006866363 bytes remaining after 2 attempt(s)
[0720/073624:INFO:libcurl_http_fetcher.cc(502)] Restarting transfer to download the remaining bytes
[0720/073644:INFO:libcurl_http_fetcher.cc(146)] Starting/Resuming transfer
[0720/073644:INFO:libcurl_http_fetcher.cc(165)] Using proxy: no
[0720/073644:INFO:libcurl_http_fetcher.cc(286)] Not setting http(s) curl options because we are running a dev/test image
[0720/073644:ERROR:libcurl_http_fetcher.cc(436)] Unable to get http response code.
[0720/073644:INFO:libcurl_http_fetcher.cc(490)] Transfer interrupted after downloading 3496612 of 1010362975 bytes. 1006866363 bytes remaining after 3 attempt(s)
[0720/073644:INFO:libcurl_http_fetcher.cc(502)] Restarting transfer to download the remaining bytes
[0720/073704:INFO:libcurl_http_fetcher.cc(146)] Starting/Resuming transfer
[0720/073704:INFO:libcurl_http_fetcher.cc(165)] Using proxy: no
[0720/073704:INFO:libcurl_http_fetcher.cc(286)] Not setting http(s) curl options because we are running a dev/test image
[0720/073704:ERROR:libcurl_http_fetcher.cc(436)] Unable to get http response code.
[0720/073704:INFO:libcurl_http_fetcher.cc(490)] Transfer interrupted after downloading 3496612 of 1010362975 bytes. 1006866363 bytes remaining after 4 attempt(s)
[0720/073704:INFO:libcurl_http_fetcher.cc(496)] Reached max attempts (4)
[0720/073704:INFO:multi_range_http_fetcher.cc(173)] Received transfer complete.
[0720/073704:INFO:multi_range_http_fetcher.cc(130)] TransferEnded w/ code 0
[0720/073704:INFO:multi_range_http_fetcher.cc(145)] Didn't get enough bytes. Ending w/ failure.
[0720/073704:INFO:delta_performer.cc(313)] Discarding 588865 unused downloaded bytes
[0720/073704:WARNING:download_action.cc(363)] Error closing the writer.
[0720/073704:INFO:action_processor.cc(116)] ActionProcessor: finished DownloadAction with code ErrorCode::kDownloadTransferError
[0720/073704:INFO:action_processor.cc(121)] ActionProcessor: Aborting processing due to failure.
[0720/073704:INFO:update_attempter.cc(941)] Processing Done.
[0720/073704:INFO:cpu_limiter.cc(71)] Setting cgroup cpu shares to  1024
[0720/073704:INFO:cpu_limiter.cc(79)] CPU shares = 1024
[0720/073704:ERROR:update_attempter.cc(1379)] Update failed.
[0720/073704:INFO:payload_state.cc(257)] Updating payload state for error code: 9 (ErrorCode::kDownloadTransferError)
The latest failure mode is different from the others:

https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?buildbucketId=8940550172709264480

(SSH to the DUT dropped before test could even begin)

The other two used different devservers, which reduces the probability this is a bad devserver.
The other two failed while installing different images, which reduces the probability this is a bad image (though these were ToT images, so if there is a bug in ToT causing this, it will continue to fail on all images)


All these failures were on DUT chromeos6-row4-rack13-host20


It's also strange that it fails with the same number of bytes. Could we be running out of disk space? Where are the images betting stored?
I do see this in dmesg repeated a lot of times:

2018-07-20T05:46:13.884949+00:00 INFO kernel: [ 1528.163532] usb 2-1.3.1: New USB device strings: Mfr=1, Product=2, SerialNumber=6
2018-07-20T05:46:13.884955+00:00 INFO kernel: [ 1528.163540] usb 2-1.3.1: Product: USB 10/100/1000 LAN
2018-07-20T05:46:13.884960+00:00 INFO kernel: [ 1528.163547] usb 2-1.3.1: Manufacturer: Realtek
2018-07-20T05:46:13.884965+00:00 INFO kernel: [ 1528.163554] usb 2-1.3.1: SerialNumber: 000000000000
2018-07-20T05:46:13.948897+00:00 INFO kernel: [ 1528.227252] usb 2-1.3.4: new full-speed USB device number 91 using ehci-platform
2018-07-20T05:46:14.011917+00:00 ERR kernel: [ 1528.290235] usb 2-1.3.4: device descriptor read/64, error -32
2018-07-20T05:46:14.175926+00:00 ERR kernel: [ 1528.454271] usb 2-1.3.4: device descriptor read/64, error -32
2018-07-20T05:46:14.338886+00:00 INFO kernel: [ 1528.617382] usb 2-1.3.4: new full-speed USB device number 92 using ehci-platform
2018-07-20T05:46:14.401878+00:00 ERR kernel: [ 1528.680341] usb 2-1.3.4: device descriptor read/64, error -32
2018-07-20T05:46:14.565874+00:00 ERR kernel: [ 1528.844374] usb 2-1.3.4: device descriptor read/64, error -32
2018-07-20T05:46:14.666881+00:00 INFO kernel: [ 1528.945629] usb 2-1.3-port4: attempt power cycle
2018-07-20T05:46:14.728886+00:00 INFO kernel: [ 1529.007447] usb 2-1.3.1: reset high-speed USB device number 90 using ehci-platform
2018-07-20T05:46:14.849880+00:00 INFO kernel: [ 1529.128653] r8152 2-1.3.1:1.0 eth0: v1.09.9
2018-07-20T05:46:14.876714+00:00 INFO kernel: [ 1529.154854] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready

Maybe the servo ethernet is bad?
.4 is the wrong device. .1 shows transaction errors:

018-07-20T05:42:00.070703+00:00 WARNING kernel: [ 1274.349906] r8152 2-1.3.1:1.0 eth0: Rx status -71
2018-07-20T05:42:00.074797+00:00 WARNING kernel: [ 1274.354108] r8152 2-1.3.1:1.0 eth0: Rx status -71
2018-07-20T05:42:00.079029+00:00 WARNING kernel: [ 1274.358350] r8152 2-1.3.1:1.0 eth0: Rx status -71
2018-07-20T05:42:00.083259+00:00 WARNING kernel: [ 1274.362591] r8152 2-1.3.1:1.0 eth0: Rx status -71
2018-07-20T05:42:00.087620+00:00 WARNING kernel: [ 1274.366875] r8152 2-1.3.1:1.0 eth0: Rx status -71
2018-07-20T05:42:00.091873+00:00 WARNING kernel: [ 1274.371118] r8152 2-1.3.1:1.0 eth0: Rx status -71
2018-07-20T05:42:00.096070+00:00 WARNING kernel: [ 1274.375362] r8152 2-1.3.1:1.0 eth0: Rx status -71
2018-07-20T05:42:00.100272+00:00 WARNING kernel: [ 1274.379601] r8152 2-1.3.1:1.0 eth0: Rx status -71
2018-07-20T05:42:00.104598+00:00 WARNING kernel: [ 1274.383867] r8152 2-1.3.1:1.0 eth0: Rx status -71
2018-07-20T05:42:00.108861+00:00 WARNING kernel: [ 1274.388122] r8152 2-1.3.1:1.0 eth0: Rx status -71
2018-07-20T05:42:00.126867+00:00 INFO kernel: [ 1274.406116] usb 2-1.3: USB disconnect, device number 8
2018-07-20T05:42:00.126923+00:00 INFO kernel: [ 1274.406146] usb 2-1.3.1: USB disconnect, device number 9
Re #4/#5: These look bad, but I don't think are related to the current issue -- update_enging actually reported getting an HTTP 206, so devserver did return partial content, and then refused to send more / DUT lost connectivity so couldn't get the rest.

I haven't seen partial content response before from devserver in this context. And those errors seem to be present throughout (including while the first partial download was in progress)
Status: Fixed (was: Untriaged)
Had a bob-release build pass bvt-installer.  Skimming, I don't see any reason to spend more time investigating this at the moment.

Sign in to add a comment