Reks-release failed due to hwtest timed out |
||||||||||||||||
Issue descriptionReks has been failing for about a week: 1/06 01:28:34.965 ERROR| logging_manager:0626| tko parser: The following lines were ignored: 11/06 01:28:34.965 ERROR| logging_manager:0626| tko parser: , 1) SSHConnectionError: ssh: connect to host 100.115.202.98 port 22: Connection timed out 11/06 01:28:34.966 ERROR| logging_manager:0626| 11/06 01:28:34.966 ERROR| logging_manager:0626| tko parser: , 11/06 01:28:34.966 ERROR| logging_manager:0626| 11/06 01:28:34.967 ERROR| logging_manager:0626| tko parser: Traceback (most recent call last): 11/06 01:28:34.967 ERROR| logging_manager:0626| 11/06 01:28:34.967 ERROR| logging_manager:0626| tko parser: File "/usr/local/autotest/client/common_lib/test.py", line 806, in _call_test_function 11/06 01:28:34.968 ERROR| logging_manager:0626| 11/06 01:28:34.968 ERROR| logging_manager:0626| tko parser: return func(*args, **dargs) 11/06 01:28:34.968 ERROR| logging_manager:0626| 11/06 01:28:34.969 ERROR| logging_manager:0626| tko parser: File "/usr/local/autotest/client/common_lib/test.py", line 470, in execute 11/06 01:28:34.969 ERROR| logging_manager:0626| 11/06 01:28:34.970 ERROR| logging_manager:0626| tko parser: dargs) 11/06 01:28:34.970 ERROR| logging_manager:0626| 11/06 01:28:34.970 ERROR| logging_manager:0626| tko parser: File "/usr/local/autotest/client/common_lib/test.py", line 347, in _call_run_once_with_retry 11/06 01:28:34.971 ERROR| logging_manager:0626| 11/06 01:28:34.971 ERROR| logging_manager:0626| tko parser: postprocess_profiled_run, args, dargs) 11/06 01:28:34.971 ERROR| logging_manager:0626| 11/06 01:28:34.972 ERROR| logging_manager:0626| tko parser: File "/usr/local/autotest/client/common_lib/test.py", line 380, in _call_run_once 11/06 01:28:34.972 ERROR| logging_manager:0626| 11/06 01:28:34.973 ERROR| logging_manager:0626| tko parser: self.run_once(*args, **dargs) 11/06 01:28:34.973 ERROR| logging_manager:0626| 11/06 01:28:34.973 ERROR| logging_manager:0626| tko parser: File "/usr/local/autotest/server/site_tests/autoupdate_EndToEndTest/autoupdate_EndToEndTest.py", line 423, in run_once 11/06 01:28:34.974 ERROR| logging_manager:0626| 11/06 01:28:34.974 ERROR| logging_manager:0626| tko parser: cros_device.install_source_image(test_conf['source_payload_uri']) 11/06 01:28:34.974 ERROR| logging_manager:0626| 11/06 01:28:34.975 ERROR| logging_manager:0626| tko parser: File "/usr/local/autotest/server/cros/update_engine/chromiumos_test_platform.py", line 104, in install_source_image 11/06 01:28:34.975 ERROR| logging_manager:0626| 11/06 01:28:34.975 ERROR| logging_manager:0626| tko parser: self._install_version(source_payload_uri, clobber_stateful=True) 11/06 01:28:34.976 ERROR| logging_manager:0626| 11/06 01:28:34.976 ERROR| logging_manager:0626| tko parser: File "/usr/local/autotest/server/cros/update_engine/chromiumos_test_platform.py", line 53, in _install_version 11/06 01:28:34.977 ERROR| logging_manager:0626| 11/06 01:28:34.977 ERROR| logging_manager:0626| tko parser: clobber_stateful=clobber_stateful) 11/06 01:28:34.978 ERROR| logging_manager:0626| 11/06 01:28:34.978 ERROR| logging_manager:0626| tko parser: File "/usr/local/autotest/client/common_lib/cros/dev_server.py", line 2380, in auto_update 11/06 01:28:34.978 ERROR| logging_manager:0626| 11/06 01:28:34.979 ERROR| logging_manager:0626| tko parser: error_msg % (host_name, real_error)) 11/06 01:28:34.979 ERROR| logging_manager:0626| 11/06 01:28:34.979 ERROR| logging_manager:0626| tko parser: DevServerException: CrOS auto-update failed for host chromeos4-row11-rack7-host2: 0) SSHConnectionError: ssh: connect to host chromeos4-row11-rack7-host2 port 22: Connection timed out 11/06 01:28:34.980 ERROR| logging_manager:0626| 11/06 01:28:34.980 ERROR| logging_manager:0626| tko parser: , 1) SSHConnectionError: ssh: connect to host 100.115.202.98 port 22: Connection timed out 11/06 01:28:34.981 ERROR| logging_manager:0626| 11/06 01:28:34.981 ERROR| logging_manager:0626| tko parser: , Link to builds page: https://uberchromegw.corp.google.com/i/chromeos_release/builders/reks-release%20release-R63-10032.B
,
Nov 7 2017
Issue 780745 has been merged into this issue.
,
Nov 7 2017
,
Nov 7 2017
Still an issue? If so, please reassign to me.
,
Nov 8 2017
Reks' last run was green. Thanks
,
Nov 9 2017
Issue 780745 still remains same in M64. Could you have a look?
,
Nov 9 2017
Please link to recent occurrence.
,
Nov 10 2017
w.r.t. Issue 780745, 10113.0.0 was the most recent failure(Build status FAIL) https://cros-goldeneye.corp.google.com/chromeos/console/listBuild?boards=reks&milestone=64&chromeOsVersion=&chromeVersion=&startTimeFrom=&startTimeTo=#/ Issue 780745 covers the rest.
,
Nov 10 2017
,
Nov 10 2017
It's more helpful to link to the *specific failure*. E.g. in this case I believe https://luci-milo.appspot.com/buildbot/chromeos/reks-release/1660 ?
,
Nov 10 2017
I don't see evidence that ^ is an infrastructure failure. At this point it's also hard to associate a particular event with a particular repair log.
,
Nov 13 2017
@akeshet Yes true. it is always a good practice to provide enough info from testing side to triage the bug. Due to restricted access, TVC's/I are/am refrained from accessing the link you referenced in # 10 and build status "FAIL" link in Golden eye. That said, it would be helpful if you share some insight on the gap that make it fruitful( preliminary check and logs of interest from dev perspective for such issues) Rohit, As per our guideline, Reks continues to have "FAIL" build status in M64 and M63 and the same have been reported via Issue 780745 which then duplicated to the current bug . This is Red(FAIL build status) most of the time and remains same till now. This affects our cts monitoring . Could you please guide us on the approach to take it forward?
,
Nov 14 2017
+ deputy
,
Nov 14 2017
"/tmp/tmp60f18J"'
11/14 05:01:34.158 DEBUG| test:0183| before_hook ends running.
11/14 05:01:34.215 DEBUG| test:0362| starting before_iteration_hooks
11/14 05:01:34.216 DEBUG| test:0365| before_iteration_hooks completed
11/14 05:01:34.217 DEBUG| test:0379| starting test(run_once()), test details follow
()
11/14 05:01:34.217 DEBUG|provision_AutoUpda:0061| Start provisioning <remote host: chromeos4-row11-rack7-host5> to reks-release/R64-10127.0.0.
11/14 05:01:34.218 INFO | dev_server:0707| Getting devservers for host: chromeos4-row11-rack7-host5
11/14 05:01:34.278 DEBUG| dev_server:0728| The host chromeos4-row11-rack7-host5 (100.115.202.100) is in a restricted subnet. Try to locate a devserver inside subnet 100.115.192.0:19.
11/14 05:01:34.282 DEBUG| dev_server:0679| Pick one healthy devserver from ['http://100.115.219.129:8082', 'http://100.115.219.130:8082', 'http://100.115.219.132:8082', 'http://100.115.219.133:8082', 'http://100.115.219.134:8082', 'http://100.115.219.136:8082', 'http://100.115.219.139:8082', 'http://100.115.219.135:8082']
11/14 05:01:34.283 DEBUG| dev_server:0683| Check health for http://100.115.219.134:8082
11/14 05:01:34.284 DEBUG| utils:0212| Running 'ssh 100.115.219.134 'curl "http://100.115.219.134:8082/check_health?"''
11/14 05:01:34.397 DEBUG| dev_server:0936| Error occurred with exit_code 255 when executing the ssh call: ssh_exchange_identification: Connection closed by remote host
.
11/14 05:01:34.400 WARNI| retry:0228| <class 'autotest_lib.client.common_lib.error.CmdError'>(Command <ssh 100.115.219.134 'curl "http://100.115.219.134:8082/check_health?"'> failed, rc=255, Command returned non-zero exit status
* Command:
ssh 100.115.219.134 'curl "http://100.115.219.134:8082/check_health?"'
Exit status: 255
Duration: 0.0241811275482
stderr:
ssh_exchange_identification: Connection closed by remote host)
11/14 05:01:34.404 WARNI| retry:0183| Retrying in 2.741414 seconds...
11/14 05:01:37.160 DEBUG| utils:0212| Running 'ssh 100.115.219.134 'curl "http://100.115.219.134:8082/check_health?"''
11/14 05:01:45.915 DEBUG| dev_server:0688| Pick http://100.115.219.134:8082
11/14 05:01:45.916 INFO | dev_server:1153| Staging artifacts on devserver http://100.115.219.134:8082: build=reks-release/R64-10127.0.0, artifacts=['full_payload', 'stateful', 'autotest_packages'], files=, archive_url=gs://chromeos-image-archive/reks-release/R64-10127.0.0
11/14 05:01:45.918 DEBUG| utils:0212| Running 'ssh 100.115.219.134 'curl "http://100.115.219.134:8082/stage?artifacts=full_payload,stateful,autotest_packages&files=&async=True&archive_url=gs://chromeos-image-archive/reks-release/R64-10127.0.0"''
11/14 05:01:54.459 DEBUG| dev_server:1095| response for RPC: 'Success'
11/14 05:01:54.460 DEBUG| utils:0212| Running 'ssh 100.115.219.134 'curl "http://100.115.219.134:8082/is_staged?artifacts=full_payload,stateful,autotest_packages&files=&archive_url=gs://chromeos-image-archive/reks-release/R64-10127.0.0"''
11/14 05:02:02.993 DEBUG| dev_server:1051| whether artifact is staged: 'True'
11/14 05:02:02.998 INFO | dev_server:1171| Finished staging artifacts: build=reks-release/R64-10127.0.0, artifacts=['full_payload', 'stateful', 'autotest_packages'], files=, archive_url=gs://chromeos-image-archive/reks-release/R64-10127.0.0
11/14 05:02:03.004 INFO | dev_server:1489| Requesting contents from devserver http://100.115.219.134:8082 for image reks-release/R64-10127.0.0
11/14 05:02:03.005 DEBUG| utils:0212| Running 'ssh 100.115.219.134 'curl "http://100.115.219.134:8082/list_image_dir?archive_url=gs://chromeos-image-archive/reks-release/R64-10127.0.0"''
11/14 05:02:03.166 DEBUG| dev_server:0936| Error occurred with exit_code 255 when executing the ssh call: ssh_exchange_identification: Connection closed by remote host
.
11/14 05:02:03.168 WARNI| retry:0228| <class 'autotest_lib.client.common_lib.error.CmdError'>(Command <ssh 100.115.219.134 'curl "http://100.115.219.134:8082/list_image_dir?archive_url=gs://chromeos-image-archive/reks-release/R64-10127.0.0"'> failed, rc=255, Command returned non-zero exit status
* Command:
ssh 100.115.219.134 'curl
"http://100.115.219.134:8082/list_image_dir?archive_url=gs://chromeos-
image-archive/reks-release/R64-10127.0.0"'
Exit status: 255
Duration: 0.0392081737518
stderr:
ssh_exchange_identification: Connection closed by remote host)
I find this on the recent logging of the failed HWTest.
,
Nov 14 2017
,
Nov 20 2017
Pass to current deputy.
,
Nov 22 2017
https://uberchromegw.corp.google.com/i/chromeos_release/builders/reks-release%20release-R63-10032.B/builds/41 https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/157829732-chromeos-test/chromeos4-row11-rack6-host20/autoupdate_EndToEndTest.paygen_au_dev_full/debug/ The latest build failed at Traceback (most recent call last): File "/home/chromeos-test/chromiumos/src/platform/dev/cros_update.py", line 269, in TriggerAU clobber_stateful=self.clobber_stateful) File "/home/chromeos-test/chromiumos/chromite/lib/auto_updater.py", line 1020, in __init__ payload_filename=payload_filename) File "/home/chromeos-test/chromiumos/chromite/lib/auto_updater.py", line 252, in __init__ self.device_dev_dir = os.path.join(self.device.work_dir, 'src') File "/home/chromeos-test/chromiumos/chromite/lib/remote_access.py", line 677, in work_dir capture_output=True).output.strip() File "/home/chromeos-test/chromiumos/chromite/lib/remote_access.py", line 959, in BaseRunCommand return self.GetAgent().RemoteSh(cmd, **kwargs) File "/home/chromeos-test/chromiumos/chromite/lib/remote_access.py", line 349, in RemoteSh raise SSHConnectionError(e.result.error) SSHConnectionError: ssh: connect to host 100.115.202.94 port 22: Connection timed out 11/22 01:13:33.747 DEBUG| dev_server:2337| Please see error details in log /usr/local/autotest/results/157829732-chromeos-test/chromeos4-row11-rack6-host20/autoupdate_logs/CrOS_update_100.115.202.94_18149.log 11/22 01:13:33.763 CRITI|chromiumos_test_pl:0055| ERROR: Failed to install image on the DUT. 11/22 01:13:33.765 DEBUG| test:0410| Test failed due to CrOS auto-update failed for host chromeos4-row11-rack6-host20: 0) Could not copy /home/chromeos-test/images/dev-channel/reks/10032.50.0/payloads/chromeos_10032.50.0_reks_dev-channel_full_test.bin-a8b9705409820a6bcec6b1d4787b0c49 to device., 1) SSHConnectionError: ssh: connect to host 100.115.202.94 port 22: Connection timed out, . Exception log follows the after_iteration_hooks.
,
Nov 22 2017
But the dut chromeos4-row11-rack6-host20 seems able to run tests for R64 branch, only failed at provision jobs when the previous run was for R63.
,
Nov 22 2017
Going to lock the dut, will see if it helps
,
Nov 27 2017
reks was passing in the past few days, the last failed one was because the lab was closed. passing this to the current deputy to monitor.
,
Dec 7 2017
Pass to current deputy & secondary, recently failed builds are due to hwtest timed out. Should we add more DUTs to bvt?
,
Dec 7 2017
,
Dec 13 2017
M65: https://cros-goldeneye.corp.google.com/chromeos/console/listBuild?boards=reks&milestone=65&chromeOsVersion=&chromeVersion=&startTimeFrom=&startTimeTo=#/ https://luci-milo.appspot.com/buildbot/chromeos/reks-release/1753 https://luci-milo.appspot.com/buildbot/chromeos/reks-release/ reks continue to have red build status staring M65 also expect few green.
,
Jan 16 2018
|
||||||||||||||||
►
Sign in to add a comment |
||||||||||||||||
Comment 1 by akes...@chromium.org
, Nov 7 2017The DUT chromeos4-row11-rack7-host2 looks problematic, I'm locking it temporarily. $ dut-status -f chromeos4-row11-rack7-host2 chromeos4-row11-rack7-host2 2017-11-06 22:19:06 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row11-rack7-host2/2025203-repair/ 2017-11-06 22:18:24 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row11-rack7-host2/2025198-reset/ 2017-11-06 21:54:44 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/154476506-chromeos-test/ 2017-11-06 21:53:50 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row11-rack7-host2/2025082-reset/ 2017-11-06 21:32:33 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row11-rack7-host2/2025015-repair/ 2017-11-06 21:31:47 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row11-rack7-host2/2025009-reset/ 2017-11-06 16:51:27 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/154437049-chromeos-test/ 2017-11-06 16:50:34 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row11-rack7-host2/2024292-reset/ 2017-11-06 16:24:17 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row11-rack7-host2/2024222-repair/ 2017-11-06 16:23:32 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row11-rack7-host2/2024219-reset/ 2017-11-06 15:42:47 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/154419533-chromeos-test/ 2017-11-06 15:41:59 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row11-rack7-host2/2024116-reset/ 2017-11-06 15:39:55 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/154410902-chromeos-test/ 2017-11-06 15:39:02 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row11-rack7-host2/2024103-reset/ 2017-11-06 15:37:44 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/154410896-chromeos-test/ 2017-11-06 15:36:52 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row11-rack7-host2/2024095-reset/ 2017-11-06 15:35:15 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/154410889-chromeos-test/ 2017-11-06 15:34:29 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row11-rack7-host2/2024089-reset/ 2017-11-06 15:33:14 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/154410883-chromeos-test/ 2017-11-06 15:32:29 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row11-rack7-host2/2024083-reset/ 2017-11-06 15:31:23 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/154410873-chromeos-test/ 2017-11-06 15:30:33 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row11-rack7-host2/2024077-reset/ 2017-11-06 15:29:00 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/154410860-chromeos-test/ 2017-11-06 15:28:17 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row11-rack7-host2/2024070-reset/ 2017-11-06 15:27:02 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/154410831-chromeos-test/ 2017-11-06 15:26:15 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row11-rack7-host2/2024063-reset/ 2017-11-06 15:25:07 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/154410829-chromeos-test/ 2017-11-06 15:24:21 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row11-rack7-host2/2024060-reset/ 2017-11-06 15:22:54 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/154410820-chromeos-test/ 2017-11-06 15:21:57 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row11-rack7-host2/2024051-reset/ 2017-11-06 15:20:35 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/154410812-chromeos-test/ 2017-11-06 15:19:43 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row11-rack7-host2/2024034-reset/ 2017-11-06 15:18:19 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/154410782-chromeos-test/ 2017-11-06 15:17:12 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row11-rack7-host2/2024015-reset/ 2017-11-06 15:15:40 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/154410770-chromeos-test/ 2017-11-06 15:14:48 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row11-rack7-host2/2024003-reset/ 2017-11-06 15:13:10 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/154410758-chromeos-test/ 2017-11-06 15:11:44 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row11-rack7-host2/2023979-reset/ 2017-11-06 15:08:45 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/154410827-chromeos-test/ 2017-11-06 14:51:14 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row11-rack7-host2/2023873-provision/ 2017-11-06 14:29:58 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row11-rack7-host2/2023799-repair/ 2017-11-06 14:11:28 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row11-rack7-host2/2023735-provision/ 2017-11-06 11:34:57 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row11-rack7-host2/2023327-cleanup/ 2017-11-06 11:31:05 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/154395968-chromeos-test/ 2017-11-06 11:30:19 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row11-rack7-host2/2023318-reset/ 2017-11-06 11:29:10 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row11-rack7-host2/2023305-cleanup/ 2017-11-06 11:27:36 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/154367496-chromeos-test/