master-paladin build 16475 failed due to DNS failures |
|||
Issue descriptionhttps://uberchromegw.corp.google.com/i/chromeos/builders/master-paladin/builds/16475 All failures are in HWTest [bvt-inline] stage: --------------------- veyron_mighty-paladin: The HWTest [bvt-inline] stage failed: ** HWTest failed (code 1) **@https://luci-milo.appspot.com/buildbot/chromeos/veyron_mighty-paladin/6806 reef-paladin: The HWTest [bvt-inline] stage failed: ** HWTest failed (code 1) **@https://luci-milo.appspot.com/buildbot/chromeos/reef-paladin/3842 elm-paladin: The HWTest [bvt-inline] stage failed: ** HWTest failed (code 1) **@https://luci-milo.appspot.com/buildbot/chromeos/elm-paladin/4261 wolf-paladin: The HWTest [bvt-inline] stage failed: ** HWTest failed (code 1) **@https://luci-milo.appspot.com/buildbot/chromeos/wolf-paladin/15852 winky-paladin: The HWTest [bvt-inline] stage failed: ** HWTest failed (code 1) **@https://luci-milo.appspot.com/buildbot/chromeos/winky-paladin/3150 cave-paladin: The HWTest [bvt-inline] stage failed: ** HWTest did not complete due to infrastructure issues (code 3) **@https://luci-milo.appspot.com/buildbot/chromeos/cave-paladin/1771 nyan_big-paladin: The HWTest [bvt-inline] stage failed: ** HWTest failed (code 1) **@https://luci-milo.appspot.com/buildbot/chromeos/nyan_big-paladin/3151 kevin-paladin: The HWTest [bvt-inline] stage failed: ** HWTest did not complete due to infrastructure issues (code 3) **@https://luci-milo.appspot.com/buildbot/chromeos/kevin-paladin/2612 lumpy-paladin: The HWTest [bvt-inline] stage failed: ** HWTest failed (code 1) **@https://luci-milo.appspot.com/buildbot/chromeos/lumpy-paladin/29812 reef-uni-paladin: The HWTest [bvt-inline] [snappy] stage failed: ** HWTest did not complete due to infrastructure issues (code 3) **@https://luci-milo.appspot.com/buildbot/chromeos/reef-uni-paladin/582 peach_pit-paladin: The HWTest [bvt-inline] stage failed: ** HWTest failed (code 1) **@https://luci-milo.appspot.com/buildbot/chromeos/peach_pit-paladin/17253 link-paladin: The HWTest [bvt-inline] stage failed: ** HWTest failed (code 1) **@https://luci-milo.appspot.com/buildbot/chromeos/link-paladin/29843 --------------------- All build failures are in provision_AutoUpdate.double: veyron_mighty-paladin/6806 ** HWTest failed (code 1) ** provision_AutoUpdate.double: retry_count: 2, FAIL: Unhandled DevServerException: CrOS auto-update failed for host chromeos4-row6-rack11-host17: SSHConnectionError: ssh: Could not resolve hostname chromeos4-row6-rack11-host17: Name or service not known reef-paladin/3842 ** HWTest failed (code 1) ** [Test-Logs]: provision_AutoUpdate.double: retry_count: 2, FAIL: Unhandled DevServerException: CrOS auto-update failed for host chromeos2-row7-rack6-host21: SSHConnectionError: ssh: Could not resolve hostname chromeos2-row7-rack6-host21: Name or service not known elm-paladin/4261 ** HWTest failed (code 1) ** provision_AutoUpdate.double: retry_count: 1, FAIL: Unhandled DevServerException: CrOS auto-update failed for host chromeos2-row7-rack7-host1: SSHConnectionError: ssh: Could not resolve hostname chromeos2-row7-rack7-host1: Name or service not known wolf-paladin/15852 ** HWTest failed (code 1) ** provision_AutoUpdate.double: retry_count: 2, FAIL: Unhandled DevServerException: CrOS auto-update failed for host chromeos4-row1-rack4-host1: SSHConnectionError: ssh: Could not resolve hostname chromeos4-row1-rack4-host1: Name or service not known winky-paladin/3150 ** HWTest failed (code 1) ** [Test-Logs]: provision_AutoUpdate.double: retry_count: 2, FAIL: Unhandled DevServerException: CrOS auto-update failed for host chromeos4-row3-rack13-host5: SSHConnectionError: ssh: Could not resolve hostname chromeos4-row3-rack13-host5: Name or service not known cave-paladin/1771 ** HWTest did not complete due to infrastructure issues (code 3) ** [Test-Logs]: provision_AutoUpdate.double: retry_count: 2, ABORT: None nyan_big-paladin/3151 ** HWTest failed (code 1) ** [Test-Logs]: provision_AutoUpdate.double: retry_count: 2, FAIL: Unhandled DevServerException: CrOS auto-update failed for host chromeos4-row5-rack10-host5: SSHConnectionError: ssh: Could not resolve hostname chromeos4-row5-rack10-host5: Name or service not known kevin-paladin/2612 ** HWTest did not complete due to infrastructure issues (code 3) ** [Test-Logs]: provision_AutoUpdate.double: retry_count: 2, ABORT: None lumpy-paladin/29812 [Test-Logs]: provision_AutoUpdate.double: retry_count: 1, FAIL: Unhandled DevServerException: CrOS auto-update failed for host chromeos6-row2-rack8-host10: SSHConnectionError: ssh: Could not resolve hostname chromeos6-row2-rack8-host10: Name or service not known reef-uni-paladin/582 ** HWTest did not complete due to infrastructure issues (code 3) ** [Test-Logs]: provision_AutoUpdate.double: retry_count: 2, ABORT: None peach_pit-paladin/17253 ** HWTest failed (code 1) ** [Test-Logs]: provision_AutoUpdate.double: retry_count: 2, FAIL: Unhandled DevServerException: CrOS auto-update failed for host chromeos6-row2-rack10-host8: SSHConnectionError: ssh: Could not resolve hostname chromeos6-row2-rack10-host8: Name or service not known link-paladin/29843 ** HWTest failed (code 1) ** [Test-Logs]: provision_AutoUpdate.double: retry_count: 2, FAIL: Unhandled DevServerException: CrOS auto-update failed for host chromeos4-row5-rack13-host9: SSHConnectionError: ssh: Could not resolve hostname chromeos4-row5-rack13-host9: Name or service not known --------------------- Details from veyron_mighty-paladin/6806 autoserv.DEBUG: https://storage.cloud.google.com/chromeos-autotest-results/146703469-chromeos-test/chromeos4-row6-rack11-host17/debug/autoserv.DEBUG?_ga=2.31926264.-734044362.1501703718 10/03 14:02:57.897 DEBUG| dev_server:2124| Start CrOS auto-update for host chromeos4-row6-rack11-host17 at 1 time(s). 10/03 14:02:57.900 DEBUG| utils:0212| Running 'ssh 100.115.219.134 'curl "http://100.115.219.134:8082/cros_au?full_update=True&force_update=True&build_name=veyron_mighty-paladin/R63-9999.0.0-rc1&host_name=chromeos4-row6-rack11-host17&async=True&clobber_stateful=True"'' 10/03 14:03:06.426 INFO | dev_server:1852| Received response from devserver for cros_au call: '[true, 31124]' 10/03 14:03:06.428 DEBUG| dev_server:1975| start process 31124 for auto_update in devserver 10/03 14:03:06.429 DEBUG| dev_server:1873| Check the progress for auto-update process 31124 10/03 14:03:06.430 DEBUG| utils:0212| Running 'ssh 100.115.219.134 'curl "http://100.115.219.134:8082/get_au_status?full_update=True&force_update=True&pid=31124&build_name=veyron_mighty-paladin/R63-9999.0.0-rc1&host_name=chromeos4-row6-rack11-host17&clobber_stateful=True"'' 10/03 14:03:14.952 DEBUG| dev_server:1909| Current CrOS auto-update status: CrOS update is just started. 10/03 14:03:25.001 DEBUG| utils:0212| Running 'ssh 100.115.219.134 'curl "http://100.115.219.134:8082/get_au_status?full_update=True&force_update=True&pid=31124&build_name=veyron_mighty-paladin/R63-9999.0.0-rc1&host_name=chromeos4-row6-rack11-host17&clobber_stateful=True"'' 10/03 14:03:33.546 DEBUG| dev_server:1978| Failed to trigger auto-update process on devserver 10/03 14:03:33.549 DEBUG| utils:0212| Running 'ssh 100.115.219.134 'curl "http://100.115.219.134:8082/collect_cros_au_log?pid=31124&host_name=chromeos4-row6-rack11-host17"'' 10/03 14:03:42.045 DEBUG| dev_server:1789| Saving auto-update logs into /usr/local/autotest/results/146703469-chromeos-test/autoupdate_logs/CrOS_update_chromeos4-row6-rack11-host17_31124.log 10/03 14:03:42.049 DEBUG| utils:0212| Running 'ssh 100.115.219.134 'curl "http://100.115.219.134:8082/handler_cleanup?pid=31124&host_name=chromeos4-row6-rack11-host17"'' 10/03 14:03:42.096 DEBUG| dev_server:0936| Error occurred with exit_code 255 when executing the ssh call: ssh_exchange_identification: Connection closed by remote host ... 10/03 14:04:55.712 DEBUG| utils:0212| Running 'ssh 100.115.219.134 'curl "http://100.115.219.134:8082/kill_au_proc?pid=31124&host_name=chromeos4-row6-rack11-host17"'' 10/03 14:05:04.189 DEBUG| dev_server:2189| Exception raised on auto_update attempt #1: Traceback (most recent call last): File "/home/chromeos-test/chromiumos/src/platform/dev/cros_update.py", line 219, in TriggerAU clobber_stateful=self.clobber_stateful) File "/home/chromeos-test/chromiumos/chromite/lib/auto_updater.py", line 1001, in __init__ payload_filename=payload_filename) File "/home/chromeos-test/chromiumos/chromite/lib/auto_updater.py", line 244, in __init__ self.device_dev_dir = os.path.join(self.device.work_dir, 'src') File "/home/chromeos-test/chromiumos/chromite/lib/remote_access.py", line 674, in work_dir capture_output=True).output.strip() File "/home/chromeos-test/chromiumos/chromite/lib/remote_access.py", line 904, in BaseRunCommand return self.GetAgent().RemoteSh(cmd, **kwargs) File "/home/chromeos-test/chromiumos/chromite/lib/remote_access.py", line 345, in RemoteSh raise SSHConnectionError(e.result.error) SSHConnectionError: ssh: Could not resolve hostname chromeos4-row6-rack11-host17: Name or service not known Johndhong - is this the same DNS failure that you just discovered on https://crbug.com/770632#c12 ?
,
Oct 3 2017
Considering I just fixed the devservers just now https://b.corp.google.com/issues/67379413
,
Oct 3 2017
So in terms of timeline. Sept. 28-29 until Oct. 3 15:40 high probability of devserver DNS issues. Not sure if I should be owner now as I did the fix vs someone who should monitor this next run so kicking it over to Infra deputy
,
Oct 4 2017
so dnsmasq was just coincidental ? i've been seeing failures since, and i thought before too ...
,
Oct 4 2017
Is the DNS fixed?
,
Oct 4 2017
Seems still happening: https://luci-milo.appspot.com/buildbot/chromeos/lumpy-chrome-pfq/10688
,
Oct 4 2017
There was one DNS error in the latest run (https://luci-milo.appspot.com/buildbot/chromeos/elm-paladin/4270)
,
Dec 1 2017
|
|||
►
Sign in to add a comment |
|||
Comment 1 by djkurtz@chromium.org
, Oct 3 2017