Weird 'call devserver timeout' error |
||
Issue descriptionThis bug is used for tracking a weird RPC error: Host chromeos2-row6-rack6-host11 fails due to a timeout RPC error. https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/73508914-chromeos-test/chromeos2-row6-rack6-host11/debug/ The weird part is in the log: 08/16 06:23:20.477 DEBUG| base_utils:0185| Running 'ssh 100.115.245.200 'curl "http://100.115.245.200:8082/check_health?"'' 08/16 06:24:20.478 ERROR| dev_server:0305| Devserver call failed: "http://100.115.245.200:8082/check_health?", timeout: 60 seconds, Error: Call is timed out. 08/16 06:24:20.481 DEBUG| autoupdater:0495| Error happens in connection to devserver: ChromiumOSError('Update server at http://100.115.245.200:8082 not healthy',) 08/16 06:24:20.482 WARNI| cros_host:0839| Autoupdate did not complete. 08/16 06:24:20.483 ERROR|provision_AutoUpda:0149| Update server at http://100.115.245.200:8082 not available 08/16 06:24:20.484 WARNI| test:0606| Autotest caught exception when running test: ... The devserver's log is normal: "::ffff:127.0.0.1 - - [16/Aug/2016:06:24:20] "GET /check_health HTTP/1.1" 200 450 "" "curl/7.35.0"". So no retry loggings, and no 60s passed. But the error is reported as 'Call is timed out'. I can't reproduce it from local autotest, and can't tell why the logging is like this based on code flow. If there're more examples in the future, it may be worth to investigate.
,
Aug 18 2016
elm-paladin is failed by devserver 4-1, which is a server syslab plans to replace. Please remove 100.107.160.1 from the shadow config. chromeos2-devserver8 seems to be functioning now.
,
Nov 9 2016
#2 the devserver seems to be functioning normally at the time of the call. It may still be worth understanding the failure. It seems that the first check_health call failed after 60s. A second one may have been made right after the first failure, and that might be the one shown in the devserver log (assuming the clocks are synchronized). But as Xixuan says, the code flow may not allow this explanation.
,
Nov 10 2017
Issue has not been modified or commented on in the last 365 days, please re-open or file a new bug if this is still an issue. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot |
||
►
Sign in to add a comment |
||
Comment 1 by xixuan@chromium.org
, Aug 18 2016