New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 638450 link

Starred by 1 user

Issue metadata

Status: Archived
Owner: ----
Closed: Nov 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: ----



Sign in to add a comment

Weird 'call devserver timeout' error

Project Member Reported by xixuan@chromium.org, Aug 17 2016

Issue description

This bug is used for tracking a weird RPC error:


Host chromeos2-row6-rack6-host11 fails due to a timeout RPC error.

https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/73508914-chromeos-test/chromeos2-row6-rack6-host11/debug/

The weird part is in the log:

08/16 06:23:20.477 DEBUG|        base_utils:0185| Running 'ssh 100.115.245.200 'curl "http://100.115.245.200:8082/check_health?"''
08/16 06:24:20.478 ERROR|        dev_server:0305| Devserver call failed: "http://100.115.245.200:8082/check_health?", timeout: 60 seconds, Error: Call is timed out.
08/16 06:24:20.481 DEBUG|       autoupdater:0495| Error happens in connection to devserver: ChromiumOSError('Update server at http://100.115.245.200:8082 not healthy',)
08/16 06:24:20.482 WARNI|         cros_host:0839| Autoupdate did not complete.
08/16 06:24:20.483 ERROR|provision_AutoUpda:0149| Update server at http://100.115.245.200:8082 not available
08/16 06:24:20.484 WARNI|              test:0606| Autotest caught exception when running test:
 ...

The devserver's log is normal: "::ffff:127.0.0.1 - - [16/Aug/2016:06:24:20] "GET /check_health HTTP/1.1" 200 450 "" "curl/7.35.0"". 

So no retry loggings, and no 60s passed. But the error is reported as 'Call is timed out'.

I can't reproduce it from local autotest, and can't tell why the logging is like this based on code flow. If there're more examples in the future, it may be worth to investigate.
 

Comment 2 by dshi@chromium.org, Aug 18 2016

elm-paladin is failed by devserver 4-1, which is a server syslab plans to replace. Please remove 100.107.160.1 from the shadow config.

chromeos2-devserver8 seems to be functioning now.
#2 the devserver seems to be functioning normally at the time of the call.  It may still be worth understanding the failure.

It seems that the first check_health call failed after 60s.  A second one may have been made right after the first failure, and that might be the one shown in the devserver log (assuming the clocks are synchronized).  But as Xixuan says, the code flow may not allow this explanation.
Project Member

Comment 4 by sheriffbot@chromium.org, Nov 10 2017

Status: Archived (was: Unconfirmed)
Issue has not been modified or commented on in the last 365 days, please re-open or file a new bug if this is still an issue.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot

Sign in to add a comment