New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 621405 link

Starred by 2 users

Issue metadata

Status: WontFix
Owner:
Last visit > 30 days ago
Closed: Nov 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

run_suite rpc timeout when fetching suite json summary

Project Member Reported by akes...@chromium.org, Jun 20 2016

Issue description

Build: https://uberchromegw.corp.google.com/i/chromeos/builders/wolf-paladin/builds/11384/steps/HWTest%20%5Bbvt-inline%5D/logs/stdio


************************************************************
** Start Stage HWTest [bvt-inline] - Sun, 19 Jun 2016 19:26:25 -0700 (PDT)
** 
** Stage that runs tests in the Autotest lab.
************************************************************
19:26:25: INFO: Waiting up to forever for payloads ...
19:29:16: INFO: RunCommand: /b/cbuild/internal_master/chromite/third_party/swarming.client/swarming.py run --swarming chromeos-proxy.appspot.com --task-summary-json /tmp/cbuildbot-tmpMhDn4l/tmp42rJhe/temp_summary.json --raw-cmd --task-name wolf-paladin/R53-8476.0.0-rc1-bvt-inline --dimension os Ubuntu-14.04 --dimension pool default --print-status-updates --timeout 9000 --io-timeout 9000 --hard-timeout 9000 --expiration 1200 -- /usr/local/autotest/site_utils/run_suite.py --build wolf-paladin/R53-8476.0.0-rc1 --board wolf --suite_name suite_attr_wrapper --pool cq --num 6 --file_bugs False --priority CQ --timeout_mins 90 --retry True --max_retries 10 --minimum_duts 4 --offload_failures_only True --suite_args "{'attr_filter': '(suite:bvt-inline) and (subsystem:default)'}" -c
Autotest instance: cautotest
06-19-2016 [19:29:21] Submitted create_suite_job rpc
06-19-2016 [19:30:34] Created suite job: http://cautotest.corp.google.com/afe/#tab_id=view_job&object_id=67270482

...
... everything passes
...
...
login_GuestAndActualSession             [ PASSED ]
login_SameSessionTwice                  [ PASSED ]

Suite timings:
Downloads started at 2016-06-19 19:30:24
Payload downloads ended at 2016-06-19 19:30:31
Suite started at 2016-06-19 19:33:36
Artifact downloads ended (at latest) at 2016-06-19 19:33:44
Testing started at 2016-06-19 19:43:40
Testing ended at 2016-06-19 19:56:52


...


Output below this line is for buildbot consumption:
Will return from run_suite with status: OK
20:04:09: INFO: RunCommand: /b/cbuild/internal_master/chromite/third_party/swarming.client/swarming.py run --swarming chromeos-proxy.appspot.com --task-summary-json /tmp/cbuildbot-tmpMhDn4l/tmpDz3pcq/temp_summary.json --raw-cmd --task-name wolf-paladin/R53-8476.0.0-rc1-bvt-inline --dimension os Ubuntu-14.04 --dimension pool default --print-status-updates --timeout 9000 --io-timeout 9000 --hard-timeout 9000 --expiration 1200 -- /usr/local/autotest/site_utils/run_suite.py --build wolf-paladin/R53-8476.0.0-rc1 --board wolf --suite_name suite_attr_wrapper --pool cq --num 6 --file_bugs False --priority CQ --timeout_mins 90 --retry True --max_retries 10 --minimum_duts 4 --offload_failures_only True --suite_args "{'attr_filter': '(suite:bvt-inline) and (subsystem:default)'}" --json_dump -m 67270482
#JSON_START#{"return_code": 3, "return_message": "Unhandled run_suite exception: Call is timed out."}#JSON_END#
20:34:33: INFO: pass_subsystems: set([]), fail_subsystems: set([])

@@@STEP_FAILURE@@@
20:34:33: ERROR: ** HWTest did not complete due to infrastructure issues (code 3) **
 
Cc: fdeng@chromium.org dshi@chromium.org shuqianz@chromium.org
Looks like the suite completed just fine, and on time. However, the run_suite rpc to fetch json summary timed out.

Can we add a retry here?

Any idea why the rpc timed out?

Comment 2 by fdeng@chromium.org, Jun 20 2016

it timed out when run_suite tried to contact cautotest.
I believe it has already retried -- from the timestamp, there is 30min delay between the command started and errored out.

whenever --json_dump is used, we disabled all the std output which would have shown it was keep retrying. 

Looks like an AFE/Rpc server problem.


Comment 3 by autumn@chromium.org, Jun 21 2016

Status: Unconfirmed (was: Untriaged)

Comment 4 by autumn@chromium.org, Jul 12 2016

Labels: -current-issue
Owner: shuqianz@chromium.org
Status: WontFix (was: Unconfirmed)
It seems it didn't happen again, close it now. Feel free to reopen it when it comes back.

Sign in to add a comment