New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 730929 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner:
Closed: Jan 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

[guado_moblab-paladin] Flaky errors of moblab_RunSuite FAIL: Unhandled AutoservRunError: command execution error

Project Member Reported by mojahsu@chromium.org, Jun 8 2017

Issue description

[ADD relevant info inline]
There are flaky errors since build 5182.
Looks like the DUT(chromeos2-row2-rack8-host11) sometimes can't reboot after provision.

Link to build or pfq page.
https://uberchromegw.corp.google.com/i/chromeos/builders/guado_moblab-paladin/builds/6182
https://uberchromegw.corp.google.com/i/chromeos/builders/guado_moblab-paladin/builds/6184
https://uberchromegw.corp.google.com/i/chromeos/builders/guado_moblab-paladin/builds/6185
https://uberchromegw.corp.google.com/i/chromeos/builders/guado_moblab-paladin/builds/6189

build # for that buildbot.
6182
6184
6185
6189

Snippet of log that contains the failure.

-------
from moblab_RunSuite.tgz

START moblab_RunSuite moblab_RunSuite timestamp=1496843442  localtime=Jun 07 06:50:42 
  FAIL  moblab_RunSuite moblab_RunSuite timestamp=1496845772  localtime=Jun 07 07:29:32 Unhandled AutoservRunError: command execution error
  * Command:                                                                        
      /usr/bin/ssh -a -x     -o StrictHostKeyChecking=no -o                         
      UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o         
      ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4   
      -o Protocol=2 -l root -p 22 chromeos2-row2-rack8-host11 "export               
      LIBC_FATAL_STDERR_=1; if type \"logger\" > /dev/null 2>&1; then logger        
      -tag \"autotest\" \"server[stack::run_once|run_as_moblab|wrapper] ->          
      ssh_run(su - moblab -c '/usr/local/autotest/site_utils/run_suite.py           
      --pool='' --board=cyan --build=cyan-release/R57-9202.66.0                     
      --suite_name=dummy_server')\";fi; su - moblab -c                              
      '/usr/local/autotest/site_utils/run_suite.py --pool='' --board=cyan           
      --build=cyan-release/R57-9202.66.0 --suite_name=dummy_server'"                
  Exit status: 3                                                                    
  Duration: 2193.00971985                                                           
                                                                                    
  stdout:                                                                           
  ^[[?25h^[[?0c                                                                     
  stderr:                                                                           
  Autotest instance: localhost                                                      
  06-07-2017 [06:52:04] Submitted create_suite_job rpc                              
  06-07-2017 [06:52:10] Created suite job: http://localhost/afe/#tab_id=view_job&object_id=1
  @@@STEP_LINK@Link to suite@http://localhost/afe/#tab_id=view_job&object_id=1@@@
  The suite job has another 23:29:59.019311 till timeout.                           
  06-07-2017 [07:28:30] Suite job is finished.                                      
  06-07-2017 [07:28:30] Start collecting test results and dump them to json.        
  Suite job          [ PASSED ]                                                     
  dummy_PassServer   [ PASSED ]                                                     
  provision          [ FAILED ]                                                     
  provision            ABORT: Host did not return from reboot                       
  dummy_PassServer   [ PASSED ]                                                     
                                                                                    
  Suite timings:                                                                    
  Downloads started at 2017-06-07 06:52:04                                          
  Payload downloads ended at 2017-06-07 06:52:09                                    
  Suite started at 2017-06-07 06:52:16                                              
  Artifact downloads ended (at latest) at 2017-06-07 06:52:17                       
  Testing started at 2017-06-07 06:52:28                                            
  Testing ended at 2017-06-07 07:24:52                      
-----------------------------------------


---------------------------------
From HWTest stdio

chromeos-server22-282: 369b93bde2e76810 1
  Autotest instance: cautotest
  06-07-2017 [06:33:02] Created suite job: http://cautotest.corp.google.com/afe/#tab_id=view_job&object_id=122020013
  @@@STEP_LINK@Link to suite@http://cautotest.corp.google.com/afe/#tab_id=view_job&object_id=122020013@@@
  The suite job has another 0:59:50.019471 till timeout.
  
  06-07-2017 [07:18:03] printing summary of incomplete jobs (1):
  
  moblab_DummyServerSuite: http://cautotest.corp.google.com/afe/#tab_id=view_job&object_id=122020038
  06-07-2017 [07:31:47] Suite job is finished.
  06-07-2017 [07:31:47] Start collecting test results and dump them to json.
  Suite job         [ PASSED ]
  moblab_RunSuite   [ FAILED ]
  moblab_RunSuite     FAIL: Unhandled AutoservRunError: command execution error
  
  Suite timings:
  Downloads started at 2017-06-07 06:32:51
  Payload downloads ended at 2017-06-07 06:32:59
  Suite started at 2017-06-07 06:33:23
  Artifact downloads ended (at latest) at 2017-06-07 06:33:27
  Testing started at 2017-06-07 06:50:42
  Testing ended at 2017-06-07 07:29:32
  
  
  Links to test logs:
  Suite job http://cautotest/tko/retrieve_logs.cgi?job=/results/122020013-chromeos-test/
  moblab_RunSuite http://cautotest/tko/retrieve_logs.cgi?job=/results/122020038-chromeos-test/
..
..
************************************************************
** Finished Stage HWTest [moblab_quick] - Wed, 07 Jun 2017 07:32:18 -0700 (PDT)
************************************************************
07:32:18: ERROR: BaseException in _RunParallelStages <class 'chromite.lib.failures_lib.StepFailure'>: 
Traceback (most recent call last):
  File "/b/cbuild/chromite/lib/parallel.py", line 441, in _Run
    self._task(*self._task_args, **self._task_kwargs)
  File "/b/cbuild/chromite/cbuildbot/stages/generic_stages.py", line 649, in Run
    raise failures_lib.StepFailure()
StepFailure
Traceback (most recent call last):
  File "/b/cbuild/chromite/cbuildbot/builders/generic_builders.py", line 119, in _RunParallelStages
    parallel.RunParallelSteps(steps)
  File "/b/cbuild/chromite/lib/parallel.py", line 678, in RunParallelSteps
    return [queue.get_nowait() for queue in queues]
  File "/b/cbuild/chromite/lib/parallel.py", line 675, in RunParallelSteps
    pass
  File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/b/cbuild/chromite/lib/parallel.py", line 561, in ParallelTasks
    raise BackgroundFailure(exc_infos=errors)
BackgroundFailure: <class 'chromite.lib.failures_lib.StepFailure'>: 
Traceback (most recent call last):
  File "/b/cbuild/chromite/lib/parallel.py", line 441, in _Run
    self._task(*self._task_args, **self._task_kwargs)
  File "/b/cbuild/chromite/cbuildbot/stages/generic_stages.py", line 649, in Run
    raise failures_lib.StepFailure()
StepFailure

07:32:18: INFO: Running cidb query on pid 20237, repr(query) starts with <sqlalchemy.sql.expression.Select at 0x7fe1b23ac350; Select object>
07:32:18: INFO: Running cidb query on pid 20237, repr(query) starts with <sqlalchemy.sql.expression.Select at 0x7fe1b23ac490; Select object>
06:26:46: INFO: Created cidb engine bot@130.211.191.11 for pid 20255
06:26:46: INFO: Running cidb query on pid 20255, repr(query) starts with <sqlalchemy.sql.expression.Insert object at 0x7fe1b26ad890>
-----------------------------------------------------------------------------------------------------------------------
  

 
Labels: OS-Chrome
Cc: dgarr...@chromium.org
Labels: -Pri-3 Pri-2
Builder: guado_moblab-paladin

Builds:  https://chromegw.corp.google.com/i/chromeos/builders/guado_moblab-paladin/builds/6206
         https://chromegw.corp.google.com/i/chromeos/builders/guado_moblab-paladin/builds/6205

Logs from https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/122431308-chromeos-test/chromeos2-row2-rack8-host11/:
@@@STEP_LINK@[Test-Logs]: dummy_PassServer: retry_count: 1, GOOD: completed successfully@http://localhost/tko/retrieve_logs.cgi?job=/results/5-moblab/@@@
@@@STEP_LINK@[Flake-Dashboard]: dummy_PassServer@https://wmatrix.googleplex.com/retry_teststats/?days_back=30&tests=dummy_PassServer@@@
Will return from run_suite with status: WARNING
Traceback (most recent call last):
  File "/usr/local/autotest/client/common_lib/test.py", line 818, in _call_test_function
    return func(*args, **dargs)
  File "/usr/local/autotest/client/common_lib/test.py", line 471, in execute
    dargs)
  File "/usr/local/autotest/client/common_lib/test.py", line 348, in _call_run_once_with_retry
    postprocess_profiled_run, args, dargs)
  File "/usr/local/autotest/client/common_lib/test.py", line 381, in _call_run_once
    self.run_once(*args, **dargs)
  File "/usr/local/autotest/server/site_tests/moblab_RunSuite/moblab_RunSuite.py", line 65, in run_once
    raise e
AutoservRunError: command execution error
* Command: 
    /usr/bin/ssh -a -x     -o StrictHostKeyChecking=no -o
    UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o
    ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4
    -o Protocol=2 -l root -p 22 chromeos2-row2-rack8-host11 "export
    LIBC_FATAL_STDERR_=1; if type \"logger\" > /dev/null 2>&1; then logger
    -tag \"autotest\" \"server[stack::run_once|run_as_moblab|wrapper] ->
    ssh_run(su - moblab -c '/usr/local/autotest/site_utils/run_suite.py
    --pool='' --board=cyan --build=cyan-release/R59-9460.60.0
    --suite_name=dummy_server --retry=True --max_retries=1')\";fi; su - moblab
    -c '/usr/local/autotest/site_utils/run_suite.py --pool='' --board=cyan
    --build=cyan-release/R59-9460.60.0 --suite_name=dummy_server --retry=True
    --max_retries=1'"
Exit status: 2
Duration: 2246.62287402
Cc: sbasi@chromium.org
Owner: dhadd...@chromium.org
This is the same reason it's marked experimental. My vague memory is that dhaddock@ has been trying to resolve that.
Nope, not me. Don't know about this test 
Owner: ----

Comment 6 by sbasi@chromium.org, Jun 27 2017

Owner: haddowk@chromium.org
I believe you meant to assign to Keith Haddow. Though this bug may be old news by now.
Status: WontFix (was: Untriaged)
There are various versions of this bug, we have been stable mostly but at this time moblab is experimetal but nothing related to this issue.

Marking as obsolete, since there are newer tracking bugs mostly related to netowrking issues.

Sign in to add a comment