New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 720000 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner:
Last visit > 30 days ago
Closed: May 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 3
Type: Bug



Sign in to add a comment

provision consistently fail in testing push due to RootfsUpdateError

Project Member Reported by shuqianz@chromium.org, May 9 2017

Issue description

The dummy suite consistently fails at provision in testing push due to the not existing beaglebone_servo image

http://chromeos-shard2-staging.hot.corp.google.com/afe/#tab_id=view_host&object_id=1

Using devserver url: http://100.115.219.133:8082/update/beaglebone_servo-release/R54-8743.44.0 to trigger update on servo host chromeos4-row10-rack9-host15-servo, from 8866.0.0 to 8743.44.0
05/09 05:21:10.993 INFO |        dev_server:1094| Staging artifacts on devserver http://100.115.219.133:8082: build=beaglebone_servo-release/R54-8743.44.0, artifacts=['full_payload'], files=, archive_url=gs://chromeos-image-archive/beaglebone_servo-release/R54-8743.44.0
05/09 05:21:10.995 DEBUG|             utils:0202| Running 'ssh 100.115.219.133 'curl "http://100.115.219.133:8082/stage?artifacts=full_payload&files=&async=True&archive_url=gs://chromeos-image-archive/beaglebone_servo-release/R54-8743.44.0"''
05/09 05:21:12.134 DEBUG|        dev_server:1036| response for RPC: 'Success'
05/09 05:21:12.135 DEBUG|             utils:0202| Running 'ssh 100.115.219.133 'curl "http://100.115.219.133:8082/is_staged?artifacts=full_payload&files=&archive_url=gs://chromeos-image-archive/beaglebone_servo-release/R54-8743.44.0"''
05/09 05:21:13.266 DEBUG|        dev_server:0294| RPC call call_and_wait has timed out on devserver 100.115.219.133.
05/09 05:21:13.268 DEBUG|        dev_server:0294| RPC call stage_artifacts has timed out on devserver 100.115.219.133.
05/09 05:21:13.268 ERROR|        servo_host:0574| Staging artifacts failed: 


    
    500 Internal Server Error
    
    #powered_by {
        margin-top: 20px;
        border-top: 2px solid black;
        font-style: italic;
    }

    #traceback {
        color: red;
    }
    

    
        500 Internal Server Error
        The server encountered an unexpected condition which prevented it from fulfilling the request.
        Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/cherrypy/_cprequest.py", line 656, in respond
    response.body = self.handler()
  File "/usr/lib/python2.7/dist-packages/cherrypy/lib/encoding.py", line 188, in __call__
    self.body = self.oldhandler(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/cherrypy/_cpdispatch.py", line 34, in __call__
    return self.callable(*self.args, **self.kwargs)
  File "/home/chromeos-test/chromiumos/src/platform/dev/devserver.py", line 788, in is_staged
    response = str(dl.IsStaged(factory))
  File "/home/chromeos-test/chromiumos/src/platform/dev/downloader.py", line 216, in IsStaged
    raise DownloaderException(exceptions)
DownloaderException: return code: 1; command: /home/chromeos-test/chromeos-cache/common/gsutil_4.19.tar.gz/gsutil/gsutil -o 'Boto:num_retries=10' ls -- gs://chromeos-image-archive/beaglebone_servo-release/R54-8743.44.0
CommandException: One or more URLs matched no objects.

cmd=['/home/chromeos-test/chromeos-cache/common/gsutil_4.19.tar.gz/gsutil/gsutil', '-o', 'Boto:num_retries=10', 'ls', '--', u'gs://chromeos-image-archive/beaglebone_servo-release/R54-8743.44.0'], extra env={'BOTO_CONFIG': '/home/chromeos-test/.boto'}
Traceback (most recent call last):
  File "/home/chromeos-test/chromiumos/src/platform/dev/build_artifact.py", line 337, in Process
    self.name, self.is_regex_name, timeout)
  File "/home/chromeos-test/chromiumos/src/platform/dev/downloader.py", line 329, in Wait
    is_regex_pattern=is_regex_name)
  File "/home/chromeos-test/chromiumos/chromite/lib/gs.py", line 1257, in GetGsNamesWithWait
    lambda x: not x, _GetGsName, timeout=timeout, period=period)
  File "/home/chromeos-test/chromiumos/chromite/lib/timeout_util.py", line 292, in WaitForSuccess
    return retry()
  File "/home/chromeos-test/chromiumos/chromite/lib/timeout_util.py", line 270, in retry
    value = func(*func_args, **func_kwargs)
  File "/home/chromeos-test/chromiumos/chromite/lib/gs.py", line 1235, in _GetGsName
    uploaded_list = [os.path.basename(p.url) for p in self.List(url)]
  File "/home/chromeos-test/chromiumos/chromite/lib/gs.py", line 938, in List
    lines = self.DoCommand(cmd, **kwargs).output.splitlines()
  File "/home/chromeos-test/chromiumos/chromite/lib/gs.py", line 781, in DoCommand
    extra_env=extra_env, **kwargs)
  File "/home/chromeos-test/chromiumos/chromite/lib/retry_stats.py", line 180, in RetryWithStats
    *args, **kwargs)
  File "/home/chromeos-test/chromiumos/chromite/lib/retry_util.py", line 128, in GenericRetry
    if not handler(e):
  File "/home/chromeos-test/chromiumos/chromite/lib/gs.py", line 661, in _RetryFilter
    raise GSNoSuchKey(e)
GSNoSuchKey: return code: 1; command: /home/chromeos-test/chromeos-cache/common/gsutil_4.19.tar.gz/gsutil/gsutil -o 'Boto:num_retries=10' ls -- gs://chromeos-image-archive/beaglebone_servo-release/R54-8743.44.0
CommandException: One or more URLs matched no objects.

cmd=['/home/chromeos-test/chromeos-cache/common/gsutil_4.19.tar.gz/gsutil/gsutil', '-o', 'Boto:num_retries=10', 'ls', '--', u'gs://chromeos-image-archive/beaglebone_servo-release/R54-8743.44.0'], extra env={'BOTO_CONFIG': '/home/chromeos-test/.boto'}


    
    Powered by CherryPy 3.2.2
    
    
The beaglebone_servo image beaglebone_servo-release/R54-8743.44.0 is already swiped out from the image bucket. 

If run:
chromeos-test@chromeos-autotest:~$ curl "http://100.115.219.133:8082/is_staged?artifacts=full_payload&files=&archive_url=gs://chromeos-image-archive/beaglebone_servo-release/R54-8743.44.0"
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
<head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8"></meta>
    <title>500 Internal Server Error</title>
    <style type="text/css">
    #powered_by {
        margin-top: 20px;
        border-top: 2px solid black;
        font-style: italic;
    }

    #traceback {
        color: red;
    }
    </style>
</head>
    <body>
        <h2>500 Internal Server Error</h2>
        <p>The server encountered an unexpected condition which prevented it from fulfilling the request.</p>
        <pre id="traceback">Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/cherrypy/_cprequest.py", line 656, in respond
    response.body = self.handler()
  File "/usr/lib/python2.7/dist-packages/cherrypy/lib/encoding.py", line 188, in __call__
    self.body = self.oldhandler(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/cherrypy/_cpdispatch.py", line 34, in __call__
    return self.callable(*self.args, **self.kwargs)
  File "/home/chromeos-test/chromiumos/src/platform/dev/devserver.py", line 788, in is_staged
    response = str(dl.IsStaged(factory))
  File "/home/chromeos-test/chromiumos/src/platform/dev/downloader.py", line 216, in IsStaged
    raise DownloaderException(exceptions)
DownloaderException: return code: 1; command: /home/chromeos-test/chromeos-cache/common/gsutil_4.19.tar.gz/gsutil/gsutil -o 'Boto:num_retries=10' ls -- gs://chromeos-image-archive/beaglebone_servo-release/R54-8743.44.0
CommandException: One or more URLs matched no objects.

cmd=['/home/chromeos-test/chromeos-cache/common/gsutil_4.19.tar.gz/gsutil/gsutil', '-o', 'Boto:num_retries=10', 'ls', '--', u'gs://chromeos-image-archive/beaglebone_servo-release/R54-8743.44.0'], extra env={'BOTO_CONFIG': '/home/chromeos-test/.boto'}
Traceback (most recent call last):
  File "/home/chromeos-test/chromiumos/src/platform/dev/build_artifact.py", line 337, in Process
    self.name, self.is_regex_name, timeout)
  File "/home/chromeos-test/chromiumos/src/platform/dev/downloader.py", line 329, in Wait
    is_regex_pattern=is_regex_name)
  File "/home/chromeos-test/chromiumos/chromite/lib/gs.py", line 1257, in GetGsNamesWithWait
    lambda x: not x, _GetGsName, timeout=timeout, period=period)
  File "/home/chromeos-test/chromiumos/chromite/lib/timeout_util.py", line 292, in WaitForSuccess
    return retry()
  File "/home/chromeos-test/chromiumos/chromite/lib/timeout_util.py", line 270, in retry
    value = func(*func_args, **func_kwargs)
  File "/home/chromeos-test/chromiumos/chromite/lib/gs.py", line 1235, in _GetGsName
    uploaded_list = [os.path.basename(p.url) for p in self.List(url)]
  File "/home/chromeos-test/chromiumos/chromite/lib/gs.py", line 938, in List
    lines = self.DoCommand(cmd, **kwargs).output.splitlines()
  File "/home/chromeos-test/chromiumos/chromite/lib/gs.py", line 781, in DoCommand
    extra_env=extra_env, **kwargs)
  File "/home/chromeos-test/chromiumos/chromite/lib/retry_stats.py", line 180, in RetryWithStats
    *args, **kwargs)
  File "/home/chromeos-test/chromiumos/chromite/lib/retry_util.py", line 128, in GenericRetry
    if not handler(e):
  File "/home/chromeos-test/chromiumos/chromite/lib/gs.py", line 661, in _RetryFilter
    raise GSNoSuchKey(e)
GSNoSuchKey: return code: 1; command: /home/chromeos-test/chromeos-cache/common/gsutil_4.19.tar.gz/gsutil/gsutil -o 'Boto:num_retries=10' ls -- gs://chromeos-image-archive/beaglebone_servo-release/R54-8743.44.0
CommandException: One or more URLs matched no objects.

cmd=['/home/chromeos-test/chromeos-cache/common/gsutil_4.19.tar.gz/gsutil/gsutil', '-o', 'Boto:num_retries=10', 'ls', '--', u'gs://chromeos-image-archive/beaglebone_servo-release/R54-8743.44.0'], extra env={'BOTO_CONFIG': '/home/chromeos-test/.boto'}

</pre>
    <div id="powered_by">
    <span>Powered by <a href="http://www.cherrypy.org">CherryPy 3.2.2</a></span>
    </div>
    </body>
</html>

It will complain that the URL matched no object. How does the suite use that image? Where do we set the servo image for the test to use?
 
I think the servo image error is a red herring.

05/09 05:21:13.268 ERROR|        servo_host:0575| Abandoning update for this cycle.

Everything else passes:

START	----	provision	timestamp=1494332461	localtime=May 09 05:21:01	
	GOOD	----	verify.servo_ssh	timestamp=1494332465	localtime=May 09 05:21:05	
	GOOD	----	verify.update	timestamp=1494332473	localtime=May 09 05:21:13	
	GOOD	----	verify.brd_config	timestamp=1494332473	localtime=May 09 05:21:13	
	GOOD	----	verify.ser_config	timestamp=1494332474	localtime=May 09 05:21:14	
	GOOD	----	verify.job	timestamp=1494332474	localtime=May 09 05:21:14	
	GOOD	----	verify.servod	timestamp=1494332477	localtime=May 09 05:21:17	
	GOOD	----	verify.pwr_button	timestamp=1494332477	localtime=May 09 05:21:17	
	GOOD	----	verify.lid_open	timestamp=1494332478	localtime=May 09 05:21:18	
	GOOD	----	verify.PASS	timestamp=1494332478	localtime=May 09 05:21:18	
	START	provision_AutoUpdate	provision_AutoUpdate	timestamp=1494332478	localtime=May 09 05:21:18	
		START	----	----	timestamp=1494332492	localtime=May 09 05:21:32	
			GOOD	----	sysinfo.before	timestamp=1494332493	localtime=May 09 05:21:33	
		END GOOD	----	----	timestamp=1494332493	localtime=May 09 05:21:33	
		FAIL	provision_AutoUpdate	provision_AutoUpdate	timestamp=1494333162	localtime=May 09 05:32:42	Unhandled DevServerException: CrOS auto-update failed for host chromeos4-row10-rack9-host15: RootfsUpdateError: Failed to perform rootfs update: DevServerStartupError('Timeout (30) waiting for remote devserver port_file',)
  Traceback (most recent call last):
    File "/usr/local/autotest/client/common_lib/test.py", line 817, in _call_test_function
      return func(*args, **dargs)
    File "/usr/local/autotest/client/common_lib/test.py", line 470, in execute
      dargs)
    File "/usr/local/autotest/client/common_lib/test.py", line 347, in _call_run_once_with_retry
      postprocess_profiled_run, args, dargs)
    File "/usr/local/autotest/client/common_lib/test.py", line 380, in _call_run_once
      self.run_once(*args, **dargs)
    File "/usr/local/autotest/server/site_tests/provision_AutoUpdate/provision_AutoUpdate.py", line 113, in run_once
      force_full_update=force)
    File "/usr/local/autotest/server/afe_utils.py", line 206, in machine_install_and_update_labels
      *args, **dargs)
    File "/usr/local/autotest/server/hosts/cros_host.py", line 748, in machine_install_by_devserver
      full_update=force_full_update)
    File "/usr/local/autotest/client/common_lib/cros/dev_server.py", line 2141, in auto_update
      raise DevServerException(error_msg % (host_name, error_list[0]))
  DevServerException: CrOS auto-update failed for host chromeos4-row10-rack9-host15: RootfsUpdateError: Failed to perform rootfs update: DevServerStartupError('Timeout (30) waiting for remote devserver port_file',)
	END FAIL	provision_AutoUpdate	provision_AutoUpdate	timestamp=1494333162	localtime=May 09 05:32:42	
END FAIL	----	provision	timestamp=1494333162	localtime=May 09 05:32:42	
INFO	----	----	timestamp=1494333162	job_abort_reason=	localtime=May 09 05:32:42	

The real error is the RootfsUpdateError that has been coming up in provision everywhere (cyan-paladin for example)
Summary: provision consistently fail in testing push due to RootfsUpdateError (was: provision consistently fail in testing push due to not existing beaglebone_servo image)
Blockedon: 719786
Add a link for now
Cc: xixuan@chromium.org
I think the process is this:
1. Provision test started to run
2. Ask devserver 100.115.219.133 to stage a servo image, which is /beaglebone_servo-release/R54-8743.44.0. The image is not there. Failed.
3. Staging artifacts failed
4. Abandoned update for this cycle
5. au-update failed all three attempts. 

I think the staging failure maybe be related to the au-update failure
Blockedon: -719786
servo staging failure is not related with this provision failure, and this failure is different from cyan-paladin provision failure shown in Issue 719786.

This provision failure should be fixed by CL 495651, which requires a devserver push. So keep deputy as the owner.
> 2. Ask devserver 100.115.219.133 to stage a servo image, which is
> /beaglebone_servo-release/R54-8743.44.0. The image is not there. Failed.

The immediate fix is to set the beaglebone stable image in the staging
instance to an extant build.

The long term fix is to copy the setting from the prod instance prior to
running push to prod test.

Labels: -Pri-1 Pri-3
The servo image staging isn't failing test push
Owner: shuqianz@chromium.org
+shuqianz for low pri test push fix
Status: WontFix (was: Untriaged)

Sign in to add a comment