Project: chromium Issues People Development process History Sign in
New issue
Advanced search Search tips
Starred by 1 user
Status: WontFix
Owner: ----
Closed: Feb 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment
Caroline canary: Failed to perform stateful update
Project Member Reported by waihong@chromium.org, Feb 9 2017 Back to list
Coreline canary failed in the PaygenTestDev phase.
https://uberchromegw.corp.google.com/i/chromeos/builders/caroline-release/builds/390

The log of the failed test autoupdate_EndToEndTest_paygen_au_dev_delta_9000.77.0:
https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/100321114-chromeos-test/chromeos2-row8-rack1-host13/debug

"""
02/09 09:02:21.305 DEBUG|      abstract_ssh:0357| Using Rsync.
02/09 09:02:21.306 DEBUG|        base_utils:0185| Running 'rsync -L  --timeout=1800 --rsh='/usr/bin/ssh -a -x   -o ControlPath=/tmp/_autotmp_MWvSF6ssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22' -az --no-o --no-g root@chromeos2-row8-rack1-host13:"/tmp/autoserv-0FIIfb/sysinfo.pickle" "/tmp/tmplaySVO"'
02/09 09:02:21.727 WARNI|              test:0606| Autotest caught exception when running test:
Traceback (most recent call last):
  File "/usr/local/autotest/client/common_lib/test.py", line 600, in _exec
    _call_test_function(self.execute, *p_args, **p_dargs)
  File "/usr/local/autotest/client/common_lib/test.py", line 804, in _call_test_function
    return func(*args, **dargs)
  File "/usr/local/autotest/client/common_lib/test.py", line 461, in execute
    dargs)
  File "/usr/local/autotest/client/common_lib/test.py", line 347, in _call_run_once_with_retry
    postprocess_profiled_run, args, dargs)
  File "/usr/local/autotest/client/common_lib/test.py", line 376, in _call_run_once
    self.run_once(*args, **dargs)
  File "/usr/local/autotest/server/site_tests/autoupdate_EndToEndTest/autoupdate_EndToEndTest.py", line 1816, in run_once
    self.run_update_test(test_platform, test_conf)
  File "/usr/local/autotest/server/site_tests/autoupdate_EndToEndTest/autoupdate_EndToEndTest.py", line 1697, in run_update_test
    test_platform.finalize_update()
  File "/usr/local/autotest/server/site_tests/autoupdate_EndToEndTest/autoupdate_EndToEndTest.py", line 1173, in finalize_update
    None, None, self._staged_urls.target_stateful_url, False)
  File "/usr/local/autotest/server/site_tests/autoupdate_EndToEndTest/autoupdate_EndToEndTest.py", line 940, in _update_via_test_payloads
    perform_update(stateful_url, True)
  File "/usr/local/autotest/server/site_tests/autoupdate_EndToEndTest/autoupdate_EndToEndTest.py", line 923, in perform_update
    updater.update_stateful(clobber=clobber)
  File "/usr/local/autotest/client/common_lib/cros/autoupdater.py", line 521, in update_stateful
    raise update_error
StatefulUpdateError: Failed to perform stateful update on chromeos2-row8-rack1-host13
"""

And more on executing the stateful_update script.

"""
02/09 08:51:55.450 INFO |       autoupdater:0504| Updating stateful partition...
02/09 08:51:55.451 DEBUG|      abstract_ssh:0448| send_file. source: /usr/local/google/chromeos/src/platform/dev/stateful_update, dest: /tmp/stateful_update, delete_dest: True,preserve_symlinks:False
02/09 08:51:55.452 DEBUG|      abstract_ssh:0465| Using Rsync.
02/09 08:51:55.453 DEBUG|        base_utils:0185| Running 'rsync -L --delete --timeout=1800 --rsh='/usr/bin/ssh -a -x   -o ControlPath=/tmp/_autotmp_MW3YNBssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22' -az --no-o --no-g "/usr/local/google/chromeos/src/platform/dev/stateful_update" "root@chromeos2-row8-rack1-host13:"/tmp/stateful_update""'
02/09 08:51:56.071 DEBUG|          ssh_host:0285| Running (ssh) '/tmp/stateful_update http://100.115.245.200:8082/static/caroline-release/R56-9000.84.0 2>&1'
02/09 08:51:56.636 DEBUG|        base_utils:0280| [stdout] Downloading stateful payload from http://100.115.245.200:8082/static/caroline-release/R56-9000.84.0/stateful.tgz
02/09 08:51:56.679 DEBUG|        base_utils:0280| [stdout]   HTTP/1.1 200 OK
02/09 08:51:56.680 DEBUG|        base_utils:0280| [stdout]   Date: Thu, 09 Feb 2017 16:51:56 GMT
02/09 08:51:56.680 DEBUG|        base_utils:0280| [stdout]   Server: Apache/2.4.7 (Ubuntu)
02/09 08:51:56.680 DEBUG|        base_utils:0280| [stdout]   Last-Modified: Thu, 09 Feb 2017 09:49:46 GMT
02/09 08:51:56.680 DEBUG|        base_utils:0280| [stdout]   ETag: "10dd2f81-54815e6c3bc0f"
02/09 08:51:56.680 DEBUG|        base_utils:0280| [stdout]   Accept-Ranges: bytes
02/09 08:51:56.681 DEBUG|        base_utils:0280| [stdout]   Content-Length: 282931073
02/09 08:51:56.681 DEBUG|        base_utils:0280| [stdout]   Keep-Alive: timeout=60, max=1000
02/09 08:51:56.681 DEBUG|        base_utils:0280| [stdout]   Connection: Keep-Alive
02/09 08:51:56.681 DEBUG|        base_utils:0280| [stdout]   Content-Type: application/x-gzip
02/09 09:01:55.594 DEBUG|        base_utils:0280| [stdout] 
02/09 09:01:55.595 DEBUG|        base_utils:0280| [stdout] gzip: stdin: unexpected end of file
02/09 09:01:55.595 DEBUG|        base_utils:0280| [stdout] tar: Unexpected EOF in archive
02/09 09:01:55.595 DEBUG|        base_utils:0280| [stdout] tar: Unexpected EOF in archive
02/09 09:01:55.595 DEBUG|        base_utils:0280| [stdout] tar: Error is not recoverable: exiting now
02/09 09:01:55.596 DEBUG|        base_utils:0280| [stdout] Downloading command returns code 2.
02/09 09:01:55.596 DEBUG|        base_utils:0280| [stdout] Downloading failed, retrying.
02/09 09:01:55.596 DEBUG|        base_utils:0280| [stdout] 
02/09 09:01:55.596 DEBUG|        base_utils:0280| [stdout] gzip: stdin: unexpected end of file
02/09 09:01:55.597 DEBUG|        base_utils:0280| [stdout] tar: Child returned status 1
02/09 09:01:55.597 DEBUG|        base_utils:0280| [stdout] tar: Error is not recoverable: exiting now
02/09 09:01:55.597 DEBUG|        base_utils:0280| [stdout] Downloading command returns code 2.
"""

It complained downloading failed.
 
Comment 1 by aut...@google.com, Feb 9 2017
Owner: sbasi@chromium.org
Summary: Caroline canary: Failed to perform stateful update (was: Coreline canary: Failed to perform stateful update)
Cc: -shuqianz@chromium.org mnissler@chromium.org ejcaruso@chromium.org adurbin@chromium.org mqg@chromium.org
Owner: shuqianz@chromium.org
Assigned to this week infra deputy.

Added this week sheriffs for FYI.
Cc: -mqg@chromium.org xixuan@chromium.org
Owner: mqg@chromium.org
This is for release build, pass to sheriff. cc xixuan@ to take a look
Comment 4 by mqg@chromium.org, Feb 13 2017
Owner: shuqianz@chromium.org
Hi I have no idea how this failure happens except that it seems to be infrastructure related, so passing on to deputy.  Sorry I can't be more helpful but it's also my first time sheriffing. What am I supposed to do?
Cc: ayatane@chromium.org
Owner: xixuan@chromium.org
From the log, it seems the tarball file itself is broken? xixuan@
Comment 6 by xixuan@chromium.org, Feb 13 2017
I'm always wondering (of course get no answer :( ) who owns autoupdate_End_to_End test.  Seems that guy should be the right owner of this bug.

From the log, it's an error that update-engine fails to perform stateful update with the old provision framework. Agree with @shuqianz that this is a unzip error:

02/09 08:51:56.636 DEBUG|        base_utils:0280| [stdout] Downloading stateful payload from http://100.115.245.200:8082/static/caroline-release/R56-9000.84.0/stateful.tgz
02/09 08:51:56.679 DEBUG|        base_utils:0280| [stdout]   HTTP/1.1 200 OK
02/09 08:51:56.680 DEBUG|        base_utils:0280| [stdout]   Date: Thu, 09 Feb 2017 16:51:56 GMT
02/09 08:51:56.680 DEBUG|        base_utils:0280| [stdout]   Server: Apache/2.4.7 (Ubuntu)
02/09 08:51:56.680 DEBUG|        base_utils:0280| [stdout]   Last-Modified: Thu, 09 Feb 2017 09:49:46 GMT
02/09 08:51:56.680 DEBUG|        base_utils:0280| [stdout]   ETag: "10dd2f81-54815e6c3bc0f"
02/09 08:51:56.680 DEBUG|        base_utils:0280| [stdout]   Accept-Ranges: bytes
02/09 08:51:56.681 DEBUG|        base_utils:0280| [stdout]   Content-Length: 282931073
02/09 08:51:56.681 DEBUG|        base_utils:0280| [stdout]   Keep-Alive: timeout=60, max=1000
02/09 08:51:56.681 DEBUG|        base_utils:0280| [stdout]   Connection: Keep-Alive
02/09 08:51:56.681 DEBUG|        base_utils:0280| [stdout]   Content-Type: application/x-gzip
02/09 09:01:55.594 DEBUG|        base_utils:0280| [stdout] 
02/09 09:01:55.595 DEBUG|        base_utils:0280| [stdout] gzip: stdin: unexpected end of file
02/09 09:01:55.595 DEBUG|        base_utils:0280| [stdout] tar: Unexpected EOF in archive
02/09 09:01:55.595 DEBUG|        base_utils:0280| [stdout] tar: Unexpected EOF in archive
02/09 09:01:55.595 DEBUG|        base_utils:0280| [stdout] tar: Error is not recoverable: exiting now
02/09 09:01:55.596 DEBUG|        base_utils:0280| [stdout] Downloading command returns code 2.
02/09 09:01:55.596 DEBUG|        base_utils:0280| [stdout] Downloading failed, retrying.
02/09 09:01:55.596 DEBUG|        base_utils:0280| [stdout] 
02/09 09:01:55.596 DEBUG|        base_utils:0280| [stdout] gzip: stdin: unexpected end of file
02/09 09:01:55.597 DEBUG|        base_utils:0280| [stdout] tar: Child returned status 1
02/09 09:01:55.597 DEBUG|        base_utils:0280| [stdout] tar: Error is not recoverable: exiting now
02/09 09:01:55.597 DEBUG|        base_utils:0280| [stdout] Downloading command returns code 2.

My guess is the download is not completed well, which results in a bad tgz file. Tar cannot unzip it, and then raises errors.
Comment 7 by xixuan@chromium.org, Feb 13 2017
Owner: ----
I don't have a solution for this error, the most possible suspect is network issues. So I change myself to cc'list. 

An expert with update_engine and is able to check update_engine.log would help a lot, since I checked it but found "nothing useful"...
Status: WontFix
Well, based on the discussion, I think this is caused by network flake. Mark it as WontFix now. Feel free to reopen it if it occurs again.
Sign in to add a comment