guado_moblab has been failing provisioning |
|||||||||||||||
Issue descriptionSince the end of last week, guado moblab has been failing the provision step. The errors that I see look like "bash: /tmp/stateful_update: Permission denied". It looks like the same machine: chromeos2-row5-rack10-host1. Seems like it might be similar to chromium:591965 ? https://uberchromegw.corp.google.com/i/chromeos/builders/guado_moblab-paladin Here is a snippet of the debug logs: 04/04 04:25:56.279 DEBUG| ssh_host:0153| Running (ssh) 'sudo stop ap-update-manager' 04/04 04:25:56.442 DEBUG| base_utils:0268| [stderr] stop: Unknown job: ap-update-manager 04/04 04:25:56.445 DEBUG| ssh_host:0153| Running (ssh) 'cat "/etc/lsb-release"' 04/04 04:25:56.597 DEBUG| base_utils:0268| [stdout] CHROMEOS_RELEASE_APPID={BCC6E30A-D921-18E8-73D3-60177AD4AF86} 04/04 04:25:56.598 DEBUG| base_utils:0268| [stdout] CHROMEOS_BOARD_APPID={BCC6E30A-D921-18E8-73D3-60177AD4AF86} 04/04 04:25:56.599 DEBUG| base_utils:0268| [stdout] CHROMEOS_CANARY_APPID={90F229CE-83E2-4FAF-8479-E368A34938B1} 04/04 04:25:56.599 DEBUG| base_utils:0268| [stdout] DEVICETYPE=CHROMEBOX 04/04 04:25:56.600 DEBUG| base_utils:0268| [stdout] CHROMEOS_RELEASE_BOARD=guado_moblab 04/04 04:25:56.600 DEBUG| base_utils:0268| [stdout] CHROMEOS_DEVSERVER=http://build116-m2.golo.chromium.org:8080 04/04 04:25:56.601 DEBUG| base_utils:0268| [stdout] GOOGLE_RELEASE=8006.0.0-rc4 04/04 04:25:56.601 DEBUG| base_utils:0268| [stdout] CHROMEOS_RELEASE_BUILD_NUMBER=8006 04/04 04:25:56.602 DEBUG| base_utils:0268| [stdout] CHROMEOS_RELEASE_BRANCH_NUMBER=0 04/04 04:25:56.602 DEBUG| base_utils:0268| [stdout] CHROMEOS_RELEASE_CHROME_MILESTONE=51 04/04 04:25:56.603 DEBUG| base_utils:0268| [stdout] CHROMEOS_RELEASE_PATCH_NUMBER=0-rc4 04/04 04:25:56.604 DEBUG| base_utils:0268| [stdout] CHROMEOS_RELEASE_TRACK=testimage-channel 04/04 04:25:56.604 DEBUG| base_utils:0268| [stdout] CHROMEOS_RELEASE_DESCRIPTION=8006.0.0-rc4 (Continuous Builder - Builder: N/A) guado_moblab 04/04 04:25:56.605 DEBUG| base_utils:0268| [stdout] CHROMEOS_RELEASE_NAME=Chromium OS 04/04 04:25:56.605 DEBUG| base_utils:0268| [stdout] CHROMEOS_RELEASE_BUILD_TYPE=Continuous Builder - Builder: N/A 04/04 04:25:56.606 DEBUG| base_utils:0268| [stdout] CHROMEOS_RELEASE_VERSION=8006.0.0-rc4 04/04 04:25:56.606 DEBUG| base_utils:0268| [stdout] CHROMEOS_AUSERVER=http://build116-m2.golo.chromium.org:8080/update 04/04 04:25:56.609 INFO | autoupdater:0482| Updating from version 8006.0.0-rc4 to R51-8149.0.0-rc4. 04/04 04:25:56.647 INFO | autoupdater:0494| Installing from http://172.17.40.27:8082/update/guado_moblab-paladin/R51-8149.0.0-rc4 to chromeos2-row5-rack10-host1 04/04 04:25:56.648 DEBUG| ssh_host:0153| Running (ssh) 'rm -f /var/run/update_engine_autoupdate_completed' 04/04 04:25:56.801 DEBUG| ssh_host:0153| Running (ssh) 'stop ui || true' 04/04 04:25:57.206 DEBUG| base_utils:0268| [stdout] ui stop/waiting 04/04 04:25:57.208 DEBUG| ssh_host:0153| Running (ssh) 'stop update-engine || true' 04/04 04:25:57.365 DEBUG| base_utils:0268| [stdout] update-engine stop/waiting 04/04 04:25:57.367 DEBUG| ssh_host:0153| Running (ssh) 'start update-engine' 04/04 04:25:57.832 DEBUG| base_utils:0268| [stdout] update-engine start/running, process 11022 04/04 04:25:57.834 DEBUG| ssh_host:0153| Running (ssh) '/usr/bin/update_engine_client -status 2>&1 | grep CURRENT_OP' 04/04 04:25:57.994 DEBUG| base_utils:0268| [stdout] CURRENT_OP=UPDATE_STATUS_IDLE 04/04 04:25:57.997 DEBUG| abstract_ssh:0410| send_file. source: /usr/local/google/chromeos/src/platform/dev/stateful_update, dest: /tmp/stateful_update, delete_dest: True,preserve_symlinks:False 04/04 04:25:57.997 DEBUG| ssh_host:0153| Running (ssh) 'rsync --version' 04/04 04:25:58.151 DEBUG| abstract_ssh:0427| Using Rsync. 04/04 04:25:58.153 DEBUG| base_utils:0177| Running 'rsync -L --delete --timeout=1800 --rsh='/usr/bin/ssh -a -x -o ControlPath=/tmp/_autotmp_Yn4CyEssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22' -az --no-o --no-g "/usr/local/google/chromeos/src/platform/dev/stateful_update" "root@chromeos2-row5-rack10-host1:"/tmp/stateful_update""' 04/04 04:25:58.539 DEBUG| ssh_host:0153| Running (ssh) '/tmp/stateful_update --stateful_change=reset 2>&1' 04/04 04:25:58.698 DEBUG| base_utils:0268| [stdout] bash: /tmp/stateful_update: Permission denied 04/04 04:25:58.700 WARNI| cros_host:0744| Autoupdate did not complete. 04/04 04:25:58.708 WARNI| test:0606| Autotest caught exception when running test: Traceback (most recent call last): File "/usr/local/autotest/client/common_lib/test.py", line 600, in _exec _call_test_function(self.execute, *p_args, **p_dargs) File "/usr/local/autotest/client/common_lib/test.py", line 810, in _call_test_function raise error.UnhandledTestFail(e) UnhandledTestFail: Unhandled AutoservRunError: command execution error * Command: /usr/bin/ssh -a -x -o ControlPath=/tmp/_autotmp_Yn4CyEssh- master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22 chromeos2-row5-rack10-host1 "export LIBC_FATAL_STDERR_=1; /tmp/stateful_update --stateful_change=reset 2>&1" Exit status: 126 Duration: 0.114089012146 stdout: bash: /tmp/stateful_update: Permission denied Traceback (most recent call last): File "/usr/local/autotest/client/common_lib/test.py", line 804, in _call_test_function return func(*args, **dargs) File "/usr/local/autotest/client/common_lib/test.py", line 461, in execute dargs) File "/usr/local/autotest/client/common_lib/test.py", line 347, in _call_run_once_with_retry postprocess_profiled_run, args, dargs) File "/usr/local/autotest/client/common_lib/test.py", line 376, in _call_run_once self.run_once(*args, **dargs) File "/usr/local/autotest/server/site_tests/provision_AutoUpdate/provision_AutoUpdate.py", line 136, in run_once force_full_update=force) File "/usr/local/autotest/server/afe_utils.py", line 192, in machine_install_and_update_labels image_name, host_attributes = host.machine_install(*args, **dargs) File "/usr/local/autotest/server/hosts/cros_host.py", line 742, in machine_install updater.run_update() File "/usr/local/autotest/site-packages/statsd/timer.py", line 95, in _decorator return function(*args, **kwargs) File "/usr/local/autotest/client/common_lib/cros/autoupdater.py", line 498, in run_update self.reset_stateful_partition() File "/usr/local/autotest/client/common_lib/cros/autoupdater.py", line 384, in reset_stateful_partition self._run(' '.join(statefuldev_cmd)) File "/usr/local/autotest/client/common_lib/cros/autoupdater.py", line 292, in _run return self.host.run(cmd, *args, **kwargs) File "/usr/local/autotest/server/hosts/ssh_host.py", line 162, in run options, stdin, args, ignore_timeout) File "/usr/local/autotest/server/hosts/ssh_host.py", line 130, in _run raise error.AutoservRunError("command execution error", result) AutoservRunError: command execution error * Command: /usr/bin/ssh -a -x -o ControlPath=/tmp/_autotmp_Yn4CyEssh- master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22 chromeos2-row5-rack10-host1 "export LIBC_FATAL_STDERR_=1; /tmp/stateful_update --stateful_change=reset 2>&1" Exit status: 126 Duration: 0.114089012146 Assigning to current deputy.
,
Apr 4 2016
The /tmp dir is being mounted as noexec localhost tmp # mount | grep "/tmp" tmp on /tmp type tmpfs (rw,nosuid,nodev,noexec,relatime) which causes the call to stateful_update to fail since no script can execute in /tmp. Time to go figure out what changed for the mounting options.
,
Apr 4 2016
I locked chromeos2-row5-rack10-host1 and started the suite on another dut (chromeos2-row5-rack10-host2) to see if it repros, if not, most likely a bad ssd on the failing dut and will mark it for repair. http://cautotest/afe/#tab_id=view_job&object_id=58975706
,
Apr 4 2016
And it passed, I'll file a ticket to have the dut repaired.
,
Apr 4 2016
filed https://b.corp.google.com/u/0/issues/27999467 template from go/cros-lab-device-repair
,
Apr 25 2016
,
Apr 27 2016
,
Jun 9 2016
,
Jul 1 2016
,
Aug 29 2016
,
Oct 7 2016
,
Oct 10 2016
,
Nov 19 2016
,
Jan 21 2017
,
Mar 4 2017
,
Apr 17 2017
,
May 30 2017
,
Aug 1 2017
,
Oct 14 2017
|
|||||||||||||||
►
Sign in to add a comment |
|||||||||||||||
Comment 1 by xixuan@chromium.org
, Apr 4 2016