New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 600403 link

Starred by 2 users

Issue metadata

Status: Archived
Owner:
Closed: Jun 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment

guado_moblab has been failing provisioning

Project Member Reported by aaboagye@chromium.org, Apr 4 2016

Issue description

Since the end of last week, guado moblab has been failing the provision step. The errors that I see look like "bash: /tmp/stateful_update: Permission denied". It looks like the same machine: chromeos2-row5-rack10-host1.

Seems like it might be similar to chromium:591965 ?

https://uberchromegw.corp.google.com/i/chromeos/builders/guado_moblab-paladin

Here is a snippet of the debug logs:
04/04 04:25:56.279 DEBUG|          ssh_host:0153| Running (ssh) 'sudo stop ap-update-manager'
04/04 04:25:56.442 DEBUG|        base_utils:0268| [stderr] stop: Unknown job: ap-update-manager
04/04 04:25:56.445 DEBUG|          ssh_host:0153| Running (ssh) 'cat "/etc/lsb-release"'
04/04 04:25:56.597 DEBUG|        base_utils:0268| [stdout] CHROMEOS_RELEASE_APPID={BCC6E30A-D921-18E8-73D3-60177AD4AF86}
04/04 04:25:56.598 DEBUG|        base_utils:0268| [stdout] CHROMEOS_BOARD_APPID={BCC6E30A-D921-18E8-73D3-60177AD4AF86}
04/04 04:25:56.599 DEBUG|        base_utils:0268| [stdout] CHROMEOS_CANARY_APPID={90F229CE-83E2-4FAF-8479-E368A34938B1}
04/04 04:25:56.599 DEBUG|        base_utils:0268| [stdout] DEVICETYPE=CHROMEBOX
04/04 04:25:56.600 DEBUG|        base_utils:0268| [stdout] CHROMEOS_RELEASE_BOARD=guado_moblab
04/04 04:25:56.600 DEBUG|        base_utils:0268| [stdout] CHROMEOS_DEVSERVER=http://build116-m2.golo.chromium.org:8080
04/04 04:25:56.601 DEBUG|        base_utils:0268| [stdout] GOOGLE_RELEASE=8006.0.0-rc4
04/04 04:25:56.601 DEBUG|        base_utils:0268| [stdout] CHROMEOS_RELEASE_BUILD_NUMBER=8006
04/04 04:25:56.602 DEBUG|        base_utils:0268| [stdout] CHROMEOS_RELEASE_BRANCH_NUMBER=0
04/04 04:25:56.602 DEBUG|        base_utils:0268| [stdout] CHROMEOS_RELEASE_CHROME_MILESTONE=51
04/04 04:25:56.603 DEBUG|        base_utils:0268| [stdout] CHROMEOS_RELEASE_PATCH_NUMBER=0-rc4
04/04 04:25:56.604 DEBUG|        base_utils:0268| [stdout] CHROMEOS_RELEASE_TRACK=testimage-channel
04/04 04:25:56.604 DEBUG|        base_utils:0268| [stdout] CHROMEOS_RELEASE_DESCRIPTION=8006.0.0-rc4 (Continuous Builder - Builder: N/A) guado_moblab
04/04 04:25:56.605 DEBUG|        base_utils:0268| [stdout] CHROMEOS_RELEASE_NAME=Chromium OS
04/04 04:25:56.605 DEBUG|        base_utils:0268| [stdout] CHROMEOS_RELEASE_BUILD_TYPE=Continuous Builder - Builder: N/A
04/04 04:25:56.606 DEBUG|        base_utils:0268| [stdout] CHROMEOS_RELEASE_VERSION=8006.0.0-rc4
04/04 04:25:56.606 DEBUG|        base_utils:0268| [stdout] CHROMEOS_AUSERVER=http://build116-m2.golo.chromium.org:8080/update
04/04 04:25:56.609 INFO |       autoupdater:0482| Updating from version 8006.0.0-rc4 to R51-8149.0.0-rc4.
04/04 04:25:56.647 INFO |       autoupdater:0494| Installing from http://172.17.40.27:8082/update/guado_moblab-paladin/R51-8149.0.0-rc4 to chromeos2-row5-rack10-host1
04/04 04:25:56.648 DEBUG|          ssh_host:0153| Running (ssh) 'rm -f /var/run/update_engine_autoupdate_completed'
04/04 04:25:56.801 DEBUG|          ssh_host:0153| Running (ssh) 'stop ui || true'
04/04 04:25:57.206 DEBUG|        base_utils:0268| [stdout] ui stop/waiting
04/04 04:25:57.208 DEBUG|          ssh_host:0153| Running (ssh) 'stop update-engine || true'
04/04 04:25:57.365 DEBUG|        base_utils:0268| [stdout] update-engine stop/waiting
04/04 04:25:57.367 DEBUG|          ssh_host:0153| Running (ssh) 'start update-engine'
04/04 04:25:57.832 DEBUG|        base_utils:0268| [stdout] update-engine start/running, process 11022
04/04 04:25:57.834 DEBUG|          ssh_host:0153| Running (ssh) '/usr/bin/update_engine_client -status 2>&1 | grep CURRENT_OP'
04/04 04:25:57.994 DEBUG|        base_utils:0268| [stdout] CURRENT_OP=UPDATE_STATUS_IDLE
04/04 04:25:57.997 DEBUG|      abstract_ssh:0410| send_file. source: /usr/local/google/chromeos/src/platform/dev/stateful_update, dest: /tmp/stateful_update, delete_dest: True,preserve_symlinks:False
04/04 04:25:57.997 DEBUG|          ssh_host:0153| Running (ssh) 'rsync --version'
04/04 04:25:58.151 DEBUG|      abstract_ssh:0427| Using Rsync.
04/04 04:25:58.153 DEBUG|        base_utils:0177| Running 'rsync -L --delete --timeout=1800 --rsh='/usr/bin/ssh -a -x   -o ControlPath=/tmp/_autotmp_Yn4CyEssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22' -az --no-o --no-g "/usr/local/google/chromeos/src/platform/dev/stateful_update" "root@chromeos2-row5-rack10-host1:"/tmp/stateful_update""'
04/04 04:25:58.539 DEBUG|          ssh_host:0153| Running (ssh) '/tmp/stateful_update --stateful_change=reset 2>&1'
04/04 04:25:58.698 DEBUG|        base_utils:0268| [stdout] bash: /tmp/stateful_update: Permission denied
04/04 04:25:58.700 WARNI|         cros_host:0744| Autoupdate did not complete.
04/04 04:25:58.708 WARNI|              test:0606| Autotest caught exception when running test:
Traceback (most recent call last):
  File "/usr/local/autotest/client/common_lib/test.py", line 600, in _exec
    _call_test_function(self.execute, *p_args, **p_dargs)
  File "/usr/local/autotest/client/common_lib/test.py", line 810, in _call_test_function
    raise error.UnhandledTestFail(e)
UnhandledTestFail: Unhandled AutoservRunError: command execution error
* Command: 
    /usr/bin/ssh -a -x    -o ControlPath=/tmp/_autotmp_Yn4CyEssh-
    master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null
    -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o
    ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22
    chromeos2-row5-rack10-host1 "export LIBC_FATAL_STDERR_=1;
    /tmp/stateful_update --stateful_change=reset 2>&1"
Exit status: 126
Duration: 0.114089012146

stdout:
bash: /tmp/stateful_update: Permission denied
Traceback (most recent call last):
  File "/usr/local/autotest/client/common_lib/test.py", line 804, in _call_test_function
    return func(*args, **dargs)
  File "/usr/local/autotest/client/common_lib/test.py", line 461, in execute
    dargs)
  File "/usr/local/autotest/client/common_lib/test.py", line 347, in _call_run_once_with_retry
    postprocess_profiled_run, args, dargs)
  File "/usr/local/autotest/client/common_lib/test.py", line 376, in _call_run_once
    self.run_once(*args, **dargs)
  File "/usr/local/autotest/server/site_tests/provision_AutoUpdate/provision_AutoUpdate.py", line 136, in run_once
    force_full_update=force)
  File "/usr/local/autotest/server/afe_utils.py", line 192, in machine_install_and_update_labels
    image_name, host_attributes = host.machine_install(*args, **dargs)
  File "/usr/local/autotest/server/hosts/cros_host.py", line 742, in machine_install
    updater.run_update()
  File "/usr/local/autotest/site-packages/statsd/timer.py", line 95, in _decorator
    return function(*args, **kwargs)
  File "/usr/local/autotest/client/common_lib/cros/autoupdater.py", line 498, in run_update
    self.reset_stateful_partition()
  File "/usr/local/autotest/client/common_lib/cros/autoupdater.py", line 384, in reset_stateful_partition
    self._run(' '.join(statefuldev_cmd))
  File "/usr/local/autotest/client/common_lib/cros/autoupdater.py", line 292, in _run
    return self.host.run(cmd, *args, **kwargs)
  File "/usr/local/autotest/server/hosts/ssh_host.py", line 162, in run
    options, stdin, args, ignore_timeout)
  File "/usr/local/autotest/server/hosts/ssh_host.py", line 130, in _run
    raise error.AutoservRunError("command execution error", result)
AutoservRunError: command execution error
* Command: 
    /usr/bin/ssh -a -x    -o ControlPath=/tmp/_autotmp_Yn4CyEssh-
    master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null
    -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o
    ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22
    chromeos2-row5-rack10-host1 "export LIBC_FATAL_STDERR_=1;
    /tmp/stateful_update --stateful_change=reset 2>&1"
Exit status: 126
Duration: 0.114089012146


Assigning to current deputy.
 
Cc: xixuan@chromium.org dshi@chromium.org kevcheng@chromium.org
 Issue 600405  has been merged into this issue.
The /tmp dir is being mounted as noexec

localhost tmp # mount | grep "/tmp"
tmp on /tmp type tmpfs (rw,nosuid,nodev,noexec,relatime)

which causes the call to stateful_update to fail since no script can execute in /tmp.


Time to go figure out what changed for the mounting options.
I locked chromeos2-row5-rack10-host1 and started the suite on another dut (chromeos2-row5-rack10-host2) to see if it repros, if not, most likely a bad ssd on the failing dut and will mark it for repair.

http://cautotest/afe/#tab_id=view_job&object_id=58975706
And it passed, I'll file a ticket to have the dut repaired.
Components: Tests>Fails
Status: Untriaged (was: Unconfirmed)

Comment 7 by benhenry@google.com, Apr 27 2016

Components: Infra>Labs
Labels: -Infra-Labs
Status: Fixed (was: Untriaged)
Labels: VerifyIn-53
Labels: VerifyIn-54
Labels: VerifyIn-55

Comment 12 by dchan@chromium.org, Oct 10 2016

Labels: -VerifyIn-55

Comment 13 by dchan@google.com, Nov 19 2016

Labels: VerifyIn-56

Comment 14 by dchan@google.com, Jan 21 2017

Labels: VerifyIn-57

Comment 15 by dchan@google.com, Mar 4 2017

Labels: VerifyIn-58

Comment 16 by dchan@google.com, Apr 17 2017

Labels: VerifyIn-59

Comment 17 by dchan@google.com, May 30 2017

Labels: VerifyIn-60
Labels: VerifyIn-61

Comment 19 by dchan@chromium.org, Oct 14 2017

Status: Archived (was: Fixed)

Sign in to add a comment