New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.
Starred by 3 users
Status: Fixed
Owner:
Closed: Aug 8
Cc:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment
Moblab in the lab is failing to provision
Project Member Reported by haddowk@chromium.org, Aug 3 Back to list
Failed: Legacy host verification checks
Traceback (most recent call last):
  File "/usr/local/autotest/client/common_lib/hosts/repair.py", line 329, in _verify_host
    self.verify(host)
  File "/usr/local/autotest/server/hosts/repair.py", line 55, in verify
    host.verify_software()
  File "/usr/local/autotest/server/hosts/cros_host.py", line 1536, in verify_software
    super(CrosHost, self).verify_software()
  File "/usr/local/autotest/server/hosts/abstract_ssh.py", line 755, in verify_software
    self.AUTOTEST_GB_DISKSPACE_REQUIRED)
  File "/usr/local/autotest/client/common_lib/hosts/base_classes.py", line 355, in check_diskspace
    free_space_gb = int(df[3]) / mb_per_gb
IndexError: list index out of range


Possibly related to https://chromium-review.googlesource.com/#/c/chromiumos/third_party/autotest/+/594663/
 
Before your CL, MoblabHost.verify_software was overriding the parent one without calling super(...).verify_software.

You removed it, so now moblab is failing sanity verify_software for base host!

This is bad as far as moblab is concerned.

I'd say first "fix this" by adding back a dummy verify_software. This brings us back to where we were.

We can then see why moblab fails the sanity testing expected of all cros hosts.
Forget what I said. I can't see. MoblabHost was calling super's method :(
08/02 16:47:32.403 INFO |            repair:0327| Verifying this condition: Legacy host verification checks
08/02 16:47:32.403 DEBUG|          autotest:0119| Using existing host autodir: /usr/local/autodir
08/02 16:47:32.403 INFO |      base_classes:0353| Checking for >= 0.7 GB of space under /usr/local/autodir on machine chromeos2-row1-rack8-host7
08/02 16:47:32.412 DEBUG|          ssh_host:0296| Running (ssh) 'df -PB 1000000 /usr/local/autodir | tail -1' from 'verify_software|verify_software|check_diskspace|run|wrapper|run_very_slowly'
08/02 16:47:32.563 ERROR|             utils:0280| [stderr] df: /usr/local/autodir: No such file or directory


So, it decided to "use existing autodir", but then df said that autodir doesn't exist.
As a result df returned no result. (and we happily assumed that the result would have three parts..
Status: Assigned
So, yeah.
This is indeed failing because of https://chromium-review.googlesource.com/c/594663/5/server/hosts/moblab_host.py#b246

verify_software was creating /usr/local/autodir before verifying diskspace in it.

I don't think that CL is to blame as such -- host.get_autodir seems to not guarantee that autodir exists, but its callers seem to assume it does.

In fact check_diskspace's caller claims to ignore failures when the directory doesn't exist, but it clearly doesn't do so.
Cc: shuqianz@chromium.org
+deputy: This killed last CQ run, will kill this one too.

I'm reverting the CL that caused this.
Project Member Comment 6 by bugdroid1@chromium.org, Aug 3
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/26933a0b0dd9642300ef4e83ca9b31f5ccf82352

commit 26933a0b0dd9642300ef4e83ca9b31f5ccf82352
Author: Prathmesh Prabhu <pprabhu@chromium.org>
Date: Thu Aug 03 01:09:12 2017

Revert "[moblab] Move the moblab setup from the host provision to the test."

This reverts commit 007036ca865ca80ad31988aa02080e6370829244.

Reason for revert: This CL tickled a bug that causes moblab provision to fail. 
This CL is a poor victim of earlier bugs. But, CQ must move on...

Original change's description:
> [moblab] Move the moblab setup from the host provision to the test.
> 
> In the aim to have moblab issue show up as test failures vs
> provision failures tests should setup the moblab.
> 
> BUG= chromium:749325 
> TEST=trybot https://uberchromegw.corp.google.com/i/chromiumos.tryserver/builders/paladin/builds/3451
> 
> Change-Id: I75e486d4726a33243e4e3238b216b33df1838ad8
> Reviewed-on: https://chromium-review.googlesource.com/594663
> Commit-Ready: Keith Haddow <haddowk@chromium.org>
> Tested-by: Keith Haddow <haddowk@chromium.org>
> Reviewed-by: Prathmesh Prabhu <pprabhu@chromium.org>
> Reviewed-by: Keith Haddow <haddowk@chromium.org>

Bug= chromium:749325 
BUG= chromium:751895 

Change-Id: Id2dbfc67a8c108695fb5d42d7bac1ba9e0a9457b
Reviewed-on: https://chromium-review.googlesource.com/599356
Tested-by: Prathmesh Prabhu <pprabhu@chromium.org>
Reviewed-by: Prathmesh Prabhu <pprabhu@chromium.org>

[modify] https://crrev.com/26933a0b0dd9642300ef4e83ca9b31f5ccf82352/server/cros/moblab_test.py
[modify] https://crrev.com/26933a0b0dd9642300ef4e83ca9b31f5ccf82352/server/hosts/moblab_host.py

Cc: akes...@chromium.org jkwang@chromium.org dbasehore@chromium.org hidehiko@chromium.org
 Issue 751885  has been merged into this issue.
Project Member Comment 8 by bugdroid1@chromium.org, Aug 3
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/eade7bf913753e3cd651c59c5c776e5081ac20d3

commit eade7bf913753e3cd651c59c5c776e5081ac20d3
Author: Hidehiko Abe <hidehiko@chromium.org>
Date: Thu Aug 03 06:29:15 2017

Mark guado_moblab experiment temporarily.

The builder is failing and making master-paladin red.

BUG= chromium:751895 
TEST=None

Change-Id: I6780001c60673938e397a6f41931e04d68099f47
Reviewed-on: https://chromium-review.googlesource.com/599134
Tested-by: Hidehiko Abe <hidehiko@chromium.org>
Trybot-Ready: Hidehiko Abe <hidehiko@chromium.org>
Reviewed-by: Manoj Gupta <manojgupta@chromium.org>
Commit-Queue: Hidehiko Abe <hidehiko@chromium.org>

[modify] https://crrev.com/eade7bf913753e3cd651c59c5c776e5081ac20d3/cbuildbot/config_dump.json
[modify] https://crrev.com/eade7bf913753e3cd651c59c5c776e5081ac20d3/cbuildbot/chromeos_config.py

FYI: The blamed CL should have been blocked in the CQ, if the CQ had been running autotest tests with SSP. This is the second indication in 2 days that we're not using SSP at all in the CQ atm

That is a problem.
Re #9: My no-SSP guess was wrong. The CQ run in which the original CL was submitted did run with SSP: https://viceroy.corp.google.com/chromeos/suite_details?job_id=132044471


The problem is that provision never uses SSP. So this provision didn't catch the problem:
https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/hosts/chromeos2-row1-rack8-host7/636588-provision/20170108154626/

This means that you can't use a moblab trybot to test this change. You must provision a moblab DUT from your local autotest setup (with your change) to test this.
I have tried to reproduce this locally by provisioning a local moblab when the faulty code is in my client

 test_that --board guado_moblab 172.22.19.56 provision_AutoUpdate --args='value=guado_moblab-release/R62-9802.0.0'

But the test just passes and the device gets auto updated.  I will continue with fixes but it would be nice to know how to reproduce this error locally so I know that I do not make another mistake.
Re #11: Nothing in provision_AutoUpdate makes me believe that it requests skipping SSP by default. Can you double check that your test_that invocation didn't end up using SSP (look for ssp_logs in the results folder)
Status: Started
I can find no ssp_logs in the test_that results folder.
Project Member Comment 16 by bugdroid1@chromium.org, Aug 6
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/07f1d3e95c260b4958fbc122da02e200f95f5f8b

commit 07f1d3e95c260b4958fbc122da02e200f95f5f8b
Author: Keith Haddow <haddowk@chromium.org>
Date: Sun Aug 06 07:30:48 2017

[autotest] Make the diskspace check more robust.

Check to see the path provided to check the diskspace exists and
catch some exceptions that may happen if the du command fails

BUG= chromium:751895 
TEST= test_that, trybot job

Change-Id: I132ecbb4c92a61d496a08b22f841d55b00bef8ee
Reviewed-on: https://chromium-review.googlesource.com/601414
Commit-Ready: Keith Haddow <haddowk@chromium.org>
Tested-by: Keith Haddow <haddowk@chromium.org>
Reviewed-by: Keith Haddow <haddowk@chromium.org>

[modify] https://crrev.com/07f1d3e95c260b4958fbc122da02e200f95f5f8b/client/common_lib/hosts/base_classes.py
[modify] https://crrev.com/07f1d3e95c260b4958fbc122da02e200f95f5f8b/client/common_lib/error.py
[modify] https://crrev.com/07f1d3e95c260b4958fbc122da02e200f95f5f8b/server/hosts/abstract_ssh.py
[modify] https://crrev.com/07f1d3e95c260b4958fbc122da02e200f95f5f8b/client/common_lib/hosts/base_classes_unittest.py

Issue 753015 has been merged into this issue.
Project Member Comment 18 by bugdroid1@chromium.org, Aug 8
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/b45c84267bc5fbbb4cec0aad025cf3a52f387a57

commit b45c84267bc5fbbb4cec0aad025cf3a52f387a57
Author: Prathmesh Prabhu <pprabhu@chromium.org>
Date: Tue Aug 08 05:18:51 2017

[autotest] Fix check_diskspace path existence check

CL:601414 incorrectly used os.path.exists to check for path existence.
All os level calls must go via the |run| method on hosts classes. FixIt.

BUG= chromium:751895 
BUG=chromium:753015
TEST=(new) unittests; test-push passes.

Change-Id: I7781411e3f5b8edafe4fa38ce3ae4b574826308e
Reviewed-on: https://chromium-review.googlesource.com/604207
Commit-Ready: Prathmesh Prabhu <pprabhu@chromium.org>
Tested-by: Prathmesh Prabhu <pprabhu@chromium.org>
Reviewed-by: Keith Haddow <haddowk@chromium.org>

[modify] https://crrev.com/b45c84267bc5fbbb4cec0aad025cf3a52f387a57/client/common_lib/hosts/base_classes.py
[modify] https://crrev.com/b45c84267bc5fbbb4cec0aad025cf3a52f387a57/client/common_lib/hosts/base_classes_unittest.py

Status: Fixed
Sign in to add a comment