New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 621257 link

Starred by 2 users

Issue metadata

Status: Verified
Owner:
Last visit > 30 days ago
Closed: Aug 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

SYNC_COUNT > 1 tests fail to run

Project Member Reported by lgoo...@chromium.org, Jun 18 2016

Issue description

Version: 8172.27.0 guado_moblab

I have some tests that require multiple DUTs.

Using the SYNC_COUNT parameter in the control file to specify the
number of DUTs needed by the test, I am seeing the following
behavior:

 - the requested number of DUTs are successfully provisioned

 - as soon as all DUTs are provisioned, the job aborts

 - the AFE shows the job status as 'Stopped'

Attached is a simple example server-based test, autotest_SyncCount, that
uses SYNC_COUNT = 2 to demonstrate the problem.

To reproduce:

  1) connect 2 chromeos devices as moblab clients and give them a pool
     label. This example assumes panther clients.

  2) add the autotest_SyncCount test to autotest/files/server/site_tests/

  3) cros_workon --board=panther autotest-server-tests

  4) add the test to the autotest-server-tests ebuild

  5) build_packages --board=panther autotest-all

  6) cros stage gs://chromeos-image-archive/panther-release/R53-8460.0.0 <moblab-ip>

  7) run the test:

     test_that --board=panther --build=panther-custom/R53-8460.0.0 --web=<moblab-ip> --pool=<pool> :lab: autotest_SyncCount

Observed:

  - DUTs are provisioned successfully

  - The test then fails immediately with:

      autotest_SyncCount     ABORT: Timed out, did not run.

  - The AFE shows the job in state '2 Stopped'

Expected:

  - The test should execute

 
autotest_SyncCount.tgz
839 bytes Download
Cc: kevcheng@chromium.org
Components: Infra>Client>ChromeOS
There's nothing in Moblab to treat SYNC_COUNT different from
the regular Autotest code, so this bug is bound to be generic.

There's a decent chance that SYNC_COUNT > 1 fails regardless of
whether it's passed via the control file, or passed in at job
creation.

I don't know if this feature ever worked, but if it did, it's
probably been years since it was exercised.  :-(

I note that regardless of the status of the feature right now,
our long term plans for Servo V4 support really want this feature
to do the right thing, so we want to fix it, not deprecate it.

I have had success with this patch to scheduler_models Job.run_if_ready:

*** /usr/local/autotest/scheduler/scheduler_models.py.orig
--- /usr/local/autotest/scheduler/scheduler_models.py
***************
*** 1459,1465 ****
          ready to run.
          """
          if not self.is_ready():
!             self.stop_if_necessary()
          elif queue_entry.atomic_group:
              self.run_with_ready_delay(queue_entry)
          else:
--- 1459,1465 ----
          ready to run.
          """
          if not self.is_ready():
!             logging.info('Job not ready: %s', self)
          elif queue_entry.atomic_group:
              self.run_with_ready_delay(queue_entry)
          else:

but don't know if it is the right solution.

Labels: Pri-2
Owner: kevcheng@chromium.org
Status: Assigned (was: Untriaged)
-> kevcheng@ since you'll have to fix this eventually anyway (I think) for servov4
I did some investigation of this and have a fix to propose. Could you take a look to see if it is on the right track?

Doc at go/autotest-sync-count-fix

Status: Fixed (was: Assigned)
Labels: VerifyIn-54
Labels: VerifyIn-55

Comment 11 by dchan@chromium.org, Oct 10 2016

Labels: -VerifyIn-55

Comment 12 by dchan@google.com, Nov 19 2016

Labels: VerifyIn-56

Comment 13 by dchan@google.com, Jan 21 2017

Labels: VerifyIn-57

Comment 14 by dchan@google.com, Mar 4 2017

Labels: VerifyIn-58

Comment 15 by dchan@google.com, Apr 17 2017

Labels: VerifyIn-59

Comment 16 by dchan@google.com, May 30 2017

Labels: VerifyIn-60
Project Member

Comment 17 by bugdroid1@chromium.org, Jun 1 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/483287116e6b9607ef3415f527116e88a3e70471

commit 483287116e6b9607ef3415f527116e88a3e70471
Author: Aviv Keshet <akeshet@chromium.org>
Date: Thu Jun 01 04:14:31 2017

autotest-server-tests: build autotest_SyncCount

CQ-DEPEND=CL:519465
BUG=chromium:726490,  chromium:621257 
TEST=Attempt to run autotest_SyncCount in lab, see that it in fact runs.

Change-Id: Ie844d6c0956fd9b7eb7434cd07585c083658a723
Reviewed-on: https://chromium-review.googlesource.com/518982
Commit-Ready: Aviv Keshet <akeshet@chromium.org>
Tested-by: Aviv Keshet <akeshet@chromium.org>
Reviewed-by: Ilja H. Friedel <ihf@chromium.org>
Reviewed-by: Aviv Keshet <akeshet@chromium.org>

[modify] https://crrev.com/483287116e6b9607ef3415f527116e88a3e70471/chromeos-base/autotest-server-tests/autotest-server-tests-9999.ebuild

Project Member

Comment 18 by bugdroid1@chromium.org, Jun 1 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/828e78005dbda76677072d88c465d7304c8230ae

commit 828e78005dbda76677072d88c465d7304c8230ae
Author: Aviv Keshet <akeshet@chromium.org>
Date: Thu Jun 01 04:14:31 2017

autotest: temporarily remove autotest_SyncControl from push_to_prod

BUG=chromium:726490,  chromium:621257 
TEST=None

Change-Id: Ib9dec32b0f8880b27dec2c683f60ec7e7f8e07b3
Reviewed-on: https://chromium-review.googlesource.com/519465
Commit-Ready: Aviv Keshet <akeshet@chromium.org>
Tested-by: Aviv Keshet <akeshet@chromium.org>
Reviewed-by: Dan Shi <dshi@google.com>

[modify] https://crrev.com/828e78005dbda76677072d88c465d7304c8230ae/server/site_tests/autotest_SyncCount/control

Project Member

Comment 19 by bugdroid1@chromium.org, Jun 2 2017

Labels: merge-merged-release-R58-9334.B
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/deadecfbe39473b474fa542ab65786e138bc28ad

commit deadecfbe39473b474fa542ab65786e138bc28ad
Author: Aviv Keshet <akeshet@chromium.org>
Date: Fri Jun 02 21:44:45 2017

autotest-server-tests: build autotest_SyncCount

CQ-DEPEND=CL:519465
BUG=chromium:726490,  chromium:621257 
TEST=Attempt to run autotest_SyncCount in lab, see that it in fact runs.

Change-Id: Ie844d6c0956fd9b7eb7434cd07585c083658a723
Reviewed-on: https://chromium-review.googlesource.com/518982
Commit-Ready: Aviv Keshet <akeshet@chromium.org>
Tested-by: Aviv Keshet <akeshet@chromium.org>
Reviewed-by: Ilja H. Friedel <ihf@chromium.org>
Reviewed-by: Aviv Keshet <akeshet@chromium.org>
(cherry picked from commit 483287116e6b9607ef3415f527116e88a3e70471)
Reviewed-on: https://chromium-review.googlesource.com/523265
Reviewed-by: Prathmesh Prabhu <pprabhu@chromium.org>
Commit-Queue: Prathmesh Prabhu <pprabhu@chromium.org>
Tested-by: Prathmesh Prabhu <pprabhu@chromium.org>

[modify] https://crrev.com/deadecfbe39473b474fa542ab65786e138bc28ad/chromeos-base/autotest-server-tests/autotest-server-tests-9999.ebuild

Project Member

Comment 20 by bugdroid1@chromium.org, Jun 3 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/9f917d77714f7e6c12bbf699a4ecfe507ecd999f

commit 9f917d77714f7e6c12bbf699a4ecfe507ecd999f
Author: Prathmesh Prabhu <pprabhu@chromium.org>
Date: Sat Jun 03 01:39:40 2017

Revert "autotest: temporarily remove autotest_SyncControl from push_to_prod"

This reverts commit 828e78005dbda76677072d88c465d7304c8230ae.

Reason for revert: autotest_SyncCount has been shown to be stable (famous last words)

Original change's description:
> autotest: temporarily remove autotest_SyncControl from push_to_prod
>
> BUG=chromium:726490,  chromium:621257 
> TEST=None
>
> Change-Id: Ib9dec32b0f8880b27dec2c683f60ec7e7f8e07b3
> Reviewed-on: https://chromium-review.googlesource.com/519465
> Commit-Ready: Aviv Keshet <akeshet@chromium.org>
> Tested-by: Aviv Keshet <akeshet@chromium.org>
> Reviewed-by: Dan Shi <dshi@google.com>
>

BUG=chromium:726490,  chromium:621257 

Change-Id: I39a364998f68691974932f352cc0544f9fc700df
Reviewed-on: https://chromium-review.googlesource.com/522932
Commit-Ready: Prathmesh Prabhu <pprabhu@chromium.org>
Tested-by: Prathmesh Prabhu <pprabhu@chromium.org>
Reviewed-by: Prathmesh Prabhu <pprabhu@chromium.org>

[modify] https://crrev.com/9f917d77714f7e6c12bbf699a4ecfe507ecd999f/server/site_tests/autotest_SyncCount/control

Project Member

Comment 21 by bugdroid1@chromium.org, Jun 5 2017

Labels: merge-merged-release-R60-9592.B
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/6036317cf1e24db61a077186e23a37b4770c269b

commit 6036317cf1e24db61a077186e23a37b4770c269b
Author: Aviv Keshet <akeshet@chromium.org>
Date: Mon Jun 05 17:01:22 2017

autotest-server-tests: build autotest_SyncCount

CQ-DEPEND=CL:519465
BUG=chromium:726490,  chromium:621257 
TEST=Attempt to run autotest_SyncCount in lab, see that it in fact runs.

Change-Id: Ie844d6c0956fd9b7eb7434cd07585c083658a723
Reviewed-on: https://chromium-review.googlesource.com/518982
Commit-Ready: Aviv Keshet <akeshet@chromium.org>
Tested-by: Aviv Keshet <akeshet@chromium.org>
Reviewed-by: Ilja H. Friedel <ihf@chromium.org>
Reviewed-by: Aviv Keshet <akeshet@chromium.org>
(cherry picked from commit 483287116e6b9607ef3415f527116e88a3e70471)
Reviewed-on: https://chromium-review.googlesource.com/523263
Reviewed-by: Prathmesh Prabhu <pprabhu@chromium.org>
Commit-Queue: Prathmesh Prabhu <pprabhu@chromium.org>
Tested-by: Prathmesh Prabhu <pprabhu@chromium.org>

[modify] https://crrev.com/6036317cf1e24db61a077186e23a37b4770c269b/chromeos-base/autotest-server-tests/autotest-server-tests-9999.ebuild

Labels: VerifyIn-61
Status: Verified (was: Fixed)
Closing. Please reopen it if its not fixed. Thanks!

Sign in to add a comment