New issue
Advanced search Search tips

Issue 746751 link

Starred by 1 user

Issue metadata

Status: Verified
Owner:
Closed: Jul 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug

Blocking:
issue 726490



Sign in to add a comment

autotest_SyncCount test failing to run

Project Member Reported by lgoo...@chromium.org, Jul 20 2017

Issue description

This test broke at R61-9745.0.0.

The server job fails with:

07/14 22:39:37.013 ERROR|        server_job:0932| Exception escaped control file, job aborting:
Traceback (most recent call last):
  File "/usr/local/autotest/server/server_job.py", line 884, in run
    self._execute_code(GET_NETWORK_STATS_CONTROL_FILE, namespace)
  File "/usr/local/autotest/server/server_job.py", line 1425, in _execute_code
    execfile(code_file, namespace, namespace)
  File "/usr/local/autotest/server/control_segments/get_network_stats", line 57, in <module>
    job.parallel_simple(get_network_stats, machines)
  File "/usr/local/autotest/server/server_job.py", line 657, in parallel_simple
    return_results=return_results)
  File "/usr/local/autotest/server/subcommand.py", line 103, in parallel_simple
    subcommands.append(subcommand(function, args, subdir))
  File "/usr/local/autotest/server/subcommand.py", line 116, in __init__
    os.mkdir(self.subdir)

OSError: [Errno 36] File name too long: "/usr/local/autotest/results/128553426-chromeos-test/{'host_info_store': <autotest_lib.server.hosts.afe_store.AfeStore object at 0x7f827443d290>, 'hostname': 'chromeos6-row2-rack4-host21', 'connection_pool': <autotest_lib.server.hosts.ssh_multiplex.ConnectionPool object at 0x7f827443d1d0>, 'afe_host': HOST OBJECT: chromeos6-row2-rack4-host21}"

It looks like the problem occurs when trying to create a subdirectory for each of the machines involved in this multi-DUT test. However, a host dictionary is passed rather than a machine name, so the directory name ends up being the string representation of the host dictionary.

It seems the test failed to run when this string became too long after connection_pool info was added to the host dictionary in this change:

https://chromium-review.googlesource.com/c/547077/7/server/server_job.py

The subdirectory name is set here:

https://chromium.git.corp.google.com/chromiumos/third_party/autotest/+/5a2cac10377d910d6d7f2553b99b595cb570dae2/server/subcommand.py#100

Example failure:

http://cautotest/afe/#tab_id=view_job&object_id=128553426

 
Cc: pprabhu@chromium.org ayatane@chromium.org
Labels: -Pri-3 Pri-1
Looks related to code that either ayatane or pprabhu are touching.
Owner: pprabhu@chromium.org
Status: Assigned (was: Untriaged)
I have had a CL up for this, but it didn't seem to fix everything: https://chromium-review.googlesource.com/c/521706/

Now that this _is_ broken, we can try fixing it ;)

Comment 3 by ihf@chromium.org, Jul 20 2017

Cc: ihf@chromium.org
Adding an observation:

If this change is moved up a level (before the call to _make_parallel_wrapper), it corrects the problem:

https://chromium-review.googlesource.com/c/317579/2/server/server_job.py

insert before this line:

https://chromium.git.corp.google.com/chromiumos/third_party/autotest/+/5a2cac10377d910d6d7f2553b99b595cb570dae2/server/server_job.py#654

but there may be some functions that take advantage of being passed a machine dictionary rather than just a machine name, so this approach might have some undesirable side effects.

Any thoughts on a solution approach? Our mesh tests are currently down due to this issue.

Blocking: 726490
Status: Started (was: Assigned)
Project Member

Comment 10 by bugdroid1@chromium.org, Jul 25 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/d08c86bd1b0371b6c7b77aae3b6c92366a18e887

commit d08c86bd1b0371b6c7b77aae3b6c92366a18e887
Author: Prathmesh Prabhu <pprabhu@chromium.org>
Date: Tue Jul 25 05:54:03 2017

[autotest] Respect machine dict in parallel_simple

A long time ago, the machines list passed in to server_job was converted
from a list of strings to possibly a list of dicts. Some places didn't
get this update. Fix another one.

BUG= chromium:746751 
TEST=(1) (new) unittests
     (2) Ran a local autotest_SyncCount job and verified that subdirs
         names are correctly inferred.

Change-Id: I7479241d97155e639c85a6b5469d134c1f48eba8
Reviewed-on: https://chromium-review.googlesource.com/582234
Commit-Ready: Prathmesh Prabhu <pprabhu@chromium.org>
Tested-by: Prathmesh Prabhu <pprabhu@chromium.org>
Reviewed-by: Prathmesh Prabhu <pprabhu@chromium.org>
Reviewed-by: Richard Barnette <jrbarnette@google.com>
Reviewed-by: Laurence Goodby <lgoodby@chromium.org>

[modify] https://crrev.com/d08c86bd1b0371b6c7b77aae3b6c92366a18e887/server/server_job.py
[modify] https://crrev.com/d08c86bd1b0371b6c7b77aae3b6c92366a18e887/server/subcommand_unittest.py
[modify] https://crrev.com/d08c86bd1b0371b6c7b77aae3b6c92366a18e887/server/subcommand.py

Status: Fixed (was: Started)
I guess.
lgoodby@ can verify in his lab.
Status: Verified (was: Fixed)
Verified fix in R62-9777.0.0.

Labels: M-61 Merge-Request-61
Pls apply appropriate OSs label. Thank you.
Labels: OS-Chrome
Project Member

Comment 16 by sheriffbot@chromium.org, Jul 26 2017

Labels: -Merge-Request-61 Hotlist-Merge-Approved Merge-Approved-61
Your change meets the bar and is auto-approved for M61. Please go ahead and merge the CL to branch 3163 manually. Please contact milestone owner if you have questions.
Owners: amineer@(Android), cmasso@(iOS), ketakid @(ChromeOS), govind@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Project Member

Comment 17 by bugdroid1@chromium.org, Jul 28 2017

Labels: merge-merged-release-R61-9765.B
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/16fd2155abe7c16729efa6891ff217f1b0b28cd0

commit 16fd2155abe7c16729efa6891ff217f1b0b28cd0
Author: Prathmesh Prabhu <pprabhu@chromium.org>
Date: Fri Jul 28 23:20:20 2017

[autotest] Respect machine dict in parallel_simple

A long time ago, the machines list passed in to server_job was converted
from a list of strings to possibly a list of dicts. Some places didn't
get this update. Fix another one.

BUG= chromium:746751 
TEST=(1) (new) unittests
     (2) Ran a local autotest_SyncCount job and verified that subdirs
         names are correctly inferred.

Change-Id: I7479241d97155e639c85a6b5469d134c1f48eba8
Reviewed-on: https://chromium-review.googlesource.com/582234
Commit-Ready: Prathmesh Prabhu <pprabhu@chromium.org>
Tested-by: Prathmesh Prabhu <pprabhu@chromium.org>
Reviewed-by: Prathmesh Prabhu <pprabhu@chromium.org>
Reviewed-by: Richard Barnette <jrbarnette@google.com>
Reviewed-by: Laurence Goodby <lgoodby@chromium.org>
(cherry picked from commit d08c86bd1b0371b6c7b77aae3b6c92366a18e887)
Reviewed-on: https://chromium-review.googlesource.com/588003
Commit-Queue: Laurence Goodby <lgoodby@chromium.org>
Tested-by: Laurence Goodby <lgoodby@chromium.org>

[modify] https://crrev.com/16fd2155abe7c16729efa6891ff217f1b0b28cd0/server/server_job.py
[modify] https://crrev.com/16fd2155abe7c16729efa6891ff217f1b0b28cd0/server/subcommand_unittest.py
[modify] https://crrev.com/16fd2155abe7c16729efa6891ff217f1b0b28cd0/server/subcommand.py

Project Member

Comment 18 by sheriffbot@chromium.org, Jul 31 2017

This issue has been approved for a merge. Please merge the fix to any appropriate branches as soon as possible!

If all merges have been completed, please remove any remaining Merge-Approved labels from this issue.

Thanks for your time! To disable nags, add the Disable-Nags label.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Labels: -Merge-Approved-61

Sign in to add a comment