New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 822112 link

Starred by 0 users

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug

Blocked on:
issue 717268



Sign in to add a comment

Containers cannot start on Rodete

Project Member Reported by jkop@chromium.org, Mar 15 2018

Issue description

On ToT (git SHA e38310559ebc0c5695c27ab4492683977d461fba), on my workstation, lxc_functional_test reliably fails.

Here is the most relevant chunk of error log:

jkop@jkop:~/chromiumos/src/third_party/autotest/files/site_utils/lxc$ ./lxc_functional_test.py
DEBUG:root:Running 'sudo -n true'                                                                                                         
2018-03-14 18:44:39,214.214 INFO |lxc_functional_tes:0184|       MainThread(140254145861376)| Rebuild base container in folder /usr/local/autotest/containers/container_test_HfP4DQ.
2018-03-14 18:44:49,766.766 INFO |lxc_functional_tes:0187|       MainThread(140254145861376)| Base container created: base                      
2018-03-14 18:44:49,946.946 INFO |lxc_functional_tes:0200|       MainThread(140254145861376)| Create test container.
2018-03-14 18:44:51,450.450 ERROR|lxc_functional_tes:0363|       MainThread(140254145861376)| ERROR:                                      
Traceback (most recent call last):
  File "./lxc_functional_test.py", line 359, in <module>                                                                                          
    main(options)               
  File "./lxc_functional_test.py", line 344, in main                                                                          
    container = setup_test(bucket, container_id, options.skip_cleanup)
  File "./lxc_functional_test.py", line 206, in setup_test         
    dut_name='192.168.0.3')                                                                                                                                                        
  File "/usr/local/google/home/jkop/chromiumos/src/third_party/autotest/files/site-packages/chromite/lib/metrics.py", line 483, in wrapper
    return fn(*args, **kwargs)                          
  File "/usr/local/google/home/jkop/chromiumos/src/third_party/autotest/files/site_utils/lxc/cleanup_if_fail.py", line 40, in func_cleanup_if_fail
    return func(*args, **kwargs)                    
  File "/usr/local/google/home/jkop/chromiumos/src/third_party/autotest/files/site_utils/lxc/container_bucket.py", line 202, in setup_test
    container = self._factory.create_container(container_id)
  File "/usr/local/google/home/jkop/chromiumos/src/third_party/autotest/files/site_utils/lxc/container_factory.py", line 71, in create_container
    new_container = self._create_from_base(name, lxc_path)                                                                                
  File "/usr/local/google/home/jkop/chromiumos/src/third_party/autotest/files/site-packages/chromite/lib/metrics.py", line 483, in wrapper
    return fn(*args, **kwargs)                                                                                                                    
  File "/usr/local/google/home/jkop/chromiumos/src/third_party/autotest/files/site_utils/lxc/container_factory.py", line 116, in _create_from_base
    cleanup=self._force_cleanup)                                                                                                          
  File "/usr/local/google/home/jkop/chromiumos/src/third_party/autotest/files/site_utils/lxc/container.py", line 227, in clone
    new_name)                                                                                                                                   
ContainerError: Container test_123_1521078289_77533 already exists.

(complete log here: gpaste/5686101524086784)

Other LXC tests also fail, but this one is significantly less likely to be a result of bad config on my workstation.
 

Comment 1 by jkop@chromium.org, Mar 15 2018

Owner: nxia@chromium.org
nxia@ could you try reproducing this?

Comment 2 by nxia@chromium.org, Mar 15 2018

how do I run lxc_functional_test.py ? do I need to run any command to create the container? 


nxia@nxia:~/chromiumos/src/third_party/autotest/files/site_utils/lxc$ ./lxc_functional_test.py
Traceback (most recent call last):
  File "./lxc_functional_test.py", line 33, in <module>
    prefix='container_test_')
  File "/usr/lib/python2.7/tempfile.py", line 339, in mkdtemp
    _os.mkdir(file, 0700)
OSError: [Errno 2] No such file or directory: '/usr/local/autotest/containers/container_test_DfCiFI'

Comment 3 by jkop@chromium.org, Mar 15 2018

Hmm, that's a plausible cause for the issue. I just created the directory with mkdir, I think, and had forgotten.

Comment 4 by nxia@chromium.org, Mar 15 2018

do I need to test anything else?

Comment 5 by jkop@chromium.org, Mar 15 2018

Try creating the dir with mkdir and running it then. If that passes, I'll investigate and eventually get back to you; if it fails with a similar log to mine, then no need for anything else, just mark it confirmed.

Comment 6 by nxia@chromium.org, Mar 15 2018

Hit the same error


2018-03-15 12:10:49,754.754 INFO |lxc_functional_tes:0367|       MainThread(140297052468992)| Cleaning up temporary directory /usr/local/autotest/containers/container_test_MThD75.
Traceback (most recent call last):
  File "./lxc_functional_test.py", line 359, in <module>
    main(options)
  File "./lxc_functional_test.py", line 344, in main
    container = setup_test(bucket, container_id, options.skip_cleanup)
  File "./lxc_functional_test.py", line 206, in setup_test
    dut_name='192.168.0.3')
  File "/usr/local/google/home/nxia/chromiumos/src/third_party/autotest/files/site-packages/chromite/lib/metrics.py", line 483, in wrapper
    return fn(*args, **kwargs)
  File "/usr/local/google/home/nxia/chromiumos/src/third_party/autotest/files/site_utils/lxc/cleanup_if_fail.py", line 40, in func_cleanup_if_fail
    return func(*args, **kwargs)
  File "/usr/local/google/home/nxia/chromiumos/src/third_party/autotest/files/site_utils/lxc/container_bucket.py", line 202, in setup_test
    container = self._factory.create_container(container_id)
  File "/usr/local/google/home/nxia/chromiumos/src/third_party/autotest/files/site_utils/lxc/container_factory.py", line 71, in create_container
    new_container = self._create_from_base(name, lxc_path)
  File "/usr/local/google/home/nxia/chromiumos/src/third_party/autotest/files/site-packages/chromite/lib/metrics.py", line 483, in wrapper
    return fn(*args, **kwargs)
  File "/usr/local/google/home/nxia/chromiumos/src/third_party/autotest/files/site_utils/lxc/container_factory.py", line 116, in _create_from_base
    cleanup=self._force_cleanup)
  File "/usr/local/google/home/nxia/chromiumos/src/third_party/autotest/files/site_utils/lxc/container.py", line 227, in clone
    new_name)
autotest_lib.client.common_lib.error.ContainerError: Container test_123_1521141048_155403 already exists.


Comment 7 by nxia@chromium.org, Mar 15 2018

Owner: jkop@chromium.org

Comment 8 by jkop@chromium.org, Mar 15 2018

Status: Started (was: Unconfirmed)

Comment 9 by jkop@chromium.org, Mar 15 2018

Blockedon: 717268
This is a result of lxc-clone not being available on my workstation, and could be fixed by migrating to lxc-copy.

Comment 10 by jkop@chromium.org, Mar 16 2018

That was insufficient to fix the problem. Starting the container still fails, stopped deliberately by AppArmor. gpaste/6488436445806592

Comment 11 by jkop@chromium.org, Mar 16 2018

Got a more detailed log and updated the paste: Here's the most relevant bit:

       lxc-start 20180316002846.452 INFO     lxc_cgfsng - cgroups/cgfsng.c:cgfsng_setup_limits:1991 - cgroup has been setup
      lxc-start 20180316002846.452 INFO     lxc_start - start.c:do_start:836 - Unshared CLONE_NEWCGROUP.
      lxc-start 20180316002846.452 WARN     lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:218 - Incomplete AppArmor support in your kernel
      lxc-start 20180316002846.452 ERROR    lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:220 - If you really want to start this container, set
      lxc-start 20180316002846.452 ERROR    lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:221 - lxc.aa_allow_incomplete = 1
      lxc-start 20180316002846.452 ERROR    lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:222 - in your container configuration file
      lxc-start 20180316002846.452 ERROR    lxc_sync - sync.c:__sync_wait:57 - An error occurred in another process (expected sequence number 5)
      lxc-start 20180316002846.453 ERROR    lxc_start - start.c:__lxc_start:1354 - Failed to spawn container "test_123_1521150125_207921".
      lxc-start 20180316002846.453 WARN     lxc_monitor - monitor.c:lxc_monitor_fifo_send:111 - Failed to open fifo to send message: No such file or directory.
      lxc-start 20180316002846.453 WARN     lxc_monitor - monitor.c:lxc_monitor_fifo_send:111 - Failed to open fifo to send message: No such file or directory.
      lxc-start 20180316002846.453 INFO     lxc_conf - conf.c:run_script_argv:427 - Executing script "/usr/share/lxcfs/lxc.reboot.hook" for container "test_123_1521150125_207921", config section "lxc".
      lxc-start 20180316002846.963 ERROR    lxc_start_ui - tools/lxc_start.c:main:366 - The container failed to start.
      lxc-start 20180316002846.963 ERROR    lxc_start_ui - tools/lxc_start.c:main:370 - Additional information can be obtained by setting the --logfile and --logpriority options.

Comment 12 by jkop@chromium.org, Mar 16 2018

The command I ran to get that log was 

`sudo lxc-start -P /usr/local/autotest/containers/container_test_L2LhnJ -n test_123_1521150125_207921 -dFo ~/container_log_L2LhnJ.log --logpriority=DEBUG`, results are in gpaste/6488436445806592

Comment 13 by jkop@chromium.org, Mar 16 2018

This works on a drone with the same code that fails on my workstation.

Comment 14 by jkop@chromium.org, Mar 16 2018

This is at least partially a Rodete problem, but doesn't seem to be entirely a Rodete problem.

Comment 15 by jkop@chromium.org, Mar 16 2018

Cc: akes...@chromium.org ayatane@chromium.org
Labels: Chase-Pending
Status: Assigned (was: Started)
Summary: Containers cannot start on Rodete (was: lxc_functional_test fails with duplicated containers)
I'm not sure if this applies to all containers or just the LXC pool zygotes. But I have determined what is blocking the tests. When running lxc-start, precise command and log as mentioned in #12, it starting is blocked by "AppArmor", an Access Control framework used in Ubuntu and Debian. This does not appear to exist on servers but is present on workstations.

Comment 16 by ihf@chromium.org, Mar 16 2018

Might be related to the work in b/71629580?
Unrelated to the work in b/71629580. Rodete includes a kernel with Apparmor support, and lxc has support for using Apparmor to restrict what's happening within a container. Unfortunately the lxc support requires additional patches that are carried in the Ubuntu kernel and aren't in the mainline kernel. I think lxc's behaviour here is a bug - it shouldn't depend on features that aren't mainline, and should degrade gracefully instead. You can work around this by setting the lxc.aa_allow_incomplete = 1 option as indicated in the error message.
Labels: -Chase-Pending

Comment 19 by jkop@chromium.org, Mar 19 2018

Cc: matthewgarrett@google.com
Re #17: That 'workaround' was already in place when I got the error messages above. It has no effect.
Well, in that case it seems like a bug in lxc - opening a bug against that in rodete seems reasonable.

Comment 21 by jkop@chromium.org, Mar 19 2018

Filed; Rodete bug submission rules say to go through Techstop first, so I've done that.
Project Member

Comment 22 by bugdroid1@chromium.org, Mar 31 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/bc9842db7e514a9c7c624c139cc88dd92f6b5a30

commit bc9842db7e514a9c7c624c139cc88dd92f6b5a30
Author: Jacob Kopczynski <jkop@google.com>
Date: Sat Mar 31 04:53:35 2018

autotest: lxc-functional-test creates directory

The functional tests created temp containers within a directory but did
not check whether that directory existed. Now it does, and creates it.

BUG=chromium:822112
TEST=removed directory and ran test

Change-Id: I5da9376b4c0a8b581ad318819355b05427b5fcd3
Reviewed-on: https://chromium-review.googlesource.com/964930
Commit-Ready: Jacob Kopczynski <jkop@chromium.org>
Tested-by: Jacob Kopczynski <jkop@chromium.org>
Reviewed-by: Jacob Kopczynski <jkop@chromium.org>

[modify] https://crrev.com/bc9842db7e514a9c7c624c139cc88dd92f6b5a30/site_utils/lxc/lxc_functional_test.py

Project Member

Comment 23 by bugdroid1@chromium.org, Apr 28 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/c54fecdc26aac0d9c3d08dcae6e9cfa7660adecd

commit c54fecdc26aac0d9c3d08dcae6e9cfa7660adecd
Author: Jacob Kopczynski <jkop@google.com>
Date: Sat Apr 28 04:27:38 2018

autotest: lxc: use lxc-copy where available

On shards, we have lxc-copy, which is preferred.
On moblab, we still use an old version of lxc, which does not have
lxc-copy and still uses the deprecated lxc-clone.
This checks for the existence of lxc-copy, uses it if able, and falls
back to the old command if necessary.

BUG=chromium:822112
TEST=Hwtest tryjob on moblab

Change-Id: Ie0dbe9048ef503db371a3f74c42b81ec6c6767b0
Reviewed-on: https://chromium-review.googlesource.com/965398
Commit-Ready: Jacob Kopczynski <jkop@chromium.org>
Tested-by: Jacob Kopczynski <jkop@chromium.org>
Reviewed-by: Laurence Goodby <lgoodby@chromium.org>
Reviewed-by: Dan Shi <dshi@google.com>

[modify] https://crrev.com/c54fecdc26aac0d9c3d08dcae6e9cfa7660adecd/site_utils/lxc/utils.py

Comment 24 by jkop@chromium.org, Jun 7 2018

Labels: -Pri-2 Pri-3
Status: Available (was: Assigned)

Comment 25 by jkop@chromium.org, Jun 7 2018

Owner: ----

Sign in to add a comment