guado_moblab-paladin HWTest failures: lxc-attach fails mysteriously |
|||
Issue descriptionI've seen a few instances of this since yesterday. Latest: https://luci-milo.appspot.com/buildbot/chromeos/guado_moblab-paladin/9379 Failure: FAIL moblab_RunSuite moblab_RunSuite timestamp=1525824676 localtime=May 08 17:11:16 Unhandled AutoservRunError: command execution error * Command: /usr/bin/ssh -a -x -o Protocol=2 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -l root -p 22 chromeos2-row1-rack8-host1 "export LIBC_FATAL_STDERR_=1; if type \"logger\" > /dev/null 2>&1; then logger -tag \"autotest\" \"server[stack::run_once|run_as_moblab|run] -> ssh_run(su - moblab -c '/usr/local/autotest/site_utils/run_suite.py --pool='' --board=cyan --build=cyan-release/R66-10452.74.0 --suite_name=dummy_server --retry=True --max_retries=1')\";fi; su - moblab -c '/usr/local/autotest/site_utils/run_suite.py --pool='' --board=cyan --build=cyan-release/R66-10452.74.0 --suite_name=dummy_server --retry=True --max_retries=1'" Exit status: 1 Duration: 730.166491032 stdout: [?25h[?0c stderr: Autotest instance created: localhost 05-08-2018 [16:58:23] Submitted create_suite_job rpc 05-08-2018 [16:58:28] Created suite job: http://localhost/afe/#tab_id=view_job&object_id=1 @@@STEP_LINK@Link to suite@http://localhost/afe/#tab_id=view_job&object_id=1@@@ 05-08-2018 [17:10:32] Suite job is finished. 05-08-2018 [17:10:32] Start collecting test results and dump them to json. Suite job [ PASSED ] dummy_PassServer.ssp_SERVER_JOB [ FAILED ] dummy_PassServer.ssp_SERVER_JOB FAIL: dummy_PassServer [ PASSED ] dummy_PassServer_SERVER_JOB [ FAILED ] dummy_PassServer_SERVER_JOB FAIL:
,
May 9 2018
haddowk@ is in the middle of a container upgrade for moblab which may help here, but we don't know either way.
,
May 10 2018
guado_moblab-paladin has been flaky in the CQ.
,
May 10 2018
Go ahead and mark as experimental for now.
,
May 10 2018
How do I mark a paladin builder as experimental? It doesn't look something I can do via GE UI.
,
May 10 2018
It is - oddly enough - a magic string in the tree status. I have already done it.
,
May 11 2018
I can repo on a local moblab autoserv is segfaulting best trace so far is --- modulename: scanner, funcname: _import_c_make_scanner scanner.py(5): try: scanner.py(6): from simplejson._speedups import make_scanner Segmentation fault (core dumped) Might have to get out gdb
,
May 12 2018
Does not help me much - anyone else with any other ideas ? Perhaps might have to ask the toolchain guys
Starting program: /usr/bin/python /usr/local/autotest/server/autoserv -s -P 2-moblab/192.168.231.100 -m 192.168.231.100 -l nami-release/R68-10658.0.0/gts/cheets_GTS.GtsTvBugReportTestCases -u moblab --lab True -n --parent_job_id=1 -r /usr/local/autotest/results/2-moblab --verify_job_repo_url -p /usr/local/autotest/drone_tmp/control_attach --use-existing-results --pidfile-label container_autoserv
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
BFD: /usr/local/lib/python2.7/dist-packages/simplejson/_speedups.so: don't know how to handle section `.relr.dyn' [0x 13]
warning: `/usr/local/lib/python2.7/dist-packages/simplejson/_speedups.so': Shared library architecture unknown is not compatible with target architecture i386:x86-64.
Program received signal SIGSEGV, Segmentation fault.
elf_dynamic_do_Rela (skip_ifunc=0, lazy=<optimized out>, nrelative=<optimized out>, relsize=<optimized out>, reladdr=<optimized out>, map=0xe9a5f0) at do-rel.h:136
136 do-rel.h: No such file or directory.
(gdb) backtrace
#0 elf_dynamic_do_Rela (skip_ifunc=0, lazy=<optimized out>, nrelative=<optimized out>, relsize=<optimized out>, reladdr=<optimized out>, map=0xe9a5f0) at do-rel.h:136
#1 _dl_relocate_object (scope=<optimized out>, reloc_mode=reloc_mode@entry=0, consider_profiling=<optimized out>, consider_profiling@entry=0) at dl-reloc.c:264
#2 0x00007ffff7deed71 in dl_open_worker (a=a@entry=0x7fffffff8958) at dl-open.c:427
#3 0x00007ffff7dea094 in _dl_catch_error (objname=objname@entry=0x7fffffff8948, errstring=errstring@entry=0x7fffffff8950, mallocedp=mallocedp@entry=0x7fffffff8940,
operate=operate@entry=0x7ffff7deea30 <dl_open_worker>, args=args@entry=0x7fffffff8958) at dl-error.c:187
#4 0x00007ffff7dee44b in _dl_open (file=0xe82af0 "/usr/local/lib/python2.7/dist-packages/simplejson/_speedups.so", mode=-2147483646, caller_dlopen=<optimized out>, nsid=-2, argc=23,
argv=0x7fffffffe488, env=0x7fffffffe548) at dl-open.c:661
#5 0x00007ffff75f002b in dlopen_doit (a=a@entry=0x7fffffff8b70) at dlopen.c:66
#6 0x00007ffff7dea094 in _dl_catch_error (objname=0x99e2b0, errstring=0x99e2b8, mallocedp=0x99e2a8, operate=0x7ffff75effd0 <dlopen_doit>, args=0x7fffffff8b70) at dl-error.c:187
#7 0x00007ffff75f062d in _dlerror_run (operate=operate@entry=0x7ffff75effd0 <dlopen_doit>, args=args@entry=0x7fffffff8b70) at dlerror.c:163
#8 0x00007ffff75f00c1 in __dlopen (file=<optimized out>, mode=<optimized out>) at dlopen.c:87
#9 0x000000000059dfc3 in _PyImport_GetDynLoadFunc ()
#10 0x000000000042f9a6 in _PyImport_LoadDynamicModule ()
#11 0x000000000053fe2c in ?? ()
#12 0x00000000005406c5 in PyImport_ImportModuleLevel ()
#13 0x0000000000546e37 in ?? ()
#14 0x00000000004d40fb in PyEval_CallObjectWithKeywords ()
#15 0x00000000004ca061 in PyEval_EvalFrameEx ()
#16 0x00000000004c8762 in PyEval_EvalFrameEx ()
#17 0x00000000004cfedc in PyEval_EvalCodeEx ()
#18 0x0000000000596e82 in PyEval_EvalCode ()
#19 0x0000000000596f9a in PyImport_ExecCodeModuleEx ()
#20 0x00000000005b200f in ?? ()
#21 0x000000000053fe2c in ?? ()
#22 0x000000000054056b in PyImport_ImportModuleLevel ()
#23 0x0000000000546e37 in ?? ()
#24 0x00000000004d40fb in PyEval_CallObjectWithKeywords ()
#25 0x00000000004ca061 in PyEval_EvalFrameEx ()
#26 0x00000000004cfedc in PyEval_EvalCodeEx ()
#27 0x0000000000596e82 in PyEval_EvalCode ()
#28 0x0000000000596f9a in PyImport_ExecCodeModuleEx ()
#29 0x00000000005b200f in ?? ()
#30 0x000000000042abf0 in ?? ()
#31 0x0000000000581e65 in ?? ()
#32 0x000000000053fd2f in ?? ()
#33 0x0000000000540342 in PyImport_ImportModuleLevel ()
#34 0x0000000000546e37 in ?? ()
#35 0x00000000004d40fb in PyEval_CallObjectWithKeywords ()
#36 0x00000000004ca061 in PyEval_EvalFrameEx ()
#37 0x00000000004cfedc in PyEval_EvalCodeEx ()
#38 0x0000000000596e82 in PyEval_EvalCode ()
#39 0x0000000000596f9a in PyImport_ExecCodeModuleEx ()
#40 0x00000000005b200f in ?? ()
#41 0x000000000053fe2c in ?? ()
#42 0x000000000054056b in PyImport_ImportModuleLevel ()
#43 0x0000000000546e37 in ?? ()
#44 0x00000000004d40fb in PyEval_CallObjectWithKeywords ()
#45 0x00000000004ca061 in PyEval_EvalFrameEx ()
#46 0x00000000004cfedc in PyEval_EvalCodeEx ()
#47 0x0000000000596e82 in PyEval_EvalCode ()
#48 0x0000000000596f9a in PyImport_ExecCodeModuleEx ()
#49 0x00000000005b200f in ?? ()
#50 0x0000000000581e65 in ?? ()
#51 0x000000000048c5b4 in ?? ()
#52 0x00000000005403fe in PyImport_ImportModuleLevel ()
#53 0x0000000000546e37 in ?? ()
,
May 12 2018
The new lxc container is in and we got one green run of the guado moblab, I will monitor over the weekend and hopefully can put guado moblab back in the CQ.
,
May 12 2018
After the update to lxc 2.1.1 there has been 5 CQ green runs in a row, marking this as fixed, will remove the experimental flag for the build at EOD assuming no failed builds. |
|||
►
Sign in to add a comment |
|||
Comment 1 by pprabhu@chromium.org
, May 9 2018