suite failed with: No such file or directory: '/opt/infra-tools/usr/bin/lucifer' |
|||||||
Issue descriptionbuild: https://ci.chromium.org/p/chromeos/builders/luci.chromeos.general/Prod/b8941450348129903888# suite job in question: http://cautotest-prod/afe/#tab_id=view_job&object_id=215557238 logs: https://stainless.corp.google.com/browse/chromeos-autotest-results/215557238-chromeos-test/hostless/debug/ job reporter logs: https://storage.cloud.google.com/chromeos-autotest-results/215557238-chromeos-test/hostless/lucifer/job_reporter_output.log snippet: job_reporter: 2018-07-09 14:36:19,542:DEBUG:eventlib:run_event_command:90:Starting event command with ['/opt/infra-tools/usr/bin/lucifer', 'test', '-autotestdir', '/usr/local/autotest', '-abortsock', '/usr/local/autotest/leases/215557238.sock', '-hosts', '', '-x-level', 'STARTING', '-resultsdir', '/usr/local/autotest/results/215557238-chromeos-test/hostless', '-x-control-file', '/usr/local/autotest/results/215557238-chromeos-test/hostless/lucifer/control_attach', '-x-execution-tag', '215557238-chromeos-test/hostless', '-x-job-owner', u'chromeos-test', '-x-job-name', u'caroline-arcnext-chrome-pfq/R69-10861.0.0-rc1-test_suites/control.sanity', '-x-reboot-after', 'never', '-x-run-reset', '-x-test-retries', '0', '-x-require-ssp'] Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/usr/local/autotest/venv/lucifer/cmd/job_reporter.py", line 212, in <module> sys.exit(main(sys.argv)) File "/usr/local/autotest/venv/lucifer/cmd/job_reporter.py", line 45, in main ret = _main(args) File "/usr/local/autotest/venv/lucifer/cmd/job_reporter.py", line 95, in _main return _run_autotest_job(args) File "/usr/local/autotest/venv/lucifer/cmd/job_reporter.py", line 108, in _run_autotest_job ret = _run_lucifer_job(handler, args, job) File "/usr/local/autotest/venv/lucifer/cmd/job_reporter.py", line 151, in _run_lucifer_job event_handler=event_handler, args=command_args) File "/usr/local/autotest/venv/lucifer/eventlib.py", line 91, in run_event_command with subprocess32.Popen(args, stdout=PIPE) as proc: File "/usr/local/google/home/chromeos-test/.cache/cros_venv/venv-2.7.6-a234a83456f26d726445fc6c8e6ce271/local/lib/python2.7/site-packages/subprocess32.py", line 825, in __init__ restore_signals, start_new_session) File "/usr/local/google/home/chromeos-test/.cache/cros_venv/venv-2.7.6-a234a83456f26d726445fc6c8e6ce271/local/lib/python2.7/site-packages/subprocess32.py", line 1574, in _execute_child raise child_exception_type(errno_num, err_msg) OSError: [Errno 2] No such file or directory: '/opt/infra-tools/usr/bin/lucifer'
,
Jul 9
,
Jul 9
A group of paladins failed the same way: https://luci-milo.appspot.com/buildbot/chromeos/sentry-paladin/3546 https://luci-milo.appspot.com/buildbot/chromeos/wolf-paladin/18301 https://luci-milo.appspot.com/buildbot/chromeos/nyan_big-paladin/5597 https://luci-milo.appspot.com/buildbot/chromeos/link-paladin/32290 https://luci-milo.appspot.com/buildbot/chromeos/nyan_kitty-paladin/5600 https://luci-milo.appspot.com/buildbot/chromeos/tidus-paladin/3425 https://luci-milo.appspot.com/buildbot/chromeos/peppy-paladin/19117 https://luci-milo.appspot.com/buildbot/chromeos/peach_pit-paladin/19700
,
Jul 9
,
Jul 9
onset of issues roughly corresponds to the push to prod at 1:30 pm today, with the following autotest range: git log --oneline 24e3b6e00..a022d333f | grep autotest 4b5f7e033 autotest: Update CrosHost.cleanup_services to start powerd. 0a47a6780 autotest: Also catch IOError for missing file 08fd214e5 autotest: Use universal lucifer binary 375783bd9 autotest: Remove Logcat alias. 73560c9d5 autotest: Move hardware_RamFio to bvt-perbuild. 0cb61953e Revert "[autotest] Remove unnecessary tko/db construction logic" 1b7318a41 autotest: audio_LoopbackLatency - Log performance data for 3.5mm and USB separately. 8b9126d21 autotest: Replace screenshot.py with screenshot command. 1ea28c617 autotest: improve run_test_binary() error message 027561f6e [autotest] Remove unnecessary tko/db construction logic 1196de33d autotest: Call the correct function. 6fbd3afdf autotest: Fix wrong use of utils.system(). 2c7c98f64 autotest: Save logcat while waiting for ARC boot. b279566d7 autotest: Pass job_keyvals to run_suite_skylab. 41dbe2309 autotest: Remove trailing slash for swarming offloads 9f6e11fad autotest: Remove unused code. 9abea6507 autotest: Remove explicit extension. 0d9e8aff8 autotest: hostap_config: Allow configuring max STA count 059df7bb1 autotest: Remove 'remote' dir from Tast data paths. 7c3c321a9 autotest-capability: Apply video_PowerConsumption c92805bfe [autotest] Handle version labels from `cros stage` da6280387 [autotest][cfm] Refactores enterprise_CFM_AtrusUpdaterStress to a client test d486356f1 Add truncate and write_file to SmbProvider autotest 7658c597c Add delete_entry to SmbProvider autotest f91f9c606 Add create_file to SmbProvider autotest d6aa01143 Add read_file to SmbProvider autotest b1455cb59 Add open_file and close_file to SmbProvider autotest 1ba15530f Add get_metadata to SmbProvider autotest 43022474e Add unmount to SmbProvider autotest 34739f93a Add ReadDirectory to SmbProvider autotest 67c1181da autotest: Don't log pending provision tasks in buildbot.
,
Jul 9
Suspects: autotest: Use universal lucifer binary BUG= chromium:857603 TEST=None CQ-DEPEND=CL:1119105 Change-Id: I68865e035e40aa5a5409ad2cdb047157c19b6969 Reviewed-on: https://chromium-review.googlesource.com/1119117 Commit-Ready: Allen Li <ayatane@chromium.org> Tested-by: Allen Li <ayatane@chromium.org> Reviewed-by: Prathmesh Prabhu <pprabhu@chromium.org>
,
Jul 9
initial prognosis I landed a change in the same push as its dependency, so it cause one run to fail due to race, let me take a few minutes to double check
,
Jul 9
Checking the timestamps for sentry-paladin, the suite failed at 14:57:17,664 That HWTest ran on drone cros-full-0032, where the new lucifer binary got deployed at 14:58 So an unfortunate race due to landing a change one push too early. I'll keep this for a few CQ runs.
,
Jul 9
Even if it was a race, +Chase-Pending for follow up as we got 0 alerts about this and were notified only because the oncalls directed deputy attention to the failures.
,
Jul 16
the same alerting or monitoring as Issue 863504 would have surfaced this. |
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by akes...@chromium.org
, Jul 9