lab-staging failure: deploy_server_local choking on UPDATE_COMMANDS directive |
||
Issue description[chromeos-staging-master2.hot.corp.google.com] out: DEBUG:root:[stdout] Running update commands: build_externals, afe [chromeos-staging-master2.hot.corp.google.com] out: DEBUG:root:[stdout] build_externals [chromeos-staging-master2.hot.corp.google.com] out: DEBUG:root:[stdout] Running: build_externals: /usr/local/autotest/utils/build_externals.py --use_chromite_master [chromeos-staging-master2.hot.corp.google.com] out: ERROR:root:[stderr] Traceback (most recent call last): [chromeos-staging-master2.hot.corp.google.com] out: ERROR:root:[stderr] File "/usr/local/autotest/site_utils/deploy_server_local.py", line 538, in <module> [chromeos-staging-master2.hot.corp.google.com] out: ERROR:root:[stderr] sys.exit(main(sys.argv[1:])) [chromeos-staging-master2.hot.corp.google.com] out: ERROR:root:[stderr] File "/usr/local/autotest/site_utils/deploy_server_local.py", line 530, in main [chromeos-staging-master2.hot.corp.google.com] out: ERROR:root:[stderr] use_chromite_master=behaviors.update_push_servers) [chromeos-staging-master2.hot.corp.google.com] out: ERROR:root:[stderr] File "/usr/local/autotest/site_utils/deploy_server_local.py", line 359, in run_deploy_actions [chromeos-staging-master2.hot.corp.google.com] out: ERROR:root:[stderr] use_chromite_master=use_chromite_master) [chromeos-staging-master2.hot.corp.google.com] out: ERROR:root:[stderr] File "/usr/local/autotest/site_utils/deploy_server_local.py", line 237, in update_command [chromeos-staging-master2.hot.corp.google.com] out: ERROR:root:[stderr] raise UnknownCommandException(cmd_tag, cmds) [chromeos-staging-master2.hot.corp.google.com] out: ERROR:root:[stderr] __main__.UnknownCommandException: ('afe\nbuild_externals', {'migrate': 'AUTOTEST_REPO/database/migrate.py sync', 'afe': 'AUTOTEST_REPO/utils/compile_gwt_clients.py -c autotest.AfeClient', 'build_externals': 'AUTOTEST_REPO/utils/build_externals.py', 'test_importer': 'AUTOTEST_REPO/utils/test_importer.py', 'tko': 'AUTOTEST_REPO/utils/compile_gwt_clients.py -c autotest.TkoClient', 'apache': 'sudo service apache2 reload'}) Looks like a bad puppet change that pushed corrupted UPDATE_COMMANDS to shadow_config. Pri-1 since test_push hasn't passed in days.
,
Dec 19 2017
Looks like only staging shard is affected, so fixing manually. This was caused by doing something clever in Puppet to format the ini file nicely, and then removing it. Not sure why only this one machine is affected, probably an odd edge case.
,
Dec 19 2017
This was a combination of previously using newline commas in the inifile for convenience, removing it for sanity, probably some quirk in how Puppet inifile works, the INI format not being a standard, and some coincidence that only affected this one machine. In any case, it's fixed now, and rerunning Puppet does not re-corrupt the file. |
||
►
Sign in to add a comment |
||
Comment 1 by pprabhu@chromium.org
, Dec 19 2017Status: Assigned (was: Started)
The problem is in push-shard's shadow_config: [UPDATE] commands = build_externals,afe build_externals, afe ------ deploy_server_local parses the commands field here, and expects it to be a comma separated list. This newline separate list was introduced by a combination of CLs in this stack: https://chrome-internal-review.googlesource.com/c/chromeos/chromeos-admin/+/525547/5 A single revert won't fix this. Perhaps the right answer is to teach deploy_server_local to start using ~/push_update_commands instead of reading shadow_config? This is blocking test_push, so if a simple revert is possible, I'd prefer that.