A bunch of quarantined bots with "Failed to call hook get_state(): 'ascii' codec can't encode characters" |
|||
Issue descriptionThere's a bunch of Linux bots in luci.chromuim.try pool stuck with: Failed to call hook get_state(): 'ascii' codec can't encode characters in position 108-132: ordinal not in range(128) /../../.cipd/pkgs/infra_python_linux-amd64-ubuntu14_04_tap59FFGuW/_current/ENV/lib/python2.7/os.py", line 294, in walk for x in walk(new_path, topdown, onerror, followlinks): File "/opt/infra-bot-setup/infra-python/ENV/bin/../../.cipd/pkgs/infra_python_linux-amd64-ubuntu14_04_tap59FFGuW/_current/ENV/lib/python2.7/os.py", line 294, in walk for x in walk(new_path, topdown, onerror, followlinks): File "/opt/infra-bot-setup/infra-python/ENV/bin/../../.cipd/pkgs/infra_python_linux-amd64-ubuntu14_04_tap59FFGuW/_current/ENV/lib/python2.7/os.py", line 294, in walk for x in walk(new_path, topdown, onerror, followlinks): File "/opt/infra-bot-setup/infra-python/ENV/bin/../../.cipd/pkgs/infra_python_linux-amd64-ubuntu14_04_tap59FFGuW/_current/ENV/lib/python2.7/os.py", line 294, in walk for x in walk(new_path, topdown, onerror, followlinks): File "/opt/infra-bot-setup/infra-python/ENV/bin/../../.cipd/pkgs/infra_python_linux-amd64-ubuntu14_04_tap59FFGuW/_current/ENV/lib/python2.7/os.py", line 294, in walk for x in walk(new_path, topdown, onerror, followlinks): File "/opt/infra-bot-setup/infra-python/ENV/bin/../../.cipd/pkgs/infra_python_linux-amd64-ubuntu14_04_tap59FFGuW/_current/ENV/lib/python2.7/os.py", line 294, in walk for x in walk(new_path, topdown, onerror, followlinks): File "/opt/infra-bot-setup/infra-python/ENV/bin/../../.cipd/pkgs/infra_python_linux-amd64-ubuntu14_04_tap59FFGuW/_current/ENV/lib/python2.7/os.py", line 294, in walk for x in walk(new_path, topdown, onerror, followlinks): File "/opt/infra-bot-setup/infra-python/ENV/bin/../../.cipd/pkgs/infra_python_linux-amd64-ubuntu14_04_tap59FFGuW/_current/ENV/lib/python2.7/os.py", line 284, in walk if isdir(join(top, name)): File "/opt/infra-bot-setup/infra-python/ENV/bin/../../.cipd/pkgs/infra_python_linux-amd64-ubuntu14_04_tap59FFGuW/_current/ENV/lib/python2.7/genericpath.py", line 41, in isdir st = os.stat(s) UnicodeEncodeError: 'ascii' codec can't encode characters in position 108-132: ordinal not in range(128) Calling stack: 0 /b/swarming/swarming_bot.2.zip/api/bot.py:207:post_error() 1 /b/swarming/swarming_bot.2.zip/bot_code/bot_main.py:295:_call_hook_safe() 2 /b/swarming/swarming_bot.2.zip/bot_code/bot_main.py:350:_get_state() 3 /b/swarming/swarming_bot.2.zip/bot_code/bot_main.py:1041:_run_bot_inner() 4 /b/swarming/swarming_bot.2.zip/bot_code/bot_main.py:944:_run_bot() 5 /b/swarming/swarming_bot.2.zip/bot_code/bot_main.py:1326:main() 6 /b/swarming/swarming_bot.2.zip/__main__.py:166:CMDstart_bot() 7 /b/swarming/swarming_bot.2.zip/__main__.py:252:main() 8 /b/swarming/swarming_bot.2.zip/__main__.py:264:<module>() 9 /usr/lib/python2.7/runpy.py:72:_run_code() 10 /usr/lib/python2.7/runpy.py:162:_run_module_as_main() These ones: https://chromium-swarm.appspot.com/bot?id=swarm691-c4&sort_stats=total%3Adesc https://chromium-swarm.appspot.com/bot?id=swarm909-c4&sort_stats=total%3Adesc https://chromium-swarm.appspot.com/bot?id=swarm982-c4&sort_stats=total%3Adesc https://chromium-swarm.appspot.com/bot?id=swarm983-c4&sort_stats=total%3Adesc I'm curious to see what's causing this, since these bots run only recipes and recipes aren't expected to do any crazy stuff to bots.
,
Feb 5 2018
get_recursive_size fails on /b/swarming/c/W8/cast_shell_linux/src/third_party/WebKit/LayoutTests/http/tests/local/fileapi/resources/file-for-drag-to-send3-ABC~‾¥≈¤・・•∙·☼★星🌟星★☼·∙•・・¤≈¥‾~XYZ.txt But I'm not sure why it is using ASCII for file system path :( The default encoding is supposed to be utf-8 (and it works just fine in this case). A bandaid would be to catch UnicodeEncodeError and give up.
,
Feb 6 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-py.git/+/ca8e4ec9caf1c31e1b2f944fa600eb40a7d6ef75 commit ca8e4ec9caf1c31e1b2f944fa600eb40a7d6ef75 Author: Vadim Shtayura <vadimsh@chromium.org> Date: Tue Feb 06 13:49:08 2018 Make get_recursive_size skip symlinks, don't crash on unicode paths. It is not clear why it doesn't handle unicode paths, but it is better to return -1 rather than crash. Crashing in get_recursive_size causes the bot to quarantine itself, which is bad. Also add dumb typo-catching test. R=maruel@chromium.org, iannucci@chromium.org BUG= 809196 Change-Id: I6a26bcc6a2fc4651596473da4c8377a41fe25896 Reviewed-on: https://chromium-review.googlesource.com/902814 Commit-Queue: Marc-Antoine Ruel <maruel@chromium.org> Reviewed-by: Marc-Antoine Ruel <maruel@chromium.org> Reviewed-by: Robbie Iannucci <iannucci@chromium.org> [modify] https://crrev.com/ca8e4ec9caf1c31e1b2f944fa600eb40a7d6ef75/appengine/swarming/swarming_bot/api/os_utilities.py [modify] https://crrev.com/ca8e4ec9caf1c31e1b2f944fa600eb40a7d6ef75/appengine/swarming/swarming_bot/api/os_utilities_test.py
,
Feb 6 2018
Almost all Dart Linux bots have quarantined themselves because of with that error: https://chromium-swarm.appspot.com/botlist?c=id&c=os&c=task&c=status&c=pool&f=pool%3Aluci.dart.try&l=100&s=id%3Aasc I'm raising the priority because our CQ will soon be out of capacity if this continues.
,
Feb 6 2018
Woah, I didn't notice it is so severe for Dart bots. We have a potential fix, I'll be deploying it now to staging, and a bit later to chromium-swarm.
,
Feb 6 2018
Thanks!
,
Feb 6 2018
Looks like the fix worked. |
|||
►
Sign in to add a comment |
|||
Comment 1 by vadimsh@chromium.org
, Feb 5 2018