Intermittent goma failure on multiple tryservers |
||||||
Issue descriptionhttps://build.chromium.org/p/tryserver.chromium.android/builders/cast_shell_android/builds/29154 /usr/bin/python /b/build/goma/goma_ctl.py ensure_start creating crash dump dir (/tmp/goma_crash.chrome-bot). 17512 Using goma VERSION=99 (no_auto_update) Traceback (most recent call last): GOMA version b1b0d518a1273b5ae920bd4805b899acc23a9e15@1455784174 File "/b/build/goma/goma_ctl.py", line 2292, in <module> waiting for compiler_proxy... waiting for compiler_proxy... waiting for compiler_proxy... waiting for compiler_proxy... waiting for compiler_proxy... waiting for compiler_proxy... waiting for compiler_proxy... waiting for compiler_proxy... waiting for compiler_proxy... waiting for compiler_proxy... waiting for compiler_proxy... sys.exit(main()) File "/b/build/goma/goma_ctl.py", line 2287, in main compiler proxy (pid=17512,17497) status: http://127.0.0.1:8088 error: timed out to send request to backend servers goma.Dispatch(sys.argv[1:]) File "/b/build/goma/goma_ctl.py", line 1009, in Dispatch self._action_mappings.get(args[0], self._DefaultAction)() File "/b/build/goma/goma_ctl.py", line 645, in _EnsureStartCompilerProxy self._GenericStartCompilerProxy(ensure=True) File "/b/build/goma/goma_ctl.py", line 639, in _GenericStartCompilerProxy raise Error('Failed to start compiler_proxy successfully.') __main__.Error: Failed to start compiler_proxy successfully. /usr/bin/python /b/build/goma/goma_ctl.py jsonstatus /tmp/tmpGd6VlI.json /usr/bin/python /b/build/goma/goma_ctl.py stop Killing compiler proxy. compiler proxy status: http://127.0.0.1:8088 quit! /b/build/scripts/slave/gsutil cp file:///tmp/tmpS0pqMg gs://chrome-goma-log/2016/03/01/slave106-c4/compiler_proxy.slave106-c4.chrome-bot.log.INFO.20160229-165705.17512.gz Copying file:///tmp/tmpS0pqMg [Content-Type=application/octet-stream]... Uploading ...-c4.chrome-bot.log.INFO.20160229-165705.17512.gz: 0 B/73.26 KiB Uploading ...-c4.chrome-bot.log.INFO.20160229-165705.17512.gz: 72 KiB/73.26 KiB Uploading ...-c4.chrome-bot.log.INFO.20160229-165705.17512.gz: 73.26 KiB/73.26 KiB Copied log file to gs://chrome-goma-log/2016/03/01/slave106-c4/compiler_proxy.slave106-c4.chrome-bot.log.INFO.20160229-165705.17512.gz Visualization at http://chromium-build-stats.appspot.com/compiler_proxy_log/2016/03/01/slave106-c4/compiler_proxy.slave106-c4.chrome-bot.log.INFO.20160229-165705.17512.gz /usr/bin/python /opt/infra-python/run.py infra.tools.send_monitoring_event --event-mon-run-type prod --build-event-type BUILD --event-mon-timestamp-kind POINT --event-logrequest-path /b/build/slave/cast_shell_android/build/.recipe_runtime/tmpIRVar6/build_data/log_request_proto --build-event-goma-stats-path /b/build/slave/cast_shell_android/build/.recipe_runtime/tmpIRVar6/build_data/goma_stats_proto error: failed to start goma; fallback has been disabled Traceback (most recent call last): File "/b/build/scripts/slave/compile.py", line 1317, in <module> sys.exit(real_main()) File "/b/build/scripts/slave/compile.py", line 1313, in real_main return main(options, args) File "/b/build/scripts/slave/compile.py", line 841, in main_ninja goma_ready = goma_setup(options, env) File "/b/build/scripts/slave/compile.py", line 200, in goma_setup raise Exception('failed to start goma') Exception: failed to start goma step returned non-zero exit code: 1 @@@STEP_LOG_LINE@json.output@{@@@ @@@STEP_LOG_LINE@json.output@ "notice": [@@@ @@@STEP_LOG_LINE@json.output@ {@@@ @@@STEP_LOG_LINE@json.output@ "infra_status": {@@@ @@@STEP_LOG_LINE@json.output@ "num_compiler_info_fail": 0, @@@ @@@STEP_LOG_LINE@json.output@ "num_compiler_info_miss": 0, @@@ @@@STEP_LOG_LINE@json.output@ "num_exec_compiler_proxy_failure": 0, @@@ @@@STEP_LOG_LINE@json.output@ "num_exec_fail_fallback": 0, @@@ @@@STEP_LOG_LINE@json.output@ "num_http_active": 0, @@@ @@@STEP_LOG_LINE@json.output@ "num_http_error": 0, @@@ @@@STEP_LOG_LINE@json.output@ "num_http_retry": 0, @@@ @@@STEP_LOG_LINE@json.output@ "num_http_sent": 3, @@@ @@@STEP_LOG_LINE@json.output@ "num_http_timeout": 0, @@@ @@@STEP_LOG_LINE@json.output@ "num_network_error": 0, @@@ @@@STEP_LOG_LINE@json.output@ "num_network_recovered": 0, @@@ @@@STEP_LOG_LINE@json.output@ "num_user_error": 0, @@@ @@@STEP_LOG_LINE@json.output@ "num_user_warning": 0, @@@ @@@STEP_LOG_LINE@json.output@ "ping_status_code": 408@@@ @@@STEP_LOG_LINE@json.output@ }, @@@ @@@STEP_LOG_LINE@json.output@ "version": 1@@@ @@@STEP_LOG_LINE@json.output@ }@@@ @@@STEP_LOG_LINE@json.output@ ]@@@ @@@STEP_LOG_LINE@json.output@}@@@ @@@STEP_LOG_END@json.output@@@ @@@STEP_EXCEPTION@@@ From https://build.chromium.org/p/tryserver.chromium.android/builders/cast_shell_android?numbuilds=200 it looks like there are several such failures. Needs investigation. CC'ing sheriffs. Not sure which label to use - there's no auto-complete for Sheriff-related labels.
,
Mar 1 2016
Updating synopsis. Seen also on linux_chromium_asan_rel_ng: https://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_asan_rel_ng/builds/123653
,
Mar 2 2016
,
Mar 24 2016
,
Mar 25 2016
I believe VERSION=102 handles this case more nicely? is this still happening?
,
Mar 25 2016
Not so much since version 102 release? https://goto.google.com/ytlnx I feel it a bit slow to come up with this question but... Which is preferred? a. Goma client gives up and warns network/server error as soon as possible. b. Goma client retries on network error as much as possible. In other words, how long can people wait for goma set up? When it is ok to give up? Since goma is client/server application, it sometimes need to give up and warn if network and/or server has an issue. In March 1st, it was: give up in 10 seconds, 4 seconds retry interval, never use IP addresses with an issue. Currently, give up in 30 seconds, 10 seconds retry interval, avoid to use IP address with an issue until all IP addresses are used up, but may use IP address again. (Actually, there is certain number of cases that takes 10 seconds https://goto.google.com/ocyno)
,
Apr 4 2016
The following revision refers to this bug: https://chrome-internal.googlesource.com/goma/client/+/9fd3d9653daa3669e7e7c8e302097d0dc2b5a321 commit 9fd3d9653daa3669e7e7c8e302097d0dc2b5a321 Author: Yoshisato Yanagisawa <yyanagisawa@google.com> Date: Fri Apr 01 01:09:51 2016
,
Apr 27 2016
,
May 20 2016
still failing?
,
May 20 2016
Not as far as I can tell. Shall we call this fixed by the above commit?
,
May 20 2016
let's mark as fixed. if it happens again, please reopen or file new bug. |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by kbr@chromium.org
, Mar 1 2016