veyron_rialto-chrome-pfq timeout |
||||||||||||||
Issue descriptionHere's the bot: https://uberchromegw.corp.google.com/i/chromeos/builders/veyron_rialto-chrome-pfq First failing build: https://uberchromegw.corp.google.com/i/chromeos/builders/veyron_rialto-chrome-pfq/builds/550 Most recent failing build: https://uberchromegw.corp.google.com/i/chromeos/builders/veyron_rialto-chrome-pfq/builds/556 Looks like the build is taking too long? Log snippet: @@@STEP_FAILURE@@@ 02:14:19: ERROR: Timeout occurred- waited 4701 seconds, failing. Timeout reason: This build has reached the timeout deadline set by the master. Either this stage or a previous one took too long (see stage timing historical summary in ReportStage) or the build failed to start on time. cros_sdk: Signaled to shutdown: caught 15 signal. @@@STEP_FAILURE@@@ 02:14:19: ERROR: Traceback (most recent call last): File "/b/cbuild/internal_master/chromite/cbuildbot/stages/generic_stages.py", line 525, in Run self.PerformStage() File "/b/cbuild/internal_master/chromite/cbuildbot/stages/build_stages.py", line 331, in PerformStage extra_env=self._portage_extra_env) File "/b/cbuild/internal_master/chromite/cbuildbot/commands.py", line 482, in Build enter_chroot=True) File "/b/cbuild/internal_master/chromite/cbuildbot/commands.py", line 139, in RunBuildScript raise failures_lib.BuildScriptFailure(ex, cmd[0]) File "/b/cbuild/internal_master/chromite/cbuildbot/commands.py", line 124, in RunBuildScript return runcmd(cmd, **kwargs) File "/b/cbuild/internal_master/chromite/lib/cros_build_lib.py", line 594, in RunCommand (cmd_result.output, cmd_result.error) = proc.communicate(input) File "/usr/lib/python2.7/subprocess.py", line 751, in communicate self.wait() File "/usr/lib/python2.7/subprocess.py", line 1291, in wait pid, sts = _eintr_retry_call(os.waitpid, self.pid, 0) File "/usr/lib/python2.7/subprocess.py", line 478, in _eintr_retry_call return func(*args) File "/b/cbuild/internal_master/chromite/lib/timeout_util.py", line 131, in kill_us raise SystemExit(error_message) SystemExit: Timeout occurred- waited 4575 seconds, failing. Timeout reason: This build has reached the timeout deadline set by the master. Either this stage or a previous one took too long (see stage timing historical summary in ReportStage) or the build failed to start on time.
,
Sep 6 2016
,
Sep 6 2016
,
Sep 6 2016
Successful run lasts more than 3 hours: https://00e9e64bac8930c4b86005811899dc66c8907d4cc29753134e-apidata.googleusercontent.com/download/storage/v1/b/chromeos-image-archive/o/veyron_rialto-chrome-pfq%2FR55-8765.0.0-rc1%2Ftimeline-stages.html?qk=AD5uMEub_Lm_1wsSGKa6wwRIs1s_cVidHolHaox84scgLEMEMoGjj5kqyWc0CJHegsDNkootaK8sxwn2yhGUdzCGsBxkOWobs4i7FGEN36HRI2L5tGfn4Xxw4LKB-0RsMYv8pb12pUBNYeevxBY11YNiAAIpxFgqnrDBMYhTejNfTZbXwoVWtEMa8cuPmYhVzWFPlH2Ps6P-8y-2icHNZSemz9OVQPuzDiBxUYPmQ533k-aJvDZv3unxq72Gi2rusHoP5KL8REY8W3tlqtIWhkvWH-WoF6fs0rT5VNcbUeFeUXrzfHwh8AXKoINHlx3uMvEfnGT2pABeVlNiSlUPf4COPzp4HqlT7MeO0P0gHSu05i7GNC_f63Fa34ba8b1acG_6Z6W_Cmfyl1rrrRZdxTjocLyj5qSeyvAY8TQ9NLOCSJt-51mEID48djnYDQjeUqjPEQMqC6hVmKCjf6Ca8L9GCFjcD6-8o_yM1pPQNFScI7EBBP1WcoTxY1gXjR8glqzPFLvJGzIzl9Gqyk6OjoiXzfOcKlgwnnVgdjY97Pr_Qgi6HYTmiAdv1wC4QIZwrfiTI0Sf5PcieG431p2LiG1KXZ3n4tM6po6l6zRer4KzQolCMfSy9zudtwBgF0umMLgWWljT9LTeGlX8QKhivLsq9vo8_d6XQ2HZM1hYmcdYu1XQp9A8dpNiapvcmGYi_yuwnzZx4ZTaZucAJ7gAydTRCxxiJYk-ERq-HuJl_5eHsZmu0m-tbFsyMHoQb4z8GlOweoGEpfFCeJ4k-Q_-E7WEiB9ENl03BTnqYZlv5cE1LlGl7wjko0qUqBvM3bz-CQD44WM0H5NZXf5W62G0mkDvx-kb42RoYSdbj44mI7xpxI5BrS0ZA1Y Failed run is killed after hour or so: https://00e9e64bacb522463dd90e9287be75c5d961bd638b41a30bc4-apidata.googleusercontent.com/download/storage/v1/b/chromeos-image-archive/o/veyron_rialto-chrome-pfq%2FR55-8775.0.0-rc1%2Ftimeline-stages.html?qk=AD5uMEv_xPgbVdBmOiEXJ3huws-_Luvj8J74DvbNovOhx-iSzCoQlCkH5hwzKMxRLLValnxJBKCzvn-Dla_iwRSQ_qZ51e0P95omr91FwL88kAyBF5rFYvH6glzQpq9l6UT--6JKDFEhPdNfDuUFgwARJSxnfhqRUGV8NKoJq8kFDoHpEEPUGCM3UxFBfrp13mxUXRzZd476aq496sWk8CQFYWNBozwBSJW_XDo4r0t_3r0R1oGa2g_odYFGNwx4p1c894H8Z_4J5VbhikTbHUdohxAuYOXHZnKGohODou7V8yRXHMsN8_8VPD7YrVxIXDKyujcJdvjibRzieBR9ux9PcbeOvWYvK4vMJJSrIOj8yk5L50bTeheRsuivx0sVEPGTgvVcmPIAn51pHMgmBiMKucF2PcVfdKMhViuT7ihit5R4do_taqMH4lsqb6DORt7u4MjMkLcazCzAFrrTvato33imGG-lMOOjsUDfzBZMPqT-JI3-qWJWsBwVNKN5Exghj12I0tOt9iv1tVTjUiIxCdtixOkRt_R6844uEHTsXJhuqEfvvfk93IQTPva0hC9m4vnHYFrTmB_3LlcZeMsClkjrLmeTaYcUm8Gd890czelgiFQI3uWw1oYbkr71GpVbl1SGYP9dfJYMDzxgjagcZ0T_skaAWt3eKErxBOhfQqTM_LWHMnW_O3i3gbFf4L65ljWSQJucNmLAHTNIdt1FybjG9DVDECsligIZyiLTkXWhQosPmNVv7FE-fxXNtMCgHavr46vLbOSNzrcBMJddMK7Lgf1WlgKaGwtBoe4OFqYKjwqYJeAXhuSngHh43loJw2GZUDwtoHKQ2P6jGMzQZhq2NoVJS91wG1XfCuqXodWc5-Gpbpw Did we change time limit recently? Where is it set?
,
Sep 6 2016
Somewhere in depths of cbuildbot: https://cs.chromium.org/chromium/src/third_party/chromite/scripts/cbuildbot.py?rcl=0&l=1280 As a trooper, I'm not sure how much I can help here since cbuildbot.py seems to implement its own timeout logic.
,
Sep 6 2016
Looks like I have more data Successful run: 17:08:41: INFO: Updating slave build timeout to 15986 seconds enforced by the master (from https://uberchromegw.corp.google.com/i/chromeos/builders/veyron_rialto-chrome-pfq/builds/549/steps/steps/logs/stdio) Timing-out run: 00:55:57: INFO: Updating slave build timeout to 4701 seconds enforced by the master (from https://uberchromegw.corp.google.com/i/chromeos/builders/veyron_rialto-chrome-pfq/builds/556/steps/steps/logs/stdio)
,
Sep 6 2016
+Luis in case he has some ideas on where the timeouts are set
,
Sep 7 2016
While I got an idea, builder went back to green: https://uberchromegw.corp.google.com/i/chromeos/builders/veyron_rialto-chrome-pfq/builds/558 Apparently, on failing runs veyron_rialto has not been starting for 2 hours after master started waiting for slaves. E.g. here master started around 9:30 and started waiting for slaves around 10:30 https://uberchromegw.corp.google.com/i/chromeos/builders/master-chromium-pfq/builds/3342/steps/steps/logs/stdio but message "ERROR: No status found for build config veyron_rialto-chrome-pfq" disappeared only around 12:40. Master sets timelimit to after 16200 seconds (can be seen in this file), but rialto eventually starts only when around 4000 seconds are left, which is insufficient for this builder. Though, I am not sure why it didn't start in time (maybe there were long build requests queue for this machine)? Anyway, now this start-up slowness disappeared and probably we can close this bug until this happens again.
,
Sep 12 2016
,
Sep 12 2016
Issue 644466 has been merged into this issue.
,
Oct 7 2016
,
Oct 10 2016
,
Nov 19 2016
,
Jan 21 2017
,
Mar 4 2017
,
Apr 17 2017
,
May 30 2017
,
Aug 1 2017
,
Oct 14 2017
|
||||||||||||||
►
Sign in to add a comment |
||||||||||||||
Comment 1 by achuith@chromium.org
, Sep 6 2016