cave-release:1635 failed |
|||
Issue descriptioncave-release:1635 failed Builders failed on: - cave-release: https://luci-milo.appspot.com/buildbot/chromeos/cave-release/1635 This one is a little tricky to follow, but we can see that the suite gets aborted: http://cautotest-prod.corp.google.com/afe/#tab_id=view_job&object_id=153007773 Only 2 of the tests in the suite run, and this is the last line from the suite log: 11/01 02:05:14.728 DEBUG| suite:1378| Adding job keyval for autoupdate_EndToEndTest.paygen_au_canary_full=153008096-chromeos-test Previously, we see a really slow step communicating with MySQL: 11/01 00:56:13.323 DEBUG| suite:1378| Adding job keyval for autoupdate_EndToEndTest.paygen_au_canary_delta=153008091-chromeos-test 11/01 02:04:46.180 ERROR| db:0023| 02:04:46 11/01/17: An operational error occurred during a database operation: (2006, 'MySQL server has gone away'); retrying, don't panic yet Back over in the PaygenTestCanary log: https://logs.chromium.org/v/?s=chromeos%2Fbb%2Fchromeos%2Fcave-release%2F1635%2F%2B%2Frecipes%2Fsteps%2FPaygenTestCanary%2F0%2Fstdout 02:10:06: INFO: Refreshing due to a 401 (attempt 1/2) 02:10:06: INFO: Refreshing access_token 02:13:46: INFO: Refreshing due to a 401 (attempt 1/2) 02:13:46: INFO: Refreshing access_token 02:16:38: INFO: Refreshing due to a 401 (attempt 1/2) 02:16:38: INFO: Refreshing access_token [1;33m02:23:44: WARNING: Killing tasks: [<_BackgroundTask(_BackgroundTask-7:6:6:3, started)>][0m [1;33m02:23:44: WARNING: Killing 16475 (sig=24 SIGXCPU)[0m ayatane, Have you run into issues like this with MySQL slowness before? This does seem like a really long time.
,
Nov 1 2017
The MySQL error is a red herring. All the suite job does is create its child jobs and wait on them. The MySQL connection it holds can time out (MySQL server has gone away), so it refreshes it. The "slow step communicating with MySQL" is just the suite job doing nothing waiting for any of its tests to finish. The error is simply the suite timing out. I will need to dig to find out why; a blind guess is that there are not enough DUTs or the host scheduler which is responsible for matching tests to DUTs to run has a problem.
,
Nov 6 2017
,
Nov 6 2017
|
|||
►
Sign in to add a comment |
|||
Comment 1 by ayatane@chromium.org
, Nov 1 2017