daisy_spring-release almost never works |
||||||
Issue descriptionCan we turn this off, or fix it? https://luci-milo.appspot.com/buildbot/chromeos/daisy_spring-release/?limit=200 daisy_spring-release:2591 failed Builders failed on: - daisy_spring-release: https://luci-milo.appspot.com/buildbot/chromeos/daisy_spring-release/2591
,
Jun 30 2017
These can be ignored: 04:03:50: INFO: RunCommand: /b/c/cbuild/repository/.cache/common/gsutil_4.19.tar.gz/gsutil/gsutil -o 'Boto:num_retries=10' stat -- gs://chromeos-releases/canary-channel/daisy-spring/9700.0.0/payloads/signing/18656-140019727279936/1.payload.hash.update_signer.signed.bin 04:03:51: WARNING: GS_ERROR: No URLs matched: gs://chromeos-releases/canary-channel/daisy-spring/9700.0.0/payloads/signing/18656-140019727279936/1.payload.hash.update_signer.signed.bin Filed crbug.com/738539 for that
,
Jun 30 2017
autotest link http://cautotest.corp.google.com/afe/#tab_id=view_job&object_id=125966498 going to the sponge invocation shows that everything passed: https://tests.corp.google.com/target?tab=Test+Cases&run=1&sortBy=STATUS&show=ALL&id=7404e274-6f2c-4115-a558-dc30ccdb2ff6&target=daisy_spring-release/R61-9700.0.0-test_suites/control.bvt-inline&searchFor=
,
Jun 30 2017
Half-way down the page under 'Hosts for this job' there is something that says: Host Status (hostless) Aborted with links to log output: https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/125966498-chromeos-test/hostless/ https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/125966498-chromeos-test/hostless/debug
,
Jun 30 2017
In the first link, status.log shows 'END GOOD' for all tests. In the second (debug) link, autoserv.ERROR shows: 06/30 04:37:31.520 ERROR| db:0024| 04:37:31 06/30/17: An operational error occurred during a database operation: (2006, 'MySQL server has gone away'); retrying, don't panic yet 06/30 04:53:32.615 ERROR| db:0024| 04:53:32 06/30/17: An operational error occurred during a database operation: (2006, 'MySQL server has gone away'); retrying, don't panic yet
,
Jun 30 2017
autoserv.WARNING has: 06/30 03:57:18.187 WARNI| subcommand:0081| parallel_simple was called with an empty arglist, did you forget to pass in a list of machines? 06/30 03:57:18.187 WARNI| server_job:0799| Not checking if job_repo_url contains autotest packages on [] 06/30 03:57:18.232 WARNI| suite:0922| /usr/local/autotest/server/cros/dynamic_suite/suite.py:922: UserWarning: Calling method "name_in_tag_predicate" from Suite is deprecated func.__name__) 06/30 03:57:18.233 WARNI| suite:0922| /usr/local/autotest/server/cros/dynamic_suite/suite.py:922: UserWarning: Calling method "get_test_source_build" from Suite is deprecated func.__name__) 06/30 04:37:31.520 ERROR| db:0024| 04:37:31 06/30/17: An operational error occurred during a database operation: (2006, 'MySQL server has gone away'); retrying, don't panic yet 06/30 04:53:32.615 ERROR| db:0024| 04:53:32 06/30/17: An operational error occurred during a database operation: (2006, 'MySQL server has gone away'); retrying, don't panic yet
,
Jun 30 2017
So after all that I still cannot figure out what went wrong.
,
Jun 30 2017
> So after all that I still cannot figure out what went wrong. <sigh> The short summary is that for the last two builds, all the tests in bvt-inline passed, but for some reason, the suite timed out. I don't know why, but we have to blame infra.
,
Jun 30 2017
OK thanks. Well I'll just leave this bug for infra.
,
Jun 30 2017
Actually I should probably check older builds and see if there is a different error.
,
Jun 30 2017
Cutting to the chase, here are the suite jobs for the failures:
http://cautotest.corp.google.com/afe/#tab_id=view_job&object_id=125966498
http://cautotest.corp.google.com/afe/#tab_id=view_job&object_id=125849334
Result logs are here:
https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/125966498-chromeos-test/hostless/
https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/125849334-chromeos-test/hostless/
In those folders, status.log shows an abort, but the key isn't
there. Instead, if you look in debug/autoserv.DEBUG, you'll
see that the last few lines logged look like this:
06/29 21:36:54.524 INFO | server_job:0200| START 125849658-chromeos-test/chromeos6-row2-rack16-host18/login_RetrieveActiveSessions login_RetrieveActiveSessions timestamp=1498797246 localtime=Jun 29 21:34:06
06/29 21:36:54.525 INFO | server_job:0200| GOOD 125849658-chromeos-test/chromeos6-row2-rack16-host18/login_RetrieveActiveSessions login_RetrieveActiveSessions timestamp=1498797256 localtime=Jun 29 21:34:16 completed successfully
06/29 21:36:54.525 INFO | server_job:0200| END GOOD 125849658-chromeos-test/chromeos6-row2-rack16-host18/login_RetrieveActiveSessions login_RetrieveActiveSessions timestamp=1498797256 localtime=Jun 29 21:34:16
06/29 21:36:54.526 DEBUG| suite:1451| Adding job keyval for login_RetrieveActiveSessions=125849658-chromeos-test
Those log lines come from the 125849334 job for R61-9699.0.0. The
job was created at 20:24:33, and timed out at 23:24:33. So, the
summary of the suite job is "autoserv went silent after about 1h15
minutes of work, until finally it timed out."
,
Jun 30 2017
> Actually I should probably check older builds and see if there is a different error. Given the symptoms, I think earlier build failures will be found to be either a) this same problem, or b) no longer occurring. For daisy_spring, I don't think we have anything to look at but an infra problem. I'll try to find a longer term owner for this.
,
Jul 10 2017
I don't know what changed, but the builder has been happy for quite some time, so I'm declaring victory.
,
Jul 10 2017
,
Jul 10 2017
It must have heard us complaining. Great! |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by sjg@chromium.org
, Jun 30 2017