New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 738038 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Jul 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

daisy_spring-release almost never works

Project Member Reported by sjg@google.com, Jun 29 2017

Issue description

Can we turn this off, or fix it?

https://luci-milo.appspot.com/buildbot/chromeos/daisy_spring-release/?limit=200



daisy_spring-release:2591 failed

Builders failed on: 
- daisy_spring-release: 
  https://luci-milo.appspot.com/buildbot/chromeos/daisy_spring-release/2591



 

Comment 1 by sjg@chromium.org, Jun 30 2017

HW test failure

06:20:57: WARNING: (stderr):
Could not attach to process.  If your uid matches the uid of the target
process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
again as the root user.  For more details, see /etc/sysctl.d/10-ptrace.conf
ptrace: Operation not permitted.
No current process: you must name one.
The program is not being run.

Comment 2 by sjg@chromium.org, Jun 30 2017

These can be ignored:

04:03:50: INFO: RunCommand: /b/c/cbuild/repository/.cache/common/gsutil_4.19.tar.gz/gsutil/gsutil -o 'Boto:num_retries=10' stat -- gs://chromeos-releases/canary-channel/daisy-spring/9700.0.0/payloads/signing/18656-140019727279936/1.payload.hash.update_signer.signed.bin
04:03:51: WARNING: GS_ERROR: No URLs matched: gs://chromeos-releases/canary-channel/daisy-spring/9700.0.0/payloads/signing/18656-140019727279936/1.payload.hash.update_signer.signed.bin 


Filed crbug.com/738539 for that

Comment 4 by sjg@chromium.org, Jun 30 2017

Half-way down the page under 'Hosts for this job' there is something that says:

Host         Status
(hostless)   Aborted

with links to log output:

https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/125966498-chromeos-test/hostless/

https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/125966498-chromeos-test/hostless/debug

Comment 5 by sjg@chromium.org, Jun 30 2017

In the first link, status.log shows 'END GOOD' for all tests.

In the second (debug) link, autoserv.ERROR shows:

06/30 04:37:31.520 ERROR|                db:0024| 04:37:31 06/30/17: An operational error occurred during a database operation: (2006, 'MySQL server has gone away'); retrying, don't panic yet
06/30 04:53:32.615 ERROR|                db:0024| 04:53:32 06/30/17: An operational error occurred during a database operation: (2006, 'MySQL server has gone away'); retrying, don't panic yet


Comment 6 by sjg@chromium.org, Jun 30 2017

autoserv.WARNING has:

06/30 03:57:18.187 WARNI|        subcommand:0081| parallel_simple was called with an empty arglist, did you forget to pass in a list of machines?
06/30 03:57:18.187 WARNI|        server_job:0799| Not checking if job_repo_url contains autotest packages on []
06/30 03:57:18.232 WARNI|             suite:0922| /usr/local/autotest/server/cros/dynamic_suite/suite.py:922: UserWarning: Calling method "name_in_tag_predicate" from Suite is deprecated
  func.__name__)

06/30 03:57:18.233 WARNI|             suite:0922| /usr/local/autotest/server/cros/dynamic_suite/suite.py:922: UserWarning: Calling method "get_test_source_build" from Suite is deprecated
  func.__name__)

06/30 04:37:31.520 ERROR|                db:0024| 04:37:31 06/30/17: An operational error occurred during a database operation: (2006, 'MySQL server has gone away'); retrying, don't panic yet
06/30 04:53:32.615 ERROR|                db:0024| 04:53:32 06/30/17: An operational error occurred during a database operation: (2006, 'MySQL server has gone away'); retrying, don't panic yet


Comment 7 by sjg@chromium.org, Jun 30 2017

Owner: jrbarnette@chromium.org
So after all that I still cannot figure out what went wrong.
> So after all that I still cannot figure out what went wrong.

<sigh> The short summary is that for the last two builds,
all the tests in bvt-inline passed, but for some reason,
the suite timed out.  I don't know why, but we have to blame
infra.

Comment 9 by sjg@chromium.org, Jun 30 2017

Components: Infra>Labs
Owner: ----
OK thanks. Well I'll just leave this bug for infra.

Comment 10 by sjg@chromium.org, Jun 30 2017

Owner: sjg@chromium.org
Status: Started (was: Available)
Actually I should probably check older builds and see if there is a different error.
Cutting to the chase, here are the suite jobs for the failures:
    http://cautotest.corp.google.com/afe/#tab_id=view_job&object_id=125966498
    http://cautotest.corp.google.com/afe/#tab_id=view_job&object_id=125849334

Result logs are here:
    https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/125966498-chromeos-test/hostless/
    https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/125849334-chromeos-test/hostless/

In those folders, status.log shows an abort, but the key isn't
there.  Instead, if you look in debug/autoserv.DEBUG, you'll
see that the last few lines logged look like this:

06/29 21:36:54.524 INFO |        server_job:0200| START	125849658-chromeos-test/chromeos6-row2-rack16-host18/login_RetrieveActiveSessions	login_RetrieveActiveSessions	timestamp=1498797246	localtime=Jun 29 21:34:06	
06/29 21:36:54.525 INFO |        server_job:0200| 	GOOD	125849658-chromeos-test/chromeos6-row2-rack16-host18/login_RetrieveActiveSessions	login_RetrieveActiveSessions	timestamp=1498797256	localtime=Jun 29 21:34:16	completed successfully
06/29 21:36:54.525 INFO |        server_job:0200| END GOOD	125849658-chromeos-test/chromeos6-row2-rack16-host18/login_RetrieveActiveSessions	login_RetrieveActiveSessions	timestamp=1498797256	localtime=Jun 29 21:34:16	
06/29 21:36:54.526 DEBUG|             suite:1451| Adding job keyval for login_RetrieveActiveSessions=125849658-chromeos-test

Those log lines come from the 125849334 job for R61-9699.0.0.  The
job was created at 20:24:33, and timed out at 23:24:33.  So, the
summary of the suite job is "autoserv went silent after about 1h15
minutes of work, until finally it timed out."

Owner: jrbarnette@chromium.org
Status: Assigned (was: Started)
> Actually I should probably check older builds and see if there is a different error.

Given the symptoms, I think earlier build failures will be
found to be either a) this same problem, or b) no longer
occurring.

For daisy_spring, I don't think we have anything to look at
but an infra problem.  I'll try to find a longer term owner for
this.

I don't know what changed, but the builder has been happy for quite
some time, so I'm declaring victory.

Status: Fixed (was: Assigned)

Comment 15 by sjg@google.com, Jul 10 2017

It must have heard us complaining. Great!

Sign in to add a comment