New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 679013 link

Starred by 2 users

Issue metadata

Status: Archived
Owner:
Closed: Jan 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment

orco-release builder failing paygen and hwtest due to suite timeout

Project Member Reported by kevcheng@chromium.org, Jan 6 2017

Issue description

It looks like the tests aren't even running, they get scheduled but nothing else happens after that.  The orco duts are all idle so they haven't been running anything either.

https://uberchromegw.corp.google.com/i/chromeos/builders/orco-release/builds/735

There are a ton of queued jobs right now and nothing is running on the shard (chromeos-server46.hot).

There are 3 aborted jobs that show up in the running status:
http://chromeos-server46.hot.corp.google.com/afe/#tab_id=job_list&state_filter=running&type_filter=all


 
Owner: nxia@chromium.org
assigning to current deputy

Luigi mentioned Aseda might have some ideas what's happening in the paygen stage.
Nothing fishy in the PaygenTest stages. The tests were indeed scheduled. Autotest claims that they were queued, but then aborted.

canary channel suite job: http://shortn/_v7nI1039vG
dev channel suite job: http://shortn/_LmRMplaZGc
It seems like the 3 jobs are just stuck in provisioning. Is there anyway to just clobber those jobs?

One of the provisioning jobs just seems stuck... http://shortn/_tDlHKPuGOV
Mmm.  Do you see any processes getting stuck at or near exit?

We're looking into a strange unittest failure in  issue 678643 .  It seems to coincide with a glibc change in the chroot.  If you can catch a stuck process in the act, sudo cat /proc/<pid>/stack and see if it's also stuck in a futex call.
Owner: dgarr...@chromium.org
assigning to current deputy to resolve, builder is still failing.
Labels: OS-Chrome
The glibc change mentioned in #4 appears to be a red herring.
Cc: shuqianz@chromium.org
Labels: -Pri-3 Pri-1
upping priority and adding Charlene in case it's an issue with the shard.

Charlene, jobs seemed to be stalled on this shard, do you know what might be going on?
It looks like every Orco DUT in the BVT pool is marked as dead. Investigating further.
That shard was rebooted on Jan 4th, but appears to be up and running to me. I'm not sure if it's healthy, but it's up and running shard like processes.
I just forced another restart, which seems to have aborted the stuck jobs.

I'm now forcing a reverify for all BVT pool DUTs.
Status: Fixed (was: Untriaged)
They seem to have all passed!

Comment 12 by dchan@google.com, Mar 4 2017

Labels: VerifyIn-58

Comment 13 by dchan@google.com, Apr 17 2017

Labels: VerifyIn-59

Comment 14 by dchan@google.com, May 30 2017

Labels: VerifyIn-60
Labels: VerifyIn-61

Comment 16 by dchan@chromium.org, Oct 14 2017

Status: Archived (was: Fixed)

Sign in to add a comment