New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 847560 link

Starred by 1 user

Issue metadata

Status: Fixed
Merged: issue 847540
Owner:
Last visit > 30 days ago
Closed: May 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

suite_scheduler: M69 builds not being scheduled

Project Member Reported by kmshelton@google.com, May 29 2018

Issue description

background: trying to get cr50_stress_experimental running on fizz and scarlet in chromeos1.  Suspect faft in ATL may also be impacted.

currently I see R69 builds being rejected as not ToT.  Suspect this is unintended as M68 has branched.

scheduler log snippets:


   135: {
    logMessage:  "Running Cr50StressExperimental on fizz-release/R69-10727.0.0"     
    severity:  "INFO"     
    time:  "2018-05-28T02:15:08.864150Z"     
   }
   136: {
    logMessage:  "branch_build spec 69 doesn't fit this task's requirement: ['==tot']"     
    severity:  "DEBUG"     
    time:  "2018-05-28T02:15:08.864251Z"     
   }


   135: {
    logMessage:  "Running Cr50StressExperimental on fizz-release/R69-10727.0.0"     
    severity:  "INFO"     
    time:  "2018-05-28T02:15:08.864150Z"     
   }
   136: {
    logMessage:  "branch_build spec 69 doesn't fit this task's requirement: ['==tot']"     
    severity:  "DEBUG"     
    time:  "2018-05-28T02:15:08.864251Z"     
   }
 
Cc: xixuan@chromium.org
Whatever the problem is, it appears to be specific to the
"cr50_stress_experimental" test suite, not to anything relating to
the M69 milestone.  I see plenty of jobs that suite_scheduler
scheduled against M69 builds, including specifically against
fizz-release/R69-10727.0.0.  However, I can't find any jobs scheduled
against that cr50 suite.

Mergedinto: 847540
Status: Duplicate (was: Untriaged)
I checked, and there are plenty of instances of "cr50_stress_experimental"
being scheduled against M68, but none against M69.

There's a very similar symptom reported against the wifi tests.

Status: Assigned (was: Duplicate)
Hmmm...  I note this:

$ atest host list -b pool:cr50_stress_experimental | count_labels -m
      1 bob
      1 electro
      1 eve
      1 robo360
      1 scarlet
      1 soraka
      1 teemo

That will explain lack of testing on fizz.  There's also this:

$ dut-status -p cr50_stress_experimental -m scarlet
hostname                       S   last checked         URL
chromeos1-row2-rack1-host2     NO  2018-05-29 12:16:29  http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos1-row2-rack1-host2/63432422-repair/
: jrbarnette .../src/third_party/autotest/files; dut-status -p cr50_stress_experimental
hostname                       S   last checked         URL
chromeos1-row1-rack5-host2     ??  2018-05-27 10:40:53  http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos1-row1-rack5-host2/1147405-reset/
chromeos1-row1-rack5-host3     NO  2018-05-29 12:38:13  http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos1-row1-rack5-host3/782555-repair/
chromeos1-row1-rack5-host1     NO  2018-05-29 12:07:42  http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos1-row1-rack5-host1/926816-repair/
chromeos1-row2-rack1-host4     NO  2018-05-29 06:57:17  http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos1-row2-rack1-host4/63430473-repair/
chromeos1-row2-rack1-host3     --  ---                  ---
chromeos1-row2-rack1-host2     NO  2018-05-29 12:16:29  http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos1-row2-rack1-host2/63432422-repair/
chromeos1-row2-rack1-host5     ??  2018-05-25 15:40:07  http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos1-row2-rack1-host5/1281027-repair/

There's a number of DUTs currently down, which would explain lots of
other lack of testing.

There's also an update on  bug 847540 , which is enough to
explain any remaining lack of testing.  But, even if we
fix how ToT is calculated, the DUT availability will still prevent
testing the cr50 suite at issue here.

Owner: kmshelton@chromium.org
To the extent that there's still a problem that _isn't_  bug 847540 ,
the problem is DUTs in the chromeos1 lab.  That's outside the scope
of CrOS Infra.

So...  Back to the OP, who can close this or act on it, as necessary.

Owner: jrbarnette@chromium.org
Status: Fixed (was: Assigned)
thanks for all the investigation

the M69 layer of problem is now well understood

the physical layer of problems are mostly well understood and tracked in b/80306939

the mismatch between model/board noted in c4 I think may need some more attention: will take a closer look once the physical layer problems are cleared (the scheduler syntax is "board", so teemo matching for fizz would be intuitive).  Considering that out of scope.

Sign in to add a comment