New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 685870 link

Starred by 1 user

Issue metadata

Status: Verified
Owner:
Closed: Mar 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

No tests in the control.power_daily sequence are being run

Project Member Reported by dbasehore@chromium.org, Jan 27 2017

Issue description

Looking at cautotest, it seems that the only power_daily tests that are run are tests that set the suite:power_daily attribute outside of the sequence control file.

The control.power_daily shows up as a server test, but none of the test runs that this control file should schedule seem to show up.

Can someone from the infra team comment on this?
 

Comment 1 by tbroch@chromium.org, Jan 27 2017

Cc: -snanda@chromium.org -jrbarnette@chromium.org semenzato@chromium.org dshi@chromium.org
So for example parent job,

http://cautotest/afe/#tab_id=view_job&object_id=98057307

Kicks off,


98153257	veyron_jaq-release/R58-9221.0.0/power_daily/power_SuspendStress.bareDaily
98153253	veyron_jaq-release/R58-9221.0.0/power_daily/power_LoadTest.WIRED_1hr_acok
98153252	veyron_jaq-release/R58-9221.0.0/power_daily/Power daily tests


Which presumably 'Power daily tests' is meant to be 

server/site_tests/sequences/control.power_daily

and is if I examine the control.srv file

http://cautotest/afe/#tab_id=view_job&object_id=98153252

I do see this failure in autoserv.ERROR

01/26 17:03:20.254 ERROR|           metrics:0429| Caught exception while flushing: No module named pyasn1.codec.ber


So it may relate to crbug.com/676696 which has similar signature


Owner: akes...@chromium.org
Status: Assigned (was: Untriaged)
Status: Fixed (was: Assigned)
suite_scheduler was down, this should be fixed now
Status: Assigned (was: Fixed)
There's still another issue. Even when the suite shows up in cautotest/, none of the child jobs are run.
The child jobs haven't run for a long time by the way.
Cc: shuqianz@chromium.org
Owner: dbasehore@chromium.org
Can you show me an example of such a suite job?
Here's the latest one that completed http://cautotest/afe/#tab_id=view_job&object_id=101119863

It's seeing the following error:
Caught exception while flushing: No module named pyasn1_modules.rfc5208

There's also the issue that these aren't being caught as failures... The suite is reported as having run successfully.
Owner: akes...@chromium.org
The suite job corresponding to that is here (arrived at by following the parent-job link): http://cautotest/afe/#tab_id=view_job&object_id=101015596 . It is running and has 3 child jobs. I think maybe you are referring to something other than an autotest suite (e.g. you're referring to the set of sub-tests of http://cautotest/afe/#tab_id=view_job&object_id=101119863 as a suite).

I'm 90% sure that pyasn1_modules logging is benign and unrelated to the problem (it's a metrics dependency which we ignore if missing).

The real culprit here seems to be that status.log is empty, and therefore even after the job completes, tko/parse doesn't know what to do with it.
There are 2 kinds of tests that are run under power_daily. There are the tests that are run under the sequence file, server/site_tests/sequences/control.power_daily, and there are tests that set the power_daily suite in their control file.

Is there no way to support both of these for a test suite?

Does "Power daily test" work when run locally? I'm suspicious of the test itself.

+sbasi what are all these "sequences" about anyway?

Also, NAME = "Power daily tests" is not really in line with naming convention for tests. This should be NAME = "power_daily" to match the control file name.
RE 10 I don't know much about these "sequences". From what I can tell, yeah, they should be supported just fine as a child job of a suite. But I am suspicious that the sequence itself is not working.
Cc: sbasi@chromium.org
+sbasi actually

Comment 14 by sbasi@chromium.org, Feb 14 2017

Cc: gwendal@chromium.org
What is interesting is that it seems that the server/site_tests/sequences directory contains more than just tests than the ones who use the sequence library...

Gwendal moved them from a server/site_tests/suites directory to server/site_tests/sequences

So while they are in the sequences directory its unrelated to sequences.

Moving on from that looking at the problem its as if the control file never runs any of these tests. So I did a quick test locally with a DUT in the lab via test_that and it never runs the sub tests so I think something is indeed broken in the Autotest framework.

Comment 15 by sbasi@chromium.org, Feb 14 2017

Cc: ayatane@chromium.org
A little more info I added some debug statements:
def _run_client_test(machine):
    logging.debug("Hello world")
    client = hosts.create_host(machine)
    logging.debug("Hello world2")
    client_attributes = site_host_attributes.HostAttributes(machine)
    logging.debug("Hello world3")
....

Hello world3 never came up.

02/14 12:51:33.342 DEBUG|           control:0023| Hello world
02/14 12:51:33.343 DEBUG|          ssh_host:0285| Running (ssh) 'grep -q CHROMEOS /etc/lsb-release && ! test -f /mnt/stateful_partition/.android_tester && ! grep -q moblab /etc/lsb-release'
02/14 12:51:33.353 INFO |      abstract_ssh:0809| Starting master ssh connection '/usr/bin/ssh -a -x -N -o ControlMaster=yes -o ControlPath=/tmp/_autotmp_ywpr2Hssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/tmp/tmpCnSxGn -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=300 -l root -p 22 chromeos4-row10-rack4-host7.cros'
02/14 12:51:33.353 DEBUG|        base_utils:0185| Running '/usr/bin/ssh -a -x -N -o ControlMaster=yes -o ControlPath=/tmp/_autotmp_ywpr2Hssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/tmp/tmpCnSxGn -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=300 -l root -p 22 chromeos4-row10-rack4-host7.cros'
02/14 12:51:33.594 DEBUG|      abstract_ssh:0756| Nuking master_ssh_job.
02/14 12:51:34.596 DEBUG|      abstract_ssh:0762| Cleaning master_ssh_tempdir.
02/14 12:51:34.612 DEBUG|           control:0025| Hello world2
02/14 12:51:34.612 INFO |        server_job:0801| Finished processing control file

That site_host_attributes library has not been touched since 2013... I did see some emails/threads about removing site_* classes maybe that work broke this?
I do have a CL that moves site_host_attributes into host_attributes, but that CL is nowhere close to landing at the moment.


Can this just be fixed as is? Based on #15, it looks like there's a bug with site_host_attributes. We shouldn't wait on a refactoring CL to fix this.
Owner: dbasehore@chromium.org
Without further investigation, I don't see precisely what the site_host_attributes bug is.

Can we simply modify the tests to be conventional an not use this sequence library. Infra team has hands full keeping the infra itself up and running.
Cc: pprabhu@chromium.org
Owner: akes...@chromium.org
We would have to add functionality to turn off the charging for a test based on a flag. I support doing this in the long run if it's less work to support that than sequences, but we'd like a more immediate fix.
Cc: snanda@chromium.org
Owner: pprabhu@chromium.org
Status: Started (was: Assigned)
Think I got it.


This must have been broken since the 'machine' argument to the control file got converted from 'str' to 'str' or 'dict', depends on our whim.

https://chromium-review.googlesource.com/#/c/449037/


There may be other suite control files that assume 'machine' is a string.
FTR:
- This test isn't using sequences. afaict, sequences are a dead feature that we will soon remove.
- This test _will not_ create new jobs. It just runs multiple client tests on the given DUT(s). Many server tests do that. I would not expect to see new jobs in cautotest/ for each of the client tests.
Cc: jean@chromium.org
+Jean

Okay. Unrelated by as an FYI:

* Yes this is not using sequences its just in the sequence folder b/c Gwendal moved it cause its similar to sequences.
* Sequences are not a dead feature and should not be removed. They are used by the Partners to do Storage Qual which is one of the main usecases for MobLab.
Filed  issue 698087  after scrubbing other server side tests. Either I'm misreading the situation, or all these tests have been broken for ~ 1 year...

I do not intend to fix all those myself.

(Ignore my comment about deleting sequences. It's not relevant here. I should've kept mum)
Project Member

Comment 26 by bugdroid1@chromium.org, Mar 3 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/119b04fcc1d3922314c86419fc081ff403c2ad13

commit 119b04fcc1d3922314c86419fc081ff403c2ad13
Author: Prathmesh Prabhu <pprabhu@chromium.org>
Date: Fri Mar 03 04:17:45 2017

power_daily: Use Host's hostname attribute.

The machine argument to jobs can be a str hostname or a dict containing
hostname along with other data. Instead of trying to guess, just use the
hostname attribute of the constructed host object.

BUG= chromium:685870 
TEST=None

Change-Id: I882d8cc404c369969dd6ccaee4add79ec713c563
Reviewed-on: https://chromium-review.googlesource.com/449037
Commit-Ready: Prathmesh Prabhu <pprabhu@chromium.org>
Tested-by: Prathmesh Prabhu <pprabhu@chromium.org>
Reviewed-by: Derek Basehore <dbasehore@chromium.org>

[modify] https://crrev.com/119b04fcc1d3922314c86419fc081ff403c2ad13/server/site_tests/sequences/control.power_daily

That seems to fix the problem. We just need to apply the change to power_build, power_weekly, and power control files.
Status: Verified (was: Started)
Yep. Other failing tests are collected in  issue 698087 

Sign in to add a comment