New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 643703 link

Starred by 2 users

Issue metadata

Status: Verified
Owner:
Closed: Nov 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment

Verify jobs results mixed with test results

Project Member Reported by ka...@chromium.org, Sep 2 2016

Issue description

Starting with R55-8762.0.0, Thu Sep 01

Screenshot: https://screenshot.googleplex.com/jczfqCbkWHj
 
Owner: ----

Comment 2 by dshi@chromium.org, Sep 2 2016

Cc: jrbarnette@chromium.org shchen@chromium.org dchan@chromium.org waihong@chromium.org
 Issue 643711  has been merged into this issue.

Comment 3 by ka...@chromium.org, Sep 6 2016

Owner: pho...@chromium.org
Can this be prioritized to be looked at?
Are there blocking issues for this one to be picked up?

Comment 4 by ka...@chromium.org, Sep 7 2016

Cc: harpreet@chromium.org
Labels: -Pri-1 Pri-2
Owner: jrbarnette@chromium.org
Status: Assigned (was: Untriaged)
Argh!

This is because of https://chromium-review.googlesource.com/#/c/372980/

I'd begun to worry about side-effects of the extra stuff in status.log.

I'll take a look at what we can do, but it may have to be a few days.

Comment 8 by ka...@chromium.org, Sep 27 2016

Can we have some action here?
For a test suite with no more than 40 tests, I see more than 120 test results - https://screenshot.googleplex.com/cOdW8KxGGzS

Comment 9 by dchan@chromium.org, Oct 4 2016

Labels: -Pri-2 Pri-1
Richard, this is getting really bad, we can no longer tell if the result are good or bad with hundreds of test result instead of the expected 50-ish


https://screenshot.googleplex.com/fSAePjagaGu



Labels: M-55
This is affecting M55 results, but not M54.
2 suggestions on how to fix this:

1) Update the parser to not parse verify/provision in the status.log
2) Update wmatrix to ignore these.

1 is probably easier for us to do and turnaround.
Cc: akes...@chromium.org
+ Aviv to help assign out this work 
Having the verify/provision logs in the job's status.log seems like a useful feature. It makes it easier to associate a test job with its corresponding special tasks, no?

Richard please comment on choice between 1 and 2 in comment 12.
How about
3) Present these results in single test result view page only, and probably only in the failed ones like https://wmatrix.googleplex.com/testrun/unfiltered?test_ids=358873910, so to indicate where the issue is coming from.

Meanwhile either 1) or 2) is needed to get rid of the confusing reporting
regards to c#14, the short answer is no. 

see https://wmatrix.googleplex.com/unfiltered?suites=faft_bios&days_back=30&hide_missing=True&show_faft_view=True

Having an inconsistent test count make it very difficult to know if all or no test were executed.


> 1) Update the parser to not parse verify/provision in the status.log

This is possibly hard.  There's not anything I know of to
distinguish the log items created by verify() from legitimate test
output.  By design, there isn't supposed to be any such difference.
We could probably key off of names, but I don't think the names
are unique...


> 2) Update wmatrix to ignore these.

This likely is better, architecturally.  wmatrix is the best place
to put semantic knowledge about the names of tests.  That said,
using wmatrix could still suffer from the same problem of "how do
we distinguish valid tests from uninteresting ones?"  However,
that question seems to be part of the complaint in bug 653357, so
addressing it here could have positive side-effects.

I suspect both solutions require hard-coding a test-name pattern
to distinguish "ignorable" from "important", and that implementation
choice is likely the biggest problem in either approach.  If that's
true, changing the parser is less work than changing wmatrix.

I note also that there's a third option, which is to change repair
and verify to use a logging solution that isn't "status.log".  That's
technically not too difficult, since all the logging is isolated in
common code in client/common_lib/hosts/repair.py.  There's probably
still some effort, since someone still has to create that new logging
solution.

Finally, I'd really like someone to take a deeper look at why this is
happening.  I *think* that this problem is caused by the new servo
verify and repair logging, but that should only affect some small number
of tests.  I'd like confirmation; there may be some options for a special
case solution.

Something changed around Aug 31st / Sep 2nd. If you look at the test results prior to Sept. verify/provision job results were not part of the test results on wmatrix. Eg. https://wmatrix.googleplex.com/matrix/unfiltered?suites=hotrod&releases=tot&hide_missing=True&days_back=60
Re: Finally, I'd really like someone to take a deeper look at why this is
happening.
Can these two facts contribute to a resolution
 - This is observed ONLY on M55 tests
 - This started with R55-8762.0.0
To recap/explain:
  * The problem is almost certainly caused by this CL:
        https://chromium-review.googlesource.com/#/c/372980/
  * If the problem is caused by that CL, it will show up as
    "the extra bogus data is present in status.log for certain
    tests."
  * I believe "certain tests" means "tests that use servo."

Finally, if I'm right about this, it seems like the problems should
actually cause failures in the tests:  The test requires servo, but
servo isn't working, so the test fails.

Yes, this looks to be servo tests issue. But at the same time it affects ONLY the pure 'suites=' view.
If I add 'tests=SOME_SERVO_TEST' in the URL, I don't get the extra counts.
Such workaround though is not good solution for us when observing resutls, esp on suites with big number of tests.
Cc: chingcodes@chromium.org
Example of most the repair fail reasons on M55:

- No control file for servohost_Reboot
https://wmatrix.googleplex.com/failures/unfiltered?platforms=tricky&tests=repair.reboot&days_back=10&builds=R55-8869.0.0&releases=55&suites=hotrod&hide_missing=True

or

- <Fault 1: '<type \'exceptions.Exception\'>:method "safe_switch_usbkey_power" is not supported'>
https://wmatrix.googleplex.com/failures/unfiltered?platforms=tricky&tests=verify.servod&days_back=10&builds=R55-8869.0.0&releases=55&suites=hotrod&hide_missing=True



These repair failures are preventing tests from running on this device on M55 but the same tests run without issue on this device on M54 (link below). What needs to be done to fix this? 
https://wmatrix.googleplex.com/platform/unfiltered?platforms=tricky&suites=hotrod&days_back=10&releases=54&hide_missing=True

Comment 24 by ka...@chromium.org, Oct 26 2016

P-i-n-g! P-i-n-g! P-i-n-g!

This is not working as intended.
Can this bug be assigned appropriately? There are lots of false positive and false negative conclusions we draw from the current view on wmatrix. 

Comment 25 by ka...@chromium.org, Oct 26 2016

Labels: cros-te-lab

Comment 26 by dchan@chromium.org, Oct 26 2016

Cc: autumn@chromium.org sosa@chromium.org
The best course of action is to revert  https://chromium-review.googlesource.com/#/c/372980/ until we have a better solution for wmatrix.
+richard, sosa, autumn can we have this happen soon ?
Status: Started (was: Assigned)
> The best course of action is to revert
> https://chromium-review.googlesource.com/#/c/372980/
> until we have a better solution for wmatrix.

I'm reasonably certain I can silence the code for cases where
we're invoked from a test.  That will be a much better solution
that erverting.

Comment 28 by ihf@chromium.org, Oct 26 2016

Cc: ihf@chromium.org rohi...@chromium.org
Project Member

Comment 29 by bugdroid1@chromium.org, Oct 31 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/07c2e1d04c551dd6a0011243121340d49e562e81

commit 07c2e1d04c551dd6a0011243121340d49e562e81
Author: Richard Barnette <jrbarnette@chromium.org>
Date: Wed Oct 26 21:24:28 2016

[autotest] Keep servo repair events out of status.log in tests.

For tests that depend on servo, and use the standard recommended
boilerplate for ensuring that the servo host is created, the servo
host's repair() method will be called prior to the test.  The
resulting log records in status.log cause wmatrix to report bogus
tests that pollute the UI.

This changes servo repair in this case not to write to status.log.

BUG= chromium:643703 
TEST=unit tests

Change-Id: I293282b3e80c484dd8e636c1f3c104c3ecf9283c
Reviewed-on: https://chromium-review.googlesource.com/403453
Commit-Ready: Richard Barnette <jrbarnette@chromium.org>
Tested-by: Richard Barnette <jrbarnette@chromium.org>
Reviewed-by: danny chan <dchan@chromium.org>

[modify] https://crrev.com/07c2e1d04c551dd6a0011243121340d49e562e81/server/hosts/servo_host.py
[modify] https://crrev.com/07c2e1d04c551dd6a0011243121340d49e562e81/client/common_lib/hosts/repair.py
[modify] https://crrev.com/07c2e1d04c551dd6a0011243121340d49e562e81/client/common_lib/hosts/repair_unittest.py

Status: Fixed (was: Started)
Needs a push to prod, but otherwise, it's done.

Comment 31 by ka...@chromium.org, Nov 16 2016

Labels: Merge-Request-55

Comment 32 by dimu@chromium.org, Nov 16 2016

Labels: -Merge-Request-55 Merge-Approved-55 Hotlist-Merge-Approved
Your change meets the bar and is auto-approved for M55 (branch: 2883)
Owner: ka...@chromium.org
Status: Assigned (was: Fixed)
Merge CL is here:
    https://chromium-review.googlesource.com/#/c/411948/

Awaiting approval from kalin@

Project Member

Comment 34 by bugdroid1@chromium.org, Nov 18 2016

Labels: merge-merged-release-R55-8872.B
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/0d98310070b856d68ad99f741e91ce7c6bfb63dc

commit 0d98310070b856d68ad99f741e91ce7c6bfb63dc
Author: Richard Barnette <jrbarnette@chromium.org>
Date: Wed Oct 26 21:24:28 2016

[autotest] Keep servo repair events out of status.log in tests.

For tests that depend on servo, and use the standard recommended
boilerplate for ensuring that the servo host is created, the servo
host's repair() method will be called prior to the test.  The
resulting log records in status.log cause wmatrix to report bogus
tests that pollute the UI.

This changes servo repair in this case not to write to status.log.

This change is a cherry-pick from the main branch, and includes a
required bug fix from a separate CL.

BUG= chromium:643703 ,chromium:661201
TEST=unit tests

Change-Id: I64756f20c1eda0d60d16a06bb1838991c22f4f3c
Reviewed-on: https://chromium-review.googlesource.com/411948
Commit-Queue: Richard Barnette <jrbarnette@chromium.org>
Tested-by: Richard Barnette <jrbarnette@chromium.org>
Reviewed-by: Kalin Stoyanov <kalin@chromium.org>

[modify] https://crrev.com/0d98310070b856d68ad99f741e91ce7c6bfb63dc/server/hosts/servo_host.py
[modify] https://crrev.com/0d98310070b856d68ad99f741e91ce7c6bfb63dc/client/common_lib/hosts/repair.py
[modify] https://crrev.com/0d98310070b856d68ad99f741e91ce7c6bfb63dc/client/common_lib/hosts/repair_unittest.py

Project Member

Comment 35 by sheriffbot@chromium.org, Nov 30 2016

This issue has been approved for a merge. Please merge the fix to any appropriate branches as soon as possible!

If all merges have been completed, please remove any remaining Merge-Approved labels from this issue.

Thanks for your time! To disable nags, add the Disable-Nags label.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Status: Verified (was: Assigned)
Project Member

Comment 37 by sheriffbot@chromium.org, Dec 4 2016

This issue has been approved for a merge. Please merge the fix to any appropriate branches as soon as possible!

If all merges have been completed, please remove any remaining Merge-Approved labels from this issue.

Thanks for your time! To disable nags, add the Disable-Nags label.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Labels: -Hotlist-Merge-Approved -Merge-Approved-55 -merge-merged-release-R55-8872.B
It's already all merged; there's nothing more to do.

Sign in to add a comment