[Kernel HW Tests] power_SuspendShutdown and power_SuspendStress.bare are broken |
|||||||||
Issue descriptionThese HW tests are consistently failing or having issues on daisy, oak & x86-alex boards in ChromeOS 54 (see wmatrix results for more info).
,
Sep 8 2016
If power_DarkResumeShutdownServer is having problems, it likely that mosys is just broken. It should just call into mosys to check the platform and return on daisy, oak, and alex.
,
Sep 8 2016
,
Sep 8 2016
can someone please own this issue? This is a P1 and has been open for a while. We are trying to run kernel tests as part of compiler validation and these few failures make it imposible for us to get green builders. If some tests are board specific, maybe you could split the suite into board specific and non-board specific so that we can run the non-board specific tests?
,
Sep 8 2016
I looked at the consistent failures of this test suite (kernel_daily_regression) on go/wmatrix for R55, for the 14 boards we test regularly. Most of the boards are passing most of the tests (or sometimes passing a test & sometimes failing it). There are 3 tests that appear to be major problems: power_DarkResumeDisplay (10 boards consistently fail test; 3 boards don't run it) power_DarkResumeShutdownServer (6 boards consistently fail; 3 boards don't run it) power_SuspendStress.bare (7 boards consistently fail; 1 board doesn't run it) 1 board consistently fails power_SuspendShutdown. Which tests fail on what boards (consistently): power_DarkResumeDisplay: falco, x86-alex, oak, squawks, terra, peppy, veyron_jaq, sentry, chell, nyan_big power_DarkResumeShutdownServer: falco, x86-alex, sentry, chell, nyan_big, link power_SuspendStress.bare: falco, daisy, x86-alex, oak, squawks, veyron_jaq, lumpy power_SuspendShutdown: falco
,
Sep 8 2016
Let's keep discussion of the dark resume tests in issue 625281 so it doesn't get spread across two bugs. Julius, you're listed as the owner (well, author) of power_SuspendStress.bare. Any thoughts about the failures? Looking at https://wmatrix.googleplex.com/unfiltered?hide_missing=True&releases=tot&tests=power_SuspendStress.bare, "Autotest client terminated unexpectedly: DUT is no longer pingable, it may have rebooted or hung." appears to be a common reason. I'll volunteer to delete power_SuspendShutdown unless someone speaks up in its defense. powerd already has unit tests ensuring that the Suspender class requests shutdown after repeated failures. I suppose that the autotest ensures that we report failures from the powerd_suspend script back to powerd, but that doesn't feel too hard to get right and I'm not sure that it justifies the noise from this autotest.
,
Sep 9 2016
crbug.com/626467 has more details on recent power_SuspendStress efforts. @#c4, definitely don't want power_SuspendStress/control.bareDaily in its current state to block compiler validation. The test itself isn't board specific but expectations of it succeeding on older platforms are lower since it wasn't run across many iterations like FSI (10k). Possible solutions: 1. Migrate control.bareDaily in its current form to another daily suite. I'm not aware of one to add it to though. Should we create another suite? 2. Add some whitelist mechanism for boards which we know to have lower quality bar for s2r stress. 3. Remove kernel_daily_regression from compiler qual and find a suite that parallels those tests excluding control.bareDaily 4. Fix all s2r stress bugs on all platforms. Not a short-term reality. @#c6, agree the utility of power_SuspendShutdown is low given unit tests. Wasn't able to find any issues where it identified real issue as well. I vote for removing it as well. @Will have time to take a look at resolving power_SuspendStress side of this?
,
Sep 9 2016
I've uploaded https://chromium-review.googlesource.com/c/383712/ and https://chromium-review.googlesource.com/c/383751/ to remove power_SuspendShutdown.
,
Sep 9 2016
Can we please put the onus of these tests on you instead of on the users of the suite? that is, if these tests are flaky or need some setup to work on only a few boards, then disable the tests until you decide what to do with them instead of having the users of the suite analyze these failures every day. I think we can all agree that flaky or constantly failing tests are BAD.
,
Sep 9 2016
#9: Who's the "you" in your comment? I'm in complete agreement that flaky tests are worse-than-useless and should be disabled or deleted.
,
Sep 9 2016
Unless there's disagreement I'm migrating power_SuspendStress.bareDaily to power_daily to decouple ongoing work in crbug.com/626467 from compiler validation using kernel_daily_regression.
,
Sep 9 2016
,
Sep 9 2016
about #10. By "you" I mean the owner of the suites or tests in question. I am not aware of all the different roles of different people copied in this bug.
,
Sep 9 2016
,
Sep 9 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/218f3d7f02dbab06ff09d8fbdfd4235d5f9beff3 commit 218f3d7f02dbab06ff09d8fbdfd4235d5f9beff3 Author: Daniel Erat <derat@chromium.org> Date: Fri Sep 09 16:55:58 2016 autotest-server-tests: Unlist power_SuspendShutdown. Unlist the power_SuspendShutdown test. This functionality is already covered by the power_manager package's unit tests. BUG= chromium:631504 TEST=none CQ-DEPEND=Ie30f0679387550b1fd446a25734d8c5c436aba66 Change-Id: I42d38cce87903c261a7f7ea8291f6ed71fb8a261 Reviewed-on: https://chromium-review.googlesource.com/383712 Commit-Ready: Dan Erat <derat@chromium.org> Tested-by: Dan Erat <derat@chromium.org> Reviewed-by: Todd Broch <tbroch@chromium.org> [modify] https://crrev.com/218f3d7f02dbab06ff09d8fbdfd4235d5f9beff3/chromeos-base/autotest-server-tests/autotest-server-tests-9999.ebuild
,
Sep 9 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/cc676563631416e3a190d29e7caacbee9c2ef866 commit cc676563631416e3a190d29e7caacbee9c2ef866 Author: Daniel Erat <derat@chromium.org> Date: Fri Sep 09 16:56:51 2016 autotest: Remove power_SuspendShutdown. Remove the power_SuspendShutdown test. This functionality is already covered by the power_manager package's unit tests. BUG= chromium:631504 TEST=none CQ-DEPEND=I42d38cce87903c261a7f7ea8291f6ed71fb8a261 Change-Id: Ie30f0679387550b1fd446a25734d8c5c436aba66 Reviewed-on: https://chromium-review.googlesource.com/383751 Commit-Ready: Dan Erat (out monday) <derat@chromium.org> Tested-by: Dan Erat (out monday) <derat@chromium.org> Reviewed-by: Todd Broch <tbroch@chromium.org> [delete] https://crrev.com/785cf458ee0ce7516687add68f102825d8891df5/server/site_tests/power_SuspendShutdown/power_SuspendShutdown.py [delete] https://crrev.com/785cf458ee0ce7516687add68f102825d8891df5/server/site_tests/power_SuspendShutdown/control
,
Sep 10 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/e7adc633f998487f6396bc9b83da03f4e17df8a0 commit e7adc633f998487f6396bc9b83da03f4e17df8a0 Author: Todd Broch <tbroch@chromium.org> Date: Fri Sep 09 18:53:17 2016 power_SuspendStress.bareDaily move to power_daily suite. Migrate from kernel_daily_regression to power_daily. BUG= chromium:631504 TEST=none Change-Id: I34e26a6ecdeef75b1b24ae6541d89c56c6fe8f5a Reviewed-on: https://chromium-review.googlesource.com/383752 Commit-Ready: Todd Broch <tbroch@chromium.org> Tested-by: Todd Broch <tbroch@chromium.org> Reviewed-by: Sameer Nanda <snanda@chromium.org> [modify] https://crrev.com/e7adc633f998487f6396bc9b83da03f4e17df8a0/client/site_tests/power_SuspendStress/control.bareDaily
,
Sep 10 2016
Marking fixed based on: #c15 & #c16 :: power_SuspendShutdown failure addressed by removing test ... thanks Dan. #c17 :: power_SuspendStress.bareDaily failure addressed by migrating test to different suite. issue 625281 :: power_Dark* being addressed there.
,
Sep 10 2016
thanks!
,
Sep 19 2016
sorry, re-opening this issue. There is still a failure in ARM from this builder: https://uberchromegw.corp.google.com/i/chromeos/builders/arm-gcc-toolchain/builds/6 The error is [Test-Logs]: power_DarkResumeShutdownServer: ERROR: Unhandled RemotePowerException: Failed to change outlet status for host: chromeos2-row3-rack5-host21 to state: ON. which links to: http://cautotest/tko/retrieve_logs.cgi?job=/results/77296329-chromeos-test/ Please help. This seems to be the last one. Testing with this suite is looking much better now.
,
Sep 19 2016
Issue 625281 was tracking power_DarkResumeShutdownServer. I've reopened it.
,
Nov 15 2016
Verified. power_SuspendShutdown removed. https://wmatrix.googleplex.com/unfiltered?hide_missing=True&tests=power_SuspendShutdown power_SuspendStress.bareDaily test migrated to power_daily suite. https://wmatrix.googleplex.com/unfiltered?releases=tot&suites=power_daily&tests=power_SuspendStress.bare |
|||||||||
►
Sign in to add a comment |
|||||||||
Comment 1 by cmt...@chromium.org
, Sep 8 2016