[Clapper] AU Tests are missing recently on M57 |
||||||||
Issue descriptionChromeos:9202.56.1 / 57.0.2987.123 Device: Clapper Below tests are missing R57-9202.56.1/au/autoupdate_Rollback https://wmatrix.googleplex.com/unfiltered?hide_missing=True&tests=autoupdate_Rollback&releases=57 R57-9202.56.1/au/platform_Powerwash https://wmatrix.googleplex.com/unfiltered?hide_missing=True&tests=platform_Powerwash&releases=57 R57-9202.56.1/au/autoupdate_EndToEndTest_npo_delta_9202.56.1 GE page: https://cros-goldeneye.corp.google.com/chromeos/console/qaRelease?releaseName=M57-STABLE-CHROMEOS-1
,
Mar 24 2017
This isn't a GE bug. Tests haven't been running
,
Mar 25 2017
,
Mar 27 2017
Xixuan@ will you please take a look to see why this board might be having issues? Could this be an insufficient dut issue?
,
Mar 28 2017
sorry for the delay due to CQ is not in good condition. Looking now.
,
Mar 28 2017
I don't find this job in both master and shard, which means it's not kicked off. Who / which service is responsible to kick off this test?
,
Mar 28 2017
The au stage on the builder kicks off the au suite with the three tests listed in the first comment of this bug. Since the sanity stage failed on the builder, the au stage was never run: https://uberchromegw.corp.google.com/i/chromeos_release/builders/clapper-release%20release-R57-9202.B
,
Mar 28 2017
dgarrett@ any idea what could be going on here?
,
Mar 28 2017
I think there are are several causes here. The most recent 3 failures look like devserver issues, the first two look like DUT issues (possibly indicating a bad build that doesn't boot on the DUT). Passing to deputy to investigate the devserver problems.
,
Mar 28 2017
thanks Don. Xixuan@ can you please take a look?
,
Mar 28 2017
Failure reasons: 1. I see Issue 689105 happened once in PaygenTestStable stage Solution: there's already progress in that bug. 2. I see several cases that DUT lose connection in PaygenTestStable stage. Solution: You can check whether these are the same DUTs. If it happens frequently in a fixed set of DUTs, lock them and send them to repair (go/cros-lab-device-repair). If we find many DUTs suffer this problem, it may be a build problem, since I see this error happens a lot since build 45 (45, 46, 47), maybe there's sth wrong with the build. 3. When paygen test is not finished well, the DUT has a bad state and an old build, which causes the error "DevServerStartupError('Timeout (30) waiting for remote devserver port_file',)" since it misses a lib (ImportError: /lib64/libc.so.6: version `GLIBC_2.16' not found) in that old build. Solution: A fix is ready to CQ. But I'm not 100% sure it will fix every such cases. Let me first make it pass CQ.
,
Mar 29 2017
All enguarde tests have succeeded in today's stable build. https://cros-goldeneye.corp.google.com/chromeos/console/viewRelease?releaseName=M57-STABLE-CHROMEOS-2 So this doesn't look like a build issue? more like a dut issue?
,
Mar 29 2017
> All enguarde tests have succeeded in today's stable build.
> [ ... ] more like a dut issue?
This is not a DUT issue. It's some sort of software issue, but
I can't say I understand it. I checked the complete history of
all BVT clapper DUTs in the last 24 hours. There were 45 provision
failures. That's a whole lot. Of those 45 failures, 43 had this
error:
RootfsUpdateError: Failed to perform rootfs update: DevServerStartupError('Timeout (30) waiting for remote devserver port_file',)
So, that's the problem we need to explain.
,
Mar 29 2017
I think we're talking about whether error like " Autotest client terminated unexpectedly: DUT is pingable, SSHable and did NOT restart un-expectedly. We probably lost connectivity during the test." in PaygenTestStable stage is a DUT issue or not. For the DevServerStartupError, I think it's an error that "installing a new build on a DUT with old builds" (in comment #11).
,
Mar 29 2017
Looking at the provision failures, I suspect that bug 689105 is the biggest contributor to failures here, and quite possibly the entire source of the problem. In particular, in c#48 on that bug, clapper is specifically called out as affected. One implication of all this is that after the bug is fixed in M59, we need to merge the fix back to M58 and M57.
,
May 15 2017
I think we can close this now. Bug 689105 has progress and clapper is running this test again recently. |
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by abod...@chromium.org
, Mar 23 2017