Speed up or split tests which take >20 minutes to run |
|||
Issue descriptionWe are planning on setting the timeout for tests to 20 minutes. The following tests occasionally take >20 minutes to pass (percentage is below followed by number of occurrences). moblab_RunSuite 62.8% 254 cheets_NotificationTest 1.00% 14 cheets_CTS_N.CtsOpenGLTestCases 0.70% 12 cheets_CTS_N.CtsOpenGlPerf2TestCases 0.58% 10 provision_AutoUpdate.double 0.19% 10 cheets_CTS_N.CtsDramTestCases 0.46% 8 cheets_CTS_N.CtsAccountManagerTestCases 0.35% 6 jetstream_LocalApi 0.40% 4 cheets_GTS.GtsNetTestCases 0.14% 4 cheets_GTS.GtsAdminTestCases 0.07% 2 jetstream_NetworkInterfaces 0.20% 2 video_VideoSanity 0.64% 2 cheets_CTS.android.core.tests.libcore.package.harmony_java_math 0.10% 2 platform_DMVerityBitCorruption.last 0.02% 1 jetstream_ApiServerAttestation 0.10% 1 platform_DMVerityBitCorruption.first 0.02% 1 Here is original query for it: https://plx.corp.google.com/script/#a=qo%7Ci=google%253A%253Ascript_b4._04f204_3f65_4d63_8b38_90087ed9bcca We need to do something about tests that occasionally take >20 minutes, such as the above tests. Note that while many cheets tests are in the list above, it's not the case that all cheets tests are slow -- there are plenty that run faster.
,
May 23 2017
moblab_RunSuite is an outlier because it's really an entire suite contained within it (including provisioning of dub-DUTs) Some ideas for making it faster are discussed in Issue 718618 . Nevertheless I propose a different timeout for it.
,
May 23 2017
Here are tests which took >20 mins to fail (and number of occurrences over 30 days): moblab_RunSuite 88.89% 32 cheets_CTS.android.core.tests.libcore.package.harmony_java_math 31.11% 14 cheets_CTS.com.android.cts.dram 33.33% 12 cheets_CTS_N.CtsOpenGLTestCases 13.79% 8 cheets_CTS_N.CtsDramTestCases 8.57% 6 video_ChromeHWDecodeUsed.h264 6.19% 6 provision_AutoUpdate.double 40.00% 6 cheets_CTS_N.CtsAccountManagerTestCases 8.33% 6 cheets_StartAndroid.stress 30.00% 6 cheets_CTS_N.CtsOpenGlPerf2TestCases 6.45% 4 video_ChromeRTCHWDecodeUsed.mjpeg 100.00% 4 graphics_Idle 100.00% 2 video_VideoSanity.vp9 100.00% 2 login_CryptohomeIncognito 100.00% 2 moblab_RunSuite is a special case (since it's not part of the suite and doesn't have retries configured). It currently just uses the overall suite timeout of 2.5 hours. It probably would benefit from a shorter timeout, say 40 minutes. It's very rare to see moblab runs over 40 or 60 minutes, we've only seen 2 runs over 40 minutes that passed (taking 110 minutes or so), and 26 runs over 40 minutes that failed (taking between 40 and 60 minutes).
,
May 23 2017
I suspect that the data above is somehow including provisioning
time, at least in some cases. The red flags are these:
platform_DMVerityBitCorruption.first
platform_DMVerityBitCorruption.last
cheets_CTS.android.core.tests.libcore.package.harmony_java_math
Those tests commonly show up as the victims of provisioning failures.
If the test times that you gathered were including provisioning,
that would explain their presence on the list. Most especially,
a spot check says that the "DMVerity" tests take about 20 seconds.
I consider it unlikely that these tests have a bug that causes them
sometimes to take 20 minutes and then pass.
Moreover, the fact that _any_ test passes with more than 20 minutes
run time is a red flag: The default job timeout is 20 minutes. Jobs
have a little bit of leeway in the timeout, but basically, any job with
more than 20 minutes of runtime should have aborted, not passed.
,
May 24 2017
My take is that the cheets_CTS tests were hitting shard server slowdown issue 724396 which was fixed on all servers on Monday. The update tests may have been slow due to slow devservers/network. The cheets_StartAndroid.stress may have been due to TPM blocking Chrome start on newer devices. The video tests may have been system hangs/reboots. Just wild guesses and from memory, as I don't have the log files.
,
May 24 2018
This issue has been Available for over a year. If it's no longer important or seems unlikely to be fixed, please consider closing it out. If it is important, please re-triage the issue. Sorry for the inconvenience if the bug really should have been left as Available. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
May 24 2018
|
|||
►
Sign in to add a comment |
|||
Comment 1 by davidjames@chromium.org
, May 23 2017