New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 725615 link

Starred by 2 users

Issue metadata

Status: Archived
Owner: ----
Closed: May 2018
Cc:
EstimatedDays: ----
NextAction: ----
OS: Linux
Pri: 1
Type: Bug



Sign in to add a comment

Speed up or split tests which take >20 minutes to run

Project Member Reported by davidjames@chromium.org, May 23 2017

Issue description

We are planning on setting the timeout for tests to 20 minutes.

The following tests occasionally take >20 minutes to pass (percentage is below followed by number of occurrences).

moblab_RunSuite	62.8%	254
cheets_NotificationTest	1.00%	14
cheets_CTS_N.CtsOpenGLTestCases	0.70%	12
cheets_CTS_N.CtsOpenGlPerf2TestCases	0.58%	10
provision_AutoUpdate.double	0.19%	10
cheets_CTS_N.CtsDramTestCases	0.46%	8
cheets_CTS_N.CtsAccountManagerTestCases	0.35%	6
jetstream_LocalApi	0.40%	4
cheets_GTS.GtsNetTestCases	0.14%	4
cheets_GTS.GtsAdminTestCases	0.07%	2
jetstream_NetworkInterfaces	0.20%	2
video_VideoSanity	0.64%	2
cheets_CTS.android.core.tests.libcore.package.harmony_java_math	0.10%	2
platform_DMVerityBitCorruption.last	0.02%	1
jetstream_ApiServerAttestation	0.10%	1
platform_DMVerityBitCorruption.first	0.02%	1

Here is original query for it:

https://plx.corp.google.com/script/#a=qo%7Ci=google%253A%253Ascript_b4._04f204_3f65_4d63_8b38_90087ed9bcca

We need to do something about tests that occasionally take >20 minutes, such as the above tests. Note that while many cheets tests are in the list above, it's not the case that all cheets tests are slow -- there are plenty that run faster.

 
Cc: ihf@chromium.org
moblab_RunSuite is an outlier because it's really an entire suite contained within it (including provisioning of dub-DUTs)

Some ideas for making it faster are discussed in  Issue 718618 . Nevertheless I propose a different timeout for it.
Here are tests which took >20 mins to fail (and number of occurrences over 30 days):

moblab_RunSuite	88.89%	32
cheets_CTS.android.core.tests.libcore.package.harmony_java_math	31.11%	14
cheets_CTS.com.android.cts.dram	33.33%	12
cheets_CTS_N.CtsOpenGLTestCases	13.79%	8
cheets_CTS_N.CtsDramTestCases	8.57%	6
video_ChromeHWDecodeUsed.h264	6.19%	6
provision_AutoUpdate.double	40.00%	6
cheets_CTS_N.CtsAccountManagerTestCases	8.33%	6
cheets_StartAndroid.stress	30.00%	6
cheets_CTS_N.CtsOpenGlPerf2TestCases	6.45%	4
video_ChromeRTCHWDecodeUsed.mjpeg	100.00%	4
graphics_Idle	100.00%	2
video_VideoSanity.vp9	100.00%	2
login_CryptohomeIncognito	100.00%	2


moblab_RunSuite is a special case (since it's not part of the suite and doesn't have retries configured).
It currently just uses the overall suite timeout of 2.5 hours. It probably would benefit from a shorter timeout, say 40 minutes. It's very rare to see moblab runs over 40 or 60 minutes, we've only seen 2 runs over 40 minutes that passed (taking 110 minutes or so), and 26 runs over 40 minutes that failed (taking between 40 and 60 minutes).
I suspect that the data above is somehow including provisioning
time, at least in some cases.  The red flags are these:
    platform_DMVerityBitCorruption.first
    platform_DMVerityBitCorruption.last
    cheets_CTS.android.core.tests.libcore.package.harmony_java_math

Those tests commonly show up as the victims of provisioning failures.
If the test times that you gathered were including provisioning,
that would explain their presence on the list.  Most especially,
a spot check says that the "DMVerity" tests take about 20 seconds.
I consider it unlikely that these tests have a bug that causes them
sometimes to take 20 minutes and then pass.

Moreover, the fact that _any_ test passes with more than 20 minutes
run time is a red flag:  The default job timeout is 20 minutes.  Jobs
have a little bit of leeway in the timeout, but basically, any job with
more than 20 minutes of runtime should have aborted, not passed.

Comment 5 by ihf@chromium.org, May 24 2017

My take is that the cheets_CTS tests were hitting shard server slowdown issue 724396 which was fixed on all servers on Monday. The update tests may have been slow due to slow devservers/network. The cheets_StartAndroid.stress may have been due to TPM blocking Chrome start on newer devices. The video tests may have been system hangs/reboots. Just wild guesses and from memory, as I don't have the log files.
Project Member

Comment 6 by sheriffbot@chromium.org, May 24 2018

Labels: Hotlist-Recharge-Cold
Status: Untriaged (was: Available)
This issue has been Available for over a year. If it's no longer important or seems unlikely to be fixed, please consider closing it out. If it is important, please re-triage the issue.

Sorry for the inconvenience if the bug really should have been left as Available.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Status: Archived (was: Untriaged)

Sign in to add a comment