New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 669935 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner: ----
Closed: Jan 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: All
Pri: 1
Type: Bug



Sign in to add a comment

Enforce a per-step and/or per-run deadline for performance tests

Project Member Reported by charliea@chromium.org, Nov 30 2016

Issue description

https://bugs.chromium.org/p/chromium/issues/detail?id=669088 tracks an instance where a deadlock in cloud storage resulted in a four day hang on Win High-DPI (1).

I think that we should probably make this impossible to happen in the future.

At the very least, it seems like we should have a per-step deadline. This could be high enough to guarantee that it doesn't kick in until something goes seriously wrong - maybe 12, 18, or 24 hours. This would prevent a multi-day hang from ever occurring again and instead manifesting as a more-manageable test failure.

I could also imagine a situation where different steps each hang until the deadline for the same underlying reason. If there are 20 steps, and 10 of them hang for the same reason until a 12 hour deadline, that would still result in a 5 day hang, which would still be unacceptable. For this reason, it seems like it might make sense to enforce a deadline for an entire run that's longer than the deadline for a single step.

eyaich@, nednguyen@, WDYT? Any ideas about how hard this would be to implement?
 
Cc: -charliea@google.com

Comment 2 by eyaich@chromium.org, Nov 30 2016

So this is done on swarming, we have a 5 hour execution timeout for a run for our perf tests.  Our unittests are swarmed but not sure what the execution time out is set to there.  I think the default swarming timeout is 2 hours.
Should our policy for these types of "solved by swarming" problems be not to take action with the idea that all perf unit tests will be swarmed soon enough?
After more discussion, I think that "solved by swarming" is okay as long as it's accompanied by a "warn jbudorick@ that he might want to come up with something for Android" too.
Status: WontFix (was: Untriaged)

Sign in to add a comment