tast.py should choose reasonable timeout for "tast run" command |
||
Issue descriptionRight now, the tast.py server test uses a hardcoded timeout of 10 minutes for the "tast run" command. This is inflexible and will become a problem as more tests are added. tast.py already runs "tast list" to get the list of expected tests. It should probably sum up the individual test timeouts reported there and use that to determine a reasonable timeout (i.e. with a hardcoded max) for "tast run". Before this can happen, I need to update "tast list" to include the default timeout for tests that haven't specified their own values -- right now, it just reports 0.
,
Jun 12 2018
I sorted my thoughts. Basically each test timeout is imposed by a higher level. A timeout is not inherent, like an expected runtime. It is a "you are way too late" situation and impacting other operations. In particular for individual tests it should be 100...1000 times the expected runtime. (Not really outrageous because we have Chromebooks that are 20 times faster than others.) So a 100ms...1s test (on eve) could have a 2-5 minutes timeout. That way we know for sure it wasn't just a minor slowdown that lead to the timeout. The global timeout is dictated by the lab and the suite it runs in. If tast runs in the cq, it overstays its welcome after 20 minutes. Ideally it knows about that and takes action before 20 minutes are (20 minutes is dictated by infra team). It could wait for the lab to handle the situation, but that is a heavier handed outcome. It is preferable if tast can be proactive. Now if not in the cq, but doing daily style regression testing, tast might get hours of runtime allocated. But even there it should ideally notice when things are going wrong and handle the timeout before the lab does (some suites are expected to run days). Notice that in a situation where many tests are failing, we will normally see a lab slowdown (retries, timeouts etc.) That said I hear the Chrome testing framework will check if there is a high percentage rate of individual test failures (timeouts) and abort the run early to keep constant runtime expectations (or even get faster in the failure case which could be a very beneficial property if the whole lab does it, not just tast). Long winding story why long term summing up times is not the way to go, but short time (as long as we just have a few dozen tests) no harm in it.
,
Jun 12 2018
Thanks for all the details! I understand and agree that the "tast run" command isn't expected to take as long to run as the sum of the individual test timeouts; that's why I added a one-hour cap in https://crrev.com/c/1096358. It makes sense for that to be shorter and/or overridable by control files depending on where the suite is going to be run. I still think that it probably makes sense to keep some form of summing: if we're just running five tests with one-minute timeouts, then something has gone horribly wrong if the tast command is taking more than ~5 minutes to finish, and I'd prefer for us to cut it off without waiting for a probably-longer fixed timeout. Longer-term, tests should detect bad conditions like Chrome crash loops and bail out quickly with descriptive error messages instead of timing out with "context deadline exceeded".
,
Jun 12 2018
Sounds good.
,
Jun 13 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/tast/+/fd8eb71c4cd53e0e55207de54a5c15dcf4222e59 commit fd8eb71c4cd53e0e55207de54a5c15dcf4222e59 Author: Daniel Erat <derat@chromium.org> Date: Wed Jun 13 04:50:49 2018 tast: Disallow tests with negative timeouts. Make testing.Registry.AddTest return an error when passed a test with a negative timeout. If anyone did this, it could cause problems once the tast.py Autotest test uses test timeouts as a guide when choosing a timeout for the "tast run" command. BUG= chromium:851572 TEST=updated unit tests Change-Id: I94c00355cf1fee7b13d8d19a0415124729d0f4c9 Reviewed-on: https://chromium-review.googlesource.com/1096367 Commit-Ready: Dan Erat <derat@chromium.org> Tested-by: Dan Erat <derat@chromium.org> Reviewed-by: Shuhei Takahashi <nya@chromium.org> [modify] https://crrev.com/fd8eb71c4cd53e0e55207de54a5c15dcf4222e59/src/chromiumos/tast/testing/registry.go [modify] https://crrev.com/fd8eb71c4cd53e0e55207de54a5c15dcf4222e59/src/chromiumos/tast/testing/registry_test.go
,
Jun 13 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/tast/+/42f938bd3eae9d4fe43a1776545015bb89177253 commit 42f938bd3eae9d4fe43a1776545015bb89177253 Author: Daniel Erat <derat@chromium.org> Date: Wed Jun 13 08:00:53 2018 tast: Include default test timeouts in "tast list" output. Update the bundle package to assign the default timeout to unset testing.Test.Timeout fields when producing the initial list of matching tests in readArgs, rather than later on in runTest. This causes tests to include their actual timeouts in the output from "tast list -json". BUG= chromium:851572 TEST=added a unit test; also manually verified that "tast list -json" prints default timeouts for tests that haven't specified custom ones Change-Id: I243b28e823dec56871bf6f379b23a9188493f03d Reviewed-on: https://chromium-review.googlesource.com/1095567 Commit-Ready: Dan Erat <derat@chromium.org> Tested-by: Dan Erat <derat@chromium.org> Reviewed-by: Shuhei Takahashi <nya@chromium.org> [modify] https://crrev.com/42f938bd3eae9d4fe43a1776545015bb89177253/src/chromiumos/tast/bundle/bundle_test.go [modify] https://crrev.com/42f938bd3eae9d4fe43a1776545015bb89177253/src/chromiumos/tast/bundle/bundle.go [modify] https://crrev.com/42f938bd3eae9d4fe43a1776545015bb89177253/src/chromiumos/tast/bundle/args_test.go [modify] https://crrev.com/42f938bd3eae9d4fe43a1776545015bb89177253/src/chromiumos/tast/testing/test.go [modify] https://crrev.com/42f938bd3eae9d4fe43a1776545015bb89177253/src/chromiumos/tast/bundle/args.go
,
Jun 13 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/tast/+/10d1688c149af7c5d7f7bb1ee64a06bba91ad7e3 commit 10d1688c149af7c5d7f7bb1ee64a06bba91ad7e3 Author: Daniel Erat <derat@chromium.org> Date: Wed Jun 13 08:00:53 2018 tast: Make testing.Registry return copies of tests. Make testing.Registry's AllTests, TestsForPatterns, and TestsForAttrExpr functions return deep copies of Test structs instead of pointers to the actual structs held in the registry. This reduces the risk of callers making changes (e.g. applying default timeouts) that affect the underlying tests. BUG= chromium:851572 TEST=updated unit tests; also manually verified that "tast run" and "tast list" still work when requesting all tests, wildcards, and attribute expressions Change-Id: Ia75301dbf88ac5ca644739577dc4e11136b4b915 Reviewed-on: https://chromium-review.googlesource.com/1095857 Commit-Ready: Dan Erat <derat@chromium.org> Tested-by: Dan Erat <derat@chromium.org> Reviewed-by: Shuhei Takahashi <nya@chromium.org> [modify] https://crrev.com/10d1688c149af7c5d7f7bb1ee64a06bba91ad7e3/src/chromiumos/tast/testing/test.go [modify] https://crrev.com/10d1688c149af7c5d7f7bb1ee64a06bba91ad7e3/src/chromiumos/tast/testing/registry.go [modify] https://crrev.com/10d1688c149af7c5d7f7bb1ee64a06bba91ad7e3/src/chromiumos/tast/testing/test_test.go [modify] https://crrev.com/10d1688c149af7c5d7f7bb1ee64a06bba91ad7e3/src/chromiumos/tast/bundle/bundle.go [modify] https://crrev.com/10d1688c149af7c5d7f7bb1ee64a06bba91ad7e3/src/chromiumos/tast/testing/registry_test.go
,
Jun 14 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/e5f0fa39db2b01ed1f414c39f79f0fd48d62a168 commit e5f0fa39db2b01ed1f414c39f79f0fd48d62a168 Author: Daniel Erat <derat@chromium.org> Date: Thu Jun 14 21:40:36 2018 autotest: Make tast.py sum per-test timeouts. Make the "tast" server test sum individual Tast tests' timeouts to determine the overall timeout for the "tast run" command. The sum is capped to a configurable duration with a default of one hour. BUG= chromium:851572 TEST=added unit tests CQ-DEPEND=I243b28e823dec56871bf6f379b23a9188493f03d Change-Id: Ia809c5fbd1e0b608885e694f1622062ae20d6ad6 Reviewed-on: https://chromium-review.googlesource.com/1096358 Commit-Ready: Dan Erat <derat@chromium.org> Tested-by: Dan Erat <derat@chromium.org> Reviewed-by: Ilja H. Friedel <ihf@chromium.org> [modify] https://crrev.com/e5f0fa39db2b01ed1f414c39f79f0fd48d62a168/server/site_tests/tast/tast_unittest.py [modify] https://crrev.com/e5f0fa39db2b01ed1f414c39f79f0fd48d62a168/server/site_tests/tast/tast.py
,
Jun 25 2018
|
||
►
Sign in to add a comment |
||
Comment 1 by ihf@chromium.org
, Jun 11 2018