If the first collected swarming task is expired, no remaining swarming tasks are collected |
||||||||
Issue descriptionSee: https://build.chromium.org/p/chromium.android.fyi/builders/x64%20Device%20Tester/builds/1212 vs https://build.chromium.org/p/chromium.android.fyi/builders/x64%20Device%20Tester/builds/1213 In the second build, all tasks expired, but only the first task is displayed as such. r06e6b3accc66f6c31053055c8e0efcd978f18b03 landed in between those two and is likely related, so assigning to author.
,
Apr 12 2017
er, rather, this was likely happening before for non-gtest tasks, but it's only appearing on that bot now because it only runs gtest tasks.
,
Apr 12 2017
,
Apr 12 2017
Bump up the priority since this is making us losing lots of perf data
,
Apr 12 2017
,
Apr 12 2017
,
Apr 12 2017
It seems our tests also had similar problem. https://uberchromegw.corp.google.com/i/internal.mediarouter/builders/Windows%20Build/builds/2504 It only happens on Windows build though.
,
Apr 12 2017
Thanks John for the quick fix in https://chromium-review.googlesource.com/c/476031/. I still think the root cause is the loop in https://cs.chromium.org/chromium/build/scripts/slave/recipe_modules/chromium_tests/api.py?rcl=424fd63dc75bf3fec55f76b98c932621c52577cf&l=286 is too harsh: for t in tests: try: t.run(self._api_for_tests, suffix) except self.m.step.InfraFailure: # pragma: no cover raise except self.m.step.StepFailure: # pragma: no cover failed_tests.append(t) if t.abort_on_failure: raise This means any InfraFailure on any step will block the whole build, but I don't think we can guarantee a zero percent failure rate for InfraFailure, especially given any SwarmingFailure will be an InfraFailure (see https://cs.chromium.org/chromium/build/scripts/slave/recipe_modules/swarming/api.py?rcl=424fd63dc75bf3fec55f76b98c932621c52577cf&l=668) I think the loop here should be adjusted so that the build keeps going upon InfraFailure. What do other folks think?
,
Apr 12 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/tools/build/+/b2c7e2705067c2f48e6ef702402467972f74de09 commit b2c7e2705067c2f48e6ef702402467972f74de09 Author: John Budorick <jbudorick@chromium.org> Date: Wed Apr 12 21:08:49 2017 Catch all exceptions from swarming summary json processing. Bug:710710 Change-Id: I628e6c695165221f59103f58b3bb6f266f7dc5a2 Reviewed-on: https://chromium-review.googlesource.com/476031 Reviewed-by: Stephen Martinis <martiniss@chromium.org> Commit-Queue: John Budorick <jbudorick@chromium.org> [modify] https://crrev.com/b2c7e2705067c2f48e6ef702402467972f74de09/scripts/slave/recipe_modules/swarming/example.expected/swarming_expired_new.json [modify] https://crrev.com/b2c7e2705067c2f48e6ef702402467972f74de09/scripts/slave/recipe_modules/ios/example.expected/expired.json [modify] https://crrev.com/b2c7e2705067c2f48e6ef702402467972f74de09/scripts/slave/recipes/chromium.expected/dynamic_swarmed_sharded_isolated_chartjson_test_harness_failure.json [modify] https://crrev.com/b2c7e2705067c2f48e6ef702402467972f74de09/scripts/slave/recipe_modules/swarming/api.py [modify] https://crrev.com/b2c7e2705067c2f48e6ef702402467972f74de09/scripts/slave/recipes/chromium.expected/dynamic_swarmed_sharded_invalid_json_isolated_script_test.json [modify] https://crrev.com/b2c7e2705067c2f48e6ef702402467972f74de09/scripts/slave/recipe_modules/swarming/example.expected/swarming_expired_old.json [modify] https://crrev.com/b2c7e2705067c2f48e6ef702402467972f74de09/scripts/slave/recipes/chromium.expected/dynamic_swarmed_passed_isolated_script_test_with_swarming_failure.json
,
Apr 12 2017
#7: that's likely a different issue triggered by the same CL. #8: I think that'll be out of the scope of this bug.
,
Apr 12 2017
,
Apr 12 2017
# 10: Sorry, I filed issue 711030 for that
,
Apr 13 2017
,
Apr 27 2017
,
May 1 2017
Proximate issue fixed; continuing on infra failure in the general case is https://bugs.chromium.org/p/chromium/issues/detail?id=711030 |
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by jbudorick@chromium.org
, Apr 12 2017