SchedulerWorkerPoolImpl tests may fail on heavily-loaded systems |
||||
Issue descriptionSchedulerWorkerPoolImpl tests post work to the SchedulerWorkerPool and then use ExpectWorkerCapacityAfterDelay() to wait for a short while for the pool capacity to be increased as a result of all the running but blocked tasks. The ExpectWorkerCapacityAfterDelay() implementation assumes that the capacity will have increased to its expected value within four wait cycles. This property doesn't hold true if the host system is under heavy load, since the workers or service thread may be starved of CPU cycles to increase the pool size. We see this fail relatively often on the Fuchsia bots, which run the OS under QEMU, and in some cases without KVM hypervisor acceleration.
,
Dec 6 2017
,
Dec 6 2017
Example output:
[ RUN ] TaskSchedulerWorkerPoolBlockingTest.WorkersIdleWhenOverCapacity/MAY_BLOCK
../../base/task_scheduler/scheduler_worker_pool_impl_unittest.cc:983: Failure
Expected: worker_pool_->GetWorkerCapacityForTesting()
Which is: 7
To be equal to: expected_worker_capacity
Which is: 8
../../base/task_scheduler/scheduler_worker_pool_impl_unittest.cc:1124: Failure
Expected: worker_pool_->GetWorkerCapacityForTesting()
Which is: 7
To be equal to: 2 * kNumWorkersInWorkerPool
Which is: 8
[ FAILED ] TaskSchedulerWorkerPoolBlockingTest.WorkersIdleWhenOverCapacity/MAY_BLOCK, where GetParam() = 12-byte object <00-00 00-00 00-00 00-00 00-00 00-00> (1006 ms)
,
Dec 6 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/4bac235cb5696108cc2377780aef3337dbeefaaa commit 4bac235cb5696108cc2377780aef3337dbeefaaa Author: Wez <wez@chromium.org> Date: Wed Dec 06 17:16:22 2017 Allow SchedulerWorkerPoolImplTests to wait longer for capacity increase. Previously the tests would wait up to four times in the ExpectWorkerCapacityAfterDelay() helper function, which leads to them flaking sometimes under systems with heavy load, or systems running under slow emulation, such as QEMU. We now allow the test to wait indefinitely, provided that the capacity of the worker pool is stable, or increases. Bug: 792310 Change-Id: Ida8aa3abbb2d290771f74a7aae4ba7fe5c2176a0 Reviewed-on: https://chromium-review.googlesource.com/809710 Reviewed-by: François Doray <fdoray@chromium.org> Commit-Queue: Wez <wez@chromium.org> Cr-Commit-Position: refs/heads/master@{#522117} [modify] https://crrev.com/4bac235cb5696108cc2377780aef3337dbeefaaa/base/task_scheduler/scheduler_worker_pool_impl_unittest.cc
,
Dec 6 2017
fdoray: It's a shame to have removed the timeout, so we end up relying on the TestLauncher timeout to catch issues there - assigning to you to decide whether to follow-up with a fix to reintroduce one.
,
Dec 7 2017
fdoray: It looks like the problem has actually been that we are _never_ reaching the expected capacity; Fuchsia flaked again after this change landed: https://ci.chromium.org/buildbot/chromium.fyi/Fuchsia%20ARM64/2712 This time the test just timed-out entirely. Is there any debugging output that would help you diagnose this?
,
Dec 8 2017
We just had this failure relating to worker capacity mis-calculation: https://luci-milo.appspot.com/buildbot/chromium.fyi/Fuchsia/12002 [ RUN ] TaskSchedulerWorkerPoolBlockingTest.MaximumWorkersTest ../../base/task_scheduler/scheduler_worker_pool_impl_unittest.cc:1530: Failure Expected equality of these values: worker_pool_->GetWorkerCapacityForTesting() Which is: 15 kNumWorkersInWorkerPool + kNumExtraTasks Which is: 14 [ FAILED ] TaskSchedulerWorkerPoolBlockingTest.MaximumWorkersTest (7726 ms)
,
Feb 7 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/e31677326baffe050afd6750c6c1f79b33eaa375 commit e31677326baffe050afd6750c6c1f79b33eaa375 Author: Scott Graham <scottmg@chromium.org> Date: Wed Feb 07 21:54:16 2018 fuchsia: Disable TaskSchedulerWorkerPoolBlockingTest.MaximumWorkersTest Most recently https://build.chromium.org/p/chromium.fyi/builders/Fuchsia%20%28dbg%29/builds/15991 TBR: wez@chromium.org Bug: 768436 , 792310 Change-Id: I836a3e918bf67a0780d781bea1b64fa750f75ef3 Reviewed-on: https://chromium-review.googlesource.com/907531 Commit-Queue: Scott Graham <scottmg@chromium.org> Reviewed-by: Scott Graham <scottmg@chromium.org> Cr-Commit-Position: refs/heads/master@{#535157} [modify] https://crrev.com/e31677326baffe050afd6750c6c1f79b33eaa375/testing/buildbot/filters/fuchsia.base_unittests.filter
,
Feb 9 2018
,
Feb 22 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/92a8cd0fa676e90318fca3b78871477dc0448e61 commit 92a8cd0fa676e90318fca3b78871477dc0448e61 Author: Wez <wez@chromium.org> Date: Thu Feb 22 17:07:42 2018 Filter all TaskSchedulerWorkerPoolBlockingTests which wait on capacity. Several of these tests wait until the worker pool's capacity reaches an expected value, which appears not to be a deterministic operation, and fails often on the Fuchsia bots. Also adds a filter for a flaking Mojo system unit-test. Bug: 768436 , 792310 , 814596 Change-Id: If453b3cda30747c995871fbeae6cb23830f02a88 Reviewed-on: https://chromium-review.googlesource.com/930476 Reviewed-by: Scott Graham <scottmg@chromium.org> Commit-Queue: Wez <wez@chromium.org> Cr-Commit-Position: refs/heads/master@{#538463} [modify] https://crrev.com/92a8cd0fa676e90318fca3b78871477dc0448e61/testing/buildbot/filters/fuchsia.base_unittests.filter [modify] https://crrev.com/92a8cd0fa676e90318fca3b78871477dc0448e61/testing/buildbot/filters/fuchsia.mojo_system_unittests.filter
,
May 2 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/4ce50aa89014267e7589941571976204c6928991 commit 4ce50aa89014267e7589941571976204c6928991 Author: Francois Doray <fdoray@chromium.org> Date: Wed May 02 15:21:47 2018 TaskScheduler: Enable TaskSchedulerWorkerPoolBlockingTest.* on fuchsia. Flakyness was fixed by https://chromium-review.googlesource.com/1033533 Bug: 768436 , 792310 Change-Id: I8720a1ae503d096419bcc368c3629bcaa6679e74 Reviewed-on: https://chromium-review.googlesource.com/1039767 Reviewed-by: François Doray <fdoray@chromium.org> Reviewed-by: Gabriel Charette <gab@chromium.org> Reviewed-by: Wez <wez@chromium.org> Commit-Queue: François Doray <fdoray@chromium.org> Cr-Commit-Position: refs/heads/master@{#555393} [modify] https://crrev.com/4ce50aa89014267e7589941571976204c6928991/testing/buildbot/filters/fuchsia.base_unittests.filter |
||||
►
Sign in to add a comment |
||||
Comment 1 by w...@chromium.org
, Dec 6 2017