MemoryDumpSchedulerTest.StartStopQuickly sometimes fails on Fuchsia |
|||||
Issue description[00011.100] 02443.02591> [ RUN ] MemoryDumpSchedulerTest.StartStopQuickly [00011.100] 02443.02591> ../../base/trace_event/memory_dump_scheduler_unittest.cc:157: Failure [00011.100] 02443.02591> Expected: (time_ms) >= (kPeriodMs * (kTicks - 1)), actual: 8.622 vs 9 [00011.100] 02443.02591> [ FAILED ] MemoryDumpSchedulerTest.StartStopQuickly (9 ms) Other tests (e.g. bug 734130 ) have demonstrated that the scheduler is a bit antagonistic, so this may be racy.
,
Jun 19 2017
I believe this is a bug in MemoryDumpScheduler::StartInternal (or in the test, I'm not sure which). What happens is that the Stop()/Start() loop because of the "schedule first dump now" in https://cs.chromium.org/chromium/src/base/trace_event/memory_dump_scheduler.cc?rcl=2adca2eee880e4446ce64d8c57eaba4b89a7e19b&l=77 will trigger up to as many dumps as are in the Stop()/Start() loop (5 in the test). That can happen in ~0 time as long as the right thread context switches happen. Then the PostDelayedTask()s are ignored correctly (due to the wrong generation_), then 5 more ticks happen on the correct schedule. So the EXPECT at the end of the test that it should take num_ticks*period is not satisfied. https://cs.chromium.org/chromium/src/base/trace_event/memory_dump_scheduler_unittest.cc?rcl=f1d3694aaf4a75db1c248315b1b68cbfee93bbb7&l=157 With https://chromium-review.googlesource.com/c/540284/ this repros reliably for me on Linux. +primiano: Do you have an opinion on whether the implementation should ensure that there's a delay before the tick (i.e. don't dump right away on a Start)? Or whether that condition of the test should be ignored? Or something else? I haven't investigated https://crbug.com/734218 yet, but it smells very similar. (Not Fuchsia-specific, so removing OS bit; just happens to hit there more often due to a different thread scheduler.)
,
Jun 21 2017
Scott: very likely this is the test. History has proven that testing a timing-based scheduler is hard, and I ended up in several occasions making wrong assumptions on the test. hjd@ recently fixed something very similar in https://chromium-review.googlesource.com/c/535541. Does this repro still after that? hjd@, if you have few mins today can you check whether the Linux repro Scott did provide ( https://chromium-review.googlesource.com/c/540284/ ) still repros on ToT?
,
Jun 21 2017
I quickly glanced at the code, I think I see what happened: - In https://codereview.chromium.org/2902413002 I recently-ish made the first tick happen immediately. - As a consequence, I introduced a timing race in the StartStopQuickly test, which matches Scott's explanation in #2. The right way to fix this is IMHO by using two different LevelOfDetail (the quick stop/start loop vs the last start) so we can differentiate the spurious ticks of the start/stop cycle. Specifically I'd: - In the "for (int i = 0; i < 5; i++) {" sequence use MemoryDumpLevelOfDetail::LIGHT. - On the last Start() call, switch the config to use MemoryDumpLevelOfDetail::DETAILED. - Change the expectation to be: EXPECT_CALL(on_tick_, OnTick(kLight)).AtMost(5) // 5 being the #iterations in the quick stop/start for loop. EXPECT_CALL(on_tick_, OnTick(kDetailed)).Times(kTicks - 1); EXPECT_CALL(on_tick_, OnTick(kDetailed)).WillRepeatedly(.... evt.signal) This should correctly swallow any spurious ticks coming from the quick stop/start sequence. I would do this myself, but I am just back from vacations and I am doing 3 other things in parallel at the moment. If I manage to go back to this by this afternoon (your morning) will fix this myself. Otherwise if I will forget about this by then (very likely) feel free to either send a patch or assign it to me.
,
Jun 21 2017
Scott's CL repros on ToT, I will try what you mentioned in #4.
,
Jun 21 2017
https://chromium-review.googlesource.com/c/543039/ fixes the issue with Scott's change applied on top. Sending to the CQ.
,
Jun 21 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/c704dbef54549387ce86bffb76b80c38019b23ed commit c704dbef54549387ce86bffb76b80c38019b23ed Author: Hector Dearman <hjd@google.com> Date: Wed Jun 21 14:34:35 2017 memory-infra: Fix race in MemoryDumpSchedulerTest.StartStopQuickly R: primiano@chromium.org Cc: scottmg@chromium.org Bug: 734216 Change-Id: I92b797955cd9f57223a5eaab797095639323f512 Reviewed-on: https://chromium-review.googlesource.com/543039 Commit-Queue: Hector Dearman <hjd@chromium.org> Reviewed-by: Primiano Tucci <primiano@chromium.org> Cr-Commit-Position: refs/heads/master@{#481202} [modify] https://crrev.com/c704dbef54549387ce86bffb76b80c38019b23ed/base/trace_event/memory_dump_scheduler_unittest.cc
,
Jun 21 2017
Thank you!
,
Jun 21 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/7718ac3cc5e6b0d1aabcecf1312455ceacf7ef64 commit 7718ac3cc5e6b0d1aabcecf1312455ceacf7ef64 Author: Scott Graham <scottmg@chromium.org> Date: Wed Jun 21 19:15:47 2017 fuchsia: remove filter of MemoryDumpSchedulerTest.StartStopQuickly Fixed by https://chromium-review.googlesource.com/c/543039/. Bug: 706592, 734216 Change-Id: Iad8ddccff2ad975752ab6e57e56d1b5070f0754d Reviewed-on: https://chromium-review.googlesource.com/543557 Reviewed-by: Nico Weber <thakis@chromium.org> Commit-Queue: Scott Graham <scottmg@chromium.org> Cr-Commit-Position: refs/heads/master@{#481271} [modify] https://crrev.com/7718ac3cc5e6b0d1aabcecf1312455ceacf7ef64/testing/buildbot/filters/fuchsia.base_unittests.filter
,
Jun 29 2017
|
|||||
►
Sign in to add a comment |
|||||
Comment 1 by scottmg@chromium.org
, Jun 16 2017Status: Assigned (was: Untriaged)