MessageLoopTaskRunnerTest.PostTaskAndReply_DeadReplyTaskRunnerBehavior flaked in base_unittests on Fuchsia |
||||
Issue descriptionhttps://ci.chromium.org/p/chromium/builders/luci.chromium.try/fuchsia_x64/51308 ../../base/message_loop/message_loop_task_runner_unittest.cc:247: Failure Value of: reply_deleted_on Actual: true Expected: false Stack trace: bt#00: pc 0x659c5899e4a6 (app:/pkg/bin/app,0xcce4a6) bt#01: pc 0x659c5899de4b (app:/pkg/bin/app,0xccde4b) bt#02: pc 0x659c58297849 (app:/pkg/bin/app,0x5c7849) bt#03: end [ FAILED ] MessageLoopTaskRunnerTest.PostTaskAndReply_DeadReplyTaskRunnerBehavior (10 ms)
,
Jun 28 2018
@fdoray as PostTaskAndReply expert: if anything could have made this test flaky it would be r550189 even though I'm not sure how at first sight.
,
Jun 28 2018
std::move(*this) from a destructor is potentially ill-defined C++? (ref. ~PostTaskAndReplyRelay())
,
Aug 23
,
Sep 26
Seems to be pretty reliably flaky: https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/fuchsia-fyi-arm64-rel/1685 plus a couple other bugs that I'm going to dupe in a sec. This is the actual stack, not much more useful though. [ RUN ] MessageLoopTaskRunnerTest.PostTaskAndReply_DeadReplyTaskRunnerBehavior ../../base/message_loop/message_loop_task_runner_unittest.cc:247: Failure Value of: reply_deleted_on Actual: true Expected: false Stack trace: #00: StackTraceGetter::CurrentStackTrace(int, int) at third_party/googletest/custom/gtest/internal/custom/stack_trace_getter.cc:24 #01: testing::internal::UnitTestImpl::CurrentOsStackTraceExceptTop(int) at third_party/googletest/src/googletest/src/gtest.cc:846 #02: testing::internal::AssertHelper::operator=(testing::Message const&) const at third_party/googletest/src/googletest/src/gtest.cc:418 #03: base::MessageLoopTaskRunnerTest_PostTaskAndReply_DeadReplyTaskRunnerBehavior_Test::TestBody() at base/message_loop/message_loop_task_runner_unittest.cc:247 #04: void testing::internal::InvokeHelper<void, std::__2::tuple<> >::InvokeMethod<base::RunLoop, void (base::RunLoop::*)()>(base::RunLoop*, void (base::RunLoop::*)(), std::__2::tuple<> const&) at third_party/googletest/src/googlemock/include/gmock/gmock-generated-actions.h:68
,
Sep 26
Issue 889278 has been merged into this issue.
,
Sep 26
,
Sep 26
scottmg@around:/work/cr/src$ tools/run-swarmed.py -C out/fuch --target-os fuchsia --gtest_filter=MessageLoopTaskRunnerTest.PostTaskAndReply* --test-name base_unittests --copies 500 ninja -C out/fuch base_unittests -j 1120 -l 56 ninja: Entering directory `out/fuch' ninja: no work to do. python tools/swarming_client/isolate.py check -i out/fuch/base_unittests.isolate -s out/fuch/base_unittests.isolated If you get authentication errors, follow: https://www.chromium.org/developers/testing/isolated-testing/for-swes#TOC-Login-on-the-services Uploading to isolate server, this can take a while... Triggering 500 tasks... [500/500] collected: 499 successes, 1 failures, 0 bot errors... Results logs collected into /usr/local/google/work/cr/src/results. and the failure is the same. I don't really understand any of this infrastructure, but I guess I'll dig in since it's fairly reproducible.
,
Sep 26
With a sleep added like this https://chromium-review.googlesource.com/c/chromium/src/+/1246481 it fails and/or crashes on other platforms reliably too, so I'm going to disable it. scottmg@around:/work/cr/src$ ninja -C out/lin base_unittests -j1000 && out/lin/base_unittests --gtest_filter=MessageLoopTaskRunnerTest.PostTaskAndReply_DeadReplyTaskRunnerBehavior ninja: Entering directory `out/lin' [0->2/2 ~1] LINK ./base_unittests IMPORTANT DEBUGGING NOTE: batches of tests are run inside their own process. For debugging a test inside a debugger, use the --gtest_filter=<your_test_name> flag along with --single-process-tests. Using sharding settings from environment. This is shard 0/1 Using 1 parallel jobs. Note: Google Test filter = MessageLoopTaskRunnerTest.PostTaskAndReply_DeadReplyTaskRunnerBehavior [==========] Running 1 test from 1 test case. [----------] Global test environment set-up. [----------] 1 test from MessageLoopTaskRunnerTest [ RUN ] MessageLoopTaskRunnerTest.PostTaskAndReply_DeadReplyTaskRunnerBehavior ../../base/message_loop/message_loop_task_runner_unittest.cc:249: Failure Value of: reply_deleted_on Actual: true Expected: false Stack trace: #0 0x000000ea4f6c testing::internal::UnitTestImpl::CurrentOsStackTraceExceptTop() #1 0x000000ea4939 testing::internal::AssertHelper::operator=() #2 0x000000706c38 base::MessageLoopTaskRunnerTest_PostTaskAndReply_DeadReplyTaskRunnerBehavior_Test::TestBody() [ FAILED ] MessageLoopTaskRunnerTest.PostTaskAndReply_DeadReplyTaskRunnerBehavior (1011 ms) [----------] 1 test from MessageLoopTaskRunnerTest (1011 ms total) [----------] Global test environment tear-down [==========] 1 test from 1 test case ran. (1012 ms total) [ PASSED ] 0 tests. [ FAILED ] 1 test, listed below: [ FAILED ] MessageLoopTaskRunnerTest.PostTaskAndReply_DeadReplyTaskRunnerBehavior 1 FAILED TEST [1/1] MessageLoopTaskRunnerTest.PostTaskAndReply_DeadReplyTaskRunnerBehavior (1010 ms) 1 test failed: MessageLoopTaskRunnerTest.PostTaskAndReply_DeadReplyTaskRunnerBehavior (../../base/message_loop/message_loop_task_runner_unittest.cc:204) Tests took 1 seconds.
,
Sep 26
Nice job tracking that down, Scott! IIUC the problem is that the test calls UnblockTaskThread(), allowing the task-thread to process the posted task, but then doesn't wait for that to complete before it resets |current_loop_| - so there is a race-condition in the PostTask'ing the reply task, and we may either: 1. Complete running the posted task before resetting, in which case the task will be queued to |current_loop_|, and actually deleted when that is reset. 2. Not complete running the posted task before the reset(), in which case it will be leaked, as expected.
,
Sep 26
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/f37018c402c4af20a6b89d2daf22766033f0d57d commit f37018c402c4af20a6b89d2daf22766033f0d57d Author: Scott Graham <scottmg@chromium.org> Date: Wed Sep 26 16:38:18 2018 Disable flaky MessageLoopTaskRunnerTest.PostTaskAndReply_DeadReplyTaskRunnerBehavior See linked bug for reliable repro and investigation. TBR: fdoray@chromium.org Bug: 856900 Change-Id: Ibb112c9a573ea4a2451edebffba63cccc80918de Reviewed-on: https://chromium-review.googlesource.com/1246463 Reviewed-by: Scott Graham <scottmg@chromium.org> Commit-Queue: Scott Graham <scottmg@chromium.org> Cr-Commit-Position: refs/heads/master@{#594345} [modify] https://crrev.com/f37018c402c4af20a6b89d2daf22766033f0d57d/base/message_loop/message_loop_task_runner_unittest.cc
,
Sep 26
Thanks for finding this, I'm not a fan of leaving tests disabled though, let's prioritize fixing this. |
||||
►
Sign in to add a comment |
||||
Comment 1 by w...@chromium.org
, Jun 27 2018Components: Internals