New issue
Advanced search Search tips

Issue 766221 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Oct 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Fuchsia
Pri: 3
Type: Bug

Blocking:
issue 882906
issue 754861



Sign in to add a comment

CacheStorageManagerTests/CacheStorageManagerTestP.DeletedKeysGone/1 flaked in content_unittests on Fuchsia

Project Member Reported by scottmg@chromium.org, Sep 18 2017

Issue description

[00325.296] 03508.03535> [ RUN      ] CacheStorageManagerTests/CacheStorageManagerTestP.DeletedKeysGone/1
[00325.296] 03508.03535> [3:452180003:0917/100411.183739:325174320:FATAL:test_browser_thread_bundle.cc(70)] Check failed: base::MessageLoop::current()->IsIdleForTesting().
#00: StackTrace at ../../../c/builder/linux/src/base/debug/stack_trace_fuchsia.cc:173
#01: std::__1::basic_ostream<char, std::__1::char_traits<char> >& std::__1::endl<char, std::__1::char_traits<char> >(std::__1::basic_ostream<char, std::__1::char_traits<char> >&) at ../../../c/builder/linux/src/buildtools/third_party/libc++/trunk/include/ostream:1001
      (inlined by) std::__1::basic_ostream<char, std::__1::char_traits<char> >::operator<<(std::__1::basic_ostream<char, std::__1::char_traits<char> >& (*)(std::__1::basic_ostream<char, std::__1::char_traits<char> >&)) at ../../../c/builder/linux/src/buildtools/third_party/libc++/trunk/include/ostream:195
      (inlined by) ~LogMessage at ../../../c/builder/linux/src/base/logging.cc:561
#02: std::__1::unique_ptr<base::test::ScopedTaskEnvironment, std::__1::default_delete<base::test::ScopedTaskEnvironment> >::reset(base::test::ScopedTaskEnvironment*) at ../../../c/builder/linux/src/buildtools/third_party/libc++/trunk/include/memory:2582
      (inlined by) ~TestBrowserThreadBundle at ../../../c/builder/linux/src/content/public/test/test_browser_thread_bundle.cc:76
#03: ~CacheStorageManagerTest at ../../../c/builder/linux/src/content/browser/cache_storage/cache_storage_manager_unittest.cc:113
#04: ~CacheStorageManagerTestP_DeletedKeysGone_Test at ../../../c/builder/linux/src/content/browser/cache_storage/cache_storage_manager_unittest.cc:686
#05: testing::TestInfo::Run() at /b/c/builder/linux/src/out/Release/../../third_party/googletest/src/googletest/include/gtest/gtest.h:?
#06: testing::TestCase::Run() at ../../../c/builder/linux/src/third_party/googletest/src/googletest/src/gtest.cc:2771
#07: testing::internal::UnitTestImpl::RunAllTests() at ../../../c/builder/linux/src/third_party/googletest/src/googletest/src/gtest.cc:4676
#08: testing::UnitTest::Run() at ../../../c/builder/linux/src/third_party/googletest/src/googletest/src/gtest.cc:4285
#09: RUN_ALL_TESTS() at ../../../c/builder/linux/src/third_party/googletest/src/googletest/include/gtest/gtest.h:2237
      (inlined by) base::TestSuite::Run() at ../../../c/builder/linux/src/base/test/test_suite.cc:270
#10: base::RepeatingCallback<int ()>::Run() const & at ../../../c/builder/linux/src/base/callback.h:92
      (inlined by) base::(anonymous namespace)::LaunchUnitTestsInternal(base::RepeatingCallback<int ()> const&, unsigned long, int, bool, base::RepeatingCallback<void ()> const&) at ../../../c/builder/linux/src/base/test/launcher/unit_test_launcher.cc:216
#11: base::LaunchUnitTests(int, char**, base::RepeatingCallback<int ()> const&) at ../../../c/builder/linux/src/base/test/launcher/unit_test_launcher.cc:475
#12: main at ../../../c/builder/linux/src/content/test/run_all_unittests.cc:20
#13: pc 0x4c83bdc4de3e (libc.so,0x1be3e)
[00325.306] 03508.03535>
[00325.307] 03508.03535> [3942/4669] CacheStorageManagerTests/CacheStorageManagerTestP.DeletedKeysGone/1 (CRASHED)

Looks pretty cross-platform at first glance, but will investigate if it's Fuchsia only first.
 
Project Member

Comment 2 by bugdroid1@chromium.org, Sep 18 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/45cfea793a845c6b8cdfbd515391293189af3be9

commit 45cfea793a845c6b8cdfbd515391293189af3be9
Author: Scott Graham <scottmg@chromium.org>
Date: Mon Sep 18 18:57:29 2017

fuchsia: Disable a few content_unittests that flaked recently

TBR: sergeyu@chromium.org
Bug: 760687, 766221 ,766223
Change-Id: Ice03f167b78c8ce6ca1efebc54c78d3c3c187663
Reviewed-on: https://chromium-review.googlesource.com/671494
Reviewed-by: Scott Graham <scottmg@chromium.org>
Reviewed-by: Sergey Ulanov <sergeyu@chromium.org>
Commit-Queue: Scott Graham <scottmg@chromium.org>
Cr-Commit-Position: refs/heads/master@{#502627}
[modify] https://crrev.com/45cfea793a845c6b8cdfbd515391293189af3be9/testing/buildbot/filters/fuchsia.content_unittests.filter

Same in CacheStorageManagerTest.BadOriginName so that sort of narrows it down to the fixture.

[00052.273] 03531.03558> [ RUN      ] CacheStorageManagerTest.BadOriginName
[00052.273] 03531.03558> [3:1943816969:0919/023021.923404:51957019:FATAL:test_browser_thread_bundle.cc(70)] Check failed: base::MessageLoop::current()->IsIdleForTesting().
#00: StackTrace at ../../../c/builder/linux/src/base/debug/stack_trace_fuchsia.cc:173
#01: std::__1::basic_ostream<char, std::__1::char_traits<char> >& std::__1::endl<char, std::__1::char_traits<char> >(std::__1::basic_ostream<char, std::__1::char_traits<char> >&) at ../../../c/builder/linux/src/buildtools/third_party/libc++/trunk/include/ostream:1001
      (inlined by) std::__1::basic_ostream<char, std::__1::char_traits<char> >::operator<<(std::__1::basic_ostream<char, std::__1::char_traits<char> >& (*)(std::__1::basic_ostream<char, std::__1::char_traits<char> >&)) at ../../../c/builder/linux/src/buildtools/third_party/libc++/trunk/include/ostream:195
      (inlined by) ~LogMessage at ../../../c/builder/linux/src/base/logging.cc:561
#02: std::__1::unique_ptr<base::test::ScopedTaskEnvironment, std::__1::default_delete<base::test::ScopedTaskEnvironment> >::reset(base::test::ScopedTaskEnvironment*) at ../../../c/builder/linux/src/buildtools/third_party/libc++/trunk/include/memory:2582
      (inlined by) ~TestBrowserThreadBundle at ../../../c/builder/linux/src/content/public/test/test_browser_thread_bundle.cc:76
#03: ~CacheStorageManagerTest at ../../../c/builder/linux/src/content/browser/cache_storage/cache_storage_manager_unittest.cc:113
#04: ~CacheStorageManagerTest_BadOriginName_Test at ../../../c/builder/linux/src/content/browser/cache_storage/cache_storage_manager_unittest.cc:886
#05: testing::TestInfo::Run() at /b/c/builder/linux/src/out/Release/../../third_party/googletest/src/googletest/include/gtest/gtest.h:?
#06: testing::TestCase::Run() at ../../../c/builder/linux/src/third_party/googletest/src/googletest/src/gtest.cc:2771
#07: testing::internal::UnitTestImpl::RunAllTests() at ../../../c/builder/linux/src/third_party/googletest/src/googletest/src/gtest.cc:4676
#08: testing::UnitTest::Run() at ../../../c/builder/linux/src/third_party/googletest/src/googletest/src/gtest.cc:4285
#09: RUN_ALL_TESTS() at ../../../c/builder/linux/src/third_party/googletest/src/googletest/include/gtest/gtest.h:2237
      (inlined by) base::TestSuite::Run() at ../../../c/builder/linux/src/base/test/test_suite.cc:270
#10: base::RepeatingCallback<int ()>::Run() const & at ../../../c/builder/linux/src/base/callback.h:92
      (inlined by) base::(anonymous namespace)::LaunchUnitTestsInternal(base::RepeatingCallback<int ()> const&, unsigned long, int, bool, base::RepeatingCallback<void ()> const&) at ../../../c/builder/linux/src/base/test/launcher/unit_test_launcher.cc:216
#11: base::LaunchUnitTests(int, char**, base::RepeatingCallback<int ()> const&) at ../../../c/builder/linux/src/base/test/launcher/unit_test_launcher.cc:475
#12: main at ../../../c/builder/linux/src/content/test/run_all_unittests.cc:20
#13: pc 0x4d8af6c31ebe (libc.so,0x1bebe)
[00052.277] 03531.03558>
[00052.278] 03531.03558> [489/4668] CacheStorageManagerTest.BadOriginName (CRASHED)
Meh, I can't get it to crash locally. By inspection, I'm looking in the direction of the teardown of MockQuotaManager but that's only unfounded speculation.

Maybe I'll try to get the IsIdleForTesting() failure to print out where its tasks came from when it fails.
Finally repro'd with extra logging after a bit of trybot abuse (https://chromium-review.googlesource.com/c/chromium/src/+/673583/1)

The ("A"?) leaked task is coming from LoadIndexEntries, ../../net/disk_cache/simple/simple_index_file.cc:371.

[00295.934] 03530.03557> [ RUN      ] CacheStorageManagerTests/CacheStorageManagerTestP.DeletedKeysGone/1
[00295.934] 03530.03557> [3:1547678439:0919/235906.890361:295758195:ERROR:incoming_task_queue.cc(96)] IncomingTaskQueue not empty as expected:
[00295.935] 03530.03557> [3:1547678439:0919/235906.898631:295765507:ERROR:incoming_task_queue.cc(98)] Task 0:
[00295.935] 03530.03557> [3:1547678439:0919/235906.908758:295775641:ERROR:incoming_task_queue.cc(99)]   posted_from: LoadIndexEntries, ../../net/disk_cache/simple/simple_index_file.cc:371
[00295.935] 03530.03557> [3:1547678439:0919/235906.920261:295787137:ERROR:incoming_task_queue.cc(102)]   delayed_run_time: 0 bogo-microseconds
[00295.935] 03530.03557> [3:1547678439:0919/235906.930538:295797412:ERROR:incoming_task_queue.cc(104)]   task_backtrace: 0
[00295.936] 03530.03557> [3:1547678439:0919/235906.937266:295804142:ERROR:incoming_task_queue.cc(104)]   task_backtrace: 0
[00295.936] 03530.03557> [3:1547678439:0919/235906.943003:295809880:ERROR:incoming_task_queue.cc(104)]   task_backtrace: 0
[00295.936] 03530.03557> [3:1547678439:0919/235906.950622:295817502:ERROR:incoming_task_queue.cc(104)]   task_backtrace: 0
[00295.937] 03530.03557> [3:1547678439:0919/235906.957285:295824161:FATAL:incoming_task_queue.cc(110)] Check failed: i == 0. IncomingTaskQueue wasn't empty
#00: base::debug::StackTrace::StackTrace(unsigned long) at ../../../c/b/linux/src/base/debug/stack_trace_fuchsia.cc:173
#01: endl<char, std::__1::char_traits<char> > at ../../../c/b/linux/src/buildtools/third_party/libc++/trunk/include/ostream:1001
      (inlined by) operator<< at ../../../c/b/linux/src/buildtools/third_party/libc++/trunk/include/ostream:195
      (inlined by) ~LogMessage at ../../../c/b/linux/src/base/logging.cc:561
#02: CheckIsIdleForTesting at ../../../c/b/linux/src/base/message_loop/incoming_task_queue.cc:110
#03: reset at ../../../c/b/linux/src/buildtools/third_party/libc++/trunk/include/memory:2582
      (inlined by) ~TestBrowserThreadBundle at ../../../c/b/linux/src/content/public/test/test_browser_thread_bundle.cc:76
#04: ~CacheStorageManagerTest at ../../../c/b/linux/src/content/browser/cache_storage/cache_storage_manager_unittest.cc:113
#05: ~CacheStorageManagerTestP_DeletedKeysGone_Test at ../../../c/b/linux/src/content/browser/cache_storage/cache_storage_manager_unittest.cc:686
#06: Run at /b/c/b/linux/src/out/Release/../../third_party/googletest/src/googletest/include/gtest/gtest.h:?
#07: Run at ../../../c/b/linux/src/third_party/googletest/src/googletest/src/gtest.cc:2771
#08: RunAllTests at ../../../c/b/linux/src/third_party/googletest/src/googletest/src/gtest.cc:4676
#09: Run at ../../../c/b/linux/src/third_party/googletest/src/googletest/src/gtest.cc:4285
#10: RUN_ALL_TESTS at ../../../c/b/linux/src/third_party/googletest/src/googletest/include/gtest/gtest.h:2237
      (inlined by) Run at ../../../c/b/linux/src/base/test/test_suite.cc:270
#11: Run at ../../../c/b/linux/src/base/callback.h:92
      (inlined by) LaunchUnitTestsInternal at ../../../c/b/linux/src/base/test/launcher/unit_test_launcher.cc:216
#12: LaunchUnitTests at ../../../c/b/linux/src/base/test/launcher/unit_test_launcher.cc:475
#13: main at ../../../c/b/linux/src/content/test/run_all_unittests.cc:20
#14: pc 0x5a91cb053ebe (libc.so,0x1bebe)
[00295.943] 03530.03557>


content::CacheStorageManagerTest::Open() in the test calls
CacheStorageCache::CreateBackend() which eventually gets around to calling
SimpleBackendImpl::Init() which creates a 
SimpleIndexFile, passing it 

namespace disk_cache {
base::LazyInstance<LeakySequencedWorkerPool>::Leaky g_sequenced_worker_pool = ...
}

SimpleIndexFile::LoadIndexEntries() posts tasks to this worker pool, which is the FROM_HERE above. But I don't understand how it's escaping the flushes that are scattered everywhere yet.
Unfortunately on a run where I'd switched back to plain trunk (no extra logging). Same location, different test in the same fixture, so pretty clearly something common to the fixture's shutdown.

https://chromium-swarm.appspot.com/task?id=38c4b0a204daf010&refresh=10&show_raw=1

  [00410.641] 03535.03562> 1 test crashed:
  [00410.641] 03535.03562>     CacheStorageManagerTest.DeletedCacheIgnoredInIndex (../../content/browser/cache_storage/cache_storage_manager_unittest.cc:948)

  [00053.588] 03535.03562> [ RUN      ] CacheStorageManagerTest.DeletedCacheIgnoredInIndex
  [00053.589] 03535.03562> [3:393356087:0922/231641.462788:53470881:FATAL:test_browser_thread_bundle.cc(64)] Check failed: base::MessageLoop::current()->IsIdleForTesting().
  #00: StackTrace at ../../../../work/cr/src/base/debug/stack_trace_fuchsia.cc:173
  #01: base::debug::StackTrace::StackTrace() at ../../../../work/cr/src/base/debug/stack_trace.cc:199
  #02: ~LogMessage at ../../../../work/cr/src/base/logging.cc:560
  #03: ~TestBrowserThreadBundle at ../../../../work/cr/src/content/public/test/test_browser_thread_bundle.cc:65
  #04: ~CacheStorageManagerTest at ../../../../work/cr/src/content/browser/cache_storage/cache_storage_manager_unittest.cc:113
  #05: content::CacheStorageManagerTest_DeletedCacheIgnoredInIndex_Test::~CacheStorageManagerTest_DeletedCacheIgnoredInIndex_Test() at ../../../../work/cr/src/content/browser/cache_storage/cache_storage_manager_unittest.cc:948
  #06: content::CacheStorageManagerTest_DeletedCacheIgnoredInIndex_Test::~CacheStorageManagerTest_DeletedCacheIgnoredInIndex_Test() at ../../../../work/cr/src/content/browser/cache_storage/cache_storage_manager_unittest.cc:948
  #07: testing::Test::DeleteSelf_() at ../../../../work/cr/src/third_party/googletest/src/googletest/include/gtest/gtest.h:453
  #08: void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) at ../../../../work/cr/src/third_party/googletest/src/googletest/src/gtest.cc:2400
  #09: void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) at ../../../../work/cr/src/third_party/googletest/src/googletest/src/gtest.cc:2453
  #10: testing::TestInfo::Run() at ../../../../work/cr/src/third_party/googletest/src/googletest/src/gtest.cc:2662
  #11: testing::TestCase::Run() at ../../../../work/cr/src/third_party/googletest/src/googletest/src/gtest.cc:2771
  #12: RunAllTests at ../../../../work/cr/src/third_party/googletest/src/googletest/src/gtest.cc:4676
  #13: bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) at ../../../../work/cr/src/third_party/googletest/src/googletest/src/gtest.cc:2400
  #14: bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) at ../../../../work/cr/src/third_party/googletest/src/googletest/src/gtest.cc:2453
  #15: Run at ../../../../work/cr/src/third_party/googletest/src/googletest/src/gtest.cc:4285
  #16: RUN_ALL_TESTS() at ../../../../work/cr/src/third_party/googletest/src/googletest/include/gtest/gtest.h:2237
  #17: Run at ../../../../work/cr/src/base/test/test_suite.cc:270
  #18: Run at ../../../../work/cr/src/content/public/test/unittest_test_suite.cc:45
  #19: int base::internal::FunctorTraits<int (content::UnitTestTestSuite::*)(), void>::Invoke<content::UnitTestTestSuite*>(int (content::UnitTestTestSuite::*)(), content::UnitTestTestSuite*&&) at ../../../../work/cr/src/base/bind_internal.h:194
  #20: MakeItSo<int (content::UnitTestTestSuite::*const &)(), content::UnitTestTestSuite *> at ../../../../work/cr/src/base/bind_internal.h:277
  #21: RunImpl<int (content::UnitTestTestSuite::*const &)(), const std::__1::tuple<base::internal::UnretainedWrapper<content::UnitTestTestSuite> > &, 0> at ../../../../work/cr/src/base/bind_internal.h:349
  #22: base::internal::Invoker<base::internal::BindState<int (content::UnitTestTestSuite::*)(), base::internal::UnretainedWrapper<content::UnitTestTestSuite> >, int ()>::Run(base::internal::BindStateBase*) at ../../../../work/cr/src/base/bind_internal.h:331
  #23: base::RepeatingCallback<int ()>::Run() const & at ../../../../work/cr/src/base/callback.h:92
  #24: LaunchUnitTestsInternal at ../../../../work/cr/src/base/test/launcher/unit_test_launcher.cc:216
  #25: base::LaunchUnitTests(int, char**, base::RepeatingCallback<int ()> const&) at ../../../../work/cr/src/base/test/launcher/unit_test_launcher.cc:475
  #26: main at ../../../../work/cr/src/content/test/run_all_unittests.cc:20
  #27: pc 0x4704a4023ebe (libc.so,0x1bebe)
  [00053.603] 03535.03562>
  [00053.604] 03535.03562> [500/4673] CacheStorageManagerTest.DeletedCacheIgnoredInIndex (CRASHED)




Another in BadOriginName, from the same run of 100 as the previous one (without extra logging). I guess the extra logging changes the timing/ordering enough that it gets hidden again.

https://chromium-swarm.appspot.com/task?id=38c4b2116ef5ba10&refresh=10&show_raw=1

[00065.375] 03530.03557> [ RUN      ] CacheStorageManagerTest.BadOriginName
[00065.377] 03530.03557> [3:46000289:0922/231743.063901:65211391:FATAL:test_browser_thread_bundle.cc(64)] Check failed: base::MessageLoop::current()->IsIdleForTesting().
#00: StackTrace at ../../../../work/cr/src/base/debug/stack_trace_fuchsia.cc:173
#01: base::debug::StackTrace::StackTrace() at ../../../../work/cr/src/base/debug/stack_trace.cc:199
#02: ~LogMessage at ../../../../work/cr/src/base/logging.cc:560
#03: ~TestBrowserThreadBundle at ../../../../work/cr/src/content/public/test/test_browser_thread_bundle.cc:65
#04: ~CacheStorageManagerTest at ../../../../work/cr/src/content/browser/cache_storage/cache_storage_manager_unittest.cc:113
#05: content::CacheStorageManagerTest_BadOriginName_Test::~CacheStorageManagerTest_BadOriginName_Test() at ../../../../work/cr/src/content/browser/cache_storage/cache_storage_manager_unittest.cc:886
#06: content::CacheStorageManagerTest_BadOriginName_Test::~CacheStorageManagerTest_BadOriginName_Test() at ../../../../work/cr/src/content/browser/cache_storage/cache_storage_manager_unittest.cc:886
#07: testing::Test::DeleteSelf_() at ../../../../work/cr/src/third_party/googletest/src/googletest/include/gtest/gtest.h:453
#08: void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) at ../../../../work/cr/src/third_party/googletest/src/googletest/src/gtest.cc:2400
#09: void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) at ../../../../work/cr/src/third_party/googletest/src/googletest/src/gtest.cc:2453
#10: testing::TestInfo::Run() at ../../../../work/cr/src/third_party/googletest/src/googletest/src/gtest.cc:2662
#11: testing::TestCase::Run() at ../../../../work/cr/src/third_party/googletest/src/googletest/src/gtest.cc:2771
#12: RunAllTests at ../../../../work/cr/src/third_party/googletest/src/googletest/src/gtest.cc:4676
#13: bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) at ../../../../work/cr/src/third_party/googletest/src/googletest/src/gtest.cc:2400
#14: bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) at ../../../../work/cr/src/third_party/googletest/src/googletest/src/gtest.cc:2453
#15: Run at ../../../../work/cr/src/third_party/googletest/src/googletest/src/gtest.cc:4285
#16: RUN_ALL_TESTS() at ../../../../work/cr/src/third_party/googletest/src/googletest/include/gtest/gtest.h:2237
#17: Run at ../../../../work/cr/src/base/test/test_suite.cc:270
#18: Run at ../../../../work/cr/src/content/public/test/unittest_test_suite.cc:45
#19: int base::internal::FunctorTraits<int (content::UnitTestTestSuite::*)(), void>::Invoke<content::UnitTestTestSuite*>(int (content::UnitTestTestSuite::*)(), content::UnitTestTestSuite*&&) at ../../../../work/cr/src/base/bind_internal.h:194
#20: MakeItSo<int (content::UnitTestTestSuite::*const &)(), content::UnitTestTestSuite *> at ../../../../work/cr/src/base/bind_internal.h:277
#21: RunImpl<int (content::UnitTestTestSuite::*const &)(), const std::__1::tuple<base::internal::UnretainedWrapper<content::UnitTestTestSuite> > &, 0> at ../../../../work/cr/src/base/bind_internal.h:349
#22: base::internal::Invoker<base::internal::BindState<int (content::UnitTestTestSuite::*)(), base::internal::UnretainedWrapper<content::UnitTestTestSuite> >, int ()>::Run(base::internal::BindStateBase*) at ../../../../work/cr/src/base/bind_internal.h:331
#23: base::RepeatingCallback<int ()>::Run() const & at ../../../../work/cr/src/base/callback.h:92
#24: LaunchUnitTestsInternal at ../../../../work/cr/src/base/test/launcher/unit_test_launcher.cc:216
#25: base::LaunchUnitTests(int, char**, base::RepeatingCallback<int ()> const&) at ../../../../work/cr/src/base/test/launcher/unit_test_launcher.cc:475
#26: main at ../../../../work/cr/src/content/test/run_all_unittests.cc:20
#27: pc 0x4d0f775f9ebe (libc.so,0x1bebe)
[00065.388] 03530.03557>
[00065.388] 03530.03557> [478/4673] CacheStorageManagerTest.BadOriginName (CRASHED)

Maybe there's something different about the 3 tests it's happened in.
Components: Blink>Storage>CacheStorage
CacheStorageManagerTest.DeletedCacheIgnoredInIndex in https://build.chromium.org/p/chromium.fyi/builders/Fuchsia/builds/9657
Status: Started (was: Assigned)
This is the best repro (most logging) repro I've gotten so far:

sgraham@river:/work/cr/src$ tools/fuchsia/run-swarmed.py --out-dir=out/fuch --test-name content_unittests -n 1000
python tools/swarming_client/isolate.py check -i out/fuch/content_unittests.isolate -s out/fuch/content_unittests.isolated
If you get authentication errors, follow:
  https://www.chromium.org/developers/testing/isolated-testing/for-swes#TOC-Login-on-the-services
Uploading to isolate server, this can take a while...
Triggering 1000 tasks...
Collected 1000/1000 results (999 succeeded, 1 failed)...

Full log of the failure at:
https://chromium-swarm.appspot.com/task?id=38d920c8ddcfd410&refresh=10&show_raw=1

An annotated excerpt from the failure:

  [00021.547] 03523.03550> [3:MAIN        :0926/223203.683104:21041552:ERROR:simple_backend_impl.cc(102)] creating new cache_runner
  [00021.547] 03523.03550> [3:MAIN        :0926/223203.688490:21046935:ERROR:task_scheduler_impl.cc(124)] In this one
  [00021.547] 03523.03550> [3:MAIN        :0926/223203.694286:21052728:ERROR:scheduler_worker_pool.cc(142)] in that one
  [00021.547] 03523.03550> [3:MAIN        :0926/223203.699629:21058072:ERROR:simple_backend_impl.cc(274)] cache_runner_: 0x1faed10b1280
  [00021.547] 03523.03550> [3:MAIN        :0926/223203.705364:21063809:ERROR:simple_backend_impl.cc(299)] Posting InitializeIndx to cache_runner_
  [00021.548] 03523.03550> [3:CACHE_RUNNER:0926/223203.713114:21071561:ERROR:post_task_and_reply_impl.cc(45)] Running original:Init@../../net/disk_cache/simple/simple_backend_impl.cc:301
  [00021.548] 03523.03550> [3:CACHE_RUNNER:0926/223203.759101:21117553:ERROR:post_task_and_reply_impl.cc(50)] Running reply:Init@../../net/disk_cache/simple/simple_backend_impl.cc:1000301 on 0x1faed10e7fc0
  [00021.548] 03523.03550> [3:CACHE_RUNNER:0926/223203.768985:21127437:ERROR:post_task_and_reply_impl.cc(55)]   was posted
  [00021.548] 03523.03550> [3:MAIN        :0926/223203.782319:21140766:ERROR:simple_backend_impl.cc(285)] going to destroy cache_runner_ soon, maybe
  [00021.549] 03523.03550> [3:WORKER_POOL :0926/223203.782622:21141074:ERROR:post_task_and_reply_impl.cc(45)] Running original:LoadIndexEntries@../../net/disk_cache/simple/simple_index_file.cc:369
  [00021.549] 03523.03550> [3:MAIN        :0926/223203.795224:21153675:ERROR:disk_cache.cc(99)] ~CacheCreator
  [00021.549] 03523.03550> [3:WORKER_POOL :0926/223203.798767:21157210:ERROR:simple_index_file.cc(398)] SyncLoadIndexEntries
  [00021.549] 03523.03550> [3:MAIN        :0926/223203.872957:21231403:ERROR:post_task_and_reply_impl.cc(45)] Running original:Shutdown@../../content/browser/payments/payment_app_context_impl.cc:36
  [00021.550] 03523.03550> [3:MAIN        :0926/223203.881279:21239738:ERROR:post_task_and_reply_impl.cc(50)] Running reply:Shutdown@../../content/browser/payments/payment_app_context_impl.cc:1000036 on 0x1faed10e7fc0
  [00021.551] 03523.03550> [3:WORKER_POOL :0926/223203.888509:21246965:ERROR:post_task_and_reply_impl.cc(50)] Running reply:LoadIndexEntries@../../net/disk_cache/simple/simple_index_file.cc:1000369 on 0x1faed10e7fc0
  [00021.552] 03523.03550> [3:MAIN        :0926/223203.890724:21249171:ERROR:post_task_and_reply_impl.cc(55)]   was posted
  [00021.553] 03523.03550> [3:WORKER_POOL :0926/223203.905229:21263698:ERROR:post_task_and_reply_impl.cc(55)]   was posted
  [00021.553] 03523.03550> [3:MAIN        :0926/223203.905259:21263729:ERROR:incoming_task_queue.cc(96)] IncomingTaskQueue not empty as expected:
  [00021.553] 03523.03550> [3:MAIN        :0926/223203.923782:21282234:ERROR:incoming_task_queue.cc(98)] Task 0:
  [00021.554] 03523.03550> [3:MAIN        :0926/223203.930512:21288956:ERROR:incoming_task_queue.cc(99)]   posted_from: LoadIndexEntries, ../../net/disk_cache/simple/simple_index_file.cc:1000369
  [00021.554] 03523.03550> [3:MAIN        :0926/223203.939284:21297738:ERROR:incoming_task_queue.cc(102)]   delayed_run_time: 0 bogo-microseconds
  [00021.555] 03523.03550> [3:MAIN        :0926/223203.948172:21306621:ERROR:incoming_task_queue.cc(104)]   task_backtrace: 0
  [00021.555] 03523.03550> [3:MAIN        :0926/223203.958412:21316867:ERROR:incoming_task_queue.cc(104)]   task_backtrace: 0
  [00021.555] 03523.03550> [3:MAIN        :0926/223203.966336:21324781:ERROR:incoming_task_queue.cc(104)]   task_backtrace: 0
  [00021.555] 03523.03550> [3:MAIN        :0926/223203.976630:21335076:ERROR:incoming_task_queue.cc(104)]   task_backtrace: 0
  [00021.563] 03523.03550> [3:MAIN        :0926/223203.982543:21340988:FATAL:incoming_task_queue.cc(110)] Check failed: i == 0. IncomingTaskQueue wasn't empty
  #00: StackTrace at ../../../../work/cr/src/base/debug/stack_trace_fuchsia.cc:173
  #01: base::debug::StackTrace::StackTrace() at ../../../../work/cr/src/base/debug/stack_trace.cc:199
  #02: ~LogMessage at ../../../../work/cr/src/base/logging.cc:560
  #03: CheckIsIdleForTesting at /work/cr/src/out/fuch/../../base/message_loop/incoming_task_queue.cc:?
  #04: base::MessageLoop::CheckIsIdleForTesting() at ../../../../work/cr/src/base/message_loop/message_loop.cc:272
  #05: ~TestBrowserThreadBundle at /work/cr/src/out/fuch/../../content/public/test/test_browser_thread_bundle.cc:?
  #06: ~CacheStorageManagerTest at ../../../../work/cr/src/content/browser/cache_storage/cache_storage_manager_unittest.cc:113

Setup:
- net/disk_cache/disk_cache.cc creates a SimpleBackendImpl, and saves that in a CacheCreator.
- SimpleBackendImpl creates a new SequencedTaskRunner https://cs.chromium.org/chromium/src/net/disk_cache/simple/simple_backend_impl.cc?type=cs&sq=package:chromium&l=94 (cache_runner_)
- SimpleBackendImpl::Init uses that cache_runner_ to PostTaskAndReply its initialization steps
- SimpleIndexFile::LoadIndexEntries is run on cache_runner_ but uses a different SequencedWorkerPool (created at https://cs.chromium.org/chromium/src/net/disk_cache/simple/simple_backend_impl.cc?sq=package:chromium&l=91) to run SimpleIndexFile::SyncLoadIndexEntries() and reply to SimpleIndex::MergeInitializingSet()

On shutdown:
- LoadIndexEntries is running on g_sequenced_worker_pool.
- The Reply from that is posted back to SequencedTaskRunnerHandle::Get() (called at the time of posting https://cs.chromium.org/chromium/src/base/threading/post_task_and_reply_impl.cc?type=cs&q=PostTaskAndReplyRelay&sq=package:chromium&l=36 ) which is what fails.

I (still!) don't see how that task is sneaking through all the flushes though, mainly:
- https://cs.chromium.org/chromium/src/content/browser/cache_storage/cache_storage_manager_unittest.cc?rcl=02b7e5fa7bc98300eebfb88192e66d1f1d2d0ba7&l=131
- and then https://cs.chromium.org/chromium/src/content/public/test/test_browser_thread_bundle.cc?rcl=02b7e5fa7bc98300eebfb88192e66d1f1d2d0ba7&l=32

My theory is vaguely that the task is hiding on cache_runner_ during the test fixture shutdown, then hopping to worker_pool, then to the failure message loop. But even the most minimal of logging to confirm this causes it to not repro.

I've been trying to introduce a state machine type thing to step through the orderings that have to happen with some hacky code https://chromium-review.googlesource.com/c/chromium/src/+/696262/2 but haven't succeeded at that yet.

With that code, https://chromium-swarm.appspot.com/task?id=38f8060b92f28e10&refresh=10&show_raw=1&wide_logs=true is a rare failure with a decent amount of logging inserted (the code in the above CL).


I take it back, base/hack.h has finally prevailed and I can repro somewhat reliably (maybe 50% of the time) with

https://chromium-review.googlesource.com/c/chromium/src/+/696262/3

and

ninja -C out/fuch content_unittests -j3000 && out/fuch/bin/run_content_unittests --gtest_filter=CacheStorageManagerTests/CacheStorageManagerTestP.OpenRunsSerially/1 --test-launcher-jobs=1

Sometimes https://gist.github.com/sgraham/86b04811787670e3300544262578225c
Sometimes gets stuck at https://gist.github.com/sgraham/493a4b73523f78d7deba559349e8b519

So now just to track down what unsticks it sometimes.
Finally reproducible! 5/5 repro with https://chromium-review.googlesource.com/c/chromium/src/+/696262/5

The actual figuring out the bug and fixing it should now be comparatively trivial.
Thanks for pursuing this, and also thanks for capturing your thoughts as you make progress on understanding the issue!

(And you thought you were just talking to yourself here!) 
> (And you thought you were just talking to yourself here!) 

Ha! I mostly do that because I have the memory of a goldfish, so it's like I'm talking to a new me, at least after a couple days. ;)
Cc: jkarlin@chromium.org mmenke@chromium.org morlovich@chromium.org
The two crucial timings for repro are, I believe:

1. cache_runner_ delays posting the reply from SimpleBackendImpl::Init (the call to SimpleBackendImpl::Initialize) until after disk_cache::g_sequenced_worker_pool has been flushed. This means it's possible to later run LoadIndexEntries, but it won't live anywhere that ~TestBrowserThreadBundle() can flush it.

2. When the LoadIndexEntries is posted back into the main loop, it has to happen after RunAllTasksUntilIdle() and before failing CHECK so that it's actually caught. Otherwise, it can happen to get flushed without checking or be posted after the CHECK is checked so silently succeeds.

Looking back at history, this may have started with https://chromium-review.googlesource.com/c/chromium/src/+/579871, but not 100% certain of that.
I think the underlying thing may be that flushing everything from both the scheduler stuff and the worker pool isn't really doable. The "easy" way to do this may be to turn on scheduler override of the worker pool, but that feels way too invasive to do (OTOH, that's what *Chrome* does, so that's something test fixtures should probably be doing as a matter of course). Another option is probably to just port away the worker_pool_ usage to scheduler usage, which we likely will be asked to at some point anyway (I think I saw CLs to that effect elsewhere in net?)

some potentially relevant bits:
https://cs.chromium.org/chromium/src/base/threading/sequenced_worker_pool.cc?rcl=922634cb677067f2480c75bed108341095794e4e&l=813

https://cs.chromium.org/chromium/src/base/task_scheduler/post_task.cc?rcl=922634cb677067f2480c75bed108341095794e4e&l=93


Disclaimer: actually running in similar issues with a CL I am working on.

Never mind that for a bit, that actually crashes content_unittests on Linux ATM..
I think I've fixed that, may be worth testing it?
I can try that CL. It was very tricksy to isolate though, so just passing the tests won't be necessarily give much confidence.
Typically at least 1/1000 would have failed before, but with 697925:


scottmg@around:/work/cr/src$ tools/fuchsia/run-swarmed.py --out-dir=out/fuch --test-name=content_unittests -n 1000
python tools/swarming_client/isolate.py check -i out/fuch/content_unittests.isolate -s out/fuch/content_unittests.isolated
If you get authentication errors, follow:
  https://www.chromium.org/developers/testing/isolated-testing/for-swes#TOC-Login-on-the-services
Uploading to isolate server, this can take a while...
Triggering 1000 tasks...
[1000/1000] collected: 1000 successes, 0 failures, 0 bot errors...
Results logs collected into /usr/local/google/work/cr/src/results.


So seems to help. Maybe. Hopefully. :)
Project Member

Comment 23 by bugdroid1@chromium.org, Oct 4 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/f9cc9601e0ca9ff8dc2973c677ab8b164b74d7f3

commit f9cc9601e0ca9ff8dc2973c677ab8b164b74d7f3
Author: Maks Orlovich <morlovich@chromium.org>
Date: Wed Oct 04 15:31:18 2017

SimpleCache: Port (unsequenced) use of SequencedWorkerPool over to TaskScheduler

This is what it does behind scenes in Chrome anyway, but always using the scheduler
in tests as well is helpful since it means a single TaskScheduler::FlushForTesting() call
rather than needing both TaskScheduler and SequencedWorkerPool versions of FlushForTesting()
as separate calls, which causes trouble when control jumps back and forth between the two.

Bug:  766221 
Change-Id: Iab6a49cad86f1fd5a31c66c50d25e2962c538d53
Reviewed-on: https://chromium-review.googlesource.com/697925
Reviewed-by: Asanka Herath <asanka@chromium.org>
Reviewed-by: Scott Graham <scottmg@chromium.org>
Commit-Queue: Maks Orlovich <morlovich@chromium.org>
Cr-Commit-Position: refs/heads/master@{#506395}
[modify] https://crrev.com/f9cc9601e0ca9ff8dc2973c677ab8b164b74d7f3/net/disk_cache/simple/simple_backend_impl.cc

Might need to go to ScopedTaskEnvironment::RunUntilIdle, too, to keep track of the main thread as well. Don't know if it's easily available in that fixture off the top of my head..

Cc: -mmenke@chromium.org
I think the ~TestBrowserThreadBundle will handle flush for the main thread already?
Project Member

Comment 27 by bugdroid1@chromium.org, Oct 4 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/e62d0d3641156f832d9b26c4441663e0dcb085d9

commit e62d0d3641156f832d9b26c4441663e0dcb085d9
Author: Samuel Huang <huangs@chromium.org>
Date: Wed Oct 04 18:59:50 2017

Revert "SimpleCache: Port (unsequenced) use of SequencedWorkerPool over to TaskScheduler"

This reverts commit f9cc9601e0ca9ff8dc2973c677ab8b164b74d7f3.

Reason for revert: Speculative revert, for failing net_unittests on Android Tests (dbg).

Original change's description:
> SimpleCache: Port (unsequenced) use of SequencedWorkerPool over to TaskScheduler
> 
> This is what it does behind scenes in Chrome anyway, but always using the scheduler
> in tests as well is helpful since it means a single TaskScheduler::FlushForTesting() call
> rather than needing both TaskScheduler and SequencedWorkerPool versions of FlushForTesting()
> as separate calls, which causes trouble when control jumps back and forth between the two.
> 
> Bug:  766221 
> Change-Id: Iab6a49cad86f1fd5a31c66c50d25e2962c538d53
> Reviewed-on: https://chromium-review.googlesource.com/697925
> Reviewed-by: Asanka Herath <asanka@chromium.org>
> Reviewed-by: Scott Graham <scottmg@chromium.org>
> Commit-Queue: Maks Orlovich <morlovich@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#506395}

TBR=scottmg@chromium.org,asanka@chromium.org,morlovich@chromium.org

Change-Id: Ie79bee31116c2fea12122a0c4cf8339af7a6b8a2
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug:  766221 
Reviewed-on: https://chromium-review.googlesource.com/700934
Reviewed-by: Samuel Huang <huangs@chromium.org>
Commit-Queue: Samuel Huang <huangs@chromium.org>
Cr-Commit-Position: refs/heads/master@{#506469}
[modify] https://crrev.com/e62d0d3641156f832d9b26c4441663e0dcb085d9/net/disk_cache/simple/simple_backend_impl.cc

Project Member

Comment 28 by bugdroid1@chromium.org, Oct 4 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/35a45032eafb0399718df27a38ed4b4ed8352115

commit 35a45032eafb0399718df27a38ed4b4ed8352115
Author: Maks Orlovich <morlovich@chromium.org>
Date: Wed Oct 04 22:06:15 2017

Reland "SimpleCache: Port (unsequenced) use of SequencedWorkerPool over to TaskScheduler"

This is a reland of f9cc9601e0ca9ff8dc2973c677ab8b164b74d7f3
Original change's description:
> SimpleCache: Port (unsequenced) use of SequencedWorkerPool over to TaskScheduler
> 
> This is what it does behind scenes in Chrome anyway, but always using the scheduler
> in tests as well is helpful since it means a single TaskScheduler::FlushForTesting() call
> rather than needing both TaskScheduler and SequencedWorkerPool versions of FlushForTesting()
> as separate calls, which causes trouble when control jumps back and forth between the two.
> 
> Bug:  766221 
> Change-Id: Iab6a49cad86f1fd5a31c66c50d25e2962c538d53
> Reviewed-on: https://chromium-review.googlesource.com/697925
> Reviewed-by: Asanka Herath <asanka@chromium.org>
> Reviewed-by: Scott Graham <scottmg@chromium.org>
> Commit-Queue: Maks Orlovich <morlovich@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#506395}

Bug:  766221 
Change-Id: I6f76f26780f129499fe9916b717362b7a11e944f
Reviewed-on: https://chromium-review.googlesource.com/700858
Reviewed-by: Asanka Herath <asanka@chromium.org>
Reviewed-by: Samuel Huang <huangs@chromium.org>
Commit-Queue: Samuel Huang <huangs@chromium.org>
Cr-Commit-Position: refs/heads/master@{#506534}
[modify] https://crrev.com/35a45032eafb0399718df27a38ed4b4ed8352115/net/disk_cache/simple/simple_backend_impl.cc

Project Member

Comment 29 by bugdroid1@chromium.org, Oct 10 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/a3f957741654460009f09a80f152ab7713ce1123

commit a3f957741654460009f09a80f152ab7713ce1123
Author: Scott Graham <scottmg@chromium.org>
Date: Tue Oct 10 19:37:39 2017

fuchsia: Improve swarm runner script

- Use multiprocessing to trigger/collect jobs more quickly
- Handle bot errors separate from run failures
- Improve/tersify output

Split out of debugging code in https://chromium-review.googlesource.com/c/chromium/src/+/696262/.

TBR: kmarshall@chromium.org
Bug:  766221 
Change-Id: I93de65da20f5de4b5c58f661be42d293a3849fc0
Reviewed-on: https://chromium-review.googlesource.com/698353
Reviewed-by: Scott Graham <scottmg@chromium.org>
Reviewed-by: Kevin Marshall <kmarshall@chromium.org>
Commit-Queue: Scott Graham <scottmg@chromium.org>
Cr-Commit-Position: refs/heads/master@{#507743}
[modify] https://crrev.com/a3f957741654460009f09a80f152ab7713ce1123/tools/fuchsia/run-swarmed.py

Project Member

Comment 30 by bugdroid1@chromium.org, Oct 16 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/05747987240b9643f54303b5c3747312f642fc9f

commit 05747987240b9643f54303b5c3747312f642fc9f
Author: Scott Graham <scottmg@chromium.org>
Date: Mon Oct 16 20:09:39 2017

fuchsia: Enable CacheStorageManagerTests/CacheStorageManagerTestP.DeletedKeysGone/1

Should be fixed per work in linked bug, re-enabling.

Bug:  766221 
TBR: wez@chromium.org
Change-Id: I11ab522f4173b380cf7fd932db3708e7fa42e8bf
Reviewed-on: https://chromium-review.googlesource.com/721697
Commit-Queue: Scott Graham <scottmg@chromium.org>
Reviewed-by: Wez <wez@chromium.org>
Reviewed-by: Scott Graham <scottmg@chromium.org>
Cr-Commit-Position: refs/heads/master@{#509147}
[modify] https://crrev.com/05747987240b9643f54303b5c3747312f642fc9f/testing/buildbot/filters/fuchsia.content_unittests.filter

Status: Fixed (was: Started)
Blocking: 882906

Sign in to add a comment