New issue
Advanced search Search tips

Issue 705500 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Mar 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: iOS
Pri: 3
Type: Bug



Sign in to add a comment

Destabilized use of LockImplPosix starting around 3/21

Project Member Reported by olivierrobin@chromium.org, Mar 27 2017

Issue description

Destablized use of LockImplPosix starting around 3/21.

Failure log:
3/17  (Possibly Unrelated) base_unittests Exception Failure to Complete

TaskSchedulerSingleThreadTaskRunnerManagerTest.PostDelayedTask [STACK1]
3/21  
16:42   First Recorded Failure 
        LockImplPosix: Check failed: rv == 0 (22 vs. 0). Invalid argument
        https://uberchromegw.corp.google.com/i/internal.bling.main/builders/smoke/builds/14764

        4 Successful Runs

19:44   https://uberchromegw.corp.google.com/i/internal.bling.main/builders/smoke/builds/14769

        2 Successful Runs

21:41   https://uberchromegw.corp.google.com/i/internal.bling.main/builders/smoke/builds/14772

        5 Successful Runs

04:44   https://uberchromegw.corp.google.com/i/internal.bling.main/builders/smoke/builds/14778

        3 Days Pass

3/24   
11:52   https://uberchromegw.corp.google.com/i/internal.bling.main/builders/smoke/builds/14828

TaskSchedulerSingleThreadTaskRunnerManagerTest.PrioritySetCorrectly [STACK2]
3/24    
19:47   https://uberchromegw.corp.google.com/i/internal.bling.main/builders/smoke/builds/14836

TaskSchedulerSingleThreadTaskRunnerManagerTest.PostDelayedTask [STACK1]
3/24
21:47   https://uberchromegw.corp.google.com/i/internal.bling.main/builders/smoke/builds/14839
 
        2 Successful Runs
 
3/25
10:53   https://uberchromegw.corp.google.com/i/internal.bling.main/builders/smoke/builds/14843

        7 Successful Runs    

3/27
03:27   https://uberchromegw.corp.google.com/i/internal.bling.main/builders/smoke/builds/14851

TaskSchedulerWorkerPoolCheckTlsReuse.CheckDetachedThreads [STACK3]

3/24    
19:47   https://uberchromegw.corp.google.com/i/internal.bling.main/builders/smoke/builds/14854 
        
[STACK1]
[ RUN      ] TaskSchedulerSingleThreadTaskRunnerManagerTest.PostDelayedTask
[31612:125479:0321/164626.669783:151662621172523:FATAL:lock_impl_posix.cc(65)] Check failed: rv == 0 (22 vs. 0). Invalid argument
0   base_unittests                      0x013f8be6 base::debug::StackTrace::StackTrace(unsigned long) + 134
1   base_unittests                      0x013f8c64 base::debug::StackTrace::StackTrace(unsigned long) + 36
2   base_unittests                      0x013f73e7 base::debug::StackTrace::StackTrace() + 39
3   base_unittests                      0x0147239c logging::LogMessage::~LogMessage() + 92
4   base_unittests                      0x014700d7 logging::LogMessage::~LogMessage() + 23
5   base_unittests                      0x0159549e base::internal::LockImpl::Lock() + 334
6   base_unittests                      0x002fe2cf base::Lock::Acquire() + 31
7   base_unittests                      0x015b7a83 base::internal::SchedulerLockImpl::Acquire() + 35
8   base_unittests                      0x00e7d2e4 base::internal::AutoSchedulerLock::AutoSchedulerLock(base::internal::SchedulerLock&) + 36
9   base_unittests                      0x00e7d294 base::internal::AutoSchedulerLock::AutoSchedulerLock(base::internal::SchedulerLock&) + 36
10  base_unittests                      0x015dd337 base::internal::TaskTracker::IsShutdownComplete() const + 39
11  base_unittests                      0x015c441d base::internal::SchedulerWorker::ShouldExit() + 29
12  base_unittests                      0x015c49cc base::internal::SchedulerWorker::Thread::ThreadMain() + 396
13  base_unittests                      0x015ec577 base::(anonymous namespace)::ThreadFunc(void*) + 519
14  libsystem_pthread.dylib             0x0992e780 _pthread_body + 138
15  libsystem_pthread.dylib             0x0992e6f6 _pthread_body + 0
16  libsystem_pthread.dylib             0x0992bf7a thread_start + 34

[STACK2]
[ RUN      ] TaskSchedulerSingleThreadTaskRunnerManagerTest.PrioritySetCorrectly
[57069:1803:0324/195225.144893:72225941318571:FATAL:lock_impl_posix.cc(53)] Check failed: rv == 0 (16 vs. 0). Resource busy
0   base_unittests                      0x013d5796 base::debug::StackTrace::StackTrace(unsigned long) + 134
1   base_unittests                      0x013d5814 base::debug::StackTrace::StackTrace(unsigned long) + 36
2   base_unittests                      0x013d3f97 base::debug::StackTrace::StackTrace() + 39
3   base_unittests                      0x0144ef4c logging::LogMessage::~LogMessage() + 92
4   base_unittests                      0x0144cc87 logging::LogMessage::~LogMessage() + 23
5   base_unittests                      0x015721b3 base::internal::LockImpl::~LockImpl() + 307
6   base_unittests                      0x015721d7 base::internal::LockImpl::~LockImpl() + 23
7   base_unittests                      0x01571683 base::Lock::~Lock() + 259
8   base_unittests                      0x015716a7 base::Lock::~Lock() + 23
9   base_unittests                      0x015946b0 base::internal::SchedulerLockImpl::~SchedulerLockImpl() + 64
10  base_unittests                      0x00e00b37 base::internal::SchedulerLock::~SchedulerLock() + 23
11  base_unittests                      0x00e005e7 base::internal::SchedulerLock::~SchedulerLock() + 23
12  base_unittests                      0x015b97f2 base::internal::TaskTracker::~TaskTracker() + 274
13  base_unittests                      0x015b9967 base::internal::TaskTracker::~TaskTracker() + 23
14  base_unittests                      0x00e0d936 base::internal::(anonymous namespace)::TaskSchedulerSingleThreadTaskRunnerManagerTest::~TaskSchedulerSingleThreadTaskRunnerManagerTest() + 214
15  base_unittests                      0x00e11097 base::internal::TaskSchedulerSingleThreadTaskRunnerManagerTest_PrioritySetCorrectly_Test::~TaskSchedulerSingleThreadTaskRunnerManagerTest_PrioritySetCorrectly_Test() + 23
16  base_unittests                      0x00e0c137 base::internal::TaskSchedulerSingleThreadTaskRunnerManagerTest_PrioritySetCorrectly_Test::~TaskSchedulerSingleThreadTaskRunnerManagerTest_PrioritySetCorrectly_Test() + 23
17  base_unittests                      0x00e0c15a base::internal::TaskSchedulerSingleThreadTaskRunnerManagerTest_PrioritySetCorrectly_Test::~TaskSchedulerSingleThreadTaskRunnerManagerTest_PrioritySetCorrectly_Test() + 26
18  base_unittests                      0x017b5f76 testing::Test::DeleteSelf_() + 38
19  base_unittests                      0x017df3d7 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) + 103
20  base_unittests                      0x017b413a void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) + 106
21  base_unittests                      0x017b5db8 testing::TestInfo::Run() + 328
22  base_unittests                      0x017b7494 testing::TestCase::Run() + 260

[STACK3]
[ RUN      ] TaskSchedulerWorkerPoolCheckTlsReuse.CheckDetachedThreads
[17234:118591:0327/080303.411960:21984564534804:FATAL:lock_impl_posix.cc(65)] Check failed: rv == 0 (22 vs. 0). Invalid argument
0   base_unittests                      0x01406566 base::debug::StackTrace::StackTrace(unsigned long) + 134
1   base_unittests                      0x014065e4 base::debug::StackTrace::StackTrace(unsigned long) + 36
2   base_unittests                      0x01404d67 base::debug::StackTrace::StackTrace() + 39
3   base_unittests                      0x0147fd1c logging::LogMessage::~LogMessage() + 92
4   base_unittests                      0x0147da57 logging::LogMessage::~LogMessage() + 23
5   base_unittests                      0x015a324e base::internal::LockImpl::Lock() + 334
6   base_unittests                      0x002fa7df base::Lock::Acquire() + 31
7   base_unittests                      0x015c5533 base::internal::SchedulerLockImpl::Acquire() + 35
8   base_unittests                      0x00e85d64 base::internal::AutoSchedulerLock::AutoSchedulerLock(base::internal::SchedulerLock&) + 36
9   base_unittests                      0x00e85d14 base::internal::AutoSchedulerLock::AutoSchedulerLock(base::internal::SchedulerLock&) + 36
10  base_unittests                      0x015eade7 base::internal::TaskTracker::IsShutdownComplete() const + 39
11  base_unittests                      0x015d1ecd base::internal::SchedulerWorker::ShouldExit() + 29
12  base_unittests                      0x015d247c base::internal::SchedulerWorker::Thread::ThreadMain() + 396
13  base_unittests                      0x015fa027 base::(anonymous namespace)::ThreadFunc(void*) + 519
14  libsystem_pthread.dylib             0x0998c780 _pthread_body + 138
15  libsystem_pthread.dylib             0x0998c6f6 _pthread_body + 0
16  libsystem_pthread.dylib             0x09989f7a thread_start + 34
 
Owner: robliao@chromium.org
Status: Started (was: Assigned)
Summary: Destabilized use of LockImplPosix starting around 3/21 (was: TaskSchedulerSingleThreadTaskRunnerManagerTest are flaky on ios simulator)
Updated the description for accuracy.
Description: Show this description
I am able to repro this on device only with gtest_repeat. This has not yet reproed for me in the simulator.

Comment 6 Deleted

Comment 7 Deleted

SchedulerWorkers are somehow outliving the test.

When the lock returns 22 (invalid argument), it's a simple case where the test has returned and the SchedulerWorker is still running

* thread #1: tid = 0xac1fd4, 0x0000000186085e1c libsystem_kernel.dylib`__psynch_cvwait + 8, queue = 'com.apple.main-thread'
    frame #0: 0x0000000186085e1c libsystem_kernel.dylib`__psynch_cvwait + 8
    frame #1: 0x000000018614ab74 libsystem_pthread.dylib`_pthread_cond_wait + 640
    frame #2: 0x00000001011f2658 base_unittests`base::ConditionVariable::Wait(this=0x000000016fdb82e8) + 92 at condition_variable_posix.cc:69
    frame #3: 0x00000001011f4ca0 base_unittests`base::WaitableEvent::TimedWaitUntil(this=0x000000016fdb89a0, end_time=0x000000016fdb83a8) + 548 at waitable_event_posix.cc:225
    frame #4: 0x00000001011f49ac base_unittests`base::WaitableEvent::Wait(this=0x000000016fdb89a0) + 88 at waitable_event_posix.cc:157
    frame #5: 0x0000000100ba0cc4 base_unittests`base::internal::TaskSchedulerSingleThreadTaskRunnerManagerTest_PrioritySetCorrectly_Test::TestBody(this=0x0000000103428cf0) + 1248 at scheduler_single_thread_task_runner_manager_unittest.cc:187

Both threads appear to reference different locks, suggesting this thread might also have outlived its test.

Test Thread
(lldb) print this->task_tracker.shutdown_lock_
(base::internal::SchedulerLock) $6 = {
  base::internal::SchedulerLockImpl = {
    lock_ = {
      owning_thread_ref_ = {
        id_ = 0x0000000000000000
      }
      lock_ = {
        native_handle_ = (__sig = 1297437784, __opaque = char [56] @ 0x00007fb365801580)
      }
    }
  }
}

Worker Thread
(lldb) print this->shutdown_lock_
(base::internal::SchedulerLock) $5 = {
  base::internal::SchedulerLockImpl = {
    lock_ = {
      owning_thread_ref_ = {
        id_ = 0x0000000000000000
      }
      lock_ = {
        native_handle_ = (__sig = 1297437784, __opaque = char [56] @ 0x00007fb355f21020)
      }
    }
  }
}

I have not yet reproed 16 (Resource busy).
Labels: OS-iOS
This is likely caused by the SingleThreadTaskRunner unregistration path. In this path, we release the SchedulerWorker from the SchedulerSingleThreadTaskRunnerManager and instruct it to do a non-blocking termination. Because we release the reference, we can no longer perform a Join on the thread.
Project Member

Comment 11 by bugdroid1@chromium.org, Mar 29 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/624dc15f7cb6b404e607f536aef0c18d325903ee

commit 624dc15f7cb6b404e607f536aef0c18d325903ee
Author: robliao <robliao@chromium.org>
Date: Wed Mar 29 18:59:09 2017

Change SchedulerWorker::ShouldExit Checks Ordering

A SchedulerWorker may outlive the test environment when it's released
as a result of SingleThreadTaskRunner unregistration in the
SchedulerSingleThreadTaskRunnerManager. As a result, SchedulerWorker
should prefer checking its own fields first to validate that
it should continue to run before consulting the environment provided
TaskTracker.

BUG= 705500 

Review-Url: https://codereview.chromium.org/2781933003
Cr-Commit-Position: refs/heads/master@{#460475}

[modify] https://crrev.com/624dc15f7cb6b404e607f536aef0c18d325903ee/base/task_scheduler/scheduler_worker.cc

Status: Fixed (was: Started)
This should be fixed now. Will monitor the bot.

Sign in to add a comment