New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 629139 link

Starred by 3 users

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: All
Pri: 3
Type: Bug

Blocked on:
issue 324454
issue 552633



Sign in to add a comment

FIx improper usages of the base::Thread API

Project Member Reported by gab@chromium.org, Jul 18 2016

Issue description

Wrote https://codereview.chromium.org/2145463002/ to assert that the base::Thread API is being used as documented. Some asserts fire... need to fix bad callers.

One such caller is MemoryDumpManagerTest which has two tests that incorrectly call Thread::Stop() from another thread (base::Thread's API is not thread-safe).

Other failing tests (haven't dug into those yet): ProfileSyncServiceAutofillTest.*, CRWCertVerificationControllerTest.*.
 

Comment 1 by gab@chromium.org, Jul 21 2016

Cc: mvanouwe...@chromium.org
Another bad usage: GeolocationProviderImpl::InformProvidersPermissionGranted() can call Thread::IsRunning() from underlying thread (it should only be called from owning thread -- BrowserThread::UI in this case).

+content\browser\geolocation owner to fix

Comment 2 by gab@chromium.org, Jul 21 2016

Cc: ajose@chromium.org
AudioTrackRecorder::OnSetFormat/OnData() makes the same mistake (illegal call to Thread::IsRunning() from a different sequence than the one starting the thread).

This is responsible for the WebRtcCaptureFromElementBrowserTest.* failures on https://codereview.chromium.org/2145463002/#ps160001

@ajose who wrote this (https://codereview.chromium.org/1406113002) to fix.

Comment 3 by ajose@chromium.org, Jul 21 2016

Cc: -ajose@chromium.org
Hi @gab, afraid I'm no longer active on chromium.

Comment 4 by gab@chromium.org, Jul 21 2016

Cc: perkj@chromium.org
@perkj to fix #2 as content/renderer/media owner for WebRtc in that case.
Components: Blink>Location

Comment 6 by gab@chromium.org, Jul 22 2016

Another problematic use case : BrowserProcessSubThread::Init() calls BrowserThread::CurrentlyOn() which loops over all its message_loop(). The 
check of base::Thread::message_loop() [1] is intended to allow users who use external synchronization to see |message_loop_ != nullptr| but in this case some of them are still null... BrowserThreadImpl should probably avoid checking threads it knows not to be initialized.

Disabling that test until then.

[1]
    DCHECK(sequence_checker_.CalledOnValidSequencedThread() ||
           id_ == PlatformThread::CurrentId() || message_loop_);
Project Member

Comment 7 by bugdroid1@chromium.org, Jul 22 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/1734a80f1519819eb6a3d8eb43064f8fb2c2e067

commit 1734a80f1519819eb6a3d8eb43064f8fb2c2e067
Author: gab <gab@chromium.org>
Date: Fri Jul 22 21:26:13 2016

Modernize base::Thread

No-op cleanup extracted from https://codereview.chromium.org/2135413003/#ps40001

In this CL:

 - Explicitly document the threading requirements of the API after grokking its use of locks
   in some places and lack of in some others)
   - (It's mostly thread unsafe; except for a few calls which are okay from the owner
      as well as the underlying thread; and one method which is actually thread-safe)
   - Add assertions as such (and catch issues on try bots :-O -- http://crbug.com/629139)
   - Remove |thread_lock_| which is unnecessary under the documented conditions
     (delayed until http://crbug.com/629139 is fixed..).

 - C++11 member initializers (makes it cleaner to add new POD members and
                              makes constructors simpler -- e.g. |run_loop_| was uninitialized)
 - Assertions and tests for current API which allows Start/Stop/Start cycles
   but wasn't tested... (actually more than that is poorly/not tested in this class
   but I'm merely adding tests for parts I'm stressing in my actual follow-up CL).

 - Comment nits

 - Make event-based tests use WaitableEvent instead of sleeping + toggling a bool.

BUG= 629716 , 629139

Review-Url: https://codereview.chromium.org/2145463002
Cr-Commit-Position: refs/heads/master@{#407265}

[modify] https://crrev.com/1734a80f1519819eb6a3d8eb43064f8fb2c2e067/base/threading/thread.cc
[modify] https://crrev.com/1734a80f1519819eb6a3d8eb43064f8fb2c2e067/base/threading/thread.h
[modify] https://crrev.com/1734a80f1519819eb6a3d8eb43064f8fb2c2e067/base/threading/thread_unittest.cc

Comment 8 by gab@chromium.org, Jul 25 2016

ping mvanouwerkerk@ and perkj@, can you ack that you'll address issues raised in #1 and #2 which are under your respective ownership and/or find the appropriate person to address it, thanks!

Comment 9 by gab@chromium.org, Jul 25 2016

Blockedon: 324454
Cc: danakj@chromium.org imch...@chromium.org m...@chromium.org
Another incorrect usage: https://codereview.chromium.org/130253002 introduced a while ago by imcheng@. WebContentsCaptureMachine::InternalStop() intentionally stops its worker thread on the blocking pool. As mentioned above, this is incorrect as the base::Thread API is not thread-safe.

This thread currently being owned by the UI thread (on which Stop() can't be called per the AssertIOAllowed() restrictions on Join()), you have 3 options I'd say:

1) Have the thread be owned by another sequence altogether (e.g. the BlockingPool or another thread).

2) Wait for https://codereview.chromium.org/2135413003/ to land and make it a non-joinable thread (then destroying it results in "StopSoon()" semantics -- i.e. it will wind down on its own in a non-blocking way). Disadvantage: you don't know when it's done winding down if you need to confirm it freed resources before doing something else (it could also be killed by the OS on shutdown before being completely done -- only matters if it's saving persistent state required before shutdown).

3) Add an API to base::Thread to support detaching from its sequence and re-binding to a new one (i.e. instead of making the whole API thread-safe, make consumers that care be explicit about it).


IMO (2) is best if you can deal with the pitfalls of a non-joinable thread.

@imcheng, can you address this?

CC miu as reviewer on original CL too.


Thanks!
Gab
Project Member

Comment 10 by bugdroid1@chromium.org, Jul 25 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/6e23c37e5f9bf776010908c2d19a0d842d2c70f1

commit 6e23c37e5f9bf776010908c2d19a0d842d2c70f1
Author: gab <gab@chromium.org>
Date: Mon Jul 25 21:21:23 2016

Fix MemoryDumpManagerTest's thread unsafe usage of the base::TestIOThread API

Also move TestIOThread::PostTaskAndWait to those tests as they were the
only user.

BUG=629139

Review-Url: https://codereview.chromium.org/2170953002
Cr-Commit-Position: refs/heads/master@{#407577}

[modify] https://crrev.com/6e23c37e5f9bf776010908c2d19a0d842d2c70f1/base/test/test_io_thread.cc
[modify] https://crrev.com/6e23c37e5f9bf776010908c2d19a0d842d2c70f1/base/test/test_io_thread.h
[modify] https://crrev.com/6e23c37e5f9bf776010908c2d19a0d842d2c70f1/base/trace_event/memory_dump_manager_unittest.cc

Comment 11 by w...@chromium.org, Jul 25 2016

Re #9: Start/Stop occurring on different threads is permitted by the design of the Thread API - the use of an AutoLock in there specifically to protect against the race-condition between startup and teardown.

I looked through the CL and discussion and couldn't find discussion of the rationale for changing the Thread API contract to require same-sequence Start/Stop - could you perhaps update this bug's CL description to provide that?

Comment 12 by gab@chromium.org, Jul 26 2016

@11, see reply @ https://codereview.chromium.org/2145463002/#msg82

tl;dr; this isn't changing the API, it's clarifying that, as always in Chromium : unless specified otherwise APIs are not to be assumed thread-safe (and this is clear from many long-standing comments listed in above reply)

The AutoLock was around |thread_handle_| only and is likely more an artifact than anything of days when handles were used more widely, other state is not synchronized (used in other methods and not using this lock). If this API is to be thread-safe, it will need more than that.
Project Member

Comment 13 by bugdroid1@chromium.org, Jul 26 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/63f8f4eb236c7aa8bb8505bd372487e3a529baee

commit 63f8f4eb236c7aa8bb8505bd372487e3a529baee
Author: gab <gab@chromium.org>
Date: Tue Jul 26 21:09:17 2016

Remove ThreadTest.Restart

I had missed its existence as part of https://codereview.chromium.org/2145463002/
and its now superseeded by ThreadTest.StartTwice and ThreadTest.StopTwiceNop.

BUG=629139

Review-Url: https://codereview.chromium.org/2186703002
Cr-Commit-Position: refs/heads/master@{#407913}

[modify] https://crrev.com/63f8f4eb236c7aa8bb8505bd372487e3a529baee/base/threading/thread_unittest.cc

Comment 14 by gab@chromium.org, Jul 28 2016

print_job.cc is also doing this incorrect Stop() from worker thread : https://cs.chromium.org/chromium/src/chrome/browser/printing/print_job.cc?rcl=0&l=427

And it looks like Thread::StopSoon() was explicitly introduced for that use case...

How about we introduce Detach() (and replace StopSoon() with it):

// Notifies the underlying thread that it should wind down and detaches from it. 
// This Thread instance may be re-used immediately after this call.
// The underlying thread will reply with |on_stopped| on the caller's sequence when done
// winding down.
void Thread::Detach(const base::Closure& on_stopped);

? Instead of hogging a worker thread just to join when these callsites merely care about stopping and being notified when done (e.g., print_job.cc and WebContentsCaptureMachine::InternalStop()).


I think I'll need something like this to solve a similar problem for destruction of live non-joinable Thread objects (https://codereview.chromium.org/2135413003/#msg62), so I'm happy to implement it :-).


Two problems need to be solved to allow this:
 1) The member state needs to be shared (i.e. most likely moved to a RefCountedThreadSafe base::Thread::State object) so that the Thread can be deleted while ThreadMain() is still running.
 2) Virtual member methods accessed from ThreadMain are problematic (Init/Run/Cleanup), best would be to move those to a Delegate which is owned by the underlying thread (optionally passed in Thread::Options for impls that want to override default behavior).

(1) is easy, change implementation details
(2) would require changing all base::Thread users with Init/Run/Cleanup overrides (30+)... less excited about that... any other ideas?

Comment 15 by gab@chromium.org, Aug 2 2016

Some callers of set_message_loop() appear to be passing a null |message_loop|?!

From PS 24 @ https://codereview.chromium.org/2135413003/#ps600001

IncidentReportingServiceTest.AddIncident (run #1):
[ RUN      ] IncidentReportingServiceTest.AddIncident
[30437:30437:0801/145745:23524416156:FATAL:thread.cc(247)] Check failed: message_loop.
#0 0x000003ec2f2e base::debug::StackTrace::StackTrace()
#1 0x000003eda84b logging::LogMessage::~LogMessage()
#2 0x000003f270d0 base::Thread::SetMessageLoop()
#3 0x000001f1ae0c content::BrowserThreadImpl::BrowserThreadImpl()
#4 0x0000031d6e09 content::TestBrowserThread::TestBrowserThread()
#5 0x000001852ac2 IncidentReportingServiceTest::IncidentReportingServiceTest()
#6 0x000001852a06 testing::internal::TestFactoryImpl<>::CreateTest()
#7 0x0000034a314a testing::TestInfo::Run()
#8 0x0000034a35f3 testing::TestCase::Run()
#9 0x0000034aa7c9 testing::internal::UnitTestImpl::RunAllTests()
#10 0x0000034aa46e testing::UnitTest::Run()
#11 0x00000319ea36 base::TestSuite::Run()
#12 0x0000031a041a base::(anonymous namespace)::LaunchUnitTestsInternal()
#13 0x0000031a02e4 base::LaunchUnitTests()
#14 0x00000319ac7c main
#15 0x7f71c040c7ed __libc_start_main
#16 0x00000061a091 <unknown>

Will ignore call when null for now and figure it out in later CL.

Comment 16 by gab@chromium.org, Aug 2 2016

FTR #15 appears to affect only the following unit_tests (on all platforms):

IncidentReportingServiceTest.DelayedAnalysisNoProfileNoUpload
IncidentReportingServiceTest.CoalesceIncidents
IncidentReportingServiceTest.TwoProfilesTwoUploads
IncidentReportingServiceTest.ProcessWideTwoUploads
SafeBrowsingProtocolManagerTest.TestGetHashUrl
SafeBrowsingProtocolManagerTest.UpdateResponseNetworkErrorBackupSuccess
IncidentReportingServiceTest.NoCollectionWithoutIncident
IncidentReportingServiceTest.ProcessWideUploadClearUpload
IncidentReportingServiceTest.ProfileDestroyedDuringUpload
SafeBrowsingProtocolManagerTest.UpdateResponseReset
IncidentReportingServiceTest.OneIncidentOneUpload
IncidentReportingServiceTest.DelayedAnalysisOneUpload
SafeBrowsingProtocolManagerTest.UpdateResponseHttpErrorBackupError
IncidentReportingServiceTest.ProcessWideNoProfileNoUpload
SafeBrowsingProtocolManagerTest.TestChunkStrings
IncidentReportingServiceTest.AddIncident
IncidentReportingServiceTest.AnalysisAfterProfile
SafeBrowsingProtocolManagerTest.TestBackOffTimes
SafeBrowsingProtocolManagerTest.EmptyRedirectResponse
IncidentReportingServiceTest.NoDownloadPrunedSameIncidentNoUpload
IncidentReportingServiceTest.NoDownloadNoWaiting
SafeBrowsingProtocolManagerTest.UpdateResponseConnectionErrorBackupError
IncidentReportingServiceTest.NoUploadBeforeExtendedReporting
IncidentReportingServiceTest.NoSafeBrowsing
SafeBrowsingProtocolManagerTest.ProblemAccessingDatabase
SafeBrowsingProtocolManagerTest.UpdateResponseNetworkErrorBackupError
SafeBrowsingProtocolManagerTest.UpdateResponseBadBodyBackupSuccess
SafeBrowsingProtocolManagerTest.SingleRedirectResponseWithChunks
IncidentReportingServiceTest.NoProfilesNoUpload
IncidentReportingServiceTest.AnalysisWhenRegisteredWithProfile
IncidentReportingServiceTest.ProcessWideNoUploadAfterProfile
IncidentReportingServiceTest.NoDownloadPrunedIncidentOneUpload
IncidentReportingServiceTest.CleanLegacyPruneState
SafeBrowsingProtocolManagerTest.UpdateResponseHttpErrorBackupTimeout
IncidentReportingServiceTest.TwoIncidentsTwoUploads
SafeBrowsingProtocolManagerTest.UpdateResponseConnectionErrorBackupSuccess
IncidentReportingServiceTest.SafeBrowsingFieldTrial
IncidentReportingServiceTest.NonBinaryDownloadStillUploads
SafeBrowsingProtocolManagerTest.UpdateResponseTimeoutBackupSuccess
IncidentReportingServiceTest.ClearProcessIncidentOnCleanState
SafeBrowsingProtocolManagerTest.TestUpdateUrl
SafeBrowsingProtocolManagerTest.TestGetHashBackOffTimes
IncidentReportingServiceTest.ExtendedReportingNoFieldTrial
SafeBrowsingProtocolManagerTest.InvalidRedirectResponse
IncidentReportingServiceTest.UploadsWithBothDownloadTypes
SafeBrowsingProtocolManagerTest.TestNextChunkUrl
SafeBrowsingProtocolManagerTest.UpdateResponseHttpErrorBackupSuccess
IncidentReportingServiceTest.ProcessWideOneUpload
SafeBrowsingProtocolManagerTest.ExistingDatabase
IncidentReportingServiceTest.SafeBrowsingNoFieldTrial
SafeBrowsingProtocolManagerTest.MultipleRedirectResponsesWithChunks
IncidentReportingServiceTest.NoDownloadNoUpload
Project Member

Comment 18 by bugdroid1@chromium.org, Aug 13 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/addc8ef1bcae5bff546fba73eca8bea3e640ebcd

commit addc8ef1bcae5bff546fba73eca8bea3e640ebcd
Author: gab <gab@chromium.org>
Date: Sat Aug 13 17:28:50 2016

Add joinable option to SimpleThread::Options as was just done for Thread::Options.

SimpleThread will be more suitable than Thread for
https://codereview.chromium.org/2197753002/ per it not needing a MessageLoop.

Also, cleanup the SimpleThread::Options API, making it more like Thread::Options.

Non-joinable SimpleThreads will have to remain alive so long as their Run() method uses
member state (SimpleThread itself won't after invoking Run()).
DelegateSimpleThread provides a safe way to do this by explicitly supporting
being deleted while Run() is active but ensuring the Delegate itself outlives Run()).
Note: it might seem overkill to let DelegateSimpleThread be deleted early but this is
eventually what I want to do for base::Thread (https://crbug.com/629139#c14) and this
is a good precursor to this paradigm (it really takes all its sense on base::Thread where
the destructor is more meaningful though -- winds down the MessageLoop).

The logic on SimpleThread is fairly simple, the complexity is in the assertions (which
were put behind DCHECK_IS_ON() to clearly isolate them) and the tests.

BUG= 629716 , 634835, 629139
TBR=avi@chromium.org
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_chromium_tsan_rel_ng

Committed: https://crrev.com/8b35b0572443f4a84377a1fbd199a1edabe3d7d6
Cr-Commit-Position: refs/heads/master@{#409830}

Reverted: https://crrev.com/a9094bdb2cda679e8bf3a5b43b27c12efa975d3c
Review-Url: https://codereview.chromium.org/2204333003
Cr-Original-Commit-Position: refs/heads/master@{#410169}
Cr-Commit-Position: refs/heads/master@{#411897}

[modify] https://crrev.com/addc8ef1bcae5bff546fba73eca8bea3e640ebcd/base/threading/simple_thread.cc
[modify] https://crrev.com/addc8ef1bcae5bff546fba73eca8bea3e640ebcd/base/threading/simple_thread.h
[modify] https://crrev.com/addc8ef1bcae5bff546fba73eca8bea3e640ebcd/base/threading/simple_thread_unittest.cc
[modify] https://crrev.com/addc8ef1bcae5bff546fba73eca8bea3e640ebcd/content/renderer/categorized_worker_pool.cc

Project Member

Comment 19 by bugdroid1@chromium.org, Aug 16 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/c3c5c2b2f032d44429ed6cc88bcd125c20ed9e10

commit c3c5c2b2f032d44429ed6cc88bcd125c20ed9e10
Author: perkj <perkj@chromium.org>
Date: Tue Aug 16 07:28:34 2016

Remove unnecessary DCHECK(encoder_thread_.IsRunning()) from AudioRecorder.

Calling encoder_thread_.IsRunning() is apparently illegal from another sequence than the sequence/thread that started the thread.
BUG=629139

Review-Url: https://codereview.chromium.org/2245043002
Cr-Commit-Position: refs/heads/master@{#412187}

[modify] https://crrev.com/c3c5c2b2f032d44429ed6cc88bcd125c20ed9e10/content/renderer/media/audio_track_recorder.cc

Comment 20 by gab@chromium.org, Nov 10 2016

Blockedon: 587199

Comment 21 by gab@chromium.org, Nov 10 2016

Blockedon: -587199

Comment 22 by gab@chromium.org, Nov 10 2016

Blockedon: 552633
Project Member

Comment 23 by bugdroid1@chromium.org, Nov 23 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/ca6c3690e7a1daaa6b428958dbf247c6611bb7a3

commit ca6c3690e7a1daaa6b428958dbf247c6611bb7a3
Author: gab <gab@chromium.org>
Date: Wed Nov 23 04:25:31 2016

Fix Thread::SetMessageLoop(nullptr).

Unit tests that pass base::MessageLoop::current() to a Thread need to
explicitly have a base::MessageLoop member as well or current() is null.

A better alternative though is to just have a content::TestBrowserThreadBundle member.

This long-standing issue (629139) was now blocking https://codereview.chromium.org/2464233002
which needs to unconditionally inspect the MessageLoop's member and it being unexpectedly null
is a problem.

Fixing the safe_browsing tests was a bit tricky because the TestSimpleTaskRunner they relied
on doesn't honor delays. Mocking the MessageLoop's TaskRunner with a TestMockTimeTaskRunner
was required. This in turn enabled better tests (that can explicitly wait for the expected timer
to fire). Ideally, their Timers would use the same MockTickClock but FastForwardUntilNoTasksRemain()
being the logical equivalent of the existing TestSimpleTaskRunner::RunUntilIdle() it was used instead
where required to simplify this CL.

BUG=629139,  653916 
TBR=danakj (re-enabling DCHECK in thread.cc)

Review-Url: https://codereview.chromium.org/2487343005
Cr-Commit-Position: refs/heads/master@{#434114}

[modify] https://crrev.com/ca6c3690e7a1daaa6b428958dbf247c6611bb7a3/base/threading/thread.cc
[modify] https://crrev.com/ca6c3690e7a1daaa6b428958dbf247c6611bb7a3/chrome/browser/safe_browsing/incident_reporting/incident_reporting_service_unittest.cc
[modify] https://crrev.com/ca6c3690e7a1daaa6b428958dbf247c6611bb7a3/chrome/browser/safe_browsing/protocol_manager.cc
[modify] https://crrev.com/ca6c3690e7a1daaa6b428958dbf247c6611bb7a3/chrome/browser/safe_browsing/protocol_manager.h
[modify] https://crrev.com/ca6c3690e7a1daaa6b428958dbf247c6611bb7a3/chrome/browser/safe_browsing/protocol_manager_unittest.cc

Cc: -mvanouwe...@chromium.org
Project Member

Comment 25 by bugdroid1@chromium.org, Jan 25 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/fc17e28f48e807564535c1447075656e622094a1

commit fc17e28f48e807564535c1447075656e622094a1
Author: gab <gab@chromium.org>
Date: Wed Jan 25 21:01:37 2017

Fix Thread::StopSoon() documentation.

It definitely can and ThreadTest.StopTwiceNop even exercises precisely
doing that.

BUG=629139

Review-Url: https://codereview.chromium.org/2650073004
Cr-Commit-Position: refs/heads/master@{#446116}

[modify] https://crrev.com/fc17e28f48e807564535c1447075656e622094a1/base/threading/thread.h

Components: Blink>Geolocation
Components: -Blink>Location
Components: -Internals

Sign in to add a comment