New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 738183 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner:
Closed: Jul 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 3
Type: Bug



Sign in to add a comment

WeakPtr change broke cronet tests

Project Member Reported by h...@chromium.org, Jun 29 2017

Issue description

This change: https://codereview.chromium.org/2963623002/

Appears to have broken a bunch of cronet bots:
https://build.chromium.org/p/chromium.android/builders/Android%20Cronet%20ARMv6%20Builder/builds/5160
https://build.chromium.org/p/chromium.android/builders/Android%20Cronet%20Builder%20%28dbg%29/builds/5325
https://build.chromium.org/p/chromium.android/builders/Android%20Cronet%20Builder%20Asan/builds/2674
https://build.chromium.org/p/chromium.android/builders/Android%20Cronet%20KitKat%20Builder/builds/2587


mgersh posted on the CL:

Example failure:
https://build.chromium.org/p/chromium.android/builders/Android%20Cronet%20Bui...

Stack trace:
Stack Trace:
  RELADDR   FUNCTION                                                            
                                                                                
                                                                                
                                               FILE:LINE
  0005e97b  logging::LogMessage::~LogMessage()+110                              
                                                                                
                                                                                
                                              
/b/build/slave/Android_Cronet_Builder__dbg_/build/src/base/logging.cc:553
  0002d0dd  base::internal::WeakReference::Flag::IsValid() const+64             
                                                                                
                                                                                
                                              
/b/build/slave/Android_Cronet_Builder__dbg_/build/src/base/memory/weak_ptr.h:132
  0002d055  base::WeakPtr<JsonPrefStore>::get() const+8                         
                                                                                
                                                                                
                                              
/b/build/slave/Android_Cronet_Builder__dbg_/build/src/base/memory/weak_ptr.h:287
  v------>  base::WeakPtr<net::URLRequestJob>::operator bool() const            
                                                                                
                                                                                
                                              
/b/build/slave/Android_Cronet_Builder__dbg_/build/src/base/memory/weak_ptr.h:305
  00289155  void base::internal::InvokeHelper<true, void>::MakeItSo<void
(net::URLRequestJob::* const&)(bool, int), base::WeakPtr<net::URLRequestJob>
const&, bool const&, int>(void (net::URLRequestJob::* const&)(bool, int),
base::WeakPtr<net::URLRequestJob> const&, bool const&, int&&)+22 
/b/build/slave/Android_Cronet_Builder__dbg_/build/src/base/bind_internal.h:292
  000da4e1  base::Callback<void (net::Error), (base::internal::CopyMode)0,
(base::internal::RepeatMode)0>::Run(net::Error) &&+28                           
                                                                                
                                                    
/b/build/slave/Android_Cronet_Builder__dbg_/build/src/base/callback.h:91
  002e322d  net::URLRequestJob::ReadRawDataComplete(int)+248                    
                                                                                
                                                                                
                                              
/b/build/slave/Android_Cronet_Builder__dbg_/build/src/net/url_request/url_request_job.cc:586
  0004237f  base::Callback<void (), (base::internal::CopyMode)0,
(base::internal::RepeatMode)0>::Run() &&+22                                     
                                                                                
                                                              
/b/build/slave/Android_Cronet_Builder__dbg_/build/src/base/callback.h:91
  0004c359  base::debug::TaskAnnotator::RunTask(char const*,
base::PendingTask*)+140                                                         
                                                                                
                                                                  
/b/build/slave/Android_Cronet_Builder__dbg_/build/src/base/debug/task_annotator.cc:59
  000655f1  base::MessageLoop::RunTask(base::PendingTask*)+180                  
                                                                                
                                                                                
                                              
/b/build/slave/Android_Cronet_Builder__dbg_/build/src/base/message_loop/message_loop.cc:422
  00065851  base::MessageLoop::DeferOrRunPendingTask(base::PendingTask)+44      
                                                                                
                                                                                
                                              
/b/build/slave/Android_Cronet_Builder__dbg_/build/src/base/message_loop/message_loop.cc:433
  00065a03  base::MessageLoop::DoWork()+166                                     
                                                                                
                                                                                
                                              
/b/build/slave/Android_Cronet_Builder__dbg_/build/src/base/message_loop/message_loop.cc:540
  0006794b  base::MessagePumpLibevent::Run(base::MessagePump::Delegate*)+50     
                                                                                
                                                                                
                                              
/b/build/slave/Android_Cronet_Builder__dbg_/build/src/base/message_loop/message_pump_libevent.cc:219
  00065435  base::MessageLoop::Run()+60                                         
                                                                                
                                                                                
                                              
/b/build/slave/Android_Cronet_Builder__dbg_/build/src/base/message_loop/message_loop.cc:369
  0007ee61  base::RunLoop::Run()+100                                            
                                                                                
                                                                                
                                              
/b/build/slave/Android_Cronet_Builder__dbg_/build/src/base/run_loop.cc:111
  000a02b5  base::Thread::Run(base::RunLoop*)+100                               
                                                                                
                                                                                
                                              
/b/build/slave/Android_Cronet_Builder__dbg_/build/src/base/threading/thread.cc:255
  000a0595  base::Thread::ThreadMain()+432                                      
                                                                                
                                                                                
                                              
/b/build/slave/Android_Cronet_Builder__dbg_/build/src/base/threading/thread.cc:338
  0009b633  base::(anonymous namespace)::ThreadFunc(void*)+66                   
                                                                                
                                                                                
                                              
/b/build/slave/Android_Cronet_Builder__dbg_/build/src/base/threading/platform_thread_posix.cc:71
  0000d173  <unknown>                                                           
                                                                                
                                                                                
                                               /system/lib/libc.so
  0000d30b  <unknown>
 

Comment 1 by h...@chromium.org, Jun 29 2017

mgersh: How did you get the stack trace? I looked at the bot output but it's not there.

Comment 2 by mge...@chromium.org, Jun 29 2017

Components: Internals>Network>Library
Labels: OS-Android
First I looked in the logcat output for crashes (search it for *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***), and found the one with the right timestamp. Then I looked in the "stack tool with logcat dump" step and found the one with matching pid and addresses, which isn't the last one for some reason.

Comment 3 by mge...@chromium.org, Jun 29 2017

Also, looks like this (from the logcat dump) didn't make it into my copy-paste: 4b976:  06-29 21:03:03.239 14199 14215 F chromium: [0629/210303.238603:FATAL:weak_ptr.h(132)] Check failed: sequence_checker_.CalledOnValidSequence(). WeakPtrs must be checked on the same sequenced thread.

The same crash shows up on the android_cronet_tester trybot. Release bots don't hit the check and have timeouts instead.

Comment 4 by h...@chromium.org, Jun 29 2017

Thanks!

Trying to get an emulator running so I can reproduce this, but the instructions don't work :-(

Comment 5 by mge...@chromium.org, Jun 29 2017

Sorry, I don't know about using emulators since I have physical test devices. For validating fixes you can use the android_cronet_tester trybot, but I know that's not as helpful for debugging. Hopefully there's someone near you who can help with that? When you do get it working, the instructions for building Cronet are at https://chromium.googlesource.com/chromium/src/+/master/components/cronet/android/build_instructions.md.

In the meantime I'd prefer that you revert since this is blocking Cronet changes from landing. I'm about to go home for the day but I'm happy to help debug further tomorrow.

Comment 6 by h...@chromium.org, Jun 29 2017

Yup, reverting now: https://codereview.chromium.org/2966633002/
Project Member

Comment 7 by bugdroid1@chromium.org, Jun 29 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/24046fd36c1a132243a7ffd279f59734f947d223

commit 24046fd36c1a132243a7ffd279f59734f947d223
Author: hans <hans@chromium.org>
Date: Thu Jun 29 22:07:42 2017

Revert of Make base::WeakPtr::Get() fast (patchset #10 id:180001 of https://codereview.chromium.org/2963623002/ )

Reason for revert:
This seems to have broken Cronet tests and Linux TSan. See bugs.

Original issue's description:
> Make base::WeakPtr::Get() fast
>
> It was previously calling WeakReference::is_valid() which was an out-of-line
> method. That means code like
>
>   if (my_weak_ptr)
>     my_weak_ptr->foo();
>
> would do two calls to is_valid(), plus tests and branching.
>
> The is_valid() method showed up as one of the most frequently called non-inline
> functions during browser start-up on Windows (about 1M calls).
>
> is_valid() was also inefficient because it had to do a null-pointer check on
> flag_, as well as checking if the flag was marked valid.
>
> This patch removes the null-pointer check by using a sentinel object instead of
> a nullptr. And instead of storing a bool in the flag, it stores a pointer-sized
> bitmask 0 or ~0, so it can be AND'ed with the pointer value and conveniently
> setting EFLAGS so that the code above just becomes a few loads, AND, branch,
> and call. (The size of Flag is unchanged; it grows into the padding only.)
>
> This is expected to reduce the binary size by ~48KB on Android, and increase it
> by 79KB on x64 Linux -- something that's paid for by the split-out refactorings
> to WeakPtr and WeakPtrFactory.
>
> Exposing the SequenceChecker calls in inline functions caused the
> MessageLoopTestType*.Nesting test to overflow on Win64 dcheck release builds,
> which is why this patch also lowers the recursion depth there.
>
> BUG= 728324 
>
> Review-Url: https://codereview.chromium.org/2963623002
> Cr-Commit-Position: refs/heads/master@{#483427}
> Committed: https://chromium.googlesource.com/chromium/src/+/526f714c9ae172c3b16581b7e11eb035fd14274e

TBR=thakis@chromium.org
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG= 728324 , 738167 , 738183 , 738193 

Review-Url: https://codereview.chromium.org/2966633002
Cr-Commit-Position: refs/heads/master@{#483509}

[modify] https://crrev.com/24046fd36c1a132243a7ffd279f59734f947d223/base/memory/weak_ptr.cc
[modify] https://crrev.com/24046fd36c1a132243a7ffd279f59734f947d223/base/memory/weak_ptr.h
[modify] https://crrev.com/24046fd36c1a132243a7ffd279f59734f947d223/base/message_loop/message_loop_test.cc

Comment 8 by timloh@chromium.org, Jun 30 2017

Cc: mmenke@chromium.org
 Issue 738280  has been merged into this issue.

Comment 9 by h...@chromium.org, Jun 30 2017

The bots went green after the revert.

And I can now reproduce this locally, stack trace and all *wohoo*.

Comment 10 by h...@chromium.org, Jun 30 2017

Very mysterious.. it seems the NullFlag() returns different values sometimes. I put log statements in the NullFlag constructor and the IsValid check:

I/chromium(21000): [0630/125030.143226:INFO:weak_ptr.cc(22)] Created Null Flag: 0xa3f1af9c
I/chromium(21000): [0630/125030.146004:INFO:weak_ptr.h(131)] IsValid 0xa2e0da8c isvalid: 0 (nullflag = 0xa2e0da8c)
I/chromium(21000): [0630/125030.146278:INFO:weak_ptr.h(131)] IsValid 0xa2e0da8c isvalid: 0 (nullflag = 0xa2e0da8c)
I/chromium(21000): [0630/125030.146553:INFO:weak_ptr.h(131)] IsValid 0xa2e0da8c isvalid: 0 (nullflag = 0xa2e0da8c)
I/chromium(21000): [0630/125030.146797:INFO:weak_ptr.h(131)] IsValid 0xa3f1af9c isvalid: 0 (nullflag = 0xa3f1af9c)
I/chromium(21000): [0630/125030.146919:INFO:weak_ptr.h(131)] IsValid 0xa3f1af9c isvalid: 0 (nullflag = 0xa3f1af9c)
I/chromium(21000): [0630/125030.146858:INFO:weak_ptr.h(131)] IsValid 0xa2e0da8c isvalid: 0 (nullflag = 0xa2e0da8c)
I/chromium(21000): [0630/125030.147011:INFO:weak_ptr.h(131)] IsValid 0xa3f1af9c isvalid: 0 (nullflag = 0xa3f1af9c)
I/chromium(21000): [0630/125030.147102:INFO:weak_ptr.h(131)] IsValid 0xa3f1af9c isvalid: 0 (nullflag = 0xa3f1af9c)
I/chromium(21000): [0630/125030.147621:INFO:weak_ptr.h(131)] IsValid 0xa2e0da8c isvalid: 0 (nullflag = 0xa2e0da8c)
I/chromium(21000): [0630/125030.147865:INFO:weak_ptr.h(131)] IsValid 0xa2e0da8c isvalid: 0 (nullflag = 0xa2e0da8c)
I/chromium(21000): [0630/125030.148110:INFO:weak_ptr.h(131)] IsValid 0xa2e0da8c isvalid: 0 (nullflag = 0xa2e0da8c)


The nullflag printout is from calling NullFlag(). It seems different threads sometimes sees a different value. But the constructor only ran once. I have no idea how this is happening.

I tried switching to LazyInstance instead of relying on thread-safe initialization of the static local, but that didn't change anything.

Where is that 0xa2e0da8c value coming from?

Comment 11 by h...@chromium.org, Jun 30 2017

> It seems different threads

actually I was just assuming that; I don't have anything to back it up.

Comment 12 by h...@chromium.org, Jun 30 2017

Logging the address of the static variable in NullFlag() shows two different addresses.
Cronet tests use two separate native libraries, libcronet.[version number].so and libcronet_tests.so. The particular test that's crashing does some weirdness with task runners that might be relevant, although it's not the only test that does this. https://cs.chromium.org/chromium/src/components/cronet/android/test/cronet_test_util.cc?type=cs&l=74

Comment 14 by h...@chromium.org, Jun 30 2017

Ah, that's very interesting.

Comment 15 by h...@chromium.org, Jun 30 2017

I might have to continue debugging this from the airport now :-)
/me -> afk
mgersh: since this is starting to look pretty cronet specific, any chance we could some debugging help from y'all? We'd like to get our change in to collect perf numbers, but both Hans and I are out next week.
Cc: pauljensen@chromium.org
+pauljensen This is probably a consequence of this silly arrangement. Unfortunately, Cronet is intentionally having two copies of //base in there, which is not something that tends to end well. :-(
https://codereview.chromium.org/2406273002

(We had a similar issue recently with two copies of BoringSSL in a corner of Android and it took months to track it down.)
This is looking relevant to some changes pauljensen@ made semi-recently. Adding him to cc, but we're both out next week too.
Cc: mef@chromium.org
Also adding mef@ in case he wants to look at it while the rest of us are gone. The code causing the problem is only used in a few tests and I would love to find a way to get rid of it entirely, but I'm not sure how feasible that is.

Comment 20 by mef@chromium.org, Jun 30 2017

I can take a look next week if that helps.

Comment 21 by mef@chromium.org, Jun 30 2017


This will take me some time to wrap my head around.
I'll take it as an action item.

Comment 22 by h...@chromium.org, Jun 30 2017

I think maybe I can make my WeakPtr patch not depend on the uniqueness of the Null Flag object, which would make the problem of having two copies of base go away.

But it probably won't happen before I'm back, and I don't think there's any need to rush this.
In the long run, I don't think it's reasonable to require //base or //net survive objects from one copy of it being passed into functions from another copy. //third_party/boringssl certainly doesn't support this. Though that would work as a short-term fix.
I agree with davidben. linking base statically into two .sos isn't something people should do. Why can't your tests link in cronet statically? And if that's absolutely not possible, could your tester bot do a component build so that it could link against a libbase.so?

Comment 25 by h...@chromium.org, Jun 30 2017

And I'm not sure why the NullFlag constructor didn't run twice, once in each library. Maybe the static variable initialization flag ended up in the PLT but the actual static variable didn't?
I might be able to take a look, maybe this weekend.  Of course this happens the minute I go on vacation for two weeks.  The second .so is hardly used, it just contains some test infrastructure that we didn't want to ship in Cronet; it's not a great setup but prior to this we were shipping untested binaries.
I agree with the sentiments of comments 17, 23, 24.  This arrangement wasn't desirable.  It was instead the consequence of a couple factors:
1. Cronet was moving into an app where testing was done on the actual shipping production version of the app, but it was still desirable to run all the Cronet tests on the app.
2. Cronet was being shipped without being tested.  We would build libcronet.so and ship it without really testing it.  We'd build another libcronet_test.so that included the testing infrastructure and test that.  I felt it was desirable to test the actual binary being shipped.
The solution I came up with at the time was to expose a couple items via JNI from libcronet.so and pass in a couple testing infrastructure items constructed from another testing libcronet_test.so file.  This way libcronet.so could be tested without modifications, and without bloat from including testing infrastructure.
The unfortunate consequence was that both libcronet.so and libcronet_test.so both contained copies of much of //base and //net.
This hasn't caused any problems in the 7 months since I landed the change, but I do apologize for the annoyance it has caused in this issue, and I'm thankful for the time spent tracking it down.
If there was an easy way to remove the static dependence like comment 22 mentions, that might be the easiest short term solution.  As comment 22 says, I don't think we should rush this.
I think a longer term solution would be to make libcronet_test.so use more/all of libcronet.so, instead of duplicating symbols/code.  The downside of this would be a size increase to libcronet.so as the dynamic symbol table grows.  For reference libcronet.so is 3MB...but 40MB unstripped, so exposing all the symbols in the dynamic symbol table is probably a non-starter given the size increase.  Perhaps we could whitelist symbols to expose, and start with WeakReference::Flag::NullFlag.
Perhaps there is some way to statically link libcronet_test.so against libcronet.so so we don't have to grow the dynamic symbol table; I just don't know how this could be made to work given that separate shared libraries are loaded at random offsets from each other.
I'd like to think more about this, when I'm back from vacation.
I'm open to other ideas.
What do we do with Chrome itself (When not using incremental linking)?  My feeling is that if something is good enough for Chrome, it's good enough for Cronet.  I'd assume we just link everything into one binary, though I could be wrong.

If we really want to test on the production .so, I'd suggest just one or two tests checking basic loading, and with no external dependencies on base.  Exporting all symbols from base and net from cronet's .so file just doesn't seem like a great idea to me.
We run the Cronet tests, not net_unittests.  The Cronet tests are only a test of the Cronet API.  We could disable all tests relying on the mock URLRequestJobFactory, but that would be a significant portion of the tests.
Right...but I'm suggesting two sets of tests - one which links Cronet with everything else, like tests do (So doesn't use Cronet.so, but instead works like other test do), and a more limited set that use the real cronet.so.
That suggestion is approximately the state of the world described in point 2 of comment #27. I agree it's the simplest approach. I believe that is basically how browser_tests works.

Comment 32 by mef@chromium.org, Jul 5 2017

Is there something I could/should do this week before Paul returns from vacation?
I think the question is do we want the WeakPtr change in M61, or is M62 fine (Branch point is in 2.5 weeks, and I'd prefer not to be rushing everything out last minute)?  If we want it in M62, it would probably be best to get this taken care of during the next two weeks, while he's out.

Comment 34 by mef@chromium.org, Jul 5 2017

You mean if we want it in M61, right? 
The change looks useful for Chrome overall, so I would think that we want it in M61.

Would it make sense to introduce an explicit method WeakReference::Flag::ResetNullFlag(WeakReference::Flag* globalNullFlag) that cronet_tests.so would call during initialization with actual WeakReference::Flag::NullFlag() obtained from cronet.so? 

This will make checks in cronet and cronet_tests consistent.


Cronet should not ask //base to should not support two copies of itself interacting with each other. This is a not reasonable thing to support.
+1 to that.  Whatever we do, it should not require that we support having two copies of base (Or of anything else) loaded at once in the same process.
Can't the cronet tests link in cronet statically? (I.e. have all of cronet in a source_set, and make both the real cronet.so and cronet_tests depend on that)

That's how browser_tests and chrome work, and that seems to work fine in practice.

I assume the reason for the difference here is that Cronet includes code to dynamically link in the Cronet so file (Or use another alternative), and if we've already loaded it due to the test fixture itself depending on net/ and base/, we lose some test coverage of that logic.

Comment 39 by mef@chromium.org, Jul 5 2017

That was the configuration prior to https://codereview.chromium.org/2406273002 - cronet_static library was linked into production libcronet.so and test libcronet_tests.so, which also contained all the native test support classes.

This has changed as part of https://bugs.chromium.org/p/chromium/issues/detail?id=629299 which forces us to load libcronet.so from one apk, and tests from another. We cannot put libcronet_tests.so into first apk as it is shipped in production. The hacky split introduced in https://codereview.chromium.org/2406273002 allows us to test the actual shipping version of libcronet.so.

I would really like to find non-hacky solution that would let us do that. 

As a compromise solution we could split tests between those that require test support classes and those that don't, and only run latter for internal tests. This will limit test coverage, but avoid duplicate base.

Ideally we would export base/ symbols from libcronet.so and import them into libcronet_test.so, but without keeping actual symbolic names in libcronet.so. 
Windows DEF files allow using ordinals instead of function names (https://msdn.microsoft.com/en-us/library/d91k01sh.aspx), is something similar available on Android?


I'd:
1. Revert that change, since it's not a config we support, and it didn't get buy-in from base (and net?) owners
2. If you must have it, add some spot checking for the actual cronet.so, but make that test library not depend on base or anything (and hence, keep it very simple and make it test just a few basics -- the actual testing is done by the thing statically linking all the code)
Exporting base/ symbols from libcronet.so would be unhygienic and hurt your binary size because the static linker will not be able to drop anything from //base at that point. Also if the libcronet.so / libcronet_test.so boundary includes //net, you need to export all those symbols too, at which point you'll become huge.

Rather if you need to export things in libcronet.so for tests, you should expose just the minimal amount of surface area you need (what Nico suggests).

Having libcronet_test.so still link in //base *might* work, but only if the libcronet/libcronet_test boundary does not include any objects from //base or //net. E.g. you must never take a WeakPtr from one //base and pass it into another //base. And you certainly shouldn't touch each others' TaskRunners. You would be doing so because libcronet_test itself wants to use random //base utilities to read files or whatever.

(As an example, every Chrome/Android process actually has two copies of BoringSSL in there, probably even three. One from Chrome, one from the system, and maybe also one from Play Services. But these are treated as unrelated libraries. Their symbols and existence are not exported out of their respective owners.)
> You would be doing so because libcronet_test itself wants to [...]

Sorry, that was poorly phrased. What I meant is:

It *might* work if libcronet_test pulled in //base to read files, set up its own internal threads and TaskRunners, etc. Stuff that does not cross the libcronet/libcronet_test boundary.

It would not work for libcronet_test to, say, call a libcronet-exported GetTaskRunner() function and post things to it.

Comment 43 by mef@chromium.org, Jul 5 2017

Cc: kapishnikov@chromium.org
Rolling back https://codereview.chromium.org/2406273002 maybe non-trivial as it has landed 8 months ago.

Simplest thing to do to unblock WeakPtr re-land is to disable testMockReadDataAsyncError that is breaking on those builders.

I'll do this first and will try to figure out a proper solution.
Project Member

Comment 44 by bugdroid1@chromium.org, Jul 6 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/baaa9eae9337c541976270798c4d0b6609c8bd18

commit baaa9eae9337c541976270798c4d0b6609c8bd18
Author: mef <mef@chromium.org>
Date: Thu Jul 06 18:55:47 2017

[Cronet] Disable testMockReadDataAsyncError as it breaks with WeakPtr changes.

BUG= 738183 
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_cronet_tester

Review-Url: https://codereview.chromium.org/2972903002
Cr-Commit-Position: refs/heads/master@{#484696}

[modify] https://crrev.com/baaa9eae9337c541976270798c4d0b6609c8bd18/components/cronet/android/test/javatests/src/org/chromium/net/CronetUrlRequestTest.java

Comment 45 by mef@chromium.org, Jul 6 2017

Cc: h...@chromium.org
Owner: mef@chromium.org
Status: Started (was: Assigned)
I've disabled testMockReadDataAsyncError so WeakPtr change could reland and will try figure out a good way to fix it on Cronet side. 

Comment 46 by h...@chromium.org, Jul 17 2017

Thanks! I'll try relanding the WeakPtr change.
Project Member

Comment 47 by bugdroid1@chromium.org, Jul 17 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/3856eab55bb241fb054a878d9922e65e7b5b0931

commit 3856eab55bb241fb054a878d9922e65e7b5b0931
Author: hans <hans@chromium.org>
Date: Mon Jul 17 12:03:22 2017

Make base::WeakPtr::Get() fast

It was previous calling the out-of-line method WeakReference::is_valid().
That means that code like

  if (my_weak_ptr)
    my_weak_ptr->foo();

would do two calls to is_valid(), plus tests and branching.

The is_valid() method showed up as one of the most frequently called non-inline
functions during browser start-up on Windows (about 1M calls).

is_valid() was also inefficient in itself, because it had to do a null-check
flag_, as well as checking if the flag was marked valid.

This patch removes the null-pointer check by using a sentinel Flag object instead of
a nullptr. And instead of storing a bool in the Flag, it stores a pointer-sized
bitmask 0 or ~0, so it can be AND'ed with the pointer value and conveniently
setting EFLAGS so that the code above just becomes a few loads, AND, branch,
and call. (The size of Flag is unchanged; it grows into the padding only.)

This is expected to reduce the binary size by ~48KB on Android, and increase it
by 79KB on x64 Linux -- something that's paid for by the split-out refactorings
to WeakPtr and WeakPtrFactory.

Exposing the SequenceChecker calls in inline functions caused the
MessageLoopTestType*.Nesting test to overflow on Win64 dcheck release builds,
which is why this patch also lowers the recursion depth there.

BUG= 728324 , 738183 , 738193 
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_cronet_tester

Review-Url: https://codereview.chromium.org/2963623002
Cr-Commit-Position: refs/heads/master@{#487048}

[modify] https://crrev.com/3856eab55bb241fb054a878d9922e65e7b5b0931/base/memory/weak_ptr.cc
[modify] https://crrev.com/3856eab55bb241fb054a878d9922e65e7b5b0931/base/memory/weak_ptr.h
[modify] https://crrev.com/3856eab55bb241fb054a878d9922e65e7b5b0931/base/message_loop/message_loop_test.cc

Status: Fixed (was: Started)
I filed  bug 744567  for not linking in base twice in cronet_tests.
Project Member

Comment 49 by bugdroid1@chromium.org, Aug 8 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/4cfa301cda2ab84a496f3c13a5e3fa2f8de94fca

commit 4cfa301cda2ab84a496f3c13a5e3fa2f8de94fca
Author: Hans Wennborg <hans@chromium.org>
Date: Tue Aug 08 03:33:38 2017

Revert "Make base::WeakPtr::Get() fast"

This reverts commit 71f161e3bd95827876f67c656168f936d88a6e3a and
3856eab55bb241fb054a878d9922e65e7b5b0931.

Reason for revert:
I'm backing out the WeakPtr changes. While the new code was objectively
better, it didn't show on any perf bots, regressed size on Linux quite a
bit because the new code was smaller and thus inlined more often, and
seems to have introduced a new crasher on Android. To summarize: it seems
it's not worth it.  The WeakPtr revert will be done in two patches. The
first part is in https://chromium-review.googlesource.com/c/604087
This second part should be merged to M61.

Original change's description:
> Make base::WeakPtr::Get() fast
>
> It was previous calling the out-of-line method WeakReference::is_valid().
> That means that code like
>
>   if (my_weak_ptr)
>     my_weak_ptr->foo();
>
> would do two calls to is_valid(), plus tests and branching.
>
> The is_valid() method showed up as one of the most frequently called non-inline
> functions during browser start-up on Windows (about 1M calls).
>
> is_valid() was also inefficient in itself, because it had to do a null-check
> flag_, as well as checking if the flag was marked valid.
>
> This patch removes the null-pointer check by using a sentinel Flag object instead of
> a nullptr. And instead of storing a bool in the Flag, it stores a pointer-sized
> bitmask 0 or ~0, so it can be AND'ed with the pointer value and conveniently
> setting EFLAGS so that the code above just becomes a few loads, AND, branch,
> and call. (The size of Flag is unchanged; it grows into the padding only.)
>
> This is expected to reduce the binary size by ~48KB on Android, and increase it
> by 79KB on x64 Linux -- something that's paid for by the split-out refactorings
> to WeakPtr and WeakPtrFactory.
>
> Exposing the SequenceChecker calls in inline functions caused the
> MessageLoopTestType*.Nesting test to overflow on Win64 dcheck release builds,
> which is why this patch also lowers the recursion depth there.
>
> BUG= 728324 , 738183 , 738193 
> CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_cronet_tester
>
> Review-Url: https://codereview.chromium.org/2963623002
> Cr-Commit-Position: refs/heads/master@{#487048}

BUG=748495, 748075 , 744747 

Change-Id: Id85da06115ab9453267ade4d6f8efcfd1b26220c
Reviewed-on: https://chromium-review.googlesource.com/604088
Commit-Queue: Hans Wennborg <hans@chromium.org>
Reviewed-by: Nico Weber <thakis@chromium.org>
Cr-Commit-Position: refs/heads/master@{#492520}
[modify] https://crrev.com/4cfa301cda2ab84a496f3c13a5e3fa2f8de94fca/base/memory/weak_ptr.cc
[modify] https://crrev.com/4cfa301cda2ab84a496f3c13a5e3fa2f8de94fca/base/memory/weak_ptr.h

Project Member

Comment 50 by bugdroid1@chromium.org, Aug 10 2017

Labels: merge-merged-3163
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/f393cc0504d4da327d541a054c404aa422470c1c

commit f393cc0504d4da327d541a054c404aa422470c1c
Author: Hans Wennborg <hans@chromium.org>
Date: Thu Aug 10 15:45:38 2017

Revert "Make base::WeakPtr::Get() fast"

This reverts commit 71f161e3bd95827876f67c656168f936d88a6e3a and
3856eab55bb241fb054a878d9922e65e7b5b0931.

Reason for revert:
I'm backing out the WeakPtr changes. While the new code was objectively
better, it didn't show on any perf bots, regressed size on Linux quite a
bit because the new code was smaller and thus inlined more often, and
seems to have introduced a new crasher on Android. To summarize: it seems
it's not worth it.  The WeakPtr revert will be done in two patches. The
first part is in https://chromium-review.googlesource.com/c/604087
This second part should be merged to M61.

Original change's description:
> Make base::WeakPtr::Get() fast
>
> It was previous calling the out-of-line method WeakReference::is_valid().
> That means that code like
>
>   if (my_weak_ptr)
>     my_weak_ptr->foo();
>
> would do two calls to is_valid(), plus tests and branching.
>
> The is_valid() method showed up as one of the most frequently called non-inline
> functions during browser start-up on Windows (about 1M calls).
>
> is_valid() was also inefficient in itself, because it had to do a null-check
> flag_, as well as checking if the flag was marked valid.
>
> This patch removes the null-pointer check by using a sentinel Flag object instead of
> a nullptr. And instead of storing a bool in the Flag, it stores a pointer-sized
> bitmask 0 or ~0, so it can be AND'ed with the pointer value and conveniently
> setting EFLAGS so that the code above just becomes a few loads, AND, branch,
> and call. (The size of Flag is unchanged; it grows into the padding only.)
>
> This is expected to reduce the binary size by ~48KB on Android, and increase it
> by 79KB on x64 Linux -- something that's paid for by the split-out refactorings
> to WeakPtr and WeakPtrFactory.
>
> Exposing the SequenceChecker calls in inline functions caused the
> MessageLoopTestType*.Nesting test to overflow on Win64 dcheck release builds,
> which is why this patch also lowers the recursion depth there.
>
> BUG= 728324 , 738183 , 738193 
> CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_cronet_tester
>
> Review-Url: https://codereview.chromium.org/2963623002
> Cr-Commit-Position: refs/heads/master@{#487048}

BUG=748495, 748075 , 744747 
TBR=hans@chromium.org

(cherry picked from commit 4cfa301cda2ab84a496f3c13a5e3fa2f8de94fca)

Change-Id: Id85da06115ab9453267ade4d6f8efcfd1b26220c
Reviewed-on: https://chromium-review.googlesource.com/604088
Commit-Queue: Hans Wennborg <hans@chromium.org>
Reviewed-by: Nico Weber <thakis@chromium.org>
Cr-Original-Commit-Position: refs/heads/master@{#492520}
Reviewed-on: https://chromium-review.googlesource.com/610446
Reviewed-by: Hans Wennborg <hans@chromium.org>
Cr-Commit-Position: refs/branch-heads/3163@{#432}
Cr-Branched-From: ff259bab28b35d242e10186cd63af7ed404fae0d-refs/heads/master@{#488528}
[modify] https://crrev.com/f393cc0504d4da327d541a054c404aa422470c1c/base/memory/weak_ptr.cc
[modify] https://crrev.com/f393cc0504d4da327d541a054c404aa422470c1c/base/memory/weak_ptr.h

Sign in to add a comment