New issue
Advanced search Search tips

Issue 736015 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Jul 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 2
Type: Bug-Regression



Sign in to add a comment

60 kb regression in resource_sizes (MonochromePublic.apk) at 481091:481091

Project Member Reported by estevenson@chromium.org, Jun 22 2017

Issue description

All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=736015

Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?keys=agxzfmNocm9tZXBlcmZyFAsSB0Fub21hbHkYgIDgjpDW5gkM


Bot(s) for this bug's original alert(s):

Android Builder
Labels: OS-Android
Owner: h...@chromium.org
Status: Assigned (was: Untriaged)
Growth is from native library size. Partial diff:

Index | Running Total | Section@Address | Δ PSS (Δ size_without_padding) | Path
------------------------------------------------------------
~ 0)      -8362 (-13.5%) t@Group      -8362 (0->0)       {{no path}}
               ** symbol gaps (count=9)
~ 1)      -7774 (-12.6%) t@0x41d2e4   +588 (2584->3172)  cc/tiles/tile_manager.cc
               cc::TileManager::ScheduleTasks
~ 2)      -7318 (-11.8%) t@0x40c66c   +456 (888->1344)   cc/tiles/gpu_image_decode_cache.cc
               cc::GpuImageDecodeCache::GetTaskForImageAndRefInternal
~ 3)      -7518 (-12.2%) t@0x2348fac  -200 (2012->1812)  chrome/browser/io_thread.cc
               IOThread::IOThread
~ 4)      -7342 (-11.9%) t@0x166f6c4  +176 (384->560)    gpu/ipc/service/gpu_channel.cc
               gpu::GpuChannel::GpuChannel
~ 5)      -7178 (-11.6%) t@0xd85a08   +164 (1996->2160)  content/browser/renderer_host/render_process_host_impl.cc
               content::RenderProcessHostImpl::CreateMessageFilters
~ 6)      -7022 (-11.4%) t@0x410c78   +156 (672->828)    cc/tiles/image_controller.cc
               cc::ImageController::ProcessNextImageDecodeOnWorkerThread
~ 7)      -6866 (-11.1%) t@0x7cb4fc   +156 (684->840)    ui/gfx/image/image.cc
               gfx::Image::As1xPNGBytes const
~ 8)      -6714 (-10.9%) t@0x41714c   +152 (1272->1424)  cc/tiles/software_image_decode_cache.cc
               cc::SoftwareImageDecodeCache::GetTaskForImageAndRefInternal
~ 9)      -6570 (-10.6%) t@0x428da0   +144 (168->312)    cc/trees/layer_tree_host.cc
               cc::LayerTreeHost::InitializeThreaded
~ 10)     -6426 (-10.4%) t@0xde12be   +144 (84->228)     content/browser/shared_worker/worker_storage_partition.cc
               content::WorkerStoragePartition::WorkerStoragePartition
~ 11)     -6284 (-10.2%) t@Group      +142 (98->240)     cc/trees/task_runner_provider.cc
               cc::TaskRunnerProvider::~TaskRunnerProvider (count=2)
~ 12)     -6148 (-9.9%) t@0x240d8dc  +136 (576->712)    chrome/browser/android/preferences/website_preference_bridge.cc
               Java_org_chromium_chrome_browser_preferences_website_WebsitePreferenceBridge_nativeClearCookieData
~ 13)     -6020 (-9.7%) t@0xde75f8   +128 (764->892)    content/browser/storage_partition_impl.cc
               content::StoragePartitionImpl::DataDeletionHelper::ClearDataOnUIThread
- 14)     -6144 (-9.9%) t@0x0        -124 (124->0)      cc/tiles/tile_manager.cc
               cc::InsertNodeForTask
- 15)     -6266 (-10.1%) t@0x0        -122 (122->0)      cc/tiles/image_controller.cc
               cc::ImageController::ImageDecodeRequest::operator=
~ 16)     -6144 (-9.9%) t@0x160dce2  +122 (16->138)     gpu/{{shared}}/3
               scoped_refptr::~scoped_refptr
~ 17)     -6024 (-9.7%) t@0xde6d38   +120 (1444->1564)  content/browser/storage_partition_impl.cc
               content::StoragePartitionImpl::Create
~ 18)     -5914 (-9.6%) t@Group      +110 (98->208)     base/files/file_descriptor_watcher_posix.cc
               base::FileDescriptorWatcher::Controller::Watcher::~Watcher (count=2)
~ 19)     -5804 (-9.4%) t@Group      +110 (98->208)     base/task_scheduler/scheduler_worker.cc
               base::internal::SchedulerWorker::Thread::~Thread (count=2)
~ 20)     -5694 (-9.2%) t@0x41f35c   +110 (234->344)    cc/tiles/tile_manager.cc
               cc::InsertNodeForDecodeTask
~ 21)     -5586 (-9.0%) t@0x35d02c   +108 (480->588)    base/synchronization/waitable_event_watcher_posix.cc
               base::WaitableEventWatcher::StartWatching
- 22)     -5692 (-9.2%) t@0x0        -106 (106->0)      cc/trees/layer_tree_host.cc
               cc::TaskRunnerProvider::Create
~ 23)     -5588 (-9.0%) t@0x379dd4   +104 (912->1016)   base/trace_event/memory_dump_manager.cc
               base::trace_event::MemoryDumpManager::SetupForTracing
~ 24)     -5484 (-8.9%) t@0xd2c124   +104 (1168->1272)  content/browser/loader/resource_dispatcher_host_impl.cc
               content::ResourceDispatcherHostImpl::BeginNavigationRequest
- 25)     -5586 (-9.0%) t@0x0        -102 (102->0)      gpu/command_buffer/service/sync_point_manager.cc
               base::DefaultRefCountedThreadSafeTraits::Destruct
~ 26)     -5484 (-8.9%) t@Group      +102 (102->204)    cc/layers/picture_layer.cc
               cc::PictureLayer::~PictureLayer (count=2)
~ 27)     -5383 (-8.7%) t@0xa937d4   +101 (118->160)    components/policy/core/common/schema.cc
               std::__ndk1::vector::__swap_out_circular_buffer (num_aliases=2->1)
~ 28)     -5283 (-8.5%) t@0xc5d308   +100 (568->668)    content/browser/byte_stream.cc
               content::CreateByteStream
~ 29)     -5183 (-8.4%) t@0xd86278   +100 (1788->1888)  content/browser/renderer_host/render_process_host_impl.cc
               content::RenderProcessHostImpl::RegisterMojoInterfaces
~ 30)     -5279 (-8.5%) t@0x2352390  -96 (1376->1280)   chrome/browser/metrics/chrome_metrics_service_client.cc
               ChromeMetricsServiceClient::RegisterMetricsServiceProviders
~ 31)     -5183 (-8.4%) t@Group      +96 (1026->1122)   base/message_loop/message_loop.cc
               base::MessageLoop::~MessageLoop (count=2)
~ 32)     -5087 (-8.2%) t@0x160d40e  +96 (538->634)     gpu/command_buffer/service/sync_point_manager.cc
               gpu::SyncPointOrderData::BeginProcessingOrderNumber
~ 33)     -4991 (-8.1%) t@0x160d8bc  +96 (536->632)     gpu/command_buffer/service/sync_point_manager.cc


You can see more details if you run "tools/binary_size/diagnose_bloat.py 5e3bdec771619fd558e176dafe393bfe1690078a -v --cloud".

From looking at the comments on the CL it seems that this size change was probably unexpected. 

hans@ - Is this change necessary? Is there anything we can do to reduce the size jump on Android?

Comment 4 Deleted

Comment 5 Deleted

Comment 6 Deleted

Comment 7 by h...@chromium.org, Jun 22 2017

Cc: thakis@chromium.org
> hans@ - Is this change necessary? Is there anything we can do to reduce the size jump on Android?

The change resulted in a size decrease on other platforms. Let me investigate why it's causing growth on Android/ARM..

Comment 8 by h...@chromium.org, Jun 23 2017

It looks like the machine code for atomic decrement (and probably increment too) on ARM and Thumb is a larger than on X86.

Short example:

bool ref_dec(volatile int *ptr) {
  return __atomic_fetch_add(ptr, -1, __ATOMIC_SEQ_CST) - 1 != 0;
}


I suppose we could #ifdef for ARM and out-line the implementation..
Cc: agrieve@chromium.org
60 kb is a pretty large regression for Android but I'm not sure if we want to start recommending arch specific implementations for this amount of growth..

agrieve@ - thoughts on c#8? 
I agree, unless this makes ARM measurably faster, it's not worth 60kb. 
On x86, the change _reduces_ size. The question is: Should we have different code paths on x86 and arm here?
Perhaps we could define the method body in a macro to ensure it doesn't differ between .h and .cc? If we don't want to have different paths, then I do not think this change should go in.
Wow, great work on the start-up time front! 

I don't think any of the other changes appeared as size alerts, so it's possible that this one in particular isn't necessary (on arm). 

I'd say that if this change was even responsible for 10ms at 50%, then that would justify keeping it.

Comment 15 by h...@chromium.org, Jun 30 2017

I looked at the generated code, and while it's very small for X86 (that's why we want it inline), there doesn't seem to be a way to generate a really small code sequence for ARM, so perhaps it makes sense to have it out-of-line there.

I'm working on a patch to see what it would look like.

Comment 16 by h...@chromium.org, Jun 30 2017

Taking a stab here, using an inline method instead of macro: https://codereview.chromium.org/2961413003/
Project Member

Comment 17 by bugdroid1@chromium.org, Jun 30 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/79eb7ebaabeb87fd76da56f35de0ab5776806a15

commit 79eb7ebaabeb87fd76da56f35de0ab5776806a15
Author: hans <hans@chromium.org>
Date: Fri Jun 30 18:14:48 2017

Outline RefCountedThreadSafeBase::AddRef and Release on non-X86

They were made inline in #481091, which caused a size drop on X86 and
size increase on ARM.

The code sequence on ARM is larger than on X86, and probably not worth
inlining.

BUG= 728324 , 736015 

Review-Url: https://codereview.chromium.org/2961413003
Cr-Commit-Position: refs/heads/master@{#483759}

[modify] https://crrev.com/79eb7ebaabeb87fd76da56f35de0ab5776806a15/base/memory/ref_counted.cc
[modify] https://crrev.com/79eb7ebaabeb87fd76da56f35de0ab5776806a15/base/memory/ref_counted.h

Status: Fixed (was: Assigned)
Graph here: https://chromeperf.appspot.com/report?sid=99caa6c5389f77e13b562afa5703ad8fc1dfd9784effeb8760626be0c71e498b&start_rev=483726&end_rev=483779 - 60 kb drop as expected!

Thanks for fixing :).
Status: Started (was: Fixed)
Let's keep this open for another two weeks so we remember to look at startup perf with next week's canary.

Comment 21 by h...@chromium.org, Jul 17 2017

Status: Fixed (was: Started)
Closing as I don't think there's more to do here.

Sign in to add a comment