New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 780100 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner:
Closed: Jul 10
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 2
Type: Bug

Blocking:
issue 872044



Sign in to add a comment

Speed up PostTask for Android UI threads using Native ALooper interface

Project Member Reported by mthiesse@chromium.org, Oct 31 2017

Issue description

See discussion in https://groups.google.com/a/google.com/forum/#!msg/chrome-scheduler/Mynvw2U7cfQ/ml4PHRVgBQAJ (sorry, only visible to Googlers)

Currently, PostTask on Java goes through SystemMessageHandler.java to add a message to the java message queue for each task posted to the thread. I believe the reasons for this were twofold: Maintaining fairness between java and c++ tasks, and preserving ordering of java and c++ tasks.

However, c++ tasks are now marked asynchronous in the message queue, removing any ordering guarantees (if any java task is blocked on a barrier, the c++ tasks will run) in a non-deterministic way.

Android provides a native Looper API to post tasks by signalling a File Descriptor. This API is >4x faster in terms of CPU time and wall clock duration (as measured on my Android N Pixel phone), but only guarantees ordering between descriptors based on signalling time. This means that Java tasks, which are all (analogously) backed by a single descriptor can't maintain absolute ordering as there isn't a signal per task to order them across descriptors. This is okay because ordering already isn't guaranteed, and this will make violations of ordering assumptions more common so we can find these issues.

Fairness can be maintained by limiting the number of c++ tasks run each time the Looper fires our callback (running one task will alternate between java and c++ tasks). I'll initially implement alternating between java and c++ tasks, and this can be tuned further later.


My plan is to land this behind a feature flag, enable by default with a finch control group to measure performance/stability, and a kill-switch in case it needs to turned off on stable.
 

Comment 1 by dskiba@chromium.org, Oct 31 2017

Cc: dskiba@chromium.org
Project Member

Comment 2 by bugdroid1@chromium.org, Nov 1 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/fc7067fedcb53c50ac95c2cff7b1f6f4c22c1e22

commit fc7067fedcb53c50ac95c2cff7b1f6f4c22c1e22
Author: Michael Thiessen <mthiesse@chromium.org>
Date: Wed Nov 01 22:33:01 2017

Ensure MessagePumpForUI doesn't run tasks after being aborted.

The previous behaviour of MessagePumpForUI::Abort was to prevent
delayed tasks from running, but non-delayed tasks would continue to run.
This was almost certainly not intentional, and an aborted pump probably
shouldn't be running any tasks at all.

Bug:  780100 
Change-Id: I56ad5b0b5d0621450fc62a35c4d927f4975ae6a4
Reviewed-on: https://chromium-review.googlesource.com/748979
Reviewed-by: danakj <danakj@chromium.org>
Commit-Queue: Michael Thiessen <mthiesse@chromium.org>
Cr-Commit-Position: refs/heads/master@{#513303}
[modify] https://crrev.com/fc7067fedcb53c50ac95c2cff7b1f6f4c22c1e22/base/message_loop/message_loop_unittest.cc
[modify] https://crrev.com/fc7067fedcb53c50ac95c2cff7b1f6f4c22c1e22/base/message_loop/message_pump_android.cc

Project Member

Comment 3 by bugdroid1@chromium.org, Nov 15 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/781ddeb91a1846ed336066f0d30f2ac7aa34878c

commit 781ddeb91a1846ed336066f0d30f2ac7aa34878c
Author: Michael Thiessen <mthiesse@chromium.org>
Date: Wed Nov 15 17:07:23 2017

Clean up message_pump_android and related tests.

This is a cleanup CL with no intended functional changes.

This is in preparation for the work in  crbug.com/780100 , and allows the
tests for JavaHandlerThread to remain agnostic to the message pump's
implementation (so that I can change the implementation later without
having to mess with the existing tests). I believe the modified tests
test everything the existing tests covered.

Bug:  780100 
Change-Id: I84e584f6521b15230811ddc39c9d0a1a9e77c430
Reviewed-on: https://chromium-review.googlesource.com/747922
Reviewed-by: Richard Coles <torne@chromium.org>
Reviewed-by: danakj <danakj@chromium.org>
Commit-Queue: Michael Thiessen <mthiesse@chromium.org>
Cr-Commit-Position: refs/heads/master@{#516725}
[modify] https://crrev.com/781ddeb91a1846ed336066f0d30f2ac7aa34878c/base/BUILD.gn
[modify] https://crrev.com/781ddeb91a1846ed336066f0d30f2ac7aa34878c/base/android/java/src/org/chromium/base/JavaHandlerThread.java
[modify] https://crrev.com/781ddeb91a1846ed336066f0d30f2ac7aa34878c/base/android/java/src/org/chromium/base/SystemMessageHandler.java
[modify] https://crrev.com/781ddeb91a1846ed336066f0d30f2ac7aa34878c/base/android/java_handler_thread.cc
[modify] https://crrev.com/781ddeb91a1846ed336066f0d30f2ac7aa34878c/base/android/java_handler_thread.h
[delete] https://crrev.com/d581d074704d0d88763cb87cb0e667e163c74e31/base/android/java_message_handler_factory.h
[modify] https://crrev.com/781ddeb91a1846ed336066f0d30f2ac7aa34878c/base/message_loop/message_loop.cc
[modify] https://crrev.com/781ddeb91a1846ed336066f0d30f2ac7aa34878c/base/message_loop/message_loop.h
[modify] https://crrev.com/781ddeb91a1846ed336066f0d30f2ac7aa34878c/base/message_loop/message_loop_unittest.cc
[modify] https://crrev.com/781ddeb91a1846ed336066f0d30f2ac7aa34878c/base/message_loop/message_pump_android.cc
[modify] https://crrev.com/781ddeb91a1846ed336066f0d30f2ac7aa34878c/base/message_loop/message_pump_android.h
[modify] https://crrev.com/781ddeb91a1846ed336066f0d30f2ac7aa34878c/base/test/BUILD.gn
[rename] https://crrev.com/781ddeb91a1846ed336066f0d30f2ac7aa34878c/base/test/android/java/src/org/chromium/base/JavaHandlerThreadHelpers.java
[delete] https://crrev.com/d581d074704d0d88763cb87cb0e667e163c74e31/base/test/android/java/src/org/chromium/base/TestSystemMessageHandler.java
[delete] https://crrev.com/d581d074704d0d88763cb87cb0e667e163c74e31/base/test/android/java_handler_thread_for_testing.cc
[delete] https://crrev.com/d581d074704d0d88763cb87cb0e667e163c74e31/base/test/android/java_handler_thread_for_testing.h
[add] https://crrev.com/781ddeb91a1846ed336066f0d30f2ac7aa34878c/base/test/android/java_handler_thread_helpers.cc
[add] https://crrev.com/781ddeb91a1846ed336066f0d30f2ac7aa34878c/base/test/android/java_handler_thread_helpers.h
[delete] https://crrev.com/d581d074704d0d88763cb87cb0e667e163c74e31/base/test/android/test_system_message_handler_link_android.cc
[delete] https://crrev.com/d581d074704d0d88763cb87cb0e667e163c74e31/base/test/android/test_system_message_handler_link_android.h

Labels: -M-64 M-65
Coming in too hot for M64, I'll try for 65.
By the way, do you have a link to a prototype patch? I'm curious how delayed work (PostDelayedTask) is handled.
Implementation is complete and can be found here: https://chromium-review.googlesource.com/c/chromium/src/+/751322

I'm just working through a few stubbornly racy tests.

PostDelayedTask is implemented using timerfd.
Description: Show this description

Comment 8 by dskiba@chromium.org, Nov 22 2017

What is the end goal here? The CL cites improvements, but they are pretty small to me. In which cases those improvements matter?
Well I'm not sure what decides whether or not the improvement is worth it. It makes startup ~20ms faster, and every subsequent postTask ~0.03ms faster. That's not a whole lot in itself, but it adds up over time.

It has some other benefits like letting us run unit tests like the base unit tests on the actual implementation, and not a stub (previously the actual implementation was practically untested).

Is this work not worth it in your opinion? (I've spent ~2 weeks to get it to this point)
It looks like CL has some unrelated changes? E.g. why does it include a change to LocationBarLayout.java? This makes it hard to access its size.

Have you tried benchmarking on Android Go device? I think you should get better numbers.

Also, what did you mean by "startup" in "makes startup ~20ms faster"? How did you measure that?

The CL currently has 'unrelated' changes because I'm fixing all of the test failures resulting from the change - the CL exposes some races in existing code, and some things that relied on java/native task ordering. I'll be splitting it up into multiple CLs before I actually send it out for review.

I don't have an Android Go device, I'll see if I can find one. I did try to benchmark on a nexus 4 (running K), but the telemetry startup scripts didn't work. I was hoping to land with a finch trial to get more representative measurements with and without the patch across all devices to see what the impact of the change is.

The 20ms number has huge error bars around it. Startup is super noisy and I didn't want to spend a ton of time averaging runs, so what I did was run telemetry during startup on a pixel device, and average the cumulative CPU-self time for all work done on the UI thread (in the first 5 seconds) across 5 runs to sanity check. The difference was an average of 23ms.
I have a Go device to lend you
I've analyzed performance impact of the CL on startup on 1GiB Android Go device ("gobo", running OMB1.171127.001): https://docs.google.com/a/chromium.org/spreadsheets/d/11G-7oLEajhr_D1NZQRh6d6ezej7wW0bHAEiL-eHZi4s/edit?usp=sharing

The test was opening Chrome, which was reloading a single wikipedia tab. Terminology:

* Startup: "ChromeApplication.onCreate" -> "BenchmarkInstrumentation::ImplThreadRenderingStats" (first drawLayers)
* StartupToNavigation: "ChromeApplication.onCreate" -> "NavigationTiming navigationStart"
* NavigationToCommit: "Navigation StartToCommit" slice
* NavigationToLoad: "NavigationTiming navigationStart" -> "firstContentfulPaint"
* StartupToLoad: "ChromeApplication.onCreate" -> "firstContentfulPaint"

Takeaway is that everything except StartupToNavigation has regressed. I guess StartupToNavigation improved because we're running all Java events first, and navigation is started from a Java event.

Sample traces attached.
master_50-10s-2.html.gz
1.6 MB Download
alooper_50-10s-39.html.gz
1.6 MB Download
That's very interesting. I would guess this is due to the event reordering somehow, or maybe alternating between java and native tasks is just slower for the Looper itself. That might also imply your proposed simplifications in issue 788008 would also lead to perf regressions.
Also, did you automate this somehow? If so, I'd love to know how.
Looking at the two traces a bit - the results are a bit surprising. We are definitely doing some critical stuff earlier on. The time after finishNativeInitializatoin in the alooper trace is weird - we rifle through many small chrome native functions that usually happen too early but also seemingly more FrameHandler callbacks than in the typical case.

So for example, we get to finishNativeInitialization nearly 400ms earlier (beacuse I believe we're doing more important java stuff earlier) but then only reach gpu init 100ms later than baseline. It's hard to understand why there's that delta and I'm curious if it's consistent across runs.
Labels: -M-65 M-66
I'm blocked on getting an Android Go device to debug the odd slowdowns.
Labels: -M-66 M-67
Labels: -M-67 M-68

Comment 20 by gab@chromium.org, May 3 2018

Components: -Internals>TaskScheduler Internals
Not related to TaskScheduler as it doesn't use MessageLoop (maybe we should have an Internals>Tasks for these types of tracking, for now it's just in plain Internals)
Labels: -M-68 M-69
Cc: alexclarke@chromium.org
Cc: altimin@chromium.org
CL is out, at https://chromium-review.googlesource.com/c/chromium/src/+/751322.

In doing a bunch of startup testing on a gobo device I was randomly getting performance wins or losses, all pretty small and well within noise. I wasn't able to replicate dskiba's results with the latest change.
Project Member

Comment 25 by bugdroid1@chromium.org, Jul 10

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/d7ae73502ade3255987a4b5c72c5294c2c82896a

commit d7ae73502ade3255987a4b5c72c5294c2c82896a
Author: Michael Thiessen <mthiesse@chromium.org>
Date: Tue Jul 10 00:57:13 2018

ALooper implementation for MessagePumpForUI::Schedule(Delayed)Work

This CL switches the Android MessagePumpForUI from queuing tasks on
the Android MessageQueue, to signalling File Descriptors added to the
ALooper's epoll implementation.

Ordering between events queued in Java and C++ is not guaranteed, but
this was already not the case on Android; this CL simply takes better
advantage of that. Ordering for events queued in Java and C++ are still
correctly ordered with respect to other events queued in Java and C++,
respectively.

This CL also changes the Android test support MessagePumpForUIStub to use
the real implementation when non-nested, which means that the message pump
code is now *actually tested*.

Performance Improvements (measured locally using Tracing):
Device: gobo (low-end Android GO device)
OS: Android O (8.1.0)

During startup on the main thread:
On Master, ScheduleWork is called ~207 times, averaging .047ms per call.
With this patch, ScheduleWork is called ~63 times, averaging .02ms per call.
That's ~11ms less time spent in ScheduleWork during startup.
ScheduleDelayedWork is similarly faster per call, and saves ~1ms during startup.

Startup profiling doesn't show any meaningful difference in startup time
overall, probably because the sample sizes were too small given how noisy
startup is.
Data here: https://docs.google.com/spreadsheets/d/1SQJ7zYeq0Eo4lQziYCALA-noP4pHlyeAd4--4fK6mbk/edit

Bug:  780100 
Change-Id: Icca7443d62154e4b516a70f8afffb1882000b555
Reviewed-on: https://chromium-review.googlesource.com/751322
Commit-Queue: Michael Thiessen <mthiesse@chromium.org>
Reviewed-by: Gabriel Charette <gab@chromium.org>
Reviewed-by: kylechar <kylechar@chromium.org>
Reviewed-by: Yaron Friedman <yfriedman@chromium.org>
Reviewed-by: Sami Kyöstilä <skyostil@chromium.org>
Cr-Commit-Position: refs/heads/master@{#573570}
[modify] https://crrev.com/d7ae73502ade3255987a4b5c72c5294c2c82896a/base/BUILD.gn
[modify] https://crrev.com/d7ae73502ade3255987a4b5c72c5294c2c82896a/base/android/java/src/org/chromium/base/JavaHandlerThread.java
[delete] https://crrev.com/1f820fd3d5a5acb195e9dcf68b7376992e6c53d3/base/android/java/src/org/chromium/base/SystemMessageHandler.java
[modify] https://crrev.com/d7ae73502ade3255987a4b5c72c5294c2c82896a/base/android/java_handler_thread.cc
[modify] https://crrev.com/d7ae73502ade3255987a4b5c72c5294c2c82896a/base/android/java_handler_thread.h
[modify] https://crrev.com/d7ae73502ade3255987a4b5c72c5294c2c82896a/base/message_loop/incoming_task_queue.cc
[modify] https://crrev.com/d7ae73502ade3255987a4b5c72c5294c2c82896a/base/message_loop/incoming_task_queue.h
[modify] https://crrev.com/d7ae73502ade3255987a4b5c72c5294c2c82896a/base/message_loop/message_loop.cc
[modify] https://crrev.com/d7ae73502ade3255987a4b5c72c5294c2c82896a/base/message_loop/message_loop.h
[modify] https://crrev.com/d7ae73502ade3255987a4b5c72c5294c2c82896a/base/message_loop/message_pump_android.cc
[modify] https://crrev.com/d7ae73502ade3255987a4b5c72c5294c2c82896a/base/message_loop/message_pump_android.h
[modify] https://crrev.com/d7ae73502ade3255987a4b5c72c5294c2c82896a/base/test/test_support_android.cc
[modify] https://crrev.com/d7ae73502ade3255987a4b5c72c5294c2c82896a/chrome/android/javatests/src/org/chromium/chrome/browser/contextual_suggestions/ContextualSuggestionsTest.java

Status: Fixed (was: Started)
Looks like a 15% startup time win on some go devices :D

https://bugs.chromium.org/p/chromium/issues/detail?id=862984
Blocking: 872044

Sign in to add a comment