New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 616998 link

Starred by 1 user

Issue metadata

Status: Duplicate
Merged: issue 726489
Owner:
please use my google.com address
Closed: May 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 2
Type: Bug-Regression



Sign in to add a comment

9.2% regression in service_worker.service_worker_micro_benchmark at 394780:394844

Project Member Reported by horo@chromium.org, Jun 3 2016

Issue description

See the link to graphs below.
 

Comment 1 by horo@chromium.org, Jun 3 2016

All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=616998

Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?keys=agxzfmNocm9tZXBlcmZyFAsSB0Fub21hbHkYgICgrOyA5AgM


Bot(s) for this bug's original alert(s):

android-galaxy-s5
Cc: gayane@chromium.org
Owner: gayane@chromium.org

=== Auto-CCing suspected CL author gayane@chromium.org ===

Hi gayane@chromium.org, the bisect results pointed to your CL below as possibly
causing a regression. Please have a look at this info and see whether
your CL be related.


===== BISECT JOB RESULTS =====
Status: completed


===== SUSPECTED CL(s) =====
Subject : Make the launch params the default client behavior for UMA 3g
Author  : gayane
Commit description:
  
BUG=455847

Review-Url: https://codereview.chromium.org/1974593002
Cr-Commit-Position: refs/heads/master@{#394811}
Commit  : 8ff878076934185e5679df6f75a26f0eff47da22
Date    : Thu May 19 18:06:00 2016


===== TESTED REVISIONS =====
Revision         Mean     Std Dev   N   Good?
chromium@394779  21.0638  0.217748  12  good
chromium@394796  21.1729  0.412952  12  good
chromium@394804  20.959   0.191422  5   good
chromium@394808  20.9269  0.217337  8   good
chromium@394810  20.9871  0.241712  12  good
chromium@394811  21.4554  0.298424  12  bad    <--
chromium@394812  21.6263  0.420015  8   bad
chromium@394844  22.315   1.47884   8   bad

Bisect job ran on: android_s5_perf_bisect
Bug ID: 616998

Test Command: src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --also-run-disabled-tests service_worker.service_worker_micro_benchmark
Test Metric: concurrent_10_response_50_percentile/concurrent_10_response_50_percentile
Relative Change: 7.72%
Score: 99.5

Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/android_s5_perf_bisect/builds/680
Job details: https://chromeperf.appspot.com/buildbucket_job_status/9010906026470237424


Not what you expected? We'll investigate and get back to you!
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5271412869169152

| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!
Cc: dmu...@chromium.org
Owner: dmu...@chromium.org

=== Auto-CCing suspected CL author dmurph@chromium.org ===

Hi dmurph@chromium.org, the bisect results pointed to your CL below as possibly
causing a regression. Please have a look at this info and see whether
your CL be related.


===== BISECT JOB RESULTS =====
Status: completed


===== SUSPECTED CL(s) =====
Subject : [BlobAsync] Fixed race between IPC messages and IO task queue
Author  : dmurph
Commit description:
  
The message responses to our blob start message are dispatched on the IO
thread but they're not added to the task queue. So a race can happen
where OnMemoryRequest is called before we store the consolidation in our
map.

I moved the StartBuildingBlob message - which signals the browser to
start requesting memory from the renderer - to our IO task which stores
our blob data. This eliminates the race case with a small sacrifice of
construction speed.

In the future, I'd like to be able to still dispatch both IPC messages
right away to keep the fast path, and correctly handle the race case.

BUG= 612358 

Review-Url: https://codereview.chromium.org/1988293002
Cr-Commit-Position: refs/heads/master@{#394549}
Commit  : 65a8fb95a8c71a7be8515f5e1c8125b04a477716
Date    : Wed May 18 21:16:15 2016


===== TESTED REVISIONS =====
Revision         Mean    Std Dev    N  Good?
chromium@394446  19.915  0.445042   5  good
chromium@394509  19.906  0.34153    5  good
chromium@394540  19.614  0.138492   5  good
chromium@394548  19.527  0.0699643  5  good
chromium@394549  20.781  0.12477    5  bad    <--
chromium@394550  21.019  0.147199   5  bad
chromium@394552  20.683  0.0626099  5  bad
chromium@394556  20.961  0.119027   5  bad
chromium@394571  21.06   0.245484   5  bad
chromium@394696  21.025  0.163974   5  bad
chromium@394945  21.675  0.12465    5  bad
chromium@395443  23.181  0.499192   5  bad

Bisect job ran on: android_s5_perf_bisect
Bug ID: 616998

Test Command: src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --also-run-disabled-tests service_worker.service_worker_micro_benchmark
Test Metric: concurrent_10_response_50_percentile/concurrent_10_response_50_percentile
Relative Change: 16.40%
Score: 99.9

Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/android_s5_perf_bisect/builds/682
Job details: https://chromeperf.appspot.com/buildbucket_job_status/9010890922493196464


Not what you expected? We'll investigate and get back to you!
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5805335017684992

| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!
Project Member

Comment 4 by sheriffbot@chromium.org, Jun 3 2016

Labels: -M-52 M-53 MovedFrom-52
Moving this nonessential bug to the next milestone.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
That CL was reverted, so this should be fixed?
dmurph: Right, if that CL was the cause and it was reverted, we should see it on the graph.

But, looking at the graphs (https://chromeperf.appspot.com/group_report?bug_id=616998), there was no clear step up where the original change was, which suggests that this isn't a regression caused by one change, if it is a real regression at all. After the revert, I didn't see any step down.

OK to close this as WontFix?
Project Member

Comment 7 by sheriffbot@chromium.org, Jul 7 2016

Labels: -M-53 MovedFrom-53
This issue has been moved once and is lower than Pri-1. Removing the milestone.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Cc: -gayane@chromium.org -dmu...@chromium.org rsch...@chromium.org
Components: Blink>ServiceWorker
Owner: ----
There's definitely a regression here, but I don't think it's in the range of the alert. I just sent off some more, wider bisects.

For the record, the bisects that have come back so far are all pretty dubious. I'm removing dmurph and gayane, I don't think their patches are to blame. Or at least the bisects haven't proved that yet :)
Project Member

Comment 11 by 42576172...@developer.gserviceaccount.com, Jul 15 2016

Cc: dmu...@chromium.org
Owner: dmu...@chromium.org

=== Auto-CCing suspected CL author dmurph@chromium.org ===

Hi dmurph@chromium.org, the bisect results pointed to your CL below as possibly
causing a regression. Please have a look at this info and see whether
your CL be related.


===== BISECT JOB RESULTS =====
Status: completed


===== SUSPECTED CL(s) =====
Subject : [BlobAsync] Fixed race between IPC messages and IO task queue
Author  : dmurph
Commit description:
  
The message responses to our blob start message are dispatched on the IO
thread but they're not added to the task queue. So a race can happen
where OnMemoryRequest is called before we store the consolidation in our
map.

I moved the StartBuildingBlob message - which signals the browser to
start requesting memory from the renderer - to our IO task which stores
our blob data. This eliminates the race case with a small sacrifice of
construction speed.

In the future, I'd like to be able to still dispatch both IPC messages
right away to keep the fast path, and correctly handle the race case.

BUG= 612358 

Review-Url: https://codereview.chromium.org/1988293002
Cr-Commit-Position: refs/heads/master@{#394549}
Commit  : 65a8fb95a8c71a7be8515f5e1c8125b04a477716
Date    : Wed May 18 21:16:15 2016


===== TESTED REVISIONS =====
Revision         Mean    Std Dev   N  Good?
chromium@394447  20.629  0.348755  5  good
chromium@394524  21.096  0.293394  5  good
chromium@394544  21.06   0.188978  5  good
chromium@394547  20.793  0.34892   5  good
chromium@394548  21.147  0.268319  5  good
chromium@394549  22.548  0.557221  5  bad    <--
chromium@394554  22.536  0.217152  5  bad
chromium@394563  22.269  0.581672  5  bad
chromium@394601  23.079  1.30643   5  bad
chromium@394755  22.775  0.313947  5  bad

Bisect job ran on: android_s5_perf_bisect
Bug ID: 616998

Test Command: src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --also-run-disabled-tests service_worker.service_worker_micro_benchmark
Test Metric: concurrent_10_response_90_percentile/concurrent_10_response_90_percentile
Relative Change: 10.40%
Score: 99.9

Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/android_s5_perf_bisect/builds/803
Job details: https://chromeperf.appspot.com/buildbucket_job_status/9007038243971353568


Not what you expected? We'll investigate and get back to you!
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5143118131757056

| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!
That bisect *is* cleaner, but it doesn't solve the issue that this was reverted and doesn't show an improvement there.

Confusingly, the last line of that bisect contains the revert at 394740, but the value is still "bad"!
 > chromium@394755  22.775  0.313947  5  bad

dmurph, were there any other changes to this code roughly in that range? Could there have been a patch in between the land and the revert that somehow caused the regression to stick despite the revert?
Hm.... well because my email here https://groups.google.com/a/chromium.org/forum/?fromgroups#!topic/chromium-dev/f5rMehPu-LE

they reverted send being sent directly on mojo instead of scheduling on the IO thread. This is the same thing as my revert CL, so this would have had the same performance impact.
Owner: ben@chromium.org
Ah. I'm assuming you mean that would have done the same thing as your initial CL.

If the switch to Mojo ended up causing the regression, Ben: who would be right to investigate that?
Well not quite switching to Mojo. That actually should have been a perf increase, where messages were sent from any thread instead of scheduled on the IO thread. However, this caused a bunch of races where people assumed that IPCs would be received or sent on the IO task scheduler. Since there were other bizzare problems, they reverted that specific aspect, where they scheduled the sends on the IO thread. So it should have been the same as pre-mojo.

Comment 16 by ben@chromium.org, Jul 16 2016

Owner: roc...@chromium.org
Don't think I changed this yet, so over to Ken for thoughts.
I reverted thread-safe send behavior in r394568. This is likely responsible for part of the uptick, but it's not something that's going to be undone.

It looks like there are further regressions around the same time though and I'm not sure how to explain those. I don't know how to trigger a perf bisect, but can we do one with 394568 as a "good" baseline to see if there's another clear offender?
Cc: roc...@chromium.org
Labels: -Performance-Sheriff Performance OS-Android
Owner: horo@chromium.org
horo-- could you take this as this week's blink-worker triager? (and original one...  our triage process broke down here).

It's interesting that the regression apparently is only on Android.
@rockot I submitted the bisect you mentioned in 17.
Project Member

Comment 21 by 42576172...@developer.gserviceaccount.com, Jul 19 2016


===== BISECT JOB RESULTS =====
Status: completed


=== Bisection aborted ===
The bisect was aborted because The metric values for the initial "good" and "bad" revisions do not represent a clear regression.
Please contact the the team (see below) if you believe this is in error.

=== Warnings ===
The following warnings were raised by the bisect job:

 * Bisect failed to reproduce the regression with enough confidence.

===== TESTED REVISIONS =====
Revision         Mean     Std Dev   N   Good?
chromium@394568  22.7972  0.958085  18  good
chromium@394755  22.8536  0.375562  18  bad

Bisect job ran on: android_s5_perf_bisect
Bug ID: 616998

Test Command: src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --also-run-disabled-tests service_worker.service_worker_micro_benchmark
Test Metric: concurrent_10_response_90_percentile/concurrent_10_response_90_percentile
Relative Change: 1.28%
Score: 0

Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/android_s5_perf_bisect/builds/814
Job details: https://chromeperf.appspot.com/buildbucket_job_status/9006701654756461936


Not what you expected? We'll investigate and get back to you!
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5882370922643456

| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!
Project Member

Comment 23 by 42576172...@developer.gserviceaccount.com, Jul 19 2016


===== BISECT JOB RESULTS =====
Status: completed


===== SUSPECTED CL(s) =====
Subject : [BlobAsync] Fixed race between IPC messages and IO task queue
Author  : dmurph
Commit description:
  
The message responses to our blob start message are dispatched on the IO
thread but they're not added to the task queue. So a race can happen
where OnMemoryRequest is called before we store the consolidation in our
map.

I moved the StartBuildingBlob message - which signals the browser to
start requesting memory from the renderer - to our IO task which stores
our blob data. This eliminates the race case with a small sacrifice of
construction speed.

In the future, I'd like to be able to still dispatch both IPC messages
right away to keep the fast path, and correctly handle the race case.

BUG= 612358 

Review-Url: https://codereview.chromium.org/1988293002
Cr-Commit-Position: refs/heads/master@{#394549}
Commit  : 65a8fb95a8c71a7be8515f5e1c8125b04a477716
Date    : Wed May 18 21:16:15 2016


===== TESTED REVISIONS =====
Revision         Mean     Std Dev   N  Good?
chromium@394493  21.8325  1.64868   8  good
chromium@394526  20.982   0.318916  5  good
chromium@394543  20.902   0.292203  5  good
chromium@394547  20.719   0.177883  5  good
chromium@394548  20.868   0.264613  5  good
chromium@394549  22.079   0.301069  5  bad    <--
chromium@394551  22.617   0.189328  5  bad
chromium@394559  22.31    0.296202  8  bad
chromium@394624  22.555   0.251471  5  bad
chromium@394755  22.697   0.461771  5  bad

Bisect job ran on: android_s5_perf_bisect
Bug ID: 616998

Test Command: src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --also-run-disabled-tests service_worker.service_worker_micro_benchmark
Test Metric: concurrent_10_response_90_percentile/concurrent_10_response_90_percentile
Relative Change: 6.08%
Score: 99.9

Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/android_s5_perf_bisect/builds/815
Job details: https://chromeperf.appspot.com/buildbucket_job_status/9006697759983286288


Not what you expected? We'll investigate and get back to you!
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5810014145478656

| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!

Comment 24 by horo@chromium.org, Jul 20 2016

Cc: -roc...@chromium.org horo@chromium.org
Components: Internals>Mojo
Owner: roc...@chromium.org
I think this regression was introduced by ChannelMojo.

When this fetch handler is executed in the ServiceWorker, several IPC messages are sent from the ServiceWorker.
--
this.onfetch = function(event) {
  event.respondWith(new Response('Hello from ServiceWorker!'));
};
--
(1) BlobStorageMsg_RegisterBlobUUID
(2) BlobStorageMsg_StartBuildingBlob
(3) ServiceWorkerHostMsg_FetchEventFinished
(4) BlobHostMsg_DecrementRefCount

Before r389497, the all messages are sent via the IO thread using ChannelPosix.
After r389497, the all messages are sent directly from the ServiceWorker thread using ChannelMojo.
After r394549, (2) is sent via the IO thread.
After r394568, all messages are sent via the IO thread.
Even after r394740 which reverted r394549 all all messages are sent via the IO thread, because of r394568.

         |1|2|3|4|
-r389496 |I|I|I|I| (ChannelPosix)
r389497- |W|W|W|W| (ChannelMojo)
r394549- |W|I|W|W| (ChannelMojo)
r394568- |I|I|I|I| (ChannelMojo)
r394740- |I|I|I|I| (ChannelMojo)

 * W: IPC message is sent directly from the ServiceWorker thread.
 * I: IPC message is sent via the IO thread from the ServiceWorker thread.

This regression happens only when the IO thread is super busy.
So it is not surprising that it is only on Android.

ChannelMojo was enabled in the perf bot by r389497.
At that moment, we saw another regression ( Issue 607034 ).
rockot@ is there anything you plan to do for this?

The regression is old now. According to comment #24 the regression is due to the change from ChannelPosix to ChannelMojo, not due a change in the sending thread (since they are still sent from the IO thread). I don't think we're going to go back from ChannelMojo to ChannelPosix. So we might just have to accept this.
Well we certainly don't plan to go back to using ChannelPosix, but we also care about performance. Are these particular tests representative of something a user is likely to experience, or are they more like microbenchmarks?

In any case these could be useful benchmarks against which to analyze perf details and help us improve mojo performance over time. For example we could expend some effort reducing the amount of internal locking going on, and I suspect that should have some noticeable impact on some benchmarks on some Android devices.
This is a microbenchmark, and it's quite artificial: it's a service worker responding to many XHR requests using respondWith(new Response('hello')).

We basically are using it to monitor recent changes in case something really egregious gets checked in. The plan is that the loading team will create realistic end-to-end page load benchmarks including for pages that use service worker:  issue 489617 

Mergedinto: 726489
Status: Duplicate (was: Assigned)
As previously suggested I don't think there's anything to do about this regression specifically, so I'm duping this into the umbrella Mojo optimization bug.

I don't see the original regression bots (galaxy s5) around still, but I'll include Android service worker benchmarks (e.g. something like [1]) in the list of things to monitor as changes land.

[1] https://chromeperf.appspot.com/report?sid=d2e6e26909e28f899b66b6b97159d3df28ac9081cfdf31096d03452fce1d7391

Sign in to add a comment