New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 750187 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Aug 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug-Regression

Blocked on:
issue 754004



Sign in to add a comment

39.7% regression in loading.desktop at 489230:489278

Project Member Reported by majidvp@google.com, Jul 28 2017

Issue description

See the link to graphs below.
 
Project Member

Comment 1 by 42576172...@developer.gserviceaccount.com, Jul 28 2017

All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=750187

(For debugging:) Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?sid=0170136c7fb51e4b3ccb74f351a2586006f5c611efc11619bb0125c143d3eaea


Bot(s) for this bug's original alert(s):

chromium-rel-win7-dual
Project Member

Comment 3 by 42576172...@developer.gserviceaccount.com, Jul 28 2017

Cc: yhirano@chromium.org
Owner: yhirano@chromium.org

=== Auto-CCing suspected CL author yhirano@chromium.org ===

Hi yhirano@chromium.org, the bisect results pointed to your CL, please take a look at the
results.


=== BISECT JOB RESULTS ===
Perf regression found with culprit

Suspected Commit
  Author : Yutaka Hirano
  Commit : 4f87e237dabe3d4025c52a43885a469a7e872cd2
  Date   : Tue Jul 25 08:13:52 2017
  Subject: Use Independent URLLoader

Bisect Details
  Configuration: win_perf_bisect
  Benchmark    : loading.desktop
  Metric       : timeToFirstContentfulPaint_avg/pcv1-cold/HatenaBookmark
  Change       : 2.39% | 784.559999999 -> 812.188777778

Revision             Result                  N
chromium@489229      784.56 +- 1604.4        14      good
chromium@489240      665.163 +- 159.297      6       good
chromium@489248      656.519 +- 61.6624      6       good
chromium@489251      654.964 +- 70.2163      6       good
chromium@489253      663.522 +- 79.4227      6       good
chromium@489254      813.264 +- 64.2293      14      bad       <--
chromium@489278      812.189 +- 135.3        9       bad

To Run This Test
  src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=HatenaBookmark loading.desktop

More information on addressing performance regressions:
  http://g.co/ChromePerformanceRegressions

Debug information about this bisect:
  https://chromeperf.appspot.com/buildbucket_job_status/8972811546857410608


For feedback, file a bug with component Speed>Bisection
Project Member

Comment 4 by 42576172...@developer.gserviceaccount.com, Jul 28 2017

Cc: horo@chromium.org
 Issue 749923  has been merged into this issue.
Status: Assigned (was: Untriaged)
Cc: kinuko@chromium.org yzshen@chromium.org
Components: Blink>Loader Internals>Mojo
Project Member

Comment 8 by 42576172...@developer.gserviceaccount.com, Jul 31 2017


=== BISECT JOB RESULTS ===
NO Perf regression found

Bisect Details
  Configuration: winx64_10_perf_bisect
  Benchmark    : loading.desktop
  Metric       : timeToFirstContentfulPaint_avg/pcv1-warm/Elmundo

Revision             Result                  N
chromium@489274      394.256 +- 36.9881      21      good
chromium@489334      391.154 +- 22.0294      21      bad

To Run This Test
  src/tools/perf/run_benchmark -v --browser=release_x64 --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=Elmundo loading.desktop

More information on addressing performance regressions:
  http://g.co/ChromePerformanceRegressions

Debug information about this bisect:
  https://chromeperf.appspot.com/buildbucket_job_status/8972593505669681264


For feedback, file a bug with component Speed>Bisection
Project Member

Comment 12 by 42576172...@developer.gserviceaccount.com, Jul 31 2017


=== BISECT JOB RESULTS ===
NO Perf regression found

Bisect Details
  Configuration: winx64_10_perf_bisect
  Benchmark    : loading.desktop
  Metric       : timeToFirstContentfulPaint_avg/pcv1-warm/Elmundo

Revision             Result                  N
chromium@489274      394.265 +- 34.5819      21      good
chromium@489334      396.304 +- 41.2866      21      bad

To Run This Test
  src/tools/perf/run_benchmark -v --browser=release_x64 --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=Elmundo loading.desktop

More information on addressing performance regressions:
  http://g.co/ChromePerformanceRegressions

Debug information about this bisect:
  https://chromeperf.appspot.com/buildbucket_job_status/8972580771284937712


For feedback, file a bug with component Speed>Bisection
Project Member

Comment 13 by 42576172...@developer.gserviceaccount.com, Jul 31 2017


=== BISECT JOB RESULTS ===
Perf regression found with culprit

Suspected Commit
  Author : Yutaka Hirano
  Commit : 4f87e237dabe3d4025c52a43885a469a7e872cd2
  Date   : Tue Jul 25 08:13:52 2017
  Subject: Use Independent URLLoader

Bisect Details
  Configuration: win_perf_bisect
  Benchmark    : loading.desktop
  Metric       : timeToFirstContentfulPaint_avg/pcv1-cold/HatenaBookmark
  Change       : 7.50% | 752.212428574 -> 853.535666668

Revision             Result                  N
chromium@489229      752.212 +- 1243.95      14      good
chromium@489240      669.632 +- 163.354      6       good
chromium@489248      689.133 +- 384.399      9       good
chromium@489251      663.098 +- 78.0577      6       good
chromium@489253      662.404 +- 95.3864      6       good
chromium@489254      801.234 +- 163.758      14      bad       <--
chromium@489278      853.536 +- 450.056      9       bad

To Run This Test
  src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=HatenaBookmark loading.desktop

More information on addressing performance regressions:
  http://g.co/ChromePerformanceRegressions

Debug information about this bisect:
  https://chromeperf.appspot.com/buildbucket_job_status/8972580778823235760


For feedback, file a bug with component Speed>Bisection
Project Member

Comment 14 by 42576172...@developer.gserviceaccount.com, Jul 31 2017


=== BISECT JOB RESULTS ===
Perf regression found with culprit

Suspected Commit
  Author : Yutaka Hirano
  Commit : 4f87e237dabe3d4025c52a43885a469a7e872cd2
  Date   : Tue Jul 25 08:13:52 2017
  Subject: Use Independent URLLoader

Bisect Details
  Configuration: win_perf_bisect
  Benchmark    : loading.desktop
  Metric       : timeToFirstContentfulPaint_avg/pcv1-cold/HatenaBookmark
  Change       : 1.62% | 786.871333337 -> 799.645444443

Revision             Result                  N
chromium@488500      786.871 +- 1268.42      9      good
chromium@489000      651.71 +- 110.671       6      good
chromium@489250      688.362 +- 546.546      9      good
chromium@489252      655.198 +- 51.5453      6      good
chromium@489253      638.074 +- 56.6773      6      good
chromium@489254      782.135 +- 97.9629      9      bad       <--
chromium@489258      793.984 +- 60.1303      9      bad
chromium@489266      842.29 +- 278.255       9      bad
chromium@489282      789.114 +- 52.9829      9      bad
chromium@489313      855.404 +- 449.677      9      bad
chromium@489375      793.937 +- 98.12        9      bad
chromium@489500      799.645 +- 86.7808      9      bad

To Run This Test
  src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=HatenaBookmark loading.desktop

More information on addressing performance regressions:
  http://g.co/ChromePerformanceRegressions

Debug information about this bisect:
  https://chromeperf.appspot.com/buildbucket_job_status/8972577330053181728


For feedback, file a bug with component Speed>Bisection
Owner: yzshen@chromium.org
Reported regressions are:

 A. TTFCP on Elmundo
 B. TTFCP on HatenaBookmark
 C. ServiceWorker micro benchmark.

But bisect fails for A and we decided to not care about C (see  issue 728467 ). So B is the only "real" regression.

yzshen@, sorry for bothering you again, but could you take a look at the regression?

Thank you!

Sure. Have you also looked into case B / got any findings?


The regression is observable on *some* windows bots and *some* mac bots (https://chromeperf.appspot.com/report?sid=dd569ae19ac3a9c1a3e2ccf3c8c8723ec3a39b925a9760cd2529e6cf08074a2e&rev=489278), but that's all I know.
Correction: The regression is observable on *some* windows bots. https://chromeperf.appspot.com/report?sid=dd569ae19ac3a9c1a3e2ccf3c8c8723ec3a39b925a9760cd2529e6cf08074a2e&rev=489278

It's not observable on mac and linux bots.
Sorry if this is a silly question: when I looked at the graph:
https://chromeperf.appspot.com/report?sid=369f5804eec7f388050c1732ec4de0134ccb605e1e26754b437210392b061119&rev=489278

The number on the graph seems to be different from the number in the trace. Does anyone know why they are different?

For example, on the graph around the exclamation point icon the number of timeToFirstContentfulPaint_avg is ~850ms.

In the popup window, I clicked "Trace" to open the trace. And then in the "Metrics" side-tab, I chose "loadingMetric" and "avg" in the drop lists. It showed me timeToFirstContentfulPaint 385.750ms.
Answer my own question on #19:

That trace is actually for the "warm" case, while the graph for the "cold" case. Maybe there is a bug in our server serving the wrong trace file.
Status update:

I added some code to count some URLLoader-related activities during the "code" case of HatenaBookmark:
(All on the browser IO thread)
- Bind a URLLoader request: 244 (~58 happened during the time of timeToFirstContentfulPaint)
- URLLoader::FollowRedirect: 32 (~0)
- URLLoader::SetPriority: 7 (~5)

Ken came up with a theory that maybe it is because setting up a cross-process message pipe requires 2 control messages while an associated interface just requires 1.
Ken's experimental CL to reduce message pipe control messages to 1:
https://chromium-review.googlesource.com/c/598744

Perf try jobs didn't show improvements, though. The initial run actually showed worse results, but the test results fluctuated quite significantly so maybe that was just noise. I am running more runs of the tests on the perf bots.

Other than that, I am going to write a perf test that compare the cost of passing a message pipe across process v.s. that of passing an associated interface. And see whether I can find any hot-spots.
Cc: roc...@chromium.org
Project Member

Comment 25 by bugdroid1@chromium.org, Aug 8 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/60efee1a417c930ff0096c3a60bdd071c182e998

commit 60efee1a417c930ff0096c3a60bdd071c182e998
Author: Yuzhu Shen <yzshen@chromium.org>
Date: Tue Aug 08 21:53:45 2017

Add perf test cases to compare the cost of passing associated and non-associated interfaces.


BUG= 753164 ,  750187 

Change-Id: I1e1c9f4fa889a18ff7078b52e9b346eb19b45db5
Reviewed-on: https://chromium-review.googlesource.com/604873
Reviewed-by: Ken Rockot <rockot@chromium.org>
Reviewed-by: Tom Sepez <tsepez@chromium.org>
Commit-Queue: Yuzhu Shen <yzshen@chromium.org>
Cr-Commit-Position: refs/heads/master@{#492757}
[modify] https://crrev.com/60efee1a417c930ff0096c3a60bdd071c182e998/ipc/ipc_mojo_perftest.cc
[modify] https://crrev.com/60efee1a417c930ff0096c3a60bdd071c182e998/ipc/ipc_test.mojom

Blockedon: 754004
Status: Fixed (was: Assigned)
Status update:

I wrote some perf tests to compare the cost ( crbug.com/753164 ). It showed that setting up non-associated interfaces (i.e., message pipes) is more expensive than associated interfaces. Looking at the profiling data has revealed an issue that base::RandBytes() was quite time consuming on Linux. Ken has kindly landed a fix for that ( crbug.com/754004 ). Other than that I didn't see any low-hanging fruit.

After the fix, there is still perf difference between the two. Imagine we have the following sequence:
- create [associated] interface ptr and request;
- send the [associated] interface request to a different process;
- close the [associated] interface ptr and request.

In the sequence above, 6 control messages are sent for a non-associated interface (for setting up and destroying the message pipe); while only 1 control message for an associated interface.

Even in the same-process case (no control messages at all for non-associated interfaces), non-associated interfaces still take more time than associated ones. (Please see the numbers in  crbug.com/753164 .)

Given that it takes more work to set up a non-associated interface, it is natural that it is slower. Unless we have good ideas how to make further improvements, I will close this bug. Please reopen if you have any concerns or suggestions.



Thank you for the investigation!

Sign in to add a comment