Issue metadata
Sign in to add a comment
|
39.7% regression in loading.desktop at 489230:489278 |
||||||||||||||||||||||
Issue descriptionSee the link to graphs below.
,
Jul 28 2017
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8972811546857410608
,
Jul 28 2017
=== Auto-CCing suspected CL author yhirano@chromium.org === Hi yhirano@chromium.org, the bisect results pointed to your CL, please take a look at the results. === BISECT JOB RESULTS === Perf regression found with culprit Suspected Commit Author : Yutaka Hirano Commit : 4f87e237dabe3d4025c52a43885a469a7e872cd2 Date : Tue Jul 25 08:13:52 2017 Subject: Use Independent URLLoader Bisect Details Configuration: win_perf_bisect Benchmark : loading.desktop Metric : timeToFirstContentfulPaint_avg/pcv1-cold/HatenaBookmark Change : 2.39% | 784.559999999 -> 812.188777778 Revision Result N chromium@489229 784.56 +- 1604.4 14 good chromium@489240 665.163 +- 159.297 6 good chromium@489248 656.519 +- 61.6624 6 good chromium@489251 654.964 +- 70.2163 6 good chromium@489253 663.522 +- 79.4227 6 good chromium@489254 813.264 +- 64.2293 14 bad <-- chromium@489278 812.189 +- 135.3 9 bad To Run This Test src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=HatenaBookmark loading.desktop More information on addressing performance regressions: http://g.co/ChromePerformanceRegressions Debug information about this bisect: https://chromeperf.appspot.com/buildbucket_job_status/8972811546857410608 For feedback, file a bug with component Speed>Bisection
,
Jul 28 2017
,
Jul 31 2017
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8972593505669681264
,
Jul 31 2017
,
Jul 31 2017
,
Jul 31 2017
=== BISECT JOB RESULTS === NO Perf regression found Bisect Details Configuration: winx64_10_perf_bisect Benchmark : loading.desktop Metric : timeToFirstContentfulPaint_avg/pcv1-warm/Elmundo Revision Result N chromium@489274 394.256 +- 36.9881 21 good chromium@489334 391.154 +- 22.0294 21 bad To Run This Test src/tools/perf/run_benchmark -v --browser=release_x64 --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=Elmundo loading.desktop More information on addressing performance regressions: http://g.co/ChromePerformanceRegressions Debug information about this bisect: https://chromeperf.appspot.com/buildbucket_job_status/8972593505669681264 For feedback, file a bug with component Speed>Bisection
,
Jul 31 2017
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8972580778823235760
,
Jul 31 2017
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8972580771284937712
,
Jul 31 2017
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8972577330053181728
,
Jul 31 2017
=== BISECT JOB RESULTS === NO Perf regression found Bisect Details Configuration: winx64_10_perf_bisect Benchmark : loading.desktop Metric : timeToFirstContentfulPaint_avg/pcv1-warm/Elmundo Revision Result N chromium@489274 394.265 +- 34.5819 21 good chromium@489334 396.304 +- 41.2866 21 bad To Run This Test src/tools/perf/run_benchmark -v --browser=release_x64 --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=Elmundo loading.desktop More information on addressing performance regressions: http://g.co/ChromePerformanceRegressions Debug information about this bisect: https://chromeperf.appspot.com/buildbucket_job_status/8972580771284937712 For feedback, file a bug with component Speed>Bisection
,
Jul 31 2017
=== BISECT JOB RESULTS === Perf regression found with culprit Suspected Commit Author : Yutaka Hirano Commit : 4f87e237dabe3d4025c52a43885a469a7e872cd2 Date : Tue Jul 25 08:13:52 2017 Subject: Use Independent URLLoader Bisect Details Configuration: win_perf_bisect Benchmark : loading.desktop Metric : timeToFirstContentfulPaint_avg/pcv1-cold/HatenaBookmark Change : 7.50% | 752.212428574 -> 853.535666668 Revision Result N chromium@489229 752.212 +- 1243.95 14 good chromium@489240 669.632 +- 163.354 6 good chromium@489248 689.133 +- 384.399 9 good chromium@489251 663.098 +- 78.0577 6 good chromium@489253 662.404 +- 95.3864 6 good chromium@489254 801.234 +- 163.758 14 bad <-- chromium@489278 853.536 +- 450.056 9 bad To Run This Test src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=HatenaBookmark loading.desktop More information on addressing performance regressions: http://g.co/ChromePerformanceRegressions Debug information about this bisect: https://chromeperf.appspot.com/buildbucket_job_status/8972580778823235760 For feedback, file a bug with component Speed>Bisection
,
Jul 31 2017
=== BISECT JOB RESULTS === Perf regression found with culprit Suspected Commit Author : Yutaka Hirano Commit : 4f87e237dabe3d4025c52a43885a469a7e872cd2 Date : Tue Jul 25 08:13:52 2017 Subject: Use Independent URLLoader Bisect Details Configuration: win_perf_bisect Benchmark : loading.desktop Metric : timeToFirstContentfulPaint_avg/pcv1-cold/HatenaBookmark Change : 1.62% | 786.871333337 -> 799.645444443 Revision Result N chromium@488500 786.871 +- 1268.42 9 good chromium@489000 651.71 +- 110.671 6 good chromium@489250 688.362 +- 546.546 9 good chromium@489252 655.198 +- 51.5453 6 good chromium@489253 638.074 +- 56.6773 6 good chromium@489254 782.135 +- 97.9629 9 bad <-- chromium@489258 793.984 +- 60.1303 9 bad chromium@489266 842.29 +- 278.255 9 bad chromium@489282 789.114 +- 52.9829 9 bad chromium@489313 855.404 +- 449.677 9 bad chromium@489375 793.937 +- 98.12 9 bad chromium@489500 799.645 +- 86.7808 9 bad To Run This Test src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=HatenaBookmark loading.desktop More information on addressing performance regressions: http://g.co/ChromePerformanceRegressions Debug information about this bisect: https://chromeperf.appspot.com/buildbucket_job_status/8972577330053181728 For feedback, file a bug with component Speed>Bisection
,
Jul 31 2017
Reported regressions are: A. TTFCP on Elmundo B. TTFCP on HatenaBookmark C. ServiceWorker micro benchmark. But bisect fails for A and we decided to not care about C (see issue 728467 ). So B is the only "real" regression. yzshen@, sorry for bothering you again, but could you take a look at the regression? Thank you!
,
Jul 31 2017
Sure. Have you also looked into case B / got any findings?
,
Aug 1 2017
The regression is observable on *some* windows bots and *some* mac bots (https://chromeperf.appspot.com/report?sid=dd569ae19ac3a9c1a3e2ccf3c8c8723ec3a39b925a9760cd2529e6cf08074a2e&rev=489278), but that's all I know.
,
Aug 1 2017
Correction: The regression is observable on *some* windows bots. https://chromeperf.appspot.com/report?sid=dd569ae19ac3a9c1a3e2ccf3c8c8723ec3a39b925a9760cd2529e6cf08074a2e&rev=489278 It's not observable on mac and linux bots.
,
Aug 1 2017
Sorry if this is a silly question: when I looked at the graph: https://chromeperf.appspot.com/report?sid=369f5804eec7f388050c1732ec4de0134ccb605e1e26754b437210392b061119&rev=489278 The number on the graph seems to be different from the number in the trace. Does anyone know why they are different? For example, on the graph around the exclamation point icon the number of timeToFirstContentfulPaint_avg is ~850ms. In the popup window, I clicked "Trace" to open the trace. And then in the "Metrics" side-tab, I chose "loadingMetric" and "avg" in the drop lists. It showed me timeToFirstContentfulPaint 385.750ms.
,
Aug 1 2017
Answer my own question on #19: That trace is actually for the "warm" case, while the graph for the "cold" case. Maybe there is a bug in our server serving the wrong trace file.
,
Aug 2 2017
According to [1] (found in [2], "Buildbot status page" link), there are two traces ([3], [4]) for the "cold" case. TTFCP on [3] is 641.159ms and TTFCP on [4] is 1018.998ms. The average of them is 830.0785 which almost matches with TTFCP_avg in the graph. 1: https://luci-logdog.appspot.com/v/?s=chrome%2Fbb%2Fchromium.perf%2FWin_7_Perf%2F981%2F%2B%2Frecipes%2Fsteps%2Floading.desktop_on__102b__GPU_on_Windows_on_Windows-2008ServerR2-SP1%2F0%2Fstdout 2: http://build.chromium.org/p/chromium.perf/builders/Win%207%20Perf/builds/981 3: https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_36-2017-07-25_09-09-33-41766.html 4: https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_154-2017-07-25_09-13-50-80468.html
,
Aug 2 2017
Yutaka: Thanks for looking! You are right. So the following are the "cold" case traces before/after the CL: Before (associated interface): https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_36-2017-07-25_02-39-32-70609.html https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_154-2017-07-25_02-43-29-81775.html After (non-associated interface): https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_36-2017-07-25_09-09-33-41766.html https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_154-2017-07-25_09-13-50-80468.html The first three traces have mostly the same timeToFirstContentfulPaint (<700). But the last one is 1018.998. Looking...
,
Aug 3 2017
Status update: I added some code to count some URLLoader-related activities during the "code" case of HatenaBookmark: (All on the browser IO thread) - Bind a URLLoader request: 244 (~58 happened during the time of timeToFirstContentfulPaint) - URLLoader::FollowRedirect: 32 (~0) - URLLoader::SetPriority: 7 (~5) Ken came up with a theory that maybe it is because setting up a cross-process message pipe requires 2 control messages while an associated interface just requires 1. Ken's experimental CL to reduce message pipe control messages to 1: https://chromium-review.googlesource.com/c/598744 Perf try jobs didn't show improvements, though. The initial run actually showed worse results, but the test results fluctuated quite significantly so maybe that was just noise. I am running more runs of the tests on the perf bots. Other than that, I am going to write a perf test that compare the cost of passing a message pipe across process v.s. that of passing an associated interface. And see whether I can find any hot-spots.
,
Aug 3 2017
,
Aug 8 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/60efee1a417c930ff0096c3a60bdd071c182e998 commit 60efee1a417c930ff0096c3a60bdd071c182e998 Author: Yuzhu Shen <yzshen@chromium.org> Date: Tue Aug 08 21:53:45 2017 Add perf test cases to compare the cost of passing associated and non-associated interfaces. BUG= 753164 , 750187 Change-Id: I1e1c9f4fa889a18ff7078b52e9b346eb19b45db5 Reviewed-on: https://chromium-review.googlesource.com/604873 Reviewed-by: Ken Rockot <rockot@chromium.org> Reviewed-by: Tom Sepez <tsepez@chromium.org> Commit-Queue: Yuzhu Shen <yzshen@chromium.org> Cr-Commit-Position: refs/heads/master@{#492757} [modify] https://crrev.com/60efee1a417c930ff0096c3a60bdd071c182e998/ipc/ipc_mojo_perftest.cc [modify] https://crrev.com/60efee1a417c930ff0096c3a60bdd071c182e998/ipc/ipc_test.mojom
,
Aug 9 2017
,
Aug 14 2017
Status update: I wrote some perf tests to compare the cost ( crbug.com/753164 ). It showed that setting up non-associated interfaces (i.e., message pipes) is more expensive than associated interfaces. Looking at the profiling data has revealed an issue that base::RandBytes() was quite time consuming on Linux. Ken has kindly landed a fix for that ( crbug.com/754004 ). Other than that I didn't see any low-hanging fruit. After the fix, there is still perf difference between the two. Imagine we have the following sequence: - create [associated] interface ptr and request; - send the [associated] interface request to a different process; - close the [associated] interface ptr and request. In the sequence above, 6 control messages are sent for a non-associated interface (for setting up and destroying the message pipe); while only 1 control message for an associated interface. Even in the same-process case (no control messages at all for non-associated interfaces), non-associated interfaces still take more time than associated ones. (Please see the numbers in crbug.com/753164 .) Given that it takes more work to set up a non-associated interface, it is natural that it is slower. Unless we have good ideas how to make further improvements, I will close this bug. Please reopen if you have any concerns or suggestions.
,
Aug 15 2017
Thank you for the investigation! |
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by 42576172...@developer.gserviceaccount.com
, Jul 28 2017