Issue metadata
Sign in to add a comment
|
Contradictory Perf Result from Perf Dashboard and Cluster Telemetry |
||||||||||||||||||||||||
Issue descriptionBackground: We are running 'loading.desktop' benchmark w/ and w/o the '--enable-features=NetworkService' flag to make sure we are not regressing performance. Problem: 1. According to Perf Dashboard we have a 17ms, or 12% improvements over 'timeToFirstContentfulPaint_avg'[1]. 2. According to Cluster Telemetry we have a 13% regression over 'timeToFirstContentfulPaint'[2]. Actions Performed: 1. I double-checked my Perf Dashboard script[3] and don't feel that I misplaced _control and _enabled. 2. Cluster Telemetry shows similar regression since the first run[4]. 3. No-patch Cluster Telemetry run[5] shows that the top-level 'timeToFirstContentfulPaint' is reasonably stable. 4. Tracing from Cluster Telemetry shows that there might be some mismatch between 'navigationStart' and 'firstPaint'[6]. Not sure if that's the cause, still investigating (e.g. Frequency). [1] https://chromeperf.appspot.com/report?sid=53254efa729e991a6e8539b43fc4598bb2b3f473575343ff394a73d286e24a9b [2] https://ct.skia.org/results/cluster-telemetry/tasks/chromium_perf_runs/chongz-20180521181534/html/index.html [3] https://cs.chromium.org/chromium/src/tools/perf/contrib/network_service/loading.py [4] https://ct.skia.org/results/cluster-telemetry/tasks/chromium_perf_runs/jam-20180321232641/html/index.html [5] https://ct.skia.org/results/cluster-telemetry/tasks/chromium_perf_runs/chongz-20180328204144/html/index.html [6] Data gathered from [2] - http://www.windowsazure.com (#5262) - 'trace1' w/o and with patch * Trace w/o patch: https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/http___www_windowsazure_com_2018-05-21_13-15-21_19152.html a. 'navigationStart', start 331.258 ms, frame "513432E51EF9F5B071F456521A671321", data {documentLoaderURL: "http://www.windowsazure.com/", isLoadingMainFrame: true} b. 'navigationStart', start 4,258.779 ms / 4,258.855 ms, frame "513432E51EF9F5B071F456521A671321", data {documentLoaderURL: "https://azure.microsoft.com/en-us/?v=17.14", isLoadingMainFrame: true} c. 'firstContentfulPaint', start 4,709.618 ms, frame "513432E51EF9F5B071F456521A671321" d. Small 'timeToFirstContentfulPaint' (c - b) * Trace with patch: https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/http___www_windowsazure_com_2018-05-21_13-16-45_77905.html a. 'navigationStart', start 516.035 ms, frame "99793CCF63D6759982A4D84E3EBB7DD2", data {documentLoaderURL: http://www.windowsazure.com/, isLoadingMainFrame: true} b. No other 'navigationStart's c. 'firstContentfulPaint', start 5,964.665 ms, frame "99793CCF63D6759982A4D84E3EBB7DD2" d. Large 'timeToFirstContentfulPaint' (c - a)
,
Jun 1 2018
,
Jun 1 2018
hi tim and annie, we would really appreciate if you can assist us here. We are trying to launch network service to canary in M69. Essentially we are flying blind right now. It would be great if you can to let us know why there is a difference between the dashboards and which one we should use to track regressions. Thanks!
,
Jun 1 2018
Is NetworkService controlled by a finch trial? The real source of truth is the experimental data you get from the heartbeat tab in finch. Benchmark results shouldn't block turning on an experiment to gather user data, especially in canary.
,
Jun 1 2018
Re #c4: Just to clarify - Are you saying that Cluster Telemetry runs under finch trial? I don't think we have NetworkService in finch trial, but there might be other features (e.g. MojoLoading, PlzNavigate?) that could be in finch trial and do something different with NetworkService. -OS-Windows +OS-Linux Since both tests run under linux.
,
Jun 1 2018
Cluster telemetry can run any CL (whether it turns on a finch trial or not). What I'm saying is that benchmark results shouldn't block getting performance data on canary. I'm heading out on vacation, but Tim and +Ben can probably help you figure out next steps.
,
Jun 4 2018
Yeah, if data from the lab is confusing, I'd say go ahead and ship a Canary/Dev Finch trial to get some data from the wild. Data from the wild is always more representative.
,
Jun 4 2018
@tim, we are just wondering what the difference between https://ct.skia.org/results/cluster-telemetry/tasks/chromium_perf_runs/chongz-20180521181534/html/index.html and https://chromeperf.appspot.com/report?sid=53254efa729e991a6e8539b43fc4598bb2b3f473575343ff394a73d286e24a9b why 1 shows a 10% increase while the other shows a 10% decrease in performance.
,
Jun 4 2018
It's likely that the change makes some changes pages slower, and some faster. Did you do a perf tryjob? https://chromeperf.appspot.com/report?sid=53254efa729e991a6e8539b43fc4598bb2b3f473575343ff394a73d286e24a9b is just showing me a few graphs, not a perf tryjob results report. I've attached the histogram of FCP deltas for cluster telemetry. Only 1% of pages got faster, which indicates that my hypothesis is likely wrong. Seeing the perf tryjob result will help.
,
Jun 4 2018
Does NetworkService change the NavigationStart timestamp?
,
Jun 4 2018
,
Jun 4 2018
π Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/14b5471e240000
,
Jun 4 2018
π Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/14cb2fb6240000
,
Jun 4 2018
πΏ Pinpoint job stopped with an error. https://pinpoint-dot-chromeperf.appspot.com/job/14cb2fb6240000 Buildbucket says the build completed successfully, but Pinpoint can't find the isolate hash.
,
Jun 4 2018
π Job complete. See results below. https://pinpoint-dot-chromeperf.appspot.com/job/14b5471e240000
,
Jun 6 2018
,
Jun 7 2018
,
Jun 8 2018
+dtu for failing pinpoint job.
,
Jun 8 2018
Re #c10: Thanks for the investigation and sorry for the late reply. (Was sheriffing for the past few days and didn't have space to followup) I don't think the feature changed NavigationStart timestamp, but I noticed a few issues about the timestamp in issue 850258 . Basically (both w/ and w/o patch): 1. (Rare) The NavigationStart timestamp was sampled on the closest 'navigationStart', which could cause mismatch when there are multiple. 2. (Frequent) The NavigationStart timestamp was picked at the very beginning of 'TimelineBasedPageTest.RunNavigateSteps', and counting telemetry overheads. Not sure if it's expected. --- Re email: * I'm running loading.desktop on CT Linux bots, and hopefully we could eliminate hardware differences. * Regarding repro on mobile devices: The feature is probably not ready for Android yet, but I will need to double check.
,
Jun 8 2018
π Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/16ea5701240000
,
Jun 8 2018
π Job complete. See results below. https://pinpoint-dot-chromeperf.appspot.com/job/16ea5701240000
,
Jun 8 2018
Finished the loading.desktop.network_service run on CT bot, the result seems to be similar to the Perf Dashboard (which means hardware difference is not an issue).
e.g. mean('timeToFirstContentfulPaint_avg_diff::values') = -10.17 ms, showing an improvement.
Thus the remaining possibilities are:
1. Infra difference between loading.desktop and loading.cluster_telemetry
* e.g. WebReply, cache, traffic shaping (devtools? tsproxy?), telemetry measurement.
* Observations:
a. loading.desktop has 'SharedDesktopPageState.WillRunStory' before 'SharedDesktopPageState.RunStory', and 'navigationStart' only picked in the latter.
b. mean(loading.desktop/timeToFirstContentfulPaint) = 134.723 ms (Β± 127.026)
mean(loading.cluster_telemetry/timeToFirstContentfulPaint) 23269636.681 / 7895 = 2947.39 ms
2. Story-set issue.
* e.g. `LoadingDesktopStorySet` doesn't represent the web since the TtoFCP is an order of magnitudes smaller.
3. loading.cluster_telemetry has bugs when picking 'navigationStart' ( issue 850258 )
```CT-build85-m5-results-chart.json
"timeToFirstContentfulPaint_avg_diff": {
"summary": {
"std": 67.79791972121144,
"name": "timeToFirstContentfulPaint_avg_diff",
"grouping_keys": {
"cache_temperature": "cold"
},
"improvement_direction": "down",
"important": false,
"values": [
-3.8029999993741512,
305.45799999684095,
1.840999998152256,
4.047999996691942,
-13.333999998867512,
-9.990000005811453,
2.5009999983012676,
6.188999999314547,
-4.213000003248453,
-17.17799999937415,
-5.611999999731779,
11.144000001251698,
-10.63399999588728,
4.993999999016523,
0.9140000008046627,
-11.580000001937151,
-0.1380000002682209,
-46.98499999940395,
-19.024000000208616,
-13.506000004708767,
-33.81300000101328,
1.6299999989569187,
3.344999998807907,
-4.469999998807907,
-1.0480000041425228,
-4.735000003129244,
3.440999999642372,
-0.2579999975860119,
-14.066000003367662,
17.91099999845028,
12.803999997675419,
9.421999998390675,
4.321000002324581,
-8.25900000333786,
9.18300000205636,
12.813000001013279,
-48.552000001072884,
-48.010999999940395,
52.25899999961257,
38.60700000077486,
5.833000004291534,
0.125,
13.543999999761581,
2.5359999984502792,
-4.085999999195337,
0.6580000035464764,
6.000999998301268,
-6.867999996989965,
5.698000002652407,
-1.805000003427267,
-21.95600000023842,
-17.502000000327826,
6.760999999940395,
3.565999999642372,
5.370000001043081,
-5.2870000042021275,
-8.49600000306964,
-4.925999999046326,
3.907999999821186,
-44.35300000011921,
-7.8389999978244305,
0.886999998241663,
-4.960999999195337,
-6.30799999833107,
108.22399999946356,
-84.56200000271201,
1.574999999254942,
2.2200000025331974,
9.925999999046326,
3.2549999989569187,
0.9979999959468842,
1.702999997884035,
3.3680000007152557,
-0.4790000021457672,
-1.1500000022351742,
2.2200000025331974,
-4.279999997466803,
294.79199999943376,
-55.18799999728799,
-10.168999996036291,
-5.885000001639128,
-3.3640000000596046,
-12.230000000447035,
-9.950000002980232,
1.8589999973773956,
0.8070000000298023,
6.387999996542931,
-11.476999998092651,
-357.43299999833107,
10.700000002980232,
-22.607999999076128,
-4.768999997526407,
-368.2859999984503,
-4.9690000005066395,
-48.63800000026822,
-3.8679999969899654,
1.13400000333786,
-11.95099999755621,
-32.5190000012517,
-8.80799999833107,
-32.38399999961257,
-3.0309999994933605,
10.152000002563,
-25.750999998301268,
-26.48699999973178,
1.1920000053942204,
-3.5020000003278255,
-2.190999999642372,
-7.60700000077486,
9.324999999254942,
4.618000000715256,
-0.875,
-10.653000004589558,
-8.605000000447035,
1.5000000037252903,
-21.25899999961257,
12.778000000864267,
-3.7390000000596046,
-5.840000003576279,
9.263999998569489,
-348.21399999782443,
-331.097999997437,
-44.97300000116229,
-46.01500000059605,
9.893999997526407,
-220.25600000098348,
2.1099999994039536,
-8.321999996900558,
2.4109999984502792,
-0.7139999978244305,
-7.057000003755093,
12.40600000321865,
-1.7470000013709068,
-3.4479999989271164,
1.267000000923872,
-8.879999998956919,
-0.8530000001192093,
-5.108999997377396,
-4.623999997973442,
-15.592999994754791,
-26.89700000360608,
7.952999997884035,
-6.6009999960660934,
-6.754999998956919,
3.434000000357628,
4.391000002622604,
-6.456999998539686,
-3.4480000026524067,
-36.479999996721745,
-5.454999998211861,
-2.5,
-24.63900000229478,
-1.505000002682209,
-1.2809999994933605,
0.7120000012218952,
-9.048000000417233,
0.27800000086426735,
-4.717000000178814,
-0.48400000110268593,
6.828999999910593,
4.945999998599291,
13.199000000953674,
6.149999998509884,
7.0030000023543835,
19.70899999886751,
2.7599999979138374,
-13.048000000417233,
1.8640000000596046,
2.3969999998807907,
-9.483000002801418,
-194.80000000074506,
19.20499999821186,
-1.6889999993145466,
7.095000006258488,
-2.527999997138977,
-5.752999998629093,
-3.4639999978244305,
0.25600000098347664,
10.0,
-0.261999998241663,
-27.412999998778105,
8.466000005602837,
-9.789000000804663,
-33.55499999970198,
121.06000000238419,
64.2199999988079,
86.3099999949336,
-39.027000002563,
-0.6490000002086163,
6.900999993085861,
1.199999999254942,
2.753999996930361,
-253.4349999986589,
-53.331999998539686,
5.208999998867512,
7.364000000059605
],
"units": "ms",
"type": "list_of_scalar_values",
"description": "time to first contentful paint"
}
},
```
,
Jun 11 2018
Interesting, that's not what I expected. Let's focus on figuring out the the source of the missing navStart events, hopefully that will clear thing up? In the meantime, can you just go ahead and launch a Finch experiment without blocking on this?
,
Jun 11 2018
Thanks for the comments and suggestions! Yep focusing on navStart sounds good. Iβve discussed with dxie@ and this issue wonβt block Canary launch. Will double check if the feature is ready for Finch experiment.
,
Jun 12 2018
xunjieli@ discovered that Network Service is probably not clearing cache & closing connections correctly for benchmark - issue 838949 (Thanks for the finding!). I will leave this issue for now (since there might be other potential issues making the benchmark not reliable), and revisit when we are ready for Canary / Finch Trial. (100 bugs away, don't want to experiment too early to disturb other teams)
,
Aug 1
The NextAction date has arrived: 2018-08-01
,
Aug 1
,
Aug 31
Latest Cluster Telemetry result shows that we have minimum to none regressions: https://ct.skia.org/results/cluster-telemetry/tasks/chromium_perf_runs/chongz-20180830180335/html/index.html Which matches finch trial UMA: https://uma.googleplex.com/p/chrome/variations/?sid=362e7d650a97a60c75985870ef115409 and latest Perf Dashboard results: https://chromeperf.appspot.com/report?sid=8567732d6c57c94fd29da4899296fb25f6d3943206506b2fc19f1630927ee481 The previous differences may have been cause by missing functionalities similar to issue 874264. Closing as WontFix since we don't see differences anymore.
,
Jan 16
(6 days ago)
,
Jan 16
(6 days ago)
|
|||||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||||
Comment 1 by sullivan@chromium.org
, Jun 1 2018