New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 821388 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner: ----
Closed: May 2018
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug-Regression



Sign in to add a comment

53.4% regression in loading.desktop at 540494:540624

Project Member Reported by hjd@google.com, Mar 13 2018

Issue description

See the link to graphs below.
 
Project Member

Comment 1 by 42576172...@developer.gserviceaccount.com, Mar 13 2018

All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=821388

(For debugging:) Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?sid=f60a14ce57c4d7cdcc9c0e7415dbe64392fabc806ab9be7742e8986869e0ee0b


Bot(s) for this bug's original alert(s):

linux-release
Project Member

Comment 3 by 42576172...@developer.gserviceaccount.com, Mar 13 2018

Cc: x...@chromium.org xhw...@chromium.org m...@chromium.org
Owner: x...@chromium.org
Status: Assigned (was: Untriaged)
📍 Found a significant difference after 1 commit.
https://pinpoint-dot-chromeperf.appspot.com/job/14c03eee440000

Media remoting cleanup: Remove codes that support encrypted contents. by xjz@chromium.org
https://chromium.googlesource.com/chromium/src/+/acebf6504d2e1d1b051f6f8e9fb128d4c7e404c2

Understanding performance regressions:
  http://g.co/ChromePerformanceRegressions
Project Member

Comment 5 by 42576172...@developer.gserviceaccount.com, Mar 13 2018

📍 Found a significant difference after 1 commit.
https://pinpoint-dot-chromeperf.appspot.com/job/1486215e440000

Media remoting cleanup: Remove codes that support encrypted contents. by xjz@chromium.org
https://chromium.googlesource.com/chromium/src/+/acebf6504d2e1d1b051f6f8e9fb128d4c7e404c2

Understanding performance regressions:
  http://g.co/ChromePerformanceRegressions

Comment 7 by x...@chromium.org, Mar 14 2018

Status: Started (was: Assigned)
Project Member

Comment 8 by 42576172...@developer.gserviceaccount.com, Mar 14 2018

📍 Couldn't reproduce a difference.
https://pinpoint-dot-chromeperf.appspot.com/job/14d13c0e440000

Comment 10 by x...@chromium.org, Mar 15 2018

 Issue 821387  has been merged into this issue.
Project Member

Comment 11 by 42576172...@developer.gserviceaccount.com, Mar 15 2018

📍 Couldn't reproduce a difference.
https://pinpoint-dot-chromeperf.appspot.com/job/11908d76440000
Project Member

Comment 13 by 42576172...@developer.gserviceaccount.com, Mar 15 2018

📍 Couldn't reproduce a difference.
https://pinpoint-dot-chromeperf.appspot.com/job/1303551e440000
Project Member

Comment 15 by 42576172...@developer.gserviceaccount.com, Mar 16 2018

📍 Couldn't reproduce a difference.
https://pinpoint-dot-chromeperf.appspot.com/job/1421fe36440000

Comment 16 by x...@chromium.org, Mar 16 2018

Owner: ----
Status: Untriaged (was: Started)
Sent this back for triage.
My change is mainly removing the unused codes. And Pinpoint couldn't reproduce a difference even when that CL is completely reverted (See Comment 15).
Cc: simonhatch@chromium.org dtu@chromium.org
Owner: x...@chromium.org
Status: Assigned (was: Untriaged)
I'm really sorry the messaging on the bug is unclear--+dtu is working on improving the UI and bug messaging for pinpoint tryjobs. In this case, it's actually using "success rate" (percentage of times the benchmark passed) as the metric with "no difference", which is super confusing. Currently, you need to click on "Analyze benchmark results" to see the all the metrics for the benchmark.

Example from #15:
* Analyze Benchmark Results points here: https://pinpoint-dot-chromeperf.appspot.com/results2/1421fe36440000
* Clicking on "timeToFirstContentfulPaint" in the table, you see 114.779ms at head vs 111.667ms with your CL reverted.
Project Member

Comment 19 by 42576172...@developer.gserviceaccount.com, Mar 28 2018

📍 Found a significant difference after 1 commit.
https://pinpoint-dot-chromeperf.appspot.com/job/107a438b440000

[NOT FOR REVIEW] Revert "Media remoting cleanup: Remove codes that support encrypted contents." by xjz@chromium.org
https://chromium-review.googlesource.com/c/chromium/src/+/963852/3

Understanding performance regressions:
  http://g.co/ChromePerformanceRegressions

Comment 20 by dtu@chromium.org, Mar 28 2018

Sorry, the docs may be out of date. If you use the "+" on the Job results page (the one with the graph) it will run a try job with the right metric.
Project Member

Comment 22 by 42576172...@developer.gserviceaccount.com, Mar 29 2018

📍 Couldn't reproduce a difference.
https://pinpoint-dot-chromeperf.appspot.com/job/14bf8297440000

Comment 23 by x...@chromium.org, Mar 29 2018

Owner: sullivan@chromium.org
Pinpoint tests indicate that the CL causes a slight regression on one metric, but a larger improve on another metric.

I did the Pinpoint with the revert CL twice, one in #15 and one in #22. The "timeToFirstContentfulPaint" is 114.779ms vs 111.667ms in #15 and 130.049ms vs 130.338ms in #25, which sounds a slight regression. However, if looking at "timeToFirstMeaningfulPaint", it is 142.996ms vs 160.390ms in #15 and 236.113ms vs 241.098ms in #25, which indicates a larger improvement.   

sullivan@: It sounds that my CL doesn't cause the regression as significant as reported (53.4%), which makes sense since that CL mostly removed codes that were never called. One change might affect timing a little bit when loading a video element. However, I did try Pinpoint with disabling that feature in #8. The test result didn't indicate an improvement either.
Owner: tdres...@chromium.org
tdresser: As metric owner, can you take a look? This is pretty confusing: pinpoint bisect clearly reproduces a large shift, but the perf try job doesn't (see #23). Any idea what's happening here?

Comment 25 by dtu@chromium.org, Mar 29 2018

There are a few additional confounding factors here. The original regression report and bisects were on just two stories (webpages), PremierLeague and ja.wikipedia. Whereas the perf try job results are across all stories in the benchmark.

When I break down the try job results by story and cache temperature, I see 122.577 ms to 157.103 ms for PremierLeague / warm.

I also want to note that the bisects show the median, while the perf try job results show the mean. The two can differ significantly for bimodal results like these.
Woah. Why do we use different statistics between bisects and tryjobs?
Cc: -xhw...@chromium.org

Comment 28 by dtu@chromium.org, Mar 29 2018

I would say it has to do with the level of detail you want to show.

Internally, bisects use the full distribution, so we don't use any of the summary statistics like mean/median/etc. except for a few edge cases like frame_times.

If you want to reduce the distribution to a single number, like we show in results2, the mean (or trimmed mean) is a better choice, since it's resistant to the aliasing effects of bimodal distributions.

The Pinpoint chart (for display only) is somewhere in between, so the intent is to show the five-number summary (min, Q1, median, Q3, max, though Q1 and Q3 aren't visible on the chart yet (go/catabug/3877))
Cc: tdres...@chromium.org
Owner: x...@chromium.org
Gotcha, thanks for clarifying.

I think the next step here is probably to try repro'ing locally. The data is consistent enough that it seems likely there's a real problem here.

xjz@, are you able to try reproducing this regression locally?

Comment 30 by x...@chromium.org, Mar 29 2018

Thanks for the explaination.

I re-checked the tests in #15 and #22 and break down the results by story and cache temperature as mentioned in #25. It still doesn't indicate a significant regression. For PremierLeague / warm, the result (head vs head+revert cl) is 108.810ms vs 153.498ms in #15 and 122.577ms vs 157.103ms in #22, which both indicate >20% improvement by my CL. For PremierLeagure / cold, there is no result in #15, and the result in #22 is 164.249ms vs 156.442ms, which indicates a ~5% regression. For ja.wikipedia, the results are not consistent. #15 shows a slight regression, which is 727.231ms vs 712.028ms for cold, and 87.100ms vs 86.644ms for warm. However, #22 shows a slight improvement, which is 729.404ms vs 747.102ms for cold, and 91.183ms vs 91.234ms for warm. 

Please let me know if I still didn't analyze the results properly. 

Comment 31 by x...@chromium.org, Apr 11 2018

Owner: ----
Status: Untriaged (was: Assigned)
Sent this back for triage. 

I was trying to find out the root cause. However, according to the experimental results, as mentioned in #30, my CL introduces slight regression (<5%) in some cases, but also improvements (>20%) in other cases. Both experiments didn't repro the significant regression as reported.

My CL is a refactoring CL that just removed unused codes. Though it might change the timing of other unrelated thing, in which case it is hard for me to find out. For now I don't have any further actionable plan other than closing this issue. 
Status: WontFix (was: Untriaged)
This looks like it just shifts the timing around from bimodal low to bimodal high.

Sign in to add a comment