New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 752457 link

Starred by 1 user

Issue metadata

Status: WontFix
Merged: issue 747815
Owner:
Closed: Sep 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug-Regression



Sign in to add a comment

24.6%-194.2% regression in thread_times.tough_scrolling_cases at 487411:488741

Project Member Reported by sullivan@chromium.org, Aug 4 2017

Issue description

Split from  bug 747815  as they appear to be bisecting to a different culprit.
 
All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=752457

(For debugging:) Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?sid=0c9abfc531de2b0742191f917e900819e2a29987e771891618556aa85e6a3177


Bot(s) for this bug's original alert(s):

linux-release
Mergedinto: 747815
Status: Duplicate (was: Untriaged)

=== BISECT JOB RESULTS ===
Perf regression found with culprit

Suspected Commit
  Author : Florin Malita
  Commit : 2a27a8ddae522a22a22e0aadf56399537b695df1
  Date   : Tue Jul 18 17:13:13 2017
  Subject: Enable Skia's integral-translate bilerp optimization

Bisect Details
  Configuration: linux_perf_bisect
  Benchmark    : thread_times.tough_scrolling_cases
  Metric       : thread_raster_cpu_time_per_frame/canvas_05000_pixels_per_second
  Change       : 27.70% | 0.691850527859 -> 0.883490238341

Revision             Result                     N
chromium@487410      0.691851 +- 0.117418       6      good
chromium@487494      0.679939 +- 0.210904       6      good
chromium@487505      0.698331 +- 0.0806706      6      good
chromium@487506      0.670778 +- 0.0816483      6      good
chromium@487507      0.888985 +- 0.29552        9      bad       <--
chromium@487508      0.875228 +- 0.164776       6      bad
chromium@487510      0.874644 +- 0.128743       6      bad
chromium@487515      0.904772 +- 0.0958558      6      bad
chromium@487536      0.877581 +- 0.0826828      6      bad
chromium@487577      0.932945 +- 0.0666199      6      bad
chromium@487743      0.881487 +- 0.0852376      6      bad
chromium@488076      0.902546 +- 0.144359       6      bad
chromium@488741      0.88349 +- 0.0991412       6      bad

To Run This Test
  src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=canvas.05000.pixels.per.second thread_times.tough_scrolling_cases

More information on addressing performance regressions:
  http://g.co/ChromePerformanceRegressions

Debug information about this bisect:
  https://chromeperf.appspot.com/buildbucket_job_status/8972188362955296624


For feedback, file a bug with component Speed>Bisection
Cc: tdres...@chromium.org
Owner: fmalita@chromium.org
Status: Assigned (was: Duplicate)
Keeping this split off from  bug 747815 , but there's some context there. +tdresser, owner of thread_times.tough_scrolling_cases benchmark, see comments on  bug 747815 : should we be concerned about this?
Yeah, we've seen this a few times before.
I don't think there's much we can do about this, other than be aware of it. Ideally we'd be able to see some metric improve whenever this happens.

I can imagine the dashboard keeping track of some stats that should remain constant, like trace duration, or number of input events processed, and then indicating if these varied when an alert is detected. Are there other invariants like this that would benefit other benchmarks?
Cc: eakuefner@chromium.org benjhayden@chromium.org simonhatch@chromium.org
This is an interesting idea! +Ben, Ethan, Simon because it seems like there could be a diagnostic around this.

The most similar thing I can think of is power, when it regresses you want to know if CPU or idle wakeups changed.
Tracing metric measures trace size and peak event rate, but could be extended to measure total number of events etc. Other metrics could build RelatedHistogramMaps to point to tracingMetric's histograms, and/or add sample Breakdown diagnostics to count the number of events that contributed to each sample.
I've been wanting to work with metric authors more closely to help them write good diagnostics:
Loading metric could use some more breakdowns: https://github.com/catapult-project/catapult/issues/3326
Memory metric could use breakdown sample diagnostics.
It didn't seem like there was very much interest in this, so I figured I'd focus on the histogram pipeline for now and revisit these diagnostics after that. Let me know if I should prioritize.

Status: WontFix (was: Assigned)
WontFix-ing per #5

Sign in to add a comment