New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 712208 link

Starred by 1 user

Issue metadata

Status: Archived
Owner: ----
Closed: Aug 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: Linux , Windows
Pri: 2
Type: Bug-Regression



Sign in to add a comment

sunspider failing on 6 builders

Project Member Reported by fmea...@chromium.org, Apr 17 2017

Issue description

Lunching a regression bisect on the benchmark_duration since the benchmark currently timeouts.
Project Member

Comment 3 by 42576172...@developer.gserviceaccount.com, Apr 17 2017


=== BISECT JOB RESULTS ===
NO Perf regression found

Bisect Details
  Configuration: linux_perf_bisect
  Benchmark    : sunspider
  Metric       : benchmark_duration/benchmark_duration

Revision             Result                  N
chromium@464246      0.350898 +- 0.3909      21      good
chromium@464349      2.29648 +- 11.3725      21      bad

To Run This Test
  src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests sunspider

Debug Info
  https://chromeperf.appspot.com/buildbucket_job_status/8982048640897775440

Is this bisect wrong?
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5856316829990912


| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Speed>Bisection.  Thank you!
Project Member

Comment 5 by 42576172...@developer.gserviceaccount.com, Apr 17 2017


=== BISECT JOB RESULTS ===
NO Perf regression found

Bisect Details
  Configuration: linux_perf_bisect
  Benchmark    : sunspider
  Metric       : benchmark_duration/benchmark_duration

Revision             Result                   N
chromium@464246      0.33116 +- 0.178776      21      good
chromium@464349      1.80441 +- 10.637        21      bad

To Run This Test
  src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests sunspider

Debug Info
  https://chromeperf.appspot.com/buildbucket_job_status/8982042919529290288

Is this bisect wrong?
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5856316829990912


| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Speed>Bisection.  Thank you!
Trying a return code bisect
Project Member

Comment 8 by 42576172...@developer.gserviceaccount.com, Apr 17 2017


=== BISECT JOB RESULTS ===
NO Test failure found

Bisect Details
  Configuration: linux_perf_bisect
  Benchmark    : sunspider
  Metric       : benchmark_duration/benchmark_duration

Revision             Exit Code      N
chromium@464246      0 +- N/A       5      good
chromium@464349      0 +- N/A       5      bad

To Run This Test
  src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests sunspider

Debug Info
  https://chromeperf.appspot.com/buildbucket_job_status/8982035951409017744

Is this bisect wrong?
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5803615232458752


| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Speed>Bisection.  Thank you!
Cc: sullivan@chromium.org simonhatch@chromium.org
I'm not sure where you got the revision range for the return code bisect? The debug button here:
https://chromeperf.appspot.com/report?sid=aac0dd8b335248f537b95064ae5ee96d24d03480392f33be8586e16dff251fbf

Says r464922:r465010. I'm going to try bisecting on that range.
The range r464246:r464349 seems to match the chromium revisions on https://build.chromium.org/p/chromium.perf/builders/Linux%20Perf where sunspider seemingly first started failing, ie. on builds 564/565.

Sunspider had a successful run on build 577, so this looks flakey?
Yes it is flaky but mostly failing, I changed the ranges in case the flakiness start point is earlier than the first failure. Most of my ranges are based on the benchmark_duration spike in the graph assuming that failing --> triggers timeout --> affects benchmark_duration.
Project Member

Comment 16 by 42576172...@developer.gserviceaccount.com, Apr 18 2017


=== BISECT JOB RESULTS ===
NO Test failure found

Bisect Details
  Configuration: linux_perf_bisect
  Benchmark    : sunspider
  Metric       : Total/Total

Revision             Exit Code      N
chromium@464922      1 +- N/A       20      good
chromium@465010      1 +- N/A       20      bad

To Run This Test
  src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests sunspider

Debug Info
  https://chromeperf.appspot.com/buildbucket_job_status/8981959046088236400

Is this bisect wrong?
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5888498818613248


| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Speed>Bisection.  Thank you!
Project Member

Comment 17 by 42576172...@developer.gserviceaccount.com, Apr 18 2017


=== BISECT JOB RESULTS ===
NO Perf regression found

Bisect Details
  Configuration: linux_perf_bisect
  Benchmark    : sunspider
  Metric       : benchmark_duration/benchmark_duration

Revision             Result                    N
chromium@461528      0.356595 +- 0.440258      21      good
chromium@464350      2.0583 +- 11.0596         21      bad

To Run This Test
  src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests sunspider

Debug Info
  https://chromeperf.appspot.com/buildbucket_job_status/8981964097060799168

Is this bisect wrong?
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5800358237962240


| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Speed>Bisection.  Thank you!
Project Member

Comment 19 by 42576172...@developer.gserviceaccount.com, Apr 19 2017


=== BISECT JOB RESULTS ===
NO Perf regression found

Bisect Details
  Configuration: linux_perf_bisect
  Benchmark    : sunspider
  Metric       : benchmark_duration/benchmark_duration

Revision             Result                    N
chromium@461528      0.352703 +- 0.425653      21      good
chromium@464350      1.07455 +- 8.30167        21      bad

To Run This Test
  src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests sunspider

Debug Info
  https://chromeperf.appspot.com/buildbucket_job_status/8981944772014837216

Is this bisect wrong?
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5800358237962240


| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Speed>Bisection.  Thank you!
Cc: benhenry@chromium.org hablich@chromium.org
I think we are going to have to disable sunspider benchmark as it is no longer passing, but flaky enough that the failure doesn't bisect.
(also failing on android)
Project Member

Comment 22 by bugdroid1@chromium.org, Apr 27 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/19045842e1eff8063bd5aca05c8a94aa2035f7a5

commit 19045842e1eff8063bd5aca05c8a94aa2035f7a5
Author: sullivan <sullivan@chromium.org>
Date: Thu Apr 27 00:04:35 2017

Disable failing sunspider test on all platforms

BUG= 712208 
TBR=nednguyen@google.com

Review-Url: https://codereview.chromium.org/2844003003
Cr-Commit-Position: refs/heads/master@{#467521}

[modify] https://crrev.com/19045842e1eff8063bd5aca05c8a94aa2035f7a5/tools/perf/benchmarks/sunspider.py

Cc: bmeu...@chromium.org
Owner: mvstan...@chromium.org
Status: Assigned (was: Available)
Looks like the JS condition 

window.location.pathname.indexOf("results.html") >= 0'
        '&& typeof(output) != "undefined"

is not true anymore. Reference https://codesearch.chromium.org/chromium/src/tools/perf/benchmarks/sunspider.py?rcl=659c8284c33baf483121f1248d5aaa989d5dabe0&l=95

Adding benchmark owners.
Cc: mvstan...@chromium.org
Owner: bmeu...@chromium.org
Assigning to perf sheriff
Cc: -bmeu...@chromium.org
Owner: sullivan@chromium.org
Works for me. hablich@ thinks it might be infra related.
Background: The JavaScript expression seems to still fire correctly when the test runs on a workstation without any flakyness. Seems like the 300 seconds timeout are not enough anymore though. It is strange that a 5 minute timeout is not enough when the benchmark locally runs for ... 15 seconds. This hints that something on the hardware side has changed.
Cc: martiniss@chromium.org
Owner: nedngu...@google.com
Ned, Stephen: see #26. Can you take a look at why this test doesn't finish on the bots?
Owner: ----
Status: Available (was: Assigned)
There is no reason to believe that this is a bot problem given that the "reference" benchmark are running fine on the same bot. 

https://build.chromium.org/p/chromium.perf/builders/Win%2010%20Perf/builds/722
sunspider fails after 5 minutes
sunspider.reference passes after 30s

To me the flakiness is more likely to be a Chrome binary bug. Maybe v8, maybe blink, I am not sure.
This graph shows the clear differences between the two binaries: https://chromeperf.appspot.com/report?sid=221fe881082a4a03021a16b78cc82e5ef4645370a14876e4aafc0c927e7a3e04
Cc: -martiniss@chromium.org
Hmmm, I doubt this is V8 related. I looked at a few of the spikes and the V8 changes are not related or there are no changes at all (e.g. http://test-results.appspot.com/revision_range?start=464857&end=464866).

If it would be something V8 related the JavaScript expression would never succeed.

Is the ref build running the same Telemetry/Catapult?
Yes, it does. Everything is exactly the same, except Chrome binary.

Note that since this is a flaky failure type, I don't think looking at the spike point in the graph can help us find out the culprit.
Based on the debug screenshot https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/profiler-file-id_0-2017-04-13_12-50-0770910.png, it looks like the test finished but the output results is not in the DOM. 

Probably some blink bug?
Cc: mlippautz@chromium.org haraken@chromium.org
Possible or maybe a GC (V8 or oilpan) thing. We had one GC bug in the past where DOM elements were collected even if they were not dead. This bug is long fixed though and I would be surprised if this triggers only on Sunspider.

+mlippautz and +haraken to confirm that this is not the case here.
Hard to say. If it's only happening on M57 then it is likely the bug.

We also recently had some hickups after some bindings change which got reverted in 4a97922bec93566369465c2e3d7facc45d165005. Is it gone after this one?

(I don't have a full bindings overview; only the stuff that is related to wrapper tracing.)
Project Member

Comment 36 by bugdroid1@chromium.org, May 15 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/14232dcc19c1dc66c2adad323dfda282b69cb0c8

commit 14232dcc19c1dc66c2adad323dfda282b69cb0c8
Author: nednguyen <nednguyen@google.com>
Date: Mon May 15 17:14:04 2017

Remove sunspider benchmark

Since this benchmark doesn't use press benchmark harness, we clean it up to reduce
technical debt (also see https://bugs.chromium.org/p/chromium/issues/detail?id=708103#c13 for
further context). In addition, the benchmark has been disabled everywhere
due to a crash bug ( crbug.com/712208 ), removing this won't reduce the
current coverage anyway.

BUG= 712208 , 714231

Review-Url: https://codereview.chromium.org/2874983003
Cr-Commit-Position: refs/heads/master@{#471812}

[modify] https://crrev.com/14232dcc19c1dc66c2adad323dfda282b69cb0c8/tools/perf/benchmark.csv
[delete] https://crrev.com/8fef90770e83bf54f895c971ff08d5016ddabefc/tools/perf/benchmarks/sunspider.py

Status: Archived (was: Available)

Sign in to add a comment