New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 638012 link

Starred by 2 users

Issue metadata

Status: WontFix
Owner:
Closed: Aug 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 2
Type: Bug-Regression



Sign in to add a comment

16.1% regression in v8.infinite_scroll_tbmv2 at 411850:411870

Project Member Reported by benjhayden@chromium.org, Aug 15 2016

Issue description

Did the discourse page change?
 
All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=638012

Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?keys=agxzfmNocm9tZXBlcmZyFAsSB0Fub21hbHkYgICgjq3asAkM


Bot(s) for this bug's original alert(s):

chromium-rel-mac10
Project Member

Comment 3 by 42576172...@developer.gserviceaccount.com, Aug 16 2016


===== BISECT JOB RESULTS =====
Status: completed


=== Bisection aborted ===
The bisect was aborted because The metric values for the initial "good" and "bad" revisions do not represent a clear regression.
Please contact the the team (see below) if you believe this is in error.

=== Warnings ===
The following warnings were raised by the bisect job:

 * Bisect failed to reproduce the regression with enough confidence.

===== TESTED REVISIONS =====
Revision         Mean      Std Dev  N   Good?
chromium@411849  76145191  6817540  12  good
chromium@411870  75727971  7010140  18  bad

Bisect job ran on: mac_10_10_perf_bisect
Bug ID: 638012

Test Command: src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --also-run-disabled-tests v8.infinite_scroll_tbmv2
Test Metric: memory:chrome:all_processes:reported_by_chrome:v8:effective_size_max/memory:chrome:all_processes:reported_by_chrome:v8:effective_size_max
Relative Change: 6.59%
Score: 0

Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/mac_10_10_perf_bisect/builds/2301
Job details: https://chromeperf.appspot.com/buildbucket_job_status/9004219612618019024


Not what you expected? We'll investigate and get back to you!
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5228703013928960

| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!
Cc: fmea...@chromium.org petrcermak@chromium.org
fmeawad, petr, does this look like a real regression in the discourse page for v8 memory?
Cc: u...@chromium.org
This looks like a real regression to me, I have also checked the v8-rolls and nothing there is a red flag. The catapult roll does not change the metric, also both of them are still running yosemite (some bots migrated to elcapitain recently)

Adding Ulan.

Yes, it looks like a genuine regression in the "discourse" story. I kicked off a bisect on that single story.
Labels: OS-Mac
Project Member

Comment 9 by 42576172...@developer.gserviceaccount.com, Aug 17 2016


===== BISECT JOB RESULTS =====
Status: completed


=== Bisection aborted ===
The bisect was aborted because The metric values for the initial "good" and "bad" revisions do not represent a clear regression.
Please contact the the team (see below) if you believe this is in error.

=== Warnings ===
The following warnings were raised by the bisect job:

 * Bisect failed to reproduce the regression with enough confidence.

===== TESTED REVISIONS =====
Revision         Mean      Std Dev   N   Good?
chromium@411849  53264384  19541121  12  good
chromium@411870  57346341  24853368  14  bad

Bisect job ran on: mac_10_10_perf_bisect
Bug ID: 638012

Test Command: src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --also-run-disabled-tests v8.infinite_scroll_tbmv2
Test Metric: memory:chrome:all_processes:reported_by_chrome:v8:effective_size_max/tumblr
Relative Change: 1.57%
Score: 0

Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/mac_10_10_perf_bisect/builds/2312
Job details: https://chromeperf.appspot.com/buildbucket_job_status/9004076830297826768


Not what you expected? We'll investigate and get back to you!
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5910730331652096

| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!
Owner: petrcermak@chromium.org
Running a wider bisect ↑
Project Member

Comment 13 by 42576172...@developer.gserviceaccount.com, Aug 19 2016


===== BISECT JOB RESULTS =====
Status: completed


=== Bisection aborted ===
The bisect was aborted because The metric values for the initial "good" and "bad" revisions do not represent a clear regression.
Please contact the the team (see below) if you believe this is in error.

=== Warnings ===
The following warnings were raised by the bisect job:

 * Bisect failed to reproduce the regression with enough confidence.

===== TESTED REVISIONS =====
Revision         Mean      Std Dev   N   Good?
chromium@411737  44788395  14600410  12  good
chromium@411870  47468089  14103963  18  bad

Bisect job ran on: mac_10_10_perf_bisect
Bug ID: 638012

Test Command: src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --also-run-disabled-tests v8.infinite_scroll_tbmv2
Test Metric: memory:chrome:all_processes:reported_by_chrome:v8:effective_size_avg/tumblr
Relative Change: 6.51%
Score: 0

Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/mac_10_10_perf_bisect/builds/2314
Job details: https://chromeperf.appspot.com/buildbucket_job_status/9003908029181022976


Not what you expected? We'll investigate and get back to you!
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5318588693479424

| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!
The bisect can't reproduce the regression at all (even though it's very clear). I kicked of another bisect (#14) to check if the bisect can represent recent values on the dashboard.
Project Member

Comment 16 by 42576172...@developer.gserviceaccount.com, Aug 19 2016


===== BISECT JOB RESULTS =====
Status: completed


=== Bisection aborted ===
The bisect was aborted because The metric values for the initial "good" and "bad" revisions do not represent a clear regression.
Please contact the the team (see below) if you believe this is in error.

=== Warnings ===
The following warnings were raised by the bisect job:

 * Bisect failed to reproduce the regression with enough confidence.

===== TESTED REVISIONS =====
Revision         Mean       Std Dev   N   Good?
chromium@412644  100887211  16740436  12  good
chromium@412670  100013397  14323265  18  bad

Bisect job ran on: mac_10_10_perf_bisect
Bug ID: 638012

Test Command: src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --also-run-disabled-tests v8.infinite_scroll_tbmv2
Test Metric: memory:chrome:all_processes:reported_by_chrome:v8:effective_size_max/tumblr
Relative Change: 0.46%
Score: 0

Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/mac_10_10_perf_bisect/builds/2315
Job details: https://chromeperf.appspot.com/buildbucket_job_status/9003886704654578432


Not what you expected? We'll investigate and get back to you!
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5906878551293952

| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!
Cc: pras...@chromium.org robert...@chromium.org
Prasad, Roberto, check comment #15.
It turns out that the bisect can roughly match the numbers on the dashboard at r412644-r412670, so let's do a wider bisect...
Project Member

Comment 20 by 42576172...@developer.gserviceaccount.com, Aug 23 2016

Cc: csharrison@chromium.org
Owner: csharrison@chromium.org

=== Auto-CCing suspected CL author csharrison@chromium.org ===

Hi csharrison@chromium.org, the bisect results pointed to your CL below as possibly
causing a regression. Please have a look at this info and see whether
your CL be related.


===== BISECT JOB RESULTS =====
Status: completed


===== SUSPECTED CL(s) =====
Subject : Add testing configs for ParseHTMLOnMainThread experiment
Author  : csharrison
Commit description:
  
BUG=623165

Review-Url: https://codereview.chromium.org/2221193002
Cr-Commit-Position: refs/heads/master@{#411880}
Commit  : 46be1b831ffec878df1b258a4f26872451d7795e
Date    : Sat Aug 13 06:18:55 2016


===== TESTED REVISIONS =====
Revision         Mean       Std Dev   N   Good?
chromium@411849  69196914   25602722  18  good
chromium@411874  53439147   17197683  12  good
chromium@411878  77863481   26772025  12  good
chromium@411879  52477952   3852714   5   good
chromium@411880  106479616  8606225   12  bad    <--
chromium@411881  100974592  26802496  8   bad
chromium@411887  105955328  9408012   5   bad
chromium@411899  110359347  468937    5   bad
chromium@411949  110411776  741455    8   bad
chromium@412048  106165043  9506832   5   bad
chromium@412247  97217195   11166806  18  bad
chromium@412644  103945557  8628939   12  bad

Bisect job ran on: mac_10_10_perf_bisect
Bug ID: 638012

Test Command: src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --also-run-disabled-tests v8.infinite_scroll_tbmv2
Test Metric: memory:chrome:all_processes:reported_by_chrome:v8:effective_size_max/tumblr
Relative Change: 34.45%
Score: 99.5

Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/mac_10_10_perf_bisect/builds/2321
Job details: https://chromeperf.appspot.com/buildbucket_job_status/9003610503062281328


Not what you expected? We'll investigate and get back to you!
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5899454129897472

| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!
csharrison: I assume your patch (r411880) only modifies testing code that doesn't run in production, right?
Cc: kouhei@chromium.org
No, it made the bots use my experimental feature. It is reasonable that that CL caused a regression, but I'm not sure how it could cause a memory regression (overall).

The experiment pulls the background parser into the main thread (from the background thread). Do the memory metrics fully account for off-thread memory allocations?

+kouhei for any ideas.
I see the change in the effective_size metric of the "discourse" story. The story seem to have two modes, and the CL has fixed the mode to the "regression" mode.
This is totally possible if the memory consumption depends on particular task scheduling, as the patch changes how parser tasks are scheduled. However, I think it is questionable if we should treat this as a regression.
Project Member

Comment 25 by 42576172...@developer.gserviceaccount.com, Aug 24 2016


===== BISECT JOB RESULTS =====
Status: completed


===== SUSPECTED CL(s) =====
Subject : Add testing configs for ParseHTMLOnMainThread experiment
Author  : csharrison
Commit description:
  
BUG=623165

Review-Url: https://codereview.chromium.org/2221193002
Cr-Commit-Position: refs/heads/master@{#411880}
Commit  : 46be1b831ffec878df1b258a4f26872451d7795e
Date    : Sat Aug 13 06:18:55 2016


===== TESTED REVISIONS =====
Revision         Mean      Std Dev  N   Good?
chromium@411849  72745543  3915742  8   good
chromium@411874  74792122  4994849  12  good
chromium@411878  75705528  5481875  18  good
chromium@411879  75785587  6499378  18  good
chromium@411880  82364787  1858700  12  bad    <--
chromium@411881  83434326  1106590  12  bad
chromium@411887  81840653  3043056  8   bad
chromium@411899  83688350  778709   5   bad
chromium@411949  82260598  2075426  5   bad
chromium@412048  81108116  4755766  8   bad
chromium@412247  83686103  3397496  5   bad
chromium@412644  82386654  1778358  5   bad

Bisect job ran on: mac_10_10_perf_bisect
Bug ID: 638012

Test Command: src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --also-run-disabled-tests v8.infinite_scroll_tbmv2
Test Metric: memory:chrome:all_processes:reported_by_chrome:v8:effective_size_max/memory:chrome:all_processes:reported_by_chrome:v8:effective_size_max
Relative Change: 14.12%
Score: 99.9

Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/mac_10_10_perf_bisect/builds/2325
Job details: https://chromeperf.appspot.com/buildbucket_job_status/9003543458991093376


Not what you expected? We'll investigate and get back to you!
  https://chromeperf.appspot.com/bad_bisect?try_job_id=6469746598346752

| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!
Another bisect confirmed that r411880 is the most likely culprit for a +6.3 MiB in the maximum effective size of V8. I found that the following V8 values regressed (https://chromeperf.appspot.com/report?sid=3bc3861624431439e2deafdd55e0326d57be182da9baac2bd426ecf6331de593&rev=411870):

memory:chrome:all_processes:reported_by_chrome:v8:effective_size_avg
memory:chrome:all_processes:reported_by_chrome:v8:allocated_objects_size_max
memory:chrome:all_processes:reported_by_chrome:v8:heap:effective_size_avg
memory:chrome:all_processes:reported_by_chrome:v8:heap:allocated_objects_size_max
memory:chrome:all_processes:reported_by_chrome:v8:heap:old_space:effective_size_max
memory:chrome:all_processes:reported_by_chrome:v8:heap:old_space:allocated_objects_size_max
memory:chrome:all_processes:reported_by_chrome:v8:heap:new_space:effective_size_max
memory:chrome:all_processes:reported_by_chrome:v8:heap:map_space:effective_size_max
memory:chrome:all_processes:reported_by_chrome:v8:heap:map_space:allocated_objects_size_max

csharrison,ulan: I leave it up to you to decide if this is a genuine regression and what should be done.
Note that the page that actually regressed is tumblr (https://github.com/catapult-project/catapult/issues/2694).
Charles, Ulan, Any updates you can share on this bug?
I plan on investigating this soon, just haven't had the time. I consider investigating this a blocker to launching the experiment.

Two TODOs on my end:
1. Repro locally and observe traces seeing what exactly is happening here on both versions
2. Check out V8 memory size for the different experiment variations.
I couldn't repro this locally. I started a try job on mac here:
https://codereview.chromium.org/2335313005
Ping Chris: any update from your tryjob?
Here are two tries that worked:
http://storage.googleapis.com/chromium-telemetry/html-results/results-2016-09-14_18-58-31

http://storage.googleapis.com/chromium-telemetry/html-results/results-2016-09-14_18-35-33

For one of these I see that my revert caused a gain in effective_size_max and the other I see a loss. Same with allocated_objects_size_max. tumblr doesn't always look to be the odd page out.
Cc: -petrcermak@chromium.org
Cc: -petrcermak@chromium.org
Chris, are you the right owner to continue driving this? What are the next steps here?
I think you mean me (I'm Charlie :), and yeah I'm the correct driver for this. I'll try to get to it this week. Sorry for the delays.
I'm inclined to WontFix this. I can't reproduce the tumblr regression in perf jobs. I see a regression in flickr but I checked the traces and it seems like we're doing a *lot* more js execution in the single-threaded patch. I don't exactly know what to make of that but it doesn't really seem like the parser's fault.

I also tested this locally to see what it looks like on Linux and the tumblr test scrolled to different positions with and without the patch. Do we have a metric for "amount scrolled" for these infinite scrolling cases? I imagine a patch that causes the page to scroll more before timing out (after 70s it looks like) would look like a memory regression.
I forgot to mention: this would be a bit easier to analyze if we could specify which tracing categories to apply to the bisect. Right now the traces are pretty bare.

Comment 39 by u...@chromium.org, Oct 13 2016

Charlie, does the ParseHTMLOnMainThread experiment move parsing work to the main thread?
If so, then maybe there is less idle time available for garbage collection, which causes the regression.

Note that the memory graphs recovered (and improved) because we landed GC optimizations recently.
You're correct we're moving parsing work to the main thread, and it could conceivably result in less idle time for GC, etc. It is really hard to see from traces because we trace so few categories for this metric.

The one thing I could tell was that with my experiment on, the flickr page did a lot more v8 work:

Experiment turned off (with patch):
https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_3-2016-10-12_15-07-29-78148.html

Experiment on (without patch):
https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_3-2016-10-12_16-01-10-64975.html

The parsing tasks can be seen when the experiment is turned off. It looks like they take up something like 7ms when they're on the off thread. It is very hard to see them when they are mixed in with the main thread though.
Cc: -kouhei@chromium.org rmcilroy@chromium.org
+rmcilroy as it seems there are some questions about the efficacy of this specific metric in the last few comments. 
Cc: hpayer@chromium.org
I'm travelling right now so can't look closely. If I read the comments right this isn't a question of the efficacy of the metric, but a question of whether the regression is a side effect of moving more work to the main thread? I'm not sure I can provide much input on how to (or whether we need to) address this metric. +Hannes in case he has any thoughts.
I think it's both. In investigating  issue 663032  I learned how to selectively enable trace categories for certain tbms so I could try doing that with this one locally and see why we're seeing differences. It would be great to be able to do that from the perf dashboards so I could automatically get those traces on the bots.

Do the infinite scroll metrics have any metrics related to how much was scrolled before the 70s timeout?
Looks like we haven't made progress on this bug in 8 months. Should we wontfix? We did add the ability to enable trace categories from the perf dashboard, but unfortunately it won't work for regressions this old.
Status: WontFix (was: Assigned)
Yeah, let's WontFix. Sorry, this regression was very difficult for me to understand since the work we are moving to the main thread was so minimal (7ms out of 70 seconds) and completely unrelated to Javascript or V8.

Sign in to add a comment