New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 635483 link

Starred by 1 user

Issue metadata

Status: Archived
Owner:
Last visit > 30 days ago
Closed: Sep 2016
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug-Regression

Blocked on:
issue 640915



Sign in to add a comment

77.4%-116.5% regression in system_health.memory_desktop at 409975:410033

Project Member Reported by petrcermak@chromium.org, Aug 8 2016

Issue description

See the link to graphs below.
 

===== BISECT JOB RESULTS =====
Status: failed


===== TESTED REVISIONS =====
Revision         Mean     Std Dev  N  Good?
chromium@409979  3187350  3788.79  5  good
chromium@409998  3187350  3788.79  5  good
chromium@410007  3182266  3788.79  5  good
chromium@410008  5701312  0.0      5  bad
chromium@410009  5683238  35865.9  5  bad
chromium@410010  5697923  4640.31  5  bad
chromium@410012  5699618  3788.79  5  bad
chromium@410016  5701312  0.0      5  bad

Bisect job ran on: win_perf_bisect
Bug ID: 635483

Test Command: src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --also-run-disabled-tests system_health.memory_desktop
Test Metric: browse_social-memory:chrome:all_processes:reported_by_chrome:web_cache:effective_size_avg/browse_social_twitter
Relative Change: 78.87%
Score: 0

Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/win_perf_bisect/builds/6812
Job details: https://chromeperf.appspot.com/buildbucket_job_status/9004896728337231024


Not what you expected? We'll investigate and get back to you!
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5331865460801536

| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!
Cc: csharrison@chromium.org
Owner: y...@yoav.ws
yoav: The bisect marked your patch (https://codereview.chromium.org/2174563003) as the most likely culprit causing the regression (+2.4 MiB in web_cache on Twitter). Please investigate and decide on further action.

===== BISECT JOB RESULTS =====
Status: failed


===== TESTED REVISIONS =====
Revision         Mean     Std Dev  N  Good?
chromium@409979  3187350  3788.79  5  good
chromium@409998  3187350  3788.79  5  good
chromium@410007  3182266  3788.79  5  good
chromium@410008  5701312  0.0      5  bad
chromium@410009  5683238  35865.9  5  bad
chromium@410010  5697923  4640.31  5  bad
chromium@410012  5699618  3788.79  5  bad
chromium@410016  5701312  0.0      5  bad

Bisect job ran on: win_perf_bisect
Bug ID: 635483

Test Command: src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --also-run-disabled-tests system_health.memory_desktop
Test Metric: browse_social-memory:chrome:all_processes:reported_by_chrome:web_cache:effective_size_avg/browse_social_twitter
Relative Change: 78.87%
Score: 0

Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/win_perf_bisect/builds/6812
Job details: https://chromeperf.appspot.com/buildbucket_job_status/9004896728337231024


Not what you expected? We'll investigate and get back to you!
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5331865460801536

| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!
Hm very interesting. It may be worthwhile to look at traces from the reports.

What is "web_cache" measuring?
Thanks! Do you know if the bots can run the bisect with more tracing categories turned on? I think that's probably the next step, either manually or automatically (with memory-infra).

It's strange this patch is showing a regression, as it more eagerly evicts resources from MemoryCache. Perhaps there's a logic error somewhere?
Cc: sullivan@chromium.org
I don't think you can do this on the bots (+cc sullivan). You can of course modify the benchmark and run it locally (https://cs.chromium.org/chromium/src/tools/perf/benchmarks/system_health.py?l=33).
OK yeah that may be the best step. Actually, it may be simpler to just dump the list of resources associated with web_cache in both versions.
Cc: eakuefner@chromium.org nednguyen@chromium.org
+nednguyen, eakuefner: Is there a flag we could pass to telemetry to enable additional tracing categories? We can either pass a flag directly or slightly modify the benchmark pretty easily with a perf try job.
We can pass a flag only the benchmark defined the flag. Otherwise, slightly modifying the benchmark is the most straight forward option here.

+Petr: I think we should consider implement SetupTraceRerunOptions for system health benchmarks. Though it should be cleaned up first (https://github.com/catapult-project/catapult/issues/2137)
Ned: What do you think the method should do? Enable all tracing categories (that is overkill, in my opinion)?
At the minimal, I would expect it enable enough categories for all the rail stages. Another route is we can add "--extra-tracing-categories" as a commandline flag.
Cc: perezju@chromium.org
I think adding a command flag (maybe "--tracing-categories-for-rail"?) would be better because it would apply to all benchmarks.
Making this work for all benchmarks is hard, because not all of them use tracing. When I said adding a commandline flag, I meant adding it for only system health benchmarks, that we know all use tbm2
Good point. I'm fine with that.
Project Member

Comment 20 by 42576172...@developer.gserviceaccount.com, Aug 10 2016

Cc: y...@yoav.ws

=== Auto-CCing suspected CL author yoav@yoav.ws ===

Hi yoav@yoav.ws, the bisect results pointed to your CL below as possibly
causing a regression. Please have a look at this info and see whether
your CL be related.


===== BISECT JOB RESULTS =====
Status: completed


===== SUSPECTED CL(s) =====
Subject : Clear all preloads when document is detached
Author  : yoav
Commit description:
  
Currently link rel preload based preloads are never evicted from MemoryCache. This fixes that
by evicting all preloads when the document is detached.

This also fixes issues found when Link preloads were not reused after DCL due to bugs inside clearPreloads().

BUG=627026

Review-Url: https://codereview.chromium.org/2174563003
Cr-Commit-Position: refs/heads/master@{#410008}
Commit  : 57696ed320e6ed0263b7c447ef17bb6bef5236ca
Date    : Fri Aug 05 07:06:05 2016


===== TESTED REVISIONS =====
Revision         Mean     Std Dev  N  Good?
chromium@409979  3187350  3788.79  5  good
chromium@409998  3189044  0.0      5  good
chromium@410007  3183961  4640.31  5  good
chromium@410008  5681544  34988.0  5  bad    <--
chromium@410009  5699618  3788.79  5  bad
chromium@410010  5701312  0.0      5  bad
chromium@410012  5681544  34988.0  5  bad
chromium@410016  5699618  3788.79  5  bad

Bisect job ran on: win_perf_bisect
Bug ID: 635483

Test Command: src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --also-run-disabled-tests system_health.memory_desktop
Test Metric: browse_social-memory:chrome:all_processes:reported_by_chrome:web_cache:effective_size_avg/browse_social_twitter
Relative Change: 78.82%
Score: 99.9

Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/win_perf_bisect/builds/6823
Job details: https://chromeperf.appspot.com/buildbucket_job_status/9004896728337231024


Not what you expected? We'll investigate and get back to you!
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5331865460801536

| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!

Comment 21 by ssid@chromium.org, Aug 25 2016

Cc: keishi@chromium.org haraken@chromium.org
Looks like a considerable regression on PartitionAlloc and webcache. This has not recovered yet.

Sorry I missed this email. The webcache includes raw fonts, scripts and image resources downloaded from web.
in this case the script resources have increased. Not sure if this was intended.
Attached relevant screenshots.
webcache_without_scripts.png
94.9 KB View Download
webcache_extra_scripts.png
106 KB View Download
Cc: -csharrison@chromium.org hirosh...@chromium.org
Yeah, this looks like a real regression.

Maybe should we revert the CL at the moment?

I'm happy with reverting.

yoav: ping!
Yoav is on vacation, so might be slow to respond.

BTW: Can you point to documentation where we can run this test on trybots before landing?
You can run the test with 
./tools/perf/run_benchmark try <bot name> system_health.memory_desktop

For more documentation, run:
./tools/perf/run_benchmark try --help
Yoav: I think probably the change to extend link rel preloads past DCL must have triggered this, as I know twitter uses link rel preload. It would be good to see what's going on in MemoryCache here.

I will investigate. I agree that we should revert soon though.
I reverted the patch locally and found that twitter.com double downloads f14f52256fc806867c97cba1c3877a20e9fe8449.js without the patch.

I think this is not a real regression. The previous memory metrics were just with a buggy implementation of link rel preload that didn't persist resources past dom content loaded.
Cc: csharrison@chromium.org
Adding myself back to cc.

Comment 30 by ssid@chromium.org, Aug 26 2016

Were these resources not part of MemoryCache earlier and now added?
My thinking was that these resources are loaded in memory cache and evicted before the memory dump trace. With Yoavs patch they are kept until the document is detached.
Project Member

Comment 32 by 42576172...@developer.gserviceaccount.com, Aug 26 2016


===== BISECT JOB RESULTS =====
Status: failed


===== TESTED REVISIONS =====
Revision         Mean      Std Dev  N   Good?
chromium@409974  23488643  2122791  12  good
chromium@410015  19247138  2876929  12  bad
chromium@410044  17346266  1145261  12  bad

Bisect job ran on: win_x64_perf_bisect
Bug ID: 635483

Test Command: src/tools/perf/run_benchmark -v --browser=release_x64 --output-format=chartjson --upload-results --also-run-disabled-tests system_health.memory_desktop
Test Metric: browse_social-memory:chrome:all_processes:reported_by_chrome:partition_alloc:effective_size_avg/browse_social_twitter
Relative Change: 21.32%
Score: 0

Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/win_x64_perf_bisect/builds/1445
Job details: https://chromeperf.appspot.com/buildbucket_job_status/9003311713978105680


Not what you expected? We'll investigate and get back to you!
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5873056656916480

| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!
Blockedon: 640915
Looks like the bisect failed for the problem investigated in issue 640915.
Cc: jasontiller@chromium.org
Project Member

Comment 36 by 42576172...@developer.gserviceaccount.com, Aug 30 2016


===== BISECT JOB RESULTS =====
Status: completed


===== SUSPECTED CL(s) =====
Subject : Clear all preloads when document is detached
Author  : yoav
Commit description:
  
Currently link rel preload based preloads are never evicted from MemoryCache. This fixes that
by evicting all preloads when the document is detached.

This also fixes issues found when Link preloads were not reused after DCL due to bugs inside clearPreloads().

BUG=627026

Review-Url: https://codereview.chromium.org/2174563003
Cr-Commit-Position: refs/heads/master@{#410008}
Commit  : 57696ed320e6ed0263b7c447ef17bb6bef5236ca
Date    : Fri Aug 05 07:06:05 2016


===== TESTED REVISIONS =====
Revision         Mean     Std Dev  N  Good?
chromium@409974  2931504  571152   5  good
chromium@409998  3187350  3788.79  5  good
chromium@410003  3187350  3788.79  5  good
chromium@410006  3182266  3788.79  5  good
chromium@410007  3172665  36625.0  5  good
chromium@410008  5699618  3788.79  5  bad    <--
chromium@410018  5699618  3788.79  5  bad

Bisect job ran on: win_8_perf_bisect
Bug ID: 635483

Test Command: src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --also-run-disabled-tests system_health.memory_desktop
Test Metric: browse_social-memory:chrome:all_processes:reported_by_chrome:web_cache:effective_size_avg/browse_social_twitter
Relative Change: 94.43%
Score: 99.9

Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/win_8_perf_bisect/builds/2163
Job details: https://chromeperf.appspot.com/buildbucket_job_status/9002949730302690480


Not what you expected? We'll investigate and get back to you!
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5771676980084736

| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!
Labels: SystemHealth-Sheriff

Comment 38 by y...@yoav.ws, Aug 31 2016

Back from vacation...

csharrison@'s analysis makes sense to me. What's the best next step here? If this is not a real regression, reverting doesn't make sense.

Comment 39 by ssid@chromium.org, Aug 31 2016

So, IIUC the regression is because these resources are kept for longer than it used to be. So, in effect this is going to increase some memory usage in devices. My question would be:
can it be evicted earlier?
How much longer are these resources kept alive than before the CL?
While this patch does seem to increase memory, the resource is used later (and reloaded from the browser process). My guess is that the memory dump just takes place after the resource has been evicted from MemoryCache, but before it has been referenced by the page (and reloaded).

This is the tradeoff that <link rel="preload"> has. Devs expect the results of the preload to last past DOMContentLoaded, where unused preloads typically die. In this case I think the best thing to do is trust the devs.

Yoav would you mind double checking my work? Ideally we could run this locally (using tools/perf/run_benchmark) on just the twitter page. I haven't quite figured out how to do that (some story sets allow easy commenting out of pages).
To answer #39 these resources are kept until the document detaches.
#40: You can run a single story by providing a story filter to the command:

tools/perf/run_benchmark [...] --story-filter=twitter

Comment 43 by y...@yoav.ws, Aug 31 2016

csharrison@ - sure, I can rerun the twitter benchmarks and verify that your findings are reproducible on my end.

petrcermak@ or someone else familiar with the benchmark - when is the memory snapshot being taken? Pointers towards to relevant code would be highly appreciated :)

Short answer:

After the story is run, there's a 3s wait, then we force GCs, then we flush the entire system cache, then we wait for another 3s and then, finally, measure memory.

Long answer:

This is the story: https://cs.chromium.org/chromium/src/tools/perf/page_sets/system_health/browsing_stories.py?l=188
It's a subclass of: https://cs.chromium.org/chromium/src/tools/perf/page_sets/system_health/browsing_stories.py?l=45
which is a subclass of: https://cs.chromium.org/chromium/src/tools/perf/page_sets/system_health/system_health_story.py?l=30
which measures memory *after* the story using: https://github.com/catapult-project/catapult/blob/fa2c4f21907f44a7e34e1b92df2205bda1b2c135/telemetry/telemetry/internal/actions/action_runner.py#L111 (with deterministic_mode=True)
#44 huh... That kind of invalidates my thinking :) I'd be interested to see what you find, Yoav.
Note that "flushing the entire system cache" has been added recently (r413128), so it was not used when the regression occurred.

Comment 47 by y...@yoav.ws, Sep 1 2016

I'm failing to run these tests locally due to permission errors :/ (even after running gsutil config)
You can use the try command in #25 to run on try bots.

Comment 49 by y...@yoav.ws, Sep 14 2016

I've got https://codereview.chromium.org/2319483002/ which is one way of reducing the mem impact of preloads, but I have little faith that it's directly related to this specific benchmark. I'm also looking into other potential causes.
Labels: Hotlist-SystemHealthBankruptcy
Status: Archived (was: Assigned)
Temporarily declaring bankruptcy on the *desktop* system health benchmark.
The number of alerts became unmanageable and the overall process needs to be improved to make it sustainable.
The alerts have been turned off and I'm archiving the outstanding regressions.
Note: this is just about desktop, the mobile system health stays up. 

Sign in to add a comment