New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 792020 link

Starred by 1 user

Issue metadata

Status: Assigned
Owner:
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug-Regression



Sign in to add a comment

1.3%-1.5% regression in memory.top_10_mobile at 518659:518766

Project Member Reported by pmeenan@chromium.org, Dec 5 2017

Issue description

See the link to graphs below.
 
All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=792020

(For debugging:) Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?sid=df4d5c5f9bd98457d6dc50c1576c157e4585cf207904dacc41aee8b718ac3c8b


Bot(s) for this bug's original alert(s):

android-nexus5X
Cc: morlovich@chromium.org
Owner: morlovich@chromium.org
Status: Assigned (was: Untriaged)

=== Auto-CCing suspected CL author morlovich@chromium.org ===

Hi morlovich@chromium.org, the bisect results pointed to your CL, please take a look at the
results.


=== BISECT JOB RESULTS ===
Perf regression found with culprit

Suspected Commit
  Author : Maks Orlovich
  Commit : 6bf5e45d34018dfd5fbb114d88d422581aa33be4
  Date   : Wed Nov 22 21:30:29 2017
  Subject: Add fieldtrial_testing_config for SimpleCachePrefetch for platforms

Bisect Details
  Configuration: android_nexus5X_perf_bisect
  Benchmark    : memory.top_10_mobile
  Metric       : memory:chrome:all_processes:reported_by_chrome:malloc:effective_size_avg/background/after_https_m_facebook_com_rihanna
  Change       : 1.03% | 43223465.3333 -> 43668876.8889

Revision             Result                  N
chromium@518658      43223465 +- 811157      9      good
chromium@518712      42863501 +- 521870      6      good
chromium@518739      42826885 +- 678000      9      good
chromium@518746      42834563 +- 776329      6      good
chromium@518750      42938060 +- 783754      9      good
chromium@518752      42853037 +- 436398      6      good
chromium@518753      43477405 +- 905928      9      bad       <--
chromium@518766      43668877 +- 711847      9      bad

Please refer to the following doc on diagnosing memory regressions:
  https://chromium.googlesource.com/chromium/src/+/master/docs/memory-infra/memory_benchmarks.md

To Run This Test
  src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=https.m.facebook.com.rihanna memory.top_10_mobile

More information on addressing performance regressions:
  http://g.co/ChromePerformanceRegressions

Debug information about this bisect:
  https://chromeperf.appspot.com/buildbucket_job_status/8961038381820465952


For feedback, file a bug with component Speed>Bisection
Cc: pasko@chromium.org
FYI re: possibility of launching this...

Comment 5 by pasko@google.com, Dec 5 2017

hmm .. 430 KiB is massive ..

On the other hand we could argue that this memory bump happens more at the time when the page is loading, which is typically earlier than the mem peak happens. Hence Chrome is unlikely to be killed more often or cause more background apps to be killed more often with this feature ON.

Another way could be to limit the amount of concurrent prefetched-n-unused entries per backend. This looks not too involved, but I may miss some important details in the state machine.

Maks, can you start a discussion on, say, loading-dev@ to converge on the solution?
Cc: xunji...@chromium.org
@xunjieli --- any chance you have time to give me some advice/pointers here?

Not really familiar with the loading-dev list, but hmm, more immediately, I can't make much sense out of traces. I think it only measures at the end --- and that point the net usage is basically zero, so it's not even pre-read buffers directly? 

Also, I think the difference seems to be in the renderer process, from picking a couple of traces. Maybe the change is a bad idea for CacheStorage or something? The numbers for that certainly look worse --- comparing the ratio of ReadStream1FromPrefetched = true to SyncOpenDidPrefetch = true (which basically measures the portion 
of opens that were useful) between SimpleCache.App. and SimpleCache.Http is quite drastic. 

If this (or some other bot) is tracking canary, may be worth trying turning off pre-read just for cache_type_ == APP_CACHE, since that would be a trivial CL.





I looked at the traces. I can't tell what's going on.
You can try to see if this reproducible locally or use perf trybots, and from there add some probes to see what had grown.

(https://chromium.googlesource.com/chromium/src/+/master/docs/speed/perf_trybots.md)


Re#8:
To get all perf data for a revision, you can use this url:

https://chromeperf.appspot.com/group_report?rev=518753
Hmm, how would I get it to report the metric the report is about?
(And how do I find out what exactly it means)?
re #8: not a good link :) The blah-apidata domains come and go, but there is a way to share a working link if you grab it before some redirects kick in. Kudos to inventers of this convenience.

This metric is a newfangled perf infra metric, written in javascript, living in the catapult repo, summarized into a fancy interactive html page (which can also be embedded into a trace).

So for instance this regression has a "View trace from this run" link if you click on the exclamation point on the dashboard (and scroll into the hidden area). Which leads here: https://storage.cloud.google.com/chrome-telemetry-output/after_https_m_facebook_com_rihanna_2017-11-22_22-36-43_86103.html (does not open with @chromium.org account for me - convenience everywhere)

When the trace opens (with the blah-apidata dontshareme URL), you have the vertical tab called "Metrics", then choose dropdown value "memoryMetric", then there is a row "memory:chrome:all_processes:reported_by_chrome:effective_size" with one sample (of course, we ran the page only once). Then you can click on the blue bars next to it.

More documentation about this UI:
https://github.com/catapult-project/catapult/blob/master/docs/metrics-results-ui.md

With perf trybot you'd get a link to uploaded report, where you'd see essentially the same UI, but multiple samples and a side by side comparison. Hopefully you'd be able to dig into details from there, but I usually cannot, then I give up and run the benchmark locally, hack it to be able to grab all the raw data and filter/slice it appropriately in a colab notebook. I could not convince enough people to get me raw data at all time easily. Yet. Maybe yet.

Good luck!
Maybe a useful URL of trying what's essentially a revert on the bots:
https://console.developers.google.com/m/cloudstorage/b/chromium-telemetry/o/html-results/results-2017-12-06_14-26-38#r=TOT&s=%25%ce%94avg

This does show a change of 430K... (well, at p=.14); and changing to just preload http/media caches doesn't. Hmm.
Probably the browser process, too.

Helen explained some of this to me --- what I was missing was that it was possible could split up the effective_size metric to just the malloc row by clicking the labels under the
bar chart. Opening up traces from it, it seems like the difference is under malloc -> allocated_objects -> <unspecified> which isn't very helpful (but maybe I can hack the code 
to make things specified somehow...)

Still, even with this I am not sure what exactly effective_size is counting, and what's the after_ supposed to mean... 

Cc: primiano@chromium.org haraken@chromium.org ssid@chromium.org
+ more folks to help with digging into details of "malloc -> allocated_objects -> <unspecified>". There were recent changes in the memory infra AFAIR.
So looking at netlog, this is actually a /cold/ load, so only opens seem to just read stream 0, then write ~12 bytes to stream 2, and usually with in a millisecond or two, with one instance taking ~7ms.

Preread is kinda useless there, but it's hard to see how that yields any sort of permanent memory usage change, yet my revert seems to show memory improvement pretty consistently.

I wonder if any one the scenarios cover warm loads?

Hmm, I am reminded of this:
https://chromium.googlesource.com/chromium/src/+/248782b9cda664a1ea6d347dd508354d23160a5a

... Since this is cross-thread as well, but more intentionally so. 

Experimenting with some hacks to avoid it, cross-thread alloc/delete probably isn't the issue (besides, that was about system_memory:native_heap:proportional_resident_size_avg)

Comment 17 by ssid@chromium.org, Dec 12 2017

Re #13: "malloc -> allocated_objects -> <unspecified>" on Android could increase because of 2 reasons. The app allocated 400KB extra (this could either be Chrome or indirect android allocation) or it allocated and freed the 400K, so dlmalloc kept around the memory in freelists.
If you can reproduce the regression locally, then run with enabling heap profiler and check if chrome allocations have increased. If so, then we can see stack trace of where these allocations came from. If not we would need to get an android native heap dump.
Thanks, that's helpful. "Allocated and freed" is more consistent with what it should be doing, though the total amount is surprising --- perhaps the large number of threads involved is a factor.  

I am not sure profiling would be that helpful since I know what the extra allocations are (well, at least the ones I am making, maybe changes in timing make other things allocated more) --- just I expected them to be very short term, and timed in a way that they make way for later usage[1].  So the android native heap dump may be the thing to do to understand how they interact with things.  Are they any recent recommended docs for that? (Android ones seem to be "use Android studio", which may be more than sufficient) 

[1] hmm, entries being reopened to write out V8 compilation cache might be different in that respect, though. Actually some of the allocations for those are kinda useless...

Hmm, allocated_object_size_avg has the same curve shape --- does that also potentially include things in freelists?

Also kinda weird that other devices don't show the same symptom.
So... Rounding up all the "new" allocations to 66,000 bytes seems to fix it... as does rounding them up to 65,000. The experiment was meant to show that the per-thread cache was relevant (given its cap on Android seems to be 65,536 by default), but I am not sure of how to understand this result.
Cc: -pmeenan@chromium.org

Sign in to add a comment