New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 759598 link

Starred by 1 user

Issue metadata

Status: Archived
Owner:
Closed: Jan 2018
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug-Regression



Sign in to add a comment

(likely a revert) 9.6% regression in system_health.memory_mobile at 497393:497553

Project Member Reported by kraynov@chromium.org, Aug 28 2017

Issue description

See the link to graphs below.
 
Project Member

Comment 1 by 42576172...@developer.gserviceaccount.com, Aug 28 2017

All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=759598

(For debugging:) Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?sid=82e61c6ba3fff2e4b3d11b50559c7e014a1a094da592b6ff4746c9b4430b9508


Bot(s) for this bug's original alert(s):

android-webview-nexus6
Project Member

Comment 3 by 42576172...@developer.gserviceaccount.com, Aug 28 2017

Cc: adamk@chromium.org
Owner: adamk@chromium.org
Status: Assigned (was: Untriaged)

=== Auto-CCing suspected CL author adamk@chromium.org ===

Hi adamk@chromium.org, the bisect results pointed to your CL, please take a look at the
results.


=== BISECT JOB RESULTS ===
Perf regression found with culprit

Suspected Commit
  Author : Adam Klein
  Commit : f9a3a5af2aba5a4689be19cce8a200eed526e736
  Date   : Thu Aug 24 23:31:37 2017
  Subject: Simplify usage of runtime hashing functions in weak-collection.js

Bisect Details
  Configuration: android_webview_nexus6_aosp_perf_bisect
  Benchmark    : system_health.memory_mobile
  Metric       : memory:webview:all_processes:reported_by_chrome:v8:effective_size_avg/background_social/background_social_facebook

Revision                           Result                  N
chromium@497392                    2093944 +- 0.0          6      good
chromium@497433                    2093944 +- 0.0          6      good
chromium@497453                    2093944 +- 0.0          6      good
chromium@497453,v8@8974b75bce      2105037 +- 38428.4      6      good
chromium@497453,v8@4e45342994      2099491 +- 30380.3      6      good
chromium@497453,v8@46cb812fa1      2093944 +- 0.0          6      good
chromium@497453,v8@f9a3a5af2a      2297584 +- 197.18       6      bad       <--
chromium@497453,v8@2ee967d253      2290720 +- 197.18       6      bad
chromium@497454                    2292932 +- 0.0          6      bad
chromium@497455                    2292896 +- 197.18       6      bad
chromium@497456                    2292896 +- 197.18       6      bad
chromium@497458                    2292932 +- 0.0          6      bad
chromium@497463                    2292932 +- 0.0          6      bad
chromium@497473                    2292896 +- 197.18       6      bad
chromium@497553                    2292900 +- 0.0          6      bad

Please refer to the following doc on diagnosing memory regressions:
  https://chromium.googlesource.com/chromium/src/+/master/docs/memory-infra/memory_benchmarks.md

To Run This Test
  src/tools/perf/run_benchmark -v --browser=android-webview --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=background.social.facebook system_health.memory_mobile

More information on addressing performance regressions:
  http://g.co/ChromePerformanceRegressions

Debug information about this bisect:
  https://chromeperf.appspot.com/buildbucket_job_status/8970006426422747040


For feedback, file a bug with component Speed>Bisection

Comment 4 by adamk@chromium.org, Aug 28 2017

Owner: ----
Status: Available (was: Assigned)
My CL should not have had any appreciable affect on memory. Looking at the graph, it looks bi-modal to me. And the most recent regression bug,  issue 741727 , shows very similar absolute numbers.
Cc: perezju@chromium.org hpayer@chromium.org u...@chromium.org
+ulan, hpayer, perezju: see #4: bisect is pretty clearly reproducing 200kib regressions on both these bugs, but they don't get followed up on and the CL author doesn't think the CL should have an effect on memory. What should the next step be?
Owner: adamk@chromium.org
Status: Assigned (was: Available)
The bisect results are actually rock-solid. And they don't have any noise.

I would suggest looking and comparing a couple of traces to further debug:

last_good (src@497453,v8@47584):
https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_0-2017-08-28_13-58-42-67878.html

first_bad (src@497453,v8@47585):
https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_0-2017-08-28_12-15-28-73567.html

The increase in v8 is small, but note from the alerts caught in this bug browse_chrome_omnibox on N5 shows around 0.5MiB; so I think more investigation is warranted.

There was also a related and larger ~ 1MiB regression on malloc; which may or may not be related. I've split that one off to  issue 763343 .
v8_heap.png
21.8 KB View Download

Comment 7 by adamk@chromium.org, Sep 8 2017

Cc: yangguo@chromium.org rmcilroy@chromium.org
I believe the bisect results, but I don't believe it has anything in particular to do with my change:

 - We've seen similar jumps up and down in the past. See, e.g., the bisect job here: https://bugs.chromium.org/p/chromium/issues/detail?id=741727#c5
 - See the graph for the bimodal distribution I'm talking about: https://chromeperf.appspot.com/report?sid=40bf19f3d1d446223a994117699e7a0b284040d69c148d8813bb9b0769471aa0&start_rev=475259&end_rev=500323
  - My change literally just removes some JS code from V8, so any memory increase isn't "caused" by my change in any way other than as a catalyst.

Adding a yangguo@ and rmcilroy@ who've both dealt with V8 snapshot size in the past in case they have any insights.

Comment 8 by adamk@chromium.org, Sep 8 2017

From a high level, here's a possibility (would love any of the other V8 folks on this thread to comment on it): since we removed some JS stuff in this CL, we skip one round of GC, which means our heap size is larger. That would explain why removing stuff actually makes the measured heap bigger. It would also help explain why this bounces up and down.

If that were the case, this particular regression would be a WontFix. And it would suggest that these benchmarks might benefit from additional post-run instrumentation.
I had a look at the two traces before [1] and after [2] this CL (linked on the dashboard graph). If you click the memory dump (the "m" circle on the trace) and select "v8" you can see the differences in size - it looks like there is a small reduction in "code_space" (as expected) but a 200K increase in "old_space". 

Regarding #8 - I'm not sure that it would be due to skipping a GC. I think we explicitly do a GC before doing a memory dump, and if you look at "allocated_objects_size" they are roughly the same for both traces. The effective_size is the size of the heap, not the size of the objects on the heap, so should be less impacted by GC timings. Instead it seems like we just squeezed into peak memory usage of under 633.6KB in the preceeding run, but after the CL somehow ended up peaking slightly above this, so the heap ended up allocating another page. I'm not sure what in your CL would cause this though. It would be good if the GC folks could comment on this though?

Do we know if this increased the size of the startup snapshot, or does the increase only happen on this page?

[1] https://00e9e64bac624f1baaa801e9698d4dc3b49d1a7867b19cdcc9-apidata.googleusercontent.com/download/storage/v1/b/chrome-telemetry-output/o/trace-file-id_0-2017-08-25_15-59-35-63045.html?qk=AD5uMEu_jraF-HHNGUZJAATEXtQfc4FlnEoyYfPF6NkfpJCfzLerSc8qGmkC687N2HjfLJiqUf08cnrYpqgwQcKi0KkdYFP-q-4zAOqyRke1FKWSZEaAevQudqhLQi2dUCJXwAXdis8OYDSqaWea8gXd2HSqBC9jd_iErwuLuBjozn0j9oQhsefQnGWro5z8xyPuluVOZcGvPujbQRr7PuW8j11xw3DTO_uQ8ESVJ8fCj4CX-sXdl2j7uHy12RzPr2AGSpr6Rz9dT4KOv1L9NgiKraEWnaOXRGkR3Z5ncRoOInI_8_Lfh73hJO_UO5csZ6LnnoyJSPsoUnTQ_6Nbii-zL4jgBKmmhRdHJA1oDdS74UCCf-uiomHRlvqH59SQ_4cEPi415R_Ze07AVAVCm9HbsdV3ty599Oj-33ZA2hjZNeuWDXeaAspAiTNdoiBmI6wJj2GqqdnfQ_nVz9mZgU9WDSFhkH_8xgHtKwNtSSCWZs_y0YF_NqQdBUOaq-ajispANfvJ3ZD2i6uvclwsT04XKlUtoNph7-bjbOsfEcZqy3jLa19m9uOHUSMnQ7FH6AAH33yPD9LXXBcbRvbFrrHayGTViUvSRAPlxMAa9VlmpqGLq4sXerG18FYcB0RaPfoIPtFcGMAjSxuAfMAWPepaUgPCUJagKe1B-E8JoZ2VINSmtM6XrBW_ZdrMV18MTHSzoneq8R2SXD6rAvPGKb8x5VOfSNUS0ezYIbNhzP6PrbmvL6WmYZM_t6dwq4Xxx02E0SNslyN0pzJQdah50uaa826S_WRXSiDPaBmHcrKT42Z0XKw-E8Hx8uh4bb5PEmHd-8nC7vl2bl3Li8kQke7EELqZ7qkfzA

[2] https://00e9e64bac4ffd754c2fa6bf57c86645f13b925889a4e52719-apidata.googleusercontent.com/download/storage/v1/b/chrome-telemetry-output/o/trace-file-id_0-2017-08-26_00-22-03-33275.html?qk=AD5uMEtnEh_sYrMv3c2ZZoPDdnDYD4_Diohvk5Wgld38sKjiCDRKThGAAHC5ib_Okh_UAtp4Vc_EJxjjD1yTZzYjEO4Hf_7C4LVe57jVwEs31vaz6MebU8ez63c_JGY6R8bMbvpFndIvOoE75MAKUUvMacu25wtkLcwf3GqYwv1JdL9LO91Z05ocyXMNLM7qRZjVAODoe2lbHW5LhcPwQpFH1tSlXuSHZx4PGQc8Zn-YLLBnw5URUd6CSqzNHRn9F8VleJADH3Jc7pLpJ6D4xneAgYvG-J7K6JiWGLfHIOnmFYceuXNK7sdF0z2zJCDgofgFJwFra80eJm-GvaP2e_e0yfr3_lCXcteJUhlVXb3Vu4WSVRbJS_TX6eA6BLU1BxylTOMccjeQoGXiFJ1LMwMhrI6agpEL4b47n_JNi_pSDE_WoX9UVQZ6uaKQUBp3b2XnszutDUqIR6P5luSnV7fqY5DvKdcTiylDwGKy_tNhvJxzSxkTlCQUecrTt024NgV7cD5U0Qkjw_hye2ynYNYLN3D4VTCNFAI-GRPqZXUp2qww0nsZXjb26FLPGx-2tZQJXBjDRFv2C1y5QbGBZNbsKGc_-GU_FYCD7lSOh4M-uQkJR502a-eV0sAeKXbe4Bi5kCd2VJa4jG7lFtkEXNZFpE2gHUVjWX-EhqYTfOvfV3tdsO1HurAT5I4dJlLkDIBWA0jc36ePlH1RBm5wfsiuzvBix-Cx3s0CkW3BkeRgibFMSTPIRSx-T8aueP1ivnvxMJFPdV3aO2v7v1k1igltcXykG_qbvG8HcYu7Lihn9CtINkOb3HHzfeibrMB-Z-9D-UOcawNGRYEZnQFM0FTP4JdZLRXHzw
Sorry for the late response to #9: It appears this only repros on this page.
Status: Archived (was: Assigned)
Speaking of late responses, I found this bug while going through the back of my inbox. Given the changes to the code (and the graphs) related to this bug since it was filed, I'm inclined to archive it. If someone thinks this still deserves more investigation, feel free to reopen (preferably with some suggested next steps).

Sign in to add a comment