New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 825820 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner:
Closed: Mar 2018
Cc:
EstimatedDays: ----
NextAction: 2018-03-27
OS: ----
Pri: 2
Type: Bug-Regression



Sign in to add a comment

3.3% regression in system_health.memory_desktop at 545505:545650

Project Member Reported by sullivan@chromium.org, Mar 26 2018

Issue description

See the link to graphs below.
 
Project Member

Comment 1 by 42576172...@developer.gserviceaccount.com, Mar 26 2018

All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=825820

(For debugging:) Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?sid=ab71f41ee5e8f296819621bb38a858bfc3e4dedd2b93ae91d17c886503952851


Bot(s) for this bug's original alert(s):

win-high-dpi
Project Member

Comment 3 by 42576172...@developer.gserviceaccount.com, Mar 26 2018

Cc: jam@chromium.org mmenke@chromium.org csharrison@chromium.org
Owner: mmenke@chromium.org
Status: Assigned (was: Untriaged)
📍 Found a significant difference after 1 commit.
https://pinpoint-dot-chromeperf.appspot.com/job/13dfd15d440000

RDH:  Ensure there's at most one LoadInfoList in flight. by mmenke@chromium.org
https://chromium.googlesource.com/chromium/src/+/d03adf5a2c53f00c4663a93917a9fb0a53f082ad

Understanding performance regressions:
  http://g.co/ChromePerformanceRegressions

Comment 4 by mmenke@chromium.org, Mar 26 2018

Owner: ----
Status: Untriaged (was: Assigned)
Given that the CL was reverted, and the metric hasn't recovered (https://chromium-review.googlesource.com/c/chromium/src/+/979674 / 545616 is the revert, but the regression graph shows it continues into 545729), the bisect is wrong.   Or the benchmark isn't measuring what it thinks it is measuring.  Punting into untriaged bin.
NextAction: 2018-03-27
Owner: sullivan@chromium.org
Status: Assigned (was: Untriaged)
The commit position I see on the revert is r545741, and the graphs are still at 545729. Adding a next action item for myself to check the graphs tomorrow, but the bisect looks pretty clear.

Comment 6 by mmenke@chromium.org, Mar 26 2018

Owner: mmenke@chromium.org
Oops, 545741 is the revert.  Too many numbers.  Re-assigning to myself.  Too many numbers.

Comment 7 by mmenke@chromium.org, Mar 26 2018

So the benchmark calls itself " memory:chrome:all_processes:reported_by_chrome:malloc:allocated_objects_size_avg".  But looking at the Total line under malloc, I see 55.8 MiB in the trace from before, and 63.8 after (Difference is mostly in the renderer...Which is curious, since this CL doesn't touch the renderer).  The benchmark graph shows about 252 MB total size.  Am I looking at the right thing?
Cc: erikc...@chromium.org
+benchmark owner erikchen, can you help with mmenke's questions?
The numbers in the overview report "discounted" malloc usage - "effective_size", which discounts memory htat is being allocated from malloc that is included in other MDPs. 

allocated_objects_size includes everything - that would be the "size" column. In this case, both "allocated_objects_size" and "effective_size" increase by 9MB in the renderer process.

Given that this is a WPR test, I'd guess that the original change is a networking-related change that happens to change the resources that get loaded by WPR, thus causing a difference in renderer memory usage.

sullivan: Do we have a list of false-positives we suspect are  caused by WPR non-determinism? Anecdotally, I feel like I see this a lot - it might be good to quantify this.
Screen Shot 2018-03-26 at 1.20.21 PM.png
24.9 KB View Download
Status: WontFix (was: Assigned)
The change was rate limiting the data we send to the WebContents that displays the little status bar in the lower-left (The data object was pretty big, and this was OOMing the browser when the UI thread was blocked.  csharrison already reduced the size of the object, and my CL just made sure there was only one status update message at a time).

I'm not sure why fewer/different status updates would make the renderer consumer more memory, but since this is renderer memory, but I think that's what we're seeing (If it were a browser process memory change, I could look for leaks, but in the renderer, this couldn't have created a new leak).

WontFixing this, per above analysis.  If either of you disagrees, please say so, and we can figure out how to proceed from there.
Cc: perezju@chromium.org nednguyen@chromium.org
+Ned, Juan, has either of you looked at regressions caused by WPR non-determinism?

I remember something similar with mojo loading in August, see Times of India investigation here: https://docs.google.com/document/d/1wissv31WUjeHaSaf78uAH0FiFQfSU6sqv14Lcuo1toA/edit#heading=h.voebn3wvz3na

But I don't see it that often on bugs I'm cc-ed on. Ned, Juan, any thoughts?
The NextAction date has arrived: 2018-03-27
#11:No, I haven't looked into regressions caused by WPR non-determinism at scale.

It seems hard to capture in theory. I also think it's hard to distinguish those from legit regressions caused by changing the ordering of network loads & how resources are fetched. The best tool we have is tracing to look at what going on & manually evaluate whether the regression is expected.
Theoretically, we could look at which WPR resources get served [and how many times] for each telemetry page run, and then compare that between page runs. This would give us an up-front warning when different resources are getting served.
Erik: that's a nice proposal. Probably something like a WPR trace.
#14: I filed issue 826309 to track the wpr tracing. Can't promise when we can staff that project but it's a good direction moving forward for wpr
Thanks! I feel like I see issues of this type quite frequently [maybe observation bias due to being owner of memory benchmarks] - I'll make sure to add more examples when I run across them.

Sign in to add a comment