New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 763710 link

Starred by 5 users

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Nov 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 1
Type: Bug-Regression

Blocked on:
issue 776013



Sign in to add a comment

Chrome cause system to go out of memory after resuming from sleep

Reported by darkbaha...@gmail.com, Sep 10 2017

Issue description

UserAgent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.9 Safari/537.36

Steps to reproduce the problem:
1. Set a hard commit limit (fixed page file size)
2. Sleep computer for several hours with a chrome session running
3. Upon system resume system goes OOM and windows terminates various process to attempt recovery leading to various application crashes.

What is the expected behavior?
System resumes in the state it went to sleep.

What went wrong?
My system runs a fixed commit limit of 21GB. 16GB physical RAM with a 5GB page file. On average it runs with about 11GB commit used on the desktop with a large chrome session (~70 tabs) a couple of background applications.

As of 3192 when the system is put to sleep and then resumed some time later, chrome appears to try allocating lots of memory to something, and as this system runs a fixed commit limit this causes the system to go OOM and various applications crash as a result. Usually a complete reboot is the only way to recover the system.

The amount of time the system is asleep is definitely a factor here. If I only sleep it for 1 hour then I notice a spike on the memory graph, but there is no OOM condition. If I sleep the system overnight (~10 hours sleep time) then the system is always OOM on resume, even though it had more than 10GB of commit free at the time the system went to sleep.

If chrome is closed before the system goes to sleep then the issue doesn't occur as the system resumes normally. it only happens if chrome is running when the system enters sleep.

Did this work before? Yes 62.0.3188.4

Chrome version: 62.0.3202.9  Channel: dev
OS Version: 10.0
Flash Version: 

issue has been occurring since 62.0.3192.0. Build 3188 appeared to work with no problems, so the issue seems to have come up in that release window.
 
Cc: ligim...@chromium.org
Components: Internals>Core
Labels: Needs-Triage-M62 Needs-Bisect
Cc: sc00335...@techmahindra.com
Labels: Triaged-ET Needs-Feedback
Tested this on Windows-10 chrome version: 62.0.3202.9 as per below test steps: 
1. Opened 20/30 tabs in 2 user profiles.
2. Kept the system to sleep.

This is being investigated and will resume the laptop from sleep tomorrow and update the result.

darkbahamut@: Could you please confirm if any crash is encountered when the system is resumed from sleep. If yes, please attach the crash id from chrome://crashes.

Thank you!
Hi!

Thanks for taking a look. Yes, there are crashes if it's been asleep long enough to go OOM. Sometimes chrome locks up and doesn't seem to generate a crash report for it, but it seem one was sent a couple of days ago.

Uploaded Crash Report ID af4fb3e91ed7a302 (Local Crash ID: a08cfb60-e2b3-4b31-ba50-737b26ecfe00)

Crash report captured on Saturday, September 9, 2017 at 9:01:40 PM, uploaded on Saturday, September 9, 2017 at 9:05:29 PM

I grab this screenshot from last nights memory usage. System was asleep for 8 hours an 5 minutes and woken up this morning. The spike on the graph is the system waking up, the time before that is pre sleep, and after the recovery of the memory usage after. High CPU usage (one core under full load) is observed in the main browser process while the memory usage recovered (but the processes memory usage was normal).
mem.png
29.6 KB View Download
Project Member

Comment 4 by sheriffbot@chromium.org, Sep 12 2017

Labels: -Needs-Feedback
Thank you for providing more feedback. Adding requester "sc00335628@techmahindra.com" to the cc list and removing "Needs-Feedback" label.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Labels: Stability-Crash
@Reporter: Thank you for providing crash id. 

Updating appropriate labels for further triaging.
Cc: sandeepkumars@chromium.org
Labels: Needs-Feedback
@darkbahamut: The Crash I'd which you provided in comment #3 doesn't have meaningful stack trace information. Requesting you to update your'e chrome to the latest dev #63.0.3213.3 and check if you still face the issue? 

If so please navigate to chrome://crashes and add a sample Crash server I'd for further action on this.

This is being investigated on latest dev #63.0.3213.3 and will resume the laptop from sleep after some time and update the behavior in a while.

Thanks!!

I'm not sure if my chrome had updated to 63.0.3213.3 or not before I put the system to sleep last night (I think it had) but upon waking today the system went right back to OOM and chrome was forcibly terminated by Windows. No crash log was generated, likely because it got killed by Windows rather than crashing. I'm not sure the crash logs will provide much information really, as when it does 'crash' it's only crashing because it runs out of address space. The real issue is whatever is happening to cause the huge spike in memory usage after sleep which leads to the OOM if left long enough.

From what I can see the memory usage seems to increase at roughly 900MB/hour while it's sleeping, so on this system more than about 10 hours alseep pushes it to OOM (but any time asleep is enough for the issue to occur).
Project Member

Comment 8 by sheriffbot@chromium.org, Sep 14 2017

Labels: -Needs-Feedback
Thank you for providing more feedback. Adding requester "sandeepkumars@chromium.org" to the cc list and removing "Needs-Feedback" label.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Labels: -Needs-Bisect
Unable to reproduce the issue on Win-10 using latest chrome dev version #63.0.3213.3 and the crash id: af4fb3e91ed7a302 (comment#3) doesn't show any meaningful stack trace. Hence, unable to proceed further with crash triaging.

Removing the Needs-Bisect label as of now as it is not reproducible from TE-end. Please feel free to add the same if required.

Thanks...!!


This is an odd one for sure. I've done more testing on this machine and it's definitely chrome related. I also tested on my tablet but it didn't occur there, so it appears it's setup dependent (which isn't going to make working out the cause very easy..). This is where I'm at so far with testing over the past 2 days.

Dev (63.0.3213.3) - Issue occurs
Canary (63.0.3215.0) - Issue occurs (clean install)
Stable (61.0.3163.79) - No issue (clean install)

All tested with the same tabs running. So a clean install does nothing, but as soon as I revert to a pre 62.0.3192.0 build then the issue disappears. I've updated a 2nd desktop here to the latest dev now so will keep an eye on that. It's much closer in setup to this PC than the tablet was so I'll see if that shows any signs of the same issue when it gets woken up later today.

I've attached 3 screenshots of the memory usage on wake-up from the 3 builds tested above. Filename says which channel and time asleep. Both Dev and Canary show a visible bump in usage, followed by high CPU usage and high I/O on the chrome main process. Stable is completely flat and shows nothing at all.
Dev8hrs.png
44.7 KB View Download
Canary2hrs.png
43.6 KB View Download
stable2hrs.png
45.2 KB View Download
Still getting this issue. Been trying to get more information on it. Taking a look at the elevated CPU usage after resuming, it appears the high CPU usage is in a message loop on the main browser process. A theory is that might be a backlog of messages building up while the system is asleep, which causes the spike in system memory usages on wakeup and the message loop thread is clearing these out which is shown in the recovery of the memory usage and high CPU usage.

As to why any of that is happening is still unclear though!

Attached a stack trace of the message loop in question while the system was recovering memory usage. Not sure if it provides anything helpful, but more information can't hurt!
stack.txt
2.1 KB View Download
I've spent some time trying to find the issue. After bisecting the builds I've found the commit which causes the issue. It's this:

https://chromium.googlesource.com/chromium/src/+/7be7aceb006f81a3638bba4249c4367ddf659351

Without that commit the memory usage is fine, with it the spikes on resuming are observed.

I'm just checking if it's now still present in the very latest canary today to make sure it's not been fixed in the mean time, but the commit above is the source of the problems I observe.

Comment 13 by ajha@chromium.org, Oct 11 2017

Cc: l...@chromium.org ducbui@google.com
Labels: -Pri-2 ReleaseBlock-Stable M-62 Pri-1
Owner: nasko@chromium.org
Status: Assigned (was: Unconfirmed)
Based on C#12 bisect assigning to the suspected CL owner/reviewers for more inputs and debugging of this.

Tagging with M-62 RB-Stable for tracking, feel free to remove if this should not be blocking.

Comment 14 by nasko@chromium.org, Oct 11 2017

Cc: nasko@chromium.org
Owner: ojan@chromium.org
Assigning to ojan@, as I think he took the lead on GRC. Feel free to dispatch to more appropriate person.

Comment 15 by l...@chromium.org, Oct 11 2017

Cc: -l...@chromium.org
Owner: l...@chromium.org
assign to myself. This issue should be fixed in https://chromium-review.googlesource.com/c/chromium/src/+/627031
@15

It looks like that commit has already been merged to master from what I can see. The issue is still present in every build since 62.0.3189 including Canary 63.0.3236.0 so I don't think that has resolved the issue unfortunately.

Comment 17 by l...@chromium.org, Oct 11 2017

Cc: fmea...@chromium.org
Maybe let me try to disable the code you pointed to today, and let's see if that help once canary rolls out. :)

CC the owner of related metrics collected by that code.
Cc: abdulsyed@chromium.org

Comment 19 by l...@chromium.org, Oct 11 2017

Cc: erikc...@chromium.org
Add memory expert for insight :)
Given that we have a bisect in c#12, and c#15 didn't fix the problem, please revert the CLs in question and investigate further. 
lpy@ This issue is marked as RB-Stable for M62, could you please let us know is there any latest update available on this issue?

Thanks!
Cc: oysteine@chromium.org
+oysteine - this is a RB-S bug, that has a bisect pointing to a CL, see c#12. Is there any reason we can't revert and investigate further?
Cc: ojan@chromium.org
I doubt the commit could just reverted outright, it's two months old and has some base functionality that's also used for other features now.

We could disable or Finch-gate the actual sending of the EQT from renderer_scheduler_impl.cc though. lpy, have you had any luck reproducing this?

Comment 24 by l...@chromium.org, Oct 16 2017

I didn't have windows machine to reproduce it, but I can remove the code that collects EQT
#24: Great, yep given that this is a stable release blocker that's probably a good idea.
I attempted to revert it myself yesterday but as mentioned, given the age there are quite a few conflicts so it's not so easy it seems.

I tested the latest Canary last night (63.0.3239.6) and still getting the issue. Right now I'm on 64.0.3241.2 built from source so I'll confirm the issue is still present on this build tonight and I'll try disabling the EQT code tomorrow if so. I think it seems likely that should resolve it (I hope), assuming that won't cause any issues with anything else.

Comment 27 by l...@chromium.org, Oct 16 2017

Cool, I am going to remove the code here: https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/scheduler/renderer/renderer_scheduler_impl.cc?l=2002

The patch is https://chromium-review.googlesource.com/c/chromium/src/+/721613

Maybe you can try to patch it and compile a chrome and try it :)
As discussed with erikchen@, we will continue with our m62 stable push tomorrow. We will monitor for this issue and include fix/revert in our next M62 Respin. 

Comment 29 by zh...@chromium.org, Oct 17 2017

Cc: zh...@chromium.org
Project Member

Comment 30 by bugdroid1@chromium.org, Oct 17 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/de184fe0065ceca2a3c8e65f91cc7ac48b7b288c

commit de184fe0065ceca2a3c8e65f91cc7ac48b7b288c
Author: Peiyong Lin <lpy@chromium.org>
Date: Tue Oct 17 19:21:48 2017

Remove EQT plumbing to GRC.

The patch that added EQT plumbing to GRC was suspected as a cause of OOM
crash when Windows was waken up, thus revert it speculatively.

BUG= 763710 

Change-Id: I4695328752aa5c9a19f100dbeaaefd2afb1b350b
Reviewed-on: https://chromium-review.googlesource.com/721613
Commit-Queue: lpy <lpy@chromium.org>
Reviewed-by: Alexander Timin <altimin@chromium.org>
Reviewed-by: Timothy Dresser <tdresser@chromium.org>
Cr-Commit-Position: refs/heads/master@{#509483}
[modify] https://crrev.com/de184fe0065ceca2a3c8e65f91cc7ac48b7b288c/third_party/WebKit/Source/platform/scheduler/renderer/renderer_scheduler_impl.cc

Comment 31 by l...@chromium.org, Oct 17 2017

Hello darkbahamut@, could you please help us verify the problem once the new Canary rolls out? :)

Btw, could you please also share with us some reproduce steps?
Hello!

I gave this a test last night and today. I built a 64.0.3241.2 build and put the computer to sleep overnight. Memory issue observed upon waking.

I applied the patch above to that branch this morning and then put the system back to sleep while I was at work (~9 hours) and issue is no longer present. I tried it again for another 3.5 hours a sleep and again no issues. I can confirm the patch resolves the issue on my end :)

Reproducing the issue is an odd one. On this machine I can reproduce it 100% as long as that code is present. It persists across all channels (beta/dev/canary) and even building from source, so there no issue with the installation. But on a 2nd machine here the issue is not present at all, so there must be something specific it doesn't like about this machines setup.

At a bit of a guess it looks like the intention of that code is to check something once per second. It looks like when the system is put to sleep then resumed that the browser notices the time has changed and queues up all the work that should have happened in that time the system wasn't running and tries to do it all at once, instead of discarding the time the system was off. This causes the memory usage to spike up and the message loop goes to 100% CPU load to clear out the work which is the recovery of ht memory usage observed in the images above. The longer the system is a sleep the worse it is. It's somewhere in the region of 900MB/hour usage on an 80 tab session. That's only a bit of a guess from the behaviour observed though.

Steps to reproduce on this machine are:
1.Chrome session open with ~80 tabs
2. Running any build with the EQT code/plumbing in place
3. Put system to sleep then wake up at a later point in time.

Upon wakeup memory/commit usage will have increased by 900MB/hour and if the system has a fixed commit limit like mine then it possible for the system to go OOM and applications crash on resume. It's possible other machines are affected, but if they run a normal dynamic page file then that will grow to avoid the OOM condition so it may not be as noticeable for some.

Comment 33 by l...@chromium.org, Oct 18 2017

Cc: tdres...@chromium.org
cc Tim
Labels: M-63

Comment 35 by zh...@chromium.org, Oct 23 2017

Blockedon: 776013

Comment 36 by npm@chromium.org, Oct 24 2017

Cc: npm@chromium.org
Was the fix in #30 merged to M62?
The fix was not requested for M-62. Lpy@ can you confirm if this is ready for M62? Has this been well tested in lower level channels? Is it a safe merge overall?

Comment 38 by l...@chromium.org, Oct 24 2017

Labels: Merge-Request-62
Project Member

Comment 39 by sheriffbot@chromium.org, Oct 24 2017

Labels: -Merge-Request-62 Merge-Review-62 Hotlist-Merge-Review
This bug requires manual review: Request affecting a post-stable build
Please contact the milestone owner if you have questions.
Owners: amineer@(Android), cmasso@(iOS), bhthompson@(ChromeOS), abdulsyed@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot

Comment 40 by l...@chromium.org, Oct 24 2017

Labels: -Merge-Review-62 Merge-Request-63
Labels: -M-62 found-in-62
As discussed with lpy@, it's a fairly rare scenario, I recommend we consider this for M63. Removing M62. 
Project Member

Comment 42 by sheriffbot@chromium.org, Oct 25 2017

Labels: -Merge-Request-63 Hotlist-Merge-Approved Merge-Approved-63
Your change meets the bar and is auto-approved for M63. Please go ahead and merge the CL to branch 3239 manually. Please contact milestone owner if you have questions.
Owners: cmasso@(Android), cmasso@(iOS), gkihumba@(ChromeOS), govind@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Please merge your change to M63 branch 3239 by 4:00 PM PT, today(Thursday). Thank you.
Project Member

Comment 44 by bugdroid1@chromium.org, Oct 26 2017

Labels: -merge-approved-63 merge-merged-3239
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/a0771ddd35addd512f8ae728e4c48177c602d74b

commit a0771ddd35addd512f8ae728e4c48177c602d74b
Author: Peiyong Lin <lpy@chromium.org>
Date: Thu Oct 26 17:37:47 2017

Remove EQT plumbing to GRC.

The patch that added EQT plumbing to GRC was suspected as a cause of OOM
crash when Windows was waken up, thus revert it speculatively.

BUG= 763710 

Change-Id: I4695328752aa5c9a19f100dbeaaefd2afb1b350b
Reviewed-on: https://chromium-review.googlesource.com/721613
Commit-Queue: lpy <lpy@chromium.org>
Reviewed-by: Alexander Timin <altimin@chromium.org>
Reviewed-by: Timothy Dresser <tdresser@chromium.org>
Cr-Original-Commit-Position: refs/heads/master@{#509483}(cherry picked from commit de184fe0065ceca2a3c8e65f91cc7ac48b7b288c)
Reviewed-on: https://chromium-review.googlesource.com/739762
Reviewed-by: lpy <lpy@chromium.org>
Cr-Commit-Position: refs/branch-heads/3239@{#245}
Cr-Branched-From: adb61db19020ed8ecee5e91b1a0ea4c924ae2988-refs/heads/master@{#508578}
[modify] https://crrev.com/a0771ddd35addd512f8ae728e4c48177c602d74b/third_party/WebKit/Source/platform/scheduler/renderer/renderer_scheduler_impl.cc

Project Member

Comment 45 by bugdroid1@chromium.org, Oct 30 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/bc1a27994f49199f4f193489c5fe36dfb22a3057

commit bc1a27994f49199f4f193489c5fe36dfb22a3057
Author: Peiyong Lin <lpy@chromium.org>
Date: Mon Oct 30 18:49:22 2017

Re-add EQT plumbing to GRC.

In https://chromium-review.googlesource.com/c/chromium/src/+/737430, we
introduced an approach to avoid calculating EQT for long idle period. Thus, we
re-add EQT plumbing to GRC in this patch. Previously it was removed in:
https://chromium-review.googlesource.com/c/chromium/src/+/721613

BUG= 763710 

Change-Id: I511f4ad6ea0f6e35d16d299248053e19f4834d21
Reviewed-on: https://chromium-review.googlesource.com/740013
Commit-Queue: lpy <lpy@chromium.org>
Reviewed-by: Alexander Timin <altimin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#512556}
[modify] https://crrev.com/bc1a27994f49199f4f193489c5fe36dfb22a3057/third_party/WebKit/Source/platform/scheduler/renderer/renderer_scheduler_impl.cc

 Issue 779391  has been merged into this issue.
[Bulk Edit]
URGENT - PTAL.
M63 Stable promotion is coming soon and your bug is labelled as Stable ReleaseBlock, pls make sure to land the fix and get it merged into the release branch ASAP. Thank you.

Comment 48 by l...@chromium.org, Nov 1 2017

Status: Fixed (was: Assigned)

Sign in to add a comment