New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 748696 link

Starred by 9 users

Issue metadata

Status: Assigned
Owner:
Last visit > 30 days ago
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

Browser process taking up >100% of CPU while idle

Project Member Reported by rsch...@chromium.org, Jul 25 2017

Issue description

Chrome Version       : 60.0.3112.52
OS Version: 9592.42.0

See attached trace. I'm seeing a huge amount of work on the IO thread, specifically MessagePumpLibevent::OnLibeventNotification and MessageLoop::RunTask, though I'm not sure if that tells us any information about what's actually happening.

 
trace_browser100cpu.json.gz
2.4 MB Download
I also filed feedback about this issue from my @google.com account.
This cycle is firing 200 times per *millisecond*
Here's another trace with more trace categories enabled, all tabs closed, and all extensions disabled (except those that corp forces me to have). Still not too enlightening...
trace_moredetails (1).json.gz
3.3 MB Download
Cc: snanda@chromium.org
Owner: bccheng@chromium.org
Status: Assigned (was: Unconfirmed)
Feedback report: https://listnr.corp.google.com/report/69161315555

CHROME VERSION: 60.0.3112.52 beta
CHROMEOS_RELEASE_BOARD: samus-signed-mp-v3keys
CHROMEOS_RELEASE_DESCRIPTION: 9592.42.0 (Official Build) beta-channel samus
CHROMEOS_RELEASE_TRACK: beta-channel
CHROMEOS_RELEASE_VERSION: 9592.42.0
Cc: ojan@chromium.org
Here's one with rendererscheduler.debug (per Ojan's request)
trace_rendererscheduler.debug.json.gz
4.7 MB Download
Cc: deanliao@chromium.org
Cc: rsch...@chromium.org
I observed that Chrome process took 20% CPU time all the time on OOBE screen when it is just reflashed to the target version. (I saw CPU% using top via SSH terminal.) Reboot (still in OOBE screen) can see the same behavior. After logging into guest, the behavior is gone.

I cannot reproduce the situation that do nothing browser taking > 100% CPU time. Ryan, can you reveal reproducible steps?
I have no steps to reproduce, but I've been avoiding my laptop in case we need to take any additional diagnostic information. This issue is still happening, >100% in browser process all the time.
er, avoiding *restarting* my laptop
Ryan, can you submit feedback from your chromebook?  (alt-shift-i)

Dean, are there any other traces that would be useful to collect?
Feedback report - ahh, it's in comment #4. Dean can you please take a look?

Comment 12 by ojan@chromium.org, Jul 26 2017

Cc: fdoray@chromium.org skyos...@chromium.org altimin@chromium.org alexclarke@chromium.org gab@chromium.org
+some scheduling folks who might have good debugging ideas while Ryan still has his laptop in this broken state.
Conrad, sorry that I cannot see any way to reproduce it from feedback report.
Components: Internals>Mojo
Based on the trace in comment #5 the browser's IO thread is pegged by notifications about a file descriptor becoming readable and calls to start watching a file descriptor. Could this be a mojo/IPC issue?
Cc: jam@chromium.org
+jam since Sami said Mojo. FWIW, my computer is still in this state if there's any info that would be helpful.

Comment 16 by jam@chromium.org, Jul 27 2017

Cc: roc...@chromium.org tzik@chromium.org yzshen@chromium.org yhirano@chromium.org
+some more folks
Ryan: can you copy & paste the output of variations from chrome:version so we can see what experiments are enabled?
Sure thing:

98ee9f3e-98ee9f3e
6a89113b-ccb9a6a5
ea8deb27-803f8fc4
6c18ba9d-3d98b302
241fff6c-4eda1c57
c68ab9a3-6edc92c7
4c328cc8-3d47f4f4
6345b824-3d47f4f4
38eb801c-3f4a17df
7c1bc906-f55a7974
47e5d3db-3d47f4f4
2a33b90e-1d964d50
1c752ce9-1c752ce9
a605b19e-65bced95
5ca89f9-f23d1dea
f3499283-7711d854
811bc6bc-ca7d8d80
98be3390-54f732d1
9e201a2b-65bced95
d0165945-3f4a17df
68812885-3f4a17df
9bd94ed7-b1c9f6b0
b791c1b8-3f4a17df
9773d3bd-3f4a17df
99144bc3-4da47e09
165e16d1-3f4a17df
9e5c75f1-7ee11c33
e79c5ffa-3f4a17df
f79cb77b-3f4a17df
b7786474-d93a0620
be3b5415-bed9585
27219e67-b2047178
23a898eb-e0e2610f
d39326b0-d93a0620
e856c60b-ca7d8d80
97b8b9d6-3f4a17df
4ea303a6-53a3c5c2
3d7e3f6a-2eb01455
64224f74-5087fa4a
56302f8c-f23d1dea
de03e059-1410f10
2697ea25-ca7d8d80
f56e0452-ca7d8d80
b2f0086-93053e47
1354da85-bed86e5
3ac60855-3ec2a267
f296190c-7158671e
4442aae2-e1cc0f14
ed1d377-e1cc0f14
75f0f0a0-6bdfffe7
e2b18481-a90023b1
e7e71889-4ad60575
828a5926-ca7d8d80
81c6897f-2d9ebb2e
90fec5c8-431b2210
e0a8874-e0a8874
Note that the IO message pump is implemented in such a way that any task posted to the IO thread will result in the FD watcher wakeup path - the pump is just watching a pool of FDs, and a task posted to an idle loop will signal a special notification FD to force a wakeup.

In the trace from #5 I see a few sparse actual IPC messages being dispatched, but the majority of the wakeups do not appear to dispatch IPC.

There is a remote possibility (though we've never seen such a thing in the wild) that some internal Mojo bug would cause the browser and a renderer to get stuck circulating some internal Mojo control messages in high frequency, and we wouldn't see real IPC message dispatches in those cases. We would however expect to see both the browser and at least one other process pegging CPU.

To me it seems more likely that some code in the browser is spamming the IO message loop with tasks.

Could be related to Mojo loading since I see MojoAsyncResourceHandler uses Mojo's SimpleWatcher helper from the IO thread, and it calls SimpleWatcher::ArmOrNotify() which if not used carefully (i.e. if called synchronously from the provided watcher callback without ensuring the handle state has been changed meaningfully), can quite easily result in this kind of task spam.

Comment 20 by jam@chromium.org, Jul 28 2017

FWIW I'm seeing this on my macbook pro. The mojo loading experiment was enabled, but I disabled it and I'm still seeing high cpu usage that's draining my battery.

Comment 21 by jam@chromium.org, Jul 28 2017

btw I can generate a trace, which categories would be useful?

Comment 22 by ojan@chromium.org, Jul 28 2017

I typically will trace all the categories in the left column and add in the renderer_scheduler ones from the right column for renderer things (although, the renderer_scheduler ones can sometimes generate too much data and cause tracing to go nuts).

I'm less familiar with what's useful for browser process.

gab@ any suggestions?

Comment 23 by gab@chromium.org, Jul 28 2017

I usually just start with "toplevel" and "task_scheduler" and then see where the majority of tasks come from with src_file field. From there, if src_file isn't enough, I refine with another trace that enables categories relevant to that src_file.

Comment 24 by jam@chromium.org, Jul 28 2017

Cc: -yhirano@chromium.org -tzik@chromium.org
Thanks, ok I did the categories that gab@ suggested. It looks like the source is 
 task_group_sampler.cc (task manager!). Once I closed the task manager, the cpu usage went down dramatically (~20% cpu to ~2).

Here's a trace (not using attachment since it has some internal URLs): https://drive.google.com/a/google.com/file/d/0B68xQb9wZOl7WllxUXBFUTRUWmM/view?usp=sharing

Comment 25 by gab@chromium.org, Jul 29 2017

Cc: brucedaw...@chromium.org
Oh yeah... I think poor task manager performance is a known issue which Bruce has looked into
There have been problems with task manager being a bit inefficient (but nothing this bad) but the known issues were resolved. This is Linux, right? Either task manager is much less efficient there or, more likely, some weird edge case is being triggered.
FWIW, my original problem was likely not related to task manager. My fans were running pretty much all the time, which is what prompted me to open up the task manager in the first place. Even after closing it I thought my laptop was going to take off.

So, it's possible these are different issues. Unfortunately at this point I've restarted Chrome to update it, so I no longer have the high CPU behavior.

Comment 28 by gab@chromium.org, Jul 31 2017

Ok if it happens again a chrome trace with "toplevel" and "task_scheduler"
is the easiest way to see at least the origin of the problem.

Le lun. 31 juil. 2017 09:10, rschoen via monorail <
monorail+v2.3577582279@chromium.org> a écrit :
Feedback report only indicates a busy IO thread in the 'top' section as people have reported:

top - 14:36:51 up 10 days, 23:13,  0 users,  load average: 1.86, 1.61, 1.38
Threads: 570 total,   3 running, 567 sleeping,   0 stopped,   0 zombie
%Cpu(s): 32.3 us,  8.0 sy,  0.0 ni, 58.6 id,  1.1 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:  16344884 total, 12488440 used,  3856444 free,   240032 buffers
KiB Swap: 23942700 total,        0 used, 23942700 free.  5998244 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
 1520 chronos   12  -8 1721624 501428 145988 R 94.7  3.1 141:30.12 Chrome_IOTh+
    1 root      20   0   14700   2584   1384 S  0.0  0.0   0:01.78 init

Should we assign the bug to the Mojo team based on the comment in #18?
Cc: msnoxell@chromium.org
I'm also now seeing this problem on my Chromebook (60.0.3112.80). A trace shows the browser IO thread being similarly pegged with StartWatching/OnFileCanReadWithoutBlocking.

chrome://inducebrowsercrashforrealz gave me this: https://crash.corp.google.com/browse?q=reportid=%27eebfc9c64cf5a3db%27#0 (still being processed)
If the chrome://tracing traces are inclusive can you grab an ETW trace the next time it is in this state? See go/etw or https://randomascii.wordpress.com/2015/09/01/xperf-basics-recording-a-trace-the-ultimate-easy-way/. After installing UIforETW you probably want to change the tracing mode from Circular buffer tracing to Tracing to file. Share the trace with me and I'll take a look.
The crash report didn't really reveal anything. There's some sqlite activity but I think that's just because I was typing in the omnibox at the time. I think we need a perf/ETW trace lik Bruce said.

I have reproduced the problem again on my Chromebook. As far as I can tell the trigger seems to be taking a trace while casting the screen to hangouts. However I don't know if I can pull any useful data out of a Chromebook while it's in this state. Any ideas?
Actually looks like you just need to take a trace on this Chromebook to get it into a bad state -- doesn't seem to matter what else is going on.
Fairly certain this is what I'm seeing on my Debian stretch laptop running Chrome 60.0.3112.101-1. Chrome is completely idle, yet the Browser process is at 100% CPU. I don't have an absurd number of tabs open, and they're not doing anything in the background AFAICT. The last time this wasn't an issue was somewhere around Chrome 59.0.3071.115-1.

I don't have to open up the task manager or inspector to get it into this state. It eventually just happens. 
#36: Could you please record a trace[1] of the problem and attach it here so we can see if it's the same problem.

[1] https://www.chromium.org/developers/how-tos/trace-event-profiling-tool/recording-tracing-runs
Trace attached.
trace.json.gz
5.1 MB Download
The trace shows two (unidentified) tabs that are waking up ten times a second, which is wasteful, but not enough to account for the observed behavior.

"Sell Out: Part Two | The New Yorker" is animating at 60 fps, so that uses more time.

I'm curious about VoiceProcessThread on uBlock Origin, but I *think* it's just waking up briefly once a second.

The real culprit is IndexedDBDatabase::CreateTransaction. Each invocation just takes 3-4 ms but they are happening back-to-back, continuously. There are 4,426 of them in the trace. They are scheduled from content/browser/indexed_db/database_impl.cc in DatabaseImpl::CreateTransaction, but I'm not sure what is triggering that, or why. There have been a lot of changes to indexed_db.mojom.cc lately (generated from src/content/common/indexed_db/indexed_db.mojom) so maybe one of those is the trigger?


Cc: jsb...@chromium.org reillyg@chromium.org
+some people who are familiar with the indexed DB impl
Calls to DatabaseImpl::CreateTransaction come from the other end of a Mojo pipe which is written to from WebIDBDatabaseImpl::CreateTransaction. Do we see any renderer activity in the trace indicating calls to IndexedDB?
Cc: dmu...@chromium.org cmumford@chromium.org pwnall@chromium.org
+more IDB people

The IndexedDBDatabase::CreateTransaction calls are interspersed with IndexedDBDatabase::DeleteRange calls with corresponding transaction IDs (yay for tracing args) which are incrementing monotonically. This is the trace similar to what you'd get if one tab was running:

while (true) { db.transaction('s', 'readwrite').objectStore('s').delete(0); }

... but I don't see a renderer doing so. (Also not sure why CreateTransaction takes 3ms, it should be cheap...)

The transaction IDs are incremented on the renderer side at the point of the script's .transaction() call , so this doesn't look like e.g. a message replay issue.
Cc: -rsch...@chromium.org

Comment 44 by ojan@chromium.org, May 8 2018

Cc: -ojan@chromium.org
Cc: -roc...@chromium.org rockot@google.com

Sign in to add a comment