Browser process taking up >100% of CPU while idle |
|||||||||||||||||
Issue descriptionChrome Version : 60.0.3112.52 OS Version: 9592.42.0 See attached trace. I'm seeing a huge amount of work on the IO thread, specifically MessagePumpLibevent::OnLibeventNotification and MessageLoop::RunTask, though I'm not sure if that tells us any information about what's actually happening.
,
Jul 25 2017
This cycle is firing 200 times per *millisecond*
,
Jul 25 2017
Here's another trace with more trace categories enabled, all tabs closed, and all extensions disabled (except those that corp forces me to have). Still not too enlightening...
,
Jul 25 2017
Feedback report: https://listnr.corp.google.com/report/69161315555 CHROME VERSION: 60.0.3112.52 beta CHROMEOS_RELEASE_BOARD: samus-signed-mp-v3keys CHROMEOS_RELEASE_DESCRIPTION: 9592.42.0 (Official Build) beta-channel samus CHROMEOS_RELEASE_TRACK: beta-channel CHROMEOS_RELEASE_VERSION: 9592.42.0
,
Jul 25 2017
Here's one with rendererscheduler.debug (per Ojan's request)
,
Jul 26 2017
,
Jul 26 2017
I observed that Chrome process took 20% CPU time all the time on OOBE screen when it is just reflashed to the target version. (I saw CPU% using top via SSH terminal.) Reboot (still in OOBE screen) can see the same behavior. After logging into guest, the behavior is gone. I cannot reproduce the situation that do nothing browser taking > 100% CPU time. Ryan, can you reveal reproducible steps?
,
Jul 26 2017
I have no steps to reproduce, but I've been avoiding my laptop in case we need to take any additional diagnostic information. This issue is still happening, >100% in browser process all the time.
,
Jul 26 2017
er, avoiding *restarting* my laptop
,
Jul 26 2017
Ryan, can you submit feedback from your chromebook? (alt-shift-i) Dean, are there any other traces that would be useful to collect?
,
Jul 26 2017
Feedback report - ahh, it's in comment #4. Dean can you please take a look?
,
Jul 26 2017
+some scheduling folks who might have good debugging ideas while Ryan still has his laptop in this broken state.
,
Jul 27 2017
Conrad, sorry that I cannot see any way to reproduce it from feedback report.
,
Jul 27 2017
Based on the trace in comment #5 the browser's IO thread is pegged by notifications about a file descriptor becoming readable and calls to start watching a file descriptor. Could this be a mojo/IPC issue?
,
Jul 27 2017
+jam since Sami said Mojo. FWIW, my computer is still in this state if there's any info that would be helpful.
,
Jul 27 2017
+some more folks Ryan: can you copy & paste the output of variations from chrome:version so we can see what experiments are enabled?
,
Jul 27 2017
Sure thing: 98ee9f3e-98ee9f3e 6a89113b-ccb9a6a5 ea8deb27-803f8fc4 6c18ba9d-3d98b302 241fff6c-4eda1c57 c68ab9a3-6edc92c7 4c328cc8-3d47f4f4 6345b824-3d47f4f4 38eb801c-3f4a17df 7c1bc906-f55a7974 47e5d3db-3d47f4f4 2a33b90e-1d964d50 1c752ce9-1c752ce9 a605b19e-65bced95 5ca89f9-f23d1dea f3499283-7711d854 811bc6bc-ca7d8d80 98be3390-54f732d1 9e201a2b-65bced95 d0165945-3f4a17df 68812885-3f4a17df 9bd94ed7-b1c9f6b0 b791c1b8-3f4a17df 9773d3bd-3f4a17df 99144bc3-4da47e09 165e16d1-3f4a17df 9e5c75f1-7ee11c33 e79c5ffa-3f4a17df f79cb77b-3f4a17df b7786474-d93a0620 be3b5415-bed9585 27219e67-b2047178 23a898eb-e0e2610f d39326b0-d93a0620 e856c60b-ca7d8d80 97b8b9d6-3f4a17df 4ea303a6-53a3c5c2 3d7e3f6a-2eb01455 64224f74-5087fa4a 56302f8c-f23d1dea de03e059-1410f10 2697ea25-ca7d8d80 f56e0452-ca7d8d80 b2f0086-93053e47 1354da85-bed86e5 3ac60855-3ec2a267 f296190c-7158671e 4442aae2-e1cc0f14 ed1d377-e1cc0f14 75f0f0a0-6bdfffe7 e2b18481-a90023b1 e7e71889-4ad60575 828a5926-ca7d8d80 81c6897f-2d9ebb2e 90fec5c8-431b2210 e0a8874-e0a8874
,
Jul 27 2017
Note that the IO message pump is implemented in such a way that any task posted to the IO thread will result in the FD watcher wakeup path - the pump is just watching a pool of FDs, and a task posted to an idle loop will signal a special notification FD to force a wakeup. In the trace from #5 I see a few sparse actual IPC messages being dispatched, but the majority of the wakeups do not appear to dispatch IPC. There is a remote possibility (though we've never seen such a thing in the wild) that some internal Mojo bug would cause the browser and a renderer to get stuck circulating some internal Mojo control messages in high frequency, and we wouldn't see real IPC message dispatches in those cases. We would however expect to see both the browser and at least one other process pegging CPU. To me it seems more likely that some code in the browser is spamming the IO message loop with tasks.
,
Jul 27 2017
Could be related to Mojo loading since I see MojoAsyncResourceHandler uses Mojo's SimpleWatcher helper from the IO thread, and it calls SimpleWatcher::ArmOrNotify() which if not used carefully (i.e. if called synchronously from the provided watcher callback without ensuring the handle state has been changed meaningfully), can quite easily result in this kind of task spam.
,
Jul 28 2017
FWIW I'm seeing this on my macbook pro. The mojo loading experiment was enabled, but I disabled it and I'm still seeing high cpu usage that's draining my battery.
,
Jul 28 2017
btw I can generate a trace, which categories would be useful?
,
Jul 28 2017
I typically will trace all the categories in the left column and add in the renderer_scheduler ones from the right column for renderer things (although, the renderer_scheduler ones can sometimes generate too much data and cause tracing to go nuts). I'm less familiar with what's useful for browser process. gab@ any suggestions?
,
Jul 28 2017
I usually just start with "toplevel" and "task_scheduler" and then see where the majority of tasks come from with src_file field. From there, if src_file isn't enough, I refine with another trace that enables categories relevant to that src_file.
,
Jul 28 2017
Thanks, ok I did the categories that gab@ suggested. It looks like the source is task_group_sampler.cc (task manager!). Once I closed the task manager, the cpu usage went down dramatically (~20% cpu to ~2). Here's a trace (not using attachment since it has some internal URLs): https://drive.google.com/a/google.com/file/d/0B68xQb9wZOl7WllxUXBFUTRUWmM/view?usp=sharing
,
Jul 29 2017
Oh yeah... I think poor task manager performance is a known issue which Bruce has looked into
,
Jul 29 2017
There have been problems with task manager being a bit inefficient (but nothing this bad) but the known issues were resolved. This is Linux, right? Either task manager is much less efficient there or, more likely, some weird edge case is being triggered.
,
Jul 31 2017
FWIW, my original problem was likely not related to task manager. My fans were running pretty much all the time, which is what prompted me to open up the task manager in the first place. Even after closing it I thought my laptop was going to take off. So, it's possible these are different issues. Unfortunately at this point I've restarted Chrome to update it, so I no longer have the high CPU behavior.
,
Jul 31 2017
Ok if it happens again a chrome trace with "toplevel" and "task_scheduler" is the easiest way to see at least the origin of the problem. Le lun. 31 juil. 2017 09:10, rschoen via monorail < monorail+v2.3577582279@chromium.org> a écrit :
,
Jul 31 2017
Feedback report only indicates a busy IO thread in the 'top' section as people have reported:
top - 14:36:51 up 10 days, 23:13, 0 users, load average: 1.86, 1.61, 1.38
Threads: 570 total, 3 running, 567 sleeping, 0 stopped, 0 zombie
%Cpu(s): 32.3 us, 8.0 sy, 0.0 ni, 58.6 id, 1.1 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 16344884 total, 12488440 used, 3856444 free, 240032 buffers
KiB Swap: 23942700 total, 0 used, 23942700 free. 5998244 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1520 chronos 12 -8 1721624 501428 145988 R 94.7 3.1 141:30.12 Chrome_IOTh+
1 root 20 0 14700 2584 1384 S 0.0 0.0 0:01.78 init
Should we assign the bug to the Mojo team based on the comment in #18?
,
Aug 8 2017
,
Aug 11 2017
I'm also now seeing this problem on my Chromebook (60.0.3112.80). A trace shows the browser IO thread being similarly pegged with StartWatching/OnFileCanReadWithoutBlocking.
,
Aug 11 2017
chrome://inducebrowsercrashforrealz gave me this: https://crash.corp.google.com/browse?q=reportid=%27eebfc9c64cf5a3db%27#0 (still being processed)
,
Aug 11 2017
If the chrome://tracing traces are inclusive can you grab an ETW trace the next time it is in this state? See go/etw or https://randomascii.wordpress.com/2015/09/01/xperf-basics-recording-a-trace-the-ultimate-easy-way/. After installing UIforETW you probably want to change the tracing mode from Circular buffer tracing to Tracing to file. Share the trace with me and I'll take a look.
,
Aug 14 2017
The crash report didn't really reveal anything. There's some sqlite activity but I think that's just because I was typing in the omnibox at the time. I think we need a perf/ETW trace lik Bruce said. I have reproduced the problem again on my Chromebook. As far as I can tell the trigger seems to be taking a trace while casting the screen to hangouts. However I don't know if I can pull any useful data out of a Chromebook while it's in this state. Any ideas?
,
Aug 14 2017
Actually looks like you just need to take a trace on this Chromebook to get it into a bad state -- doesn't seem to matter what else is going on.
,
Aug 24 2017
Fairly certain this is what I'm seeing on my Debian stretch laptop running Chrome 60.0.3112.101-1. Chrome is completely idle, yet the Browser process is at 100% CPU. I don't have an absurd number of tabs open, and they're not doing anything in the background AFAICT. The last time this wasn't an issue was somewhere around Chrome 59.0.3071.115-1. I don't have to open up the task manager or inspector to get it into this state. It eventually just happens.
,
Aug 24 2017
#36: Could you please record a trace[1] of the problem and attach it here so we can see if it's the same problem. [1] https://www.chromium.org/developers/how-tos/trace-event-profiling-tool/recording-tracing-runs
,
Aug 24 2017
Trace attached.
,
Aug 25 2017
The trace shows two (unidentified) tabs that are waking up ten times a second, which is wasteful, but not enough to account for the observed behavior. "Sell Out: Part Two | The New Yorker" is animating at 60 fps, so that uses more time. I'm curious about VoiceProcessThread on uBlock Origin, but I *think* it's just waking up briefly once a second. The real culprit is IndexedDBDatabase::CreateTransaction. Each invocation just takes 3-4 ms but they are happening back-to-back, continuously. There are 4,426 of them in the trace. They are scheduled from content/browser/indexed_db/database_impl.cc in DatabaseImpl::CreateTransaction, but I'm not sure what is triggering that, or why. There have been a lot of changes to indexed_db.mojom.cc lately (generated from src/content/common/indexed_db/indexed_db.mojom) so maybe one of those is the trigger?
,
Aug 25 2017
+some people who are familiar with the indexed DB impl
,
Aug 25 2017
Calls to DatabaseImpl::CreateTransaction come from the other end of a Mojo pipe which is written to from WebIDBDatabaseImpl::CreateTransaction. Do we see any renderer activity in the trace indicating calls to IndexedDB?
,
Aug 25 2017
+more IDB people
The IndexedDBDatabase::CreateTransaction calls are interspersed with IndexedDBDatabase::DeleteRange calls with corresponding transaction IDs (yay for tracing args) which are incrementing monotonically. This is the trace similar to what you'd get if one tab was running:
while (true) { db.transaction('s', 'readwrite').objectStore('s').delete(0); }
... but I don't see a renderer doing so. (Also not sure why CreateTransaction takes 3ms, it should be cheap...)
The transaction IDs are incremented on the renderer side at the point of the script's .transaction() call , so this doesn't look like e.g. a message replay issue.
,
Sep 20 2017
,
May 8 2018
,
Oct 17
|
|||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||
Comment 1 by rsch...@chromium.org
, Jul 25 2017