New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 878823 link

Starred by 4 users

Issue metadata

Status: Fixed
Owner:
Closed: Sep 24
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 1
Type: Bug



Sign in to add a comment

Chrome 68 - "Enable Native Notifications" results in high CPU usage

Project Member Reported by ykrychala@google.com, Aug 29

Issue description

Chrome version: 68.0.3440.106
OS version: Citrix environment on Server 2016 1607 
Case#: 16742264

Description:
user sessions often result in race conditions between explorer.exe and chrome.exe (primary process), where they fight over C:\Users\<username>\AppData\Local\Microsoft\Windows\Notifications\wpndatabase.db-wal. 

The result is that explorer.exe and chrome.exe both consume 10-20% CPU each for an indefinite period. Once the handle for that file is closed manually via procexp, the CPU usage clears immediately. The flag ""Enable native notifications"" does not appear to have a policy associated with it, the customer needs a way to disable this option for all users. Please advise if there's any supported method to disable the flag via GPO. 

Customer has rolled back to version 67 and issue no longer reproduces.

Steps to reproduce: 
1. Install Chrome browser
2. Keep user session running for uncertain amount of time
3. Task Manager starts to show high CPU consumption for both Chrome.exe and explorer.exe processes

Current Behavior / Reproduction: 
- High CPU usage

Expected Behavior: 
- No significant CPU usage when using system native notifications

Existing workaround:
- Disable native notifications

Drive link to logs: 
- unable to collect as the issue is very inconsistent for all users and can't be reproduced on demand

However, customer was able to capture procmon logs:
https://drive.google.com/open?id=15GcdvvRRGkZwAx5hRsgXCTp45XxtG5hr

Screenshot of task manager:
https://drive.google.com/open?id=19f9LpoOksJKILKuXAmaBLxJA5n3tMWWa

 
Components: UI>Notifications UI>Shell>Notifications
Labels: -Pri-2 OS-Windows Pri-1
I took a look at the PML files. One thing they show is that the machine is running 10.0.14393 Windows - Anniversary Update - which is two years old. Is that normal for enterprise customers? Could that explain why they are hitting this problem and other users aren't?

Unfortunately the PML file doesn't contain stacks on the profile events so it is difficult to tell what is consuming CPU time in explorer.exe and chrome.exe. However I did notice a lot of Thread Create events happening. The trace only covers about 34 s (from 3:49:23 to 3:49:57) and during that time chrome.exe (browser process) creates (and destroys) threads 1,535 times, or 45 times per second. This thread creation is happening from a thread that is running 100% Windows code - here is the call stack:

ntdll.dll	NtCreateThreadEx + 0x14
KERNELBASE.dll	CreateRemoteThreadEx + 0x274
KERNEL32.DLL	CreateThreadStub + 0x3c
wpnapps.dll	ToastFeedback::Invoke + 0x78, d:\rs1\onecoreuap\base\diagnosis\platform\notifications\developer\toastfeedback.cpp(145)
RPCRT4.dll	Invoke + 0x73
RPCRT4.dll	Ndr64StubWorker + 0xbfd
RPCRT4.dll	NdrStubCall3 + 0xc9
combase.dll	CStdStubBuffer_Invoke + 0x59, d:\rs1\onecore\com\combase\ndr\ndrole\stub.cxx(1527)
RPCRT4.dll	CStdStubBuffer_Invoke + 0x3b
combase.dll	ObjectMethodExceptionHandlingAction<<lambda_76d9e92c799d246a4afbe64a2bf5673d> > + 0x4c, d:\rs1\onecore\com\combase\dcomrem\excepn.hxx(91)
combase.dll	DefaultStubInvoke + 0x222, d:\rs1\onecore\com\combase\dcomrem\channelb.cxx(1891)
combase.dll	ServerCall::ContextInvoke + 0x448, d:\rs1\onecore\com\combase\dcomrem\ctxchnl.cxx(1541)
combase.dll	AppInvoke + 0x32d, d:\rs1\onecore\com\combase\dcomrem\channelb.cxx(1616)
combase.dll	ComInvokeWithLockAndIPID + 0x574, d:\rs1\onecore\com\combase\dcomrem\channelb.cxx(2722)
combase.dll	ThreadInvoke + 0xe2c, d:\rs1\onecore\com\combase\dcomrem\channelb.cxx(7096)
RPCRT4.dll	DispatchToStubInCNoAvrf + 0x24
RPCRT4.dll	RPC_INTERFACE::DispatchToStubWorker + 0x1bd
RPCRT4.dll	RPC_INTERFACE::DispatchToStubWithObject + 0x15e
RPCRT4.dll	LRPC_SCALL::DispatchRequest + 0x177
RPCRT4.dll	LRPC_SCALL::HandleRequest + 0x2bc
RPCRT4.dll	LRPC_ADDRESS::HandleRequest + 0x36c
RPCRT4.dll	LRPC_ADDRESS::ProcessIO + 0x91b
RPCRT4.dll	LrpcIoComplete + 0xaa
ntdll.dll	TppAlpcpExecuteCallback + 0x25e
ntdll.dll	TppWorkerThread + 0x8d9
KERNEL32.DLL	BaseThreadInitThunk + 0x14
ntdll.dll	RtlUserThreadStart + 0x21

Sorry for the wall of text, but ultimately ToastFeedback::Invoke is creating a thread so this is definitely related to the native notifications. I still don't know why things are going wrong. Maybe there were notification bugs in anniversary update? It is entirely consistent that creating 50 threads per second would cause problems. I'm not sure that the file itself is a problem. Maybe this is a COM registration problem or ??? I thought I'd share this information in case anyone else can add to it.
Labels: Needs-Triage-M68
Digging into this the callstack is servicing the event callback run when a native notification is clicked through or dismissed/failed. Assuming the architecture hasn't changed from the design doc this means that the Toast clickthrough invokes a method in our OOP COM server which should then invoke a new chrome instance to service the call (or maybe it's invoked directly through the COM server now?).

That leads me to believe the ultimate issue is likely something Chrome is doing during the activation process which is causing the notification to be handled repeatedly. Perhaps the locked file is causing an exception/error which is handled by a retry with no bail out? One possibility is the notifications are failing to start with Citrix installed (perhaps due to that locked file), which causes the Failed event to fire which causes the callback to be invoked. If we keep trying in the hope it'll fix itself it might call it repeatedly.

It's also possible something to do with Citrix is effectively causing the notification click through to be fired repeatedly, which wouldn't be a Chrome issue though if that was the case we might expect to see chrome.exe being started thousands of times which I assume you'd have noticed in the PML. Do we see the OOP COM process in the PML (don't have a Windows machine easily to hand for looking at the PML) as that might narrow down whether the event is coming from the inside or externally generated.

Looking at the code on Spring Creators Update (1803) it's using the thread pool API not directly creating a new thread to service the request. Perhaps the bug is still there in newer versions of Windows+Chrome but is hidden as it's not spinning up a massive number of new threads?

In conclusion as we can't easily determine from the stack trace which event was being fired we have to assume that it's something to do with how chrome is issuing the native notifications and I'd dig into whether a failure to setup the notification causes it to try again repeatedly.
Cc: jmukthavaram@chromium.org
Labels: Triaged-ET TE-NeedsTriageFromHYD
The issue needs to be tested on Citrix environment on Server 2016 1607 which is out of TE-scope. Hence, forwarding it to Inhouse enterprise team for further triaging of the issue.

Thanks...!!
Cc: chengx@chromium.org
Cc: finnur@chromium.org brucedaw...@chromium.org grt@chromium.org
Is it possible that whatever it is that is showing this notification is doing it in a loop? What is the notification for?
We have a separate report of looping notifications on versions of Windows 10 prior to 17134, leading to memory leaks. This is tracked in the (Google only) crbug.com/882622. It does look like both bugs are due to a bug in Windows 10 prior to 17134.
The latest Chrome Stable, 69.0.3497.100, turns off native notifications on Windows 10 16299 and earlier due to one or more Windows 10 bugs that are fixed in Windows 10 17134. I believe that one of those bugs was what was causing this issue. Therefore I think it is highly likely that this bug is fixed by this change:

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/46a64bff87005917e5acb3c0e292a5806a158852

commit 46a64bff87005917e5acb3c0e292a5806a158852
Author: Xi Cheng <chengx@chromium.org>
Date: Wed Sep 12 22:00:35 2018

Disable Windows 10 native notification for OS builds old than 17134

Windows 10 native notification seems to have memory leak issues on OS
builds older than 17134. See  crbug.com/882622 . This seems to be a Windows
issue which has been fixed in 17134.

Bug:  882622 
Change-Id: Icb251eb9ebb88b19b20b8e09692d85f7fcec6b54
Reviewed-on: https://chromium-review.googlesource.com/1220569
Reviewed-by: Justin DeWitt <dewittj@chromium.org>
Commit-Queue: Xi Cheng <chengx@chromium.org>
Cr-Original-Commit-Position: refs/heads/master@{#590531}(cherry picked from commit 9b89d86e112072f3bbf8c7720e8023de15c1dfc8)
Reviewed-on: https://chromium-review.googlesource.com/1222702
Reviewed-by: Krishna Govind <govind@chromium.org>
Cr-Commit-Position: refs/branch-heads/3550@{#4}
Cr-Branched-From: 56b6dbab1f1949a336a0f1dd048cf678d6f1733a-refs/heads/master@{#590498}
[modify] https://crrev.com/46a64bff87005917e5acb3c0e292a5806a158852/chrome/browser/notifications/notification_platform_bridge_win.cc



If you observe this high CPU usage on Windows 10 17134 or later then please let us know. Otherwise I think we should close this as fixed.

Cc: -chengx@chromium.org
Owner: chengx@chromium.org
Status: Assigned (was: Unconfirmed)
Assign to myself so that it's under my surveillance.
Status: Fixed (was: Assigned)

Sign in to add a comment