Chrome 68 - "Enable Native Notifications" results in high CPU usage |
|||||||
Issue descriptionChrome version: 68.0.3440.106 OS version: Citrix environment on Server 2016 1607 Case#: 16742264 Description: user sessions often result in race conditions between explorer.exe and chrome.exe (primary process), where they fight over C:\Users\<username>\AppData\Local\Microsoft\Windows\Notifications\wpndatabase.db-wal. The result is that explorer.exe and chrome.exe both consume 10-20% CPU each for an indefinite period. Once the handle for that file is closed manually via procexp, the CPU usage clears immediately. The flag ""Enable native notifications"" does not appear to have a policy associated with it, the customer needs a way to disable this option for all users. Please advise if there's any supported method to disable the flag via GPO. Customer has rolled back to version 67 and issue no longer reproduces. Steps to reproduce: 1. Install Chrome browser 2. Keep user session running for uncertain amount of time 3. Task Manager starts to show high CPU consumption for both Chrome.exe and explorer.exe processes Current Behavior / Reproduction: - High CPU usage Expected Behavior: - No significant CPU usage when using system native notifications Existing workaround: - Disable native notifications Drive link to logs: - unable to collect as the issue is very inconsistent for all users and can't be reproduced on demand However, customer was able to capture procmon logs: https://drive.google.com/open?id=15GcdvvRRGkZwAx5hRsgXCTp45XxtG5hr Screenshot of task manager: https://drive.google.com/open?id=19f9LpoOksJKILKuXAmaBLxJA5n3tMWWa
,
Aug 30
I took a look at the PML files. One thing they show is that the machine is running 10.0.14393 Windows - Anniversary Update - which is two years old. Is that normal for enterprise customers? Could that explain why they are hitting this problem and other users aren't? Unfortunately the PML file doesn't contain stacks on the profile events so it is difficult to tell what is consuming CPU time in explorer.exe and chrome.exe. However I did notice a lot of Thread Create events happening. The trace only covers about 34 s (from 3:49:23 to 3:49:57) and during that time chrome.exe (browser process) creates (and destroys) threads 1,535 times, or 45 times per second. This thread creation is happening from a thread that is running 100% Windows code - here is the call stack: ntdll.dll NtCreateThreadEx + 0x14 KERNELBASE.dll CreateRemoteThreadEx + 0x274 KERNEL32.DLL CreateThreadStub + 0x3c wpnapps.dll ToastFeedback::Invoke + 0x78, d:\rs1\onecoreuap\base\diagnosis\platform\notifications\developer\toastfeedback.cpp(145) RPCRT4.dll Invoke + 0x73 RPCRT4.dll Ndr64StubWorker + 0xbfd RPCRT4.dll NdrStubCall3 + 0xc9 combase.dll CStdStubBuffer_Invoke + 0x59, d:\rs1\onecore\com\combase\ndr\ndrole\stub.cxx(1527) RPCRT4.dll CStdStubBuffer_Invoke + 0x3b combase.dll ObjectMethodExceptionHandlingAction<<lambda_76d9e92c799d246a4afbe64a2bf5673d> > + 0x4c, d:\rs1\onecore\com\combase\dcomrem\excepn.hxx(91) combase.dll DefaultStubInvoke + 0x222, d:\rs1\onecore\com\combase\dcomrem\channelb.cxx(1891) combase.dll ServerCall::ContextInvoke + 0x448, d:\rs1\onecore\com\combase\dcomrem\ctxchnl.cxx(1541) combase.dll AppInvoke + 0x32d, d:\rs1\onecore\com\combase\dcomrem\channelb.cxx(1616) combase.dll ComInvokeWithLockAndIPID + 0x574, d:\rs1\onecore\com\combase\dcomrem\channelb.cxx(2722) combase.dll ThreadInvoke + 0xe2c, d:\rs1\onecore\com\combase\dcomrem\channelb.cxx(7096) RPCRT4.dll DispatchToStubInCNoAvrf + 0x24 RPCRT4.dll RPC_INTERFACE::DispatchToStubWorker + 0x1bd RPCRT4.dll RPC_INTERFACE::DispatchToStubWithObject + 0x15e RPCRT4.dll LRPC_SCALL::DispatchRequest + 0x177 RPCRT4.dll LRPC_SCALL::HandleRequest + 0x2bc RPCRT4.dll LRPC_ADDRESS::HandleRequest + 0x36c RPCRT4.dll LRPC_ADDRESS::ProcessIO + 0x91b RPCRT4.dll LrpcIoComplete + 0xaa ntdll.dll TppAlpcpExecuteCallback + 0x25e ntdll.dll TppWorkerThread + 0x8d9 KERNEL32.DLL BaseThreadInitThunk + 0x14 ntdll.dll RtlUserThreadStart + 0x21 Sorry for the wall of text, but ultimately ToastFeedback::Invoke is creating a thread so this is definitely related to the native notifications. I still don't know why things are going wrong. Maybe there were notification bugs in anniversary update? It is entirely consistent that creating 50 threads per second would cause problems. I'm not sure that the file itself is a problem. Maybe this is a COM registration problem or ??? I thought I'd share this information in case anyone else can add to it.
,
Aug 30
,
Aug 30
Digging into this the callstack is servicing the event callback run when a native notification is clicked through or dismissed/failed. Assuming the architecture hasn't changed from the design doc this means that the Toast clickthrough invokes a method in our OOP COM server which should then invoke a new chrome instance to service the call (or maybe it's invoked directly through the COM server now?). That leads me to believe the ultimate issue is likely something Chrome is doing during the activation process which is causing the notification to be handled repeatedly. Perhaps the locked file is causing an exception/error which is handled by a retry with no bail out? One possibility is the notifications are failing to start with Citrix installed (perhaps due to that locked file), which causes the Failed event to fire which causes the callback to be invoked. If we keep trying in the hope it'll fix itself it might call it repeatedly. It's also possible something to do with Citrix is effectively causing the notification click through to be fired repeatedly, which wouldn't be a Chrome issue though if that was the case we might expect to see chrome.exe being started thousands of times which I assume you'd have noticed in the PML. Do we see the OOP COM process in the PML (don't have a Windows machine easily to hand for looking at the PML) as that might narrow down whether the event is coming from the inside or externally generated. Looking at the code on Spring Creators Update (1803) it's using the thread pool API not directly creating a new thread to service the request. Perhaps the bug is still there in newer versions of Windows+Chrome but is hidden as it's not spinning up a massive number of new threads? In conclusion as we can't easily determine from the stack trace which event was being fired we have to assume that it's something to do with how chrome is issuing the native notifications and I'd dig into whether a failure to setup the notification causes it to try again repeatedly.
,
Aug 30
The issue needs to be tested on Citrix environment on Server 2016 1607 which is out of TE-scope. Hence, forwarding it to Inhouse enterprise team for further triaging of the issue. Thanks...!!
,
Aug 30
,
Aug 30
,
Sep 6
Is it possible that whatever it is that is showing this notification is doing it in a loop? What is the notification for?
,
Sep 13
We have a separate report of looping notifications on versions of Windows 10 prior to 17134, leading to memory leaks. This is tracked in the (Google only) crbug.com/882622. It does look like both bugs are due to a bug in Windows 10 prior to 17134.
,
Sep 17
The latest Chrome Stable, 69.0.3497.100, turns off native notifications on Windows 10 16299 and earlier due to one or more Windows 10 bugs that are fixed in Windows 10 17134. I believe that one of those bugs was what was causing this issue. Therefore I think it is highly likely that this bug is fixed by this change: The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/46a64bff87005917e5acb3c0e292a5806a158852 commit 46a64bff87005917e5acb3c0e292a5806a158852 Author: Xi Cheng <chengx@chromium.org> Date: Wed Sep 12 22:00:35 2018 Disable Windows 10 native notification for OS builds old than 17134 Windows 10 native notification seems to have memory leak issues on OS builds older than 17134. See crbug.com/882622 . This seems to be a Windows issue which has been fixed in 17134. Bug: 882622 Change-Id: Icb251eb9ebb88b19b20b8e09692d85f7fcec6b54 Reviewed-on: https://chromium-review.googlesource.com/1220569 Reviewed-by: Justin DeWitt <dewittj@chromium.org> Commit-Queue: Xi Cheng <chengx@chromium.org> Cr-Original-Commit-Position: refs/heads/master@{#590531}(cherry picked from commit 9b89d86e112072f3bbf8c7720e8023de15c1dfc8) Reviewed-on: https://chromium-review.googlesource.com/1222702 Reviewed-by: Krishna Govind <govind@chromium.org> Cr-Commit-Position: refs/branch-heads/3550@{#4} Cr-Branched-From: 56b6dbab1f1949a336a0f1dd048cf678d6f1733a-refs/heads/master@{#590498} [modify] https://crrev.com/46a64bff87005917e5acb3c0e292a5806a158852/chrome/browser/notifications/notification_platform_bridge_win.cc If you observe this high CPU usage on Windows 10 17134 or later then please let us know. Otherwise I think we should close this as fixed.
,
Sep 17
Assign to myself so that it's under my surveillance.
,
Sep 24
|
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by ykrychala@chromium.org
, Aug 29Labels: -Pri-2 OS-Windows Pri-1