New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

CPU / Memory usage on terminal server

Reported by amgede...@gmail.com, Aug 10 2017

Issue description

UserAgent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36

Steps to reproduce the problem:
1. log in to rds or citrix 2012R2 environment
2. after a while cpu and memory of chrome session is to 100%cpu
3. 

What is the expected behavior?
Hanging sessions

What went wrong?
We have this at the moment with 4 customers with identical chrome version

Did this work before? Yes 

Chrome version: 60.0.3112.90  Channel: stable
OS Version: 2012R2
Flash Version:
 
Showing comments 7 - 106 of 106 Older
Cc: brucedaw...@chromium.org
Labels: -Performance -TE-NeedsTriageHelp -Via-Wizard-Other -Needs-Triage-M60 Performance-Memory Performance-Power
Hey Bruce, is this something you can help us with? I think running Chrome via RDS is pegging CPU. 

For the folks experiencing this problem, could you try running a trace and attaching the output?

https://www.chromium.org/developers/how-tos/trace-event-profiling-tool/recording-tracing-runs
Can you clarify which software is being used to connect to the remote machine? Is it Microsoft's Remote Desktop software?

chrome://tracing (previous comment) may reveal the cause, or it may be helpful (if this is some sort of system-level issue) to get an ETW trace on the remote machine, as described here:
https://randomascii.wordpress.com/2015/09/01/xperf-basics-recording-a-trace-the-ultimate-easy-way/


Comment 9 Deleted

I too can confirm this is an issue with our Windows 2012 R2 RDS Servers.

Happy to assist with producing logs to resolve!
If you can get an ETW trace on the remote machine, as described here:
https://randomascii.wordpress.com/2015/09/01/xperf-basics-recording-a-trace-the-ultimate-easy-way/
that would be very helpful. You can share it directly with me (brucedawson@chromium.org) if you don't want to share it publicly (because ETW traces do contain a lot of information).
Components: Blink>Media>Audio Blink>MemoryAllocator>Partition
I received an ETW trace from one of the customers and it shows that something has gone wrong in PartitionAlloc, triggered by audio. The afflicted process has four threads that are all trying to use 100% of CPU time. There are only two CPUs on the system so they are switching between the CPUs. It looks like all four threads are trying to acquire the PartitionAlloc spin lock in order to make an audio allocation. Two threads are consuming all of their CPU time on this stack:

  chrome_child.dll!media::AudioDeviceThread::ThreadMain
  chrome_child.dll!media::AudioOutputDevice::AudioThreadCallback::Process
  chrome_child.dll!content::RendererWebAudioDeviceImpl::Render
  chrome_child.dll!blink::AudioDestination::Render
  chrome_child.dll!blink::CrossThreadBind<>
  chrome_child.dll!WTF::BindInternal<>
  chrome_child.dll!WTF::Partitions::FastMalloc
  chrome_child.dll!std::lock_guard<base::subtle::SpinLock>::lock_guard<base::subtle::SpinLock>

From the lock_guard function they are repeatedly calling KernelBase.dll!SwitchToThread.

The other two threads are spending 100% of their time on this call stack:

  chrome_child.dll!base::Thread::ThreadMain
  chrome_child.dll!base::RunLoop::Run
  chrome_child.dll!base::MessagePumpDefault::Run
  chrome_child.dll!base::MessageLoop::DoWork
  chrome_child.dll!base::MessageLoop::RunTask
  chrome_child.dll!base::debug::TaskAnnotator::RunTask
  chrome_child.dll!base::Callback<void __cdecl(void),0,0>::Run
  chrome_child.dll!base::internal::Invoker<>::Run
  chrome_child.dll!blink::scheduler::TaskQueueManager::DoWork
  chrome_child.dll!blink::scheduler::TaskQueueManager::ProcessTaskFromWorkQueue
  chrome_child.dll!base::debug::TaskAnnotator::RunTask
  chrome_child.dll!base::Callback<void __cdecl(void),0,0>::Run
  chrome_child.dll!base::internal::Invoker<>::Run
  chrome_child.dll!blink::`anonymous namespace'::RunCrossThreadClosure
  chrome_child.dll!WTF::Partitions::FastFree

Within FastFree they are repeatedly calling KernelBase.dll!SwitchToThread - my guess is that the lock_guard spin-lock got inlined in this case.

I can't tell for sure why chrome is getting into this state. It could be useful to know what page was triggering this behavior, but it sounds like there have been multiple reports so maybe that doesn't really matter. It may be that there is an audio bug that is triggering a storm of allocations. The spin locks then exacerbate the situation. It is impossible to tell for sure due to the nature of CPU sampling but I see no evidence that any of the four threads spinning on the lock *ever* acquire it. It seems quite likely that the lock is held by some other thread and that the spin locks are starving it. The SwitchToThread calls are presumably an attempt to let other threads run but this can easily be ineffective. SwitchToThread will, ultimately, only switch to a thread of equal priority, and such a thread would get a chance to run eventually anyway. Whatever thread owns the lock must be running at a lower priority - low enough that the priority boost/decay given to starved/busy threads was not sufficient to ever let the low priority thread run.

The two audio threads are running at high priority - priority 15. Their spinning should be enough to saturate the two cores and never let any other threads run. It's not clear to me why the other to threads were ever allowed to run given that their priority was only 8. Maybe the threads kept changing each other's priorities?

The two audio threads were each context-switched in about 277,000 times over the 9.16 s where I have full trace data, so ~31,000 times per second - pretty fast. The attached WPA screen shot shows the context switches over a 1 ms period, just because I like pretty pictures.

I'm assuming that the root cause which has triggered this is an audio issue. As a separate but related issue, spin locks are dangerous, and acquiring a spin lock from a priority-15 thread is particularly risky.

The report also mentioned memory growth but unfortunately the trace did not record that data for the timeline where chrome was running.

ContextSwitches.PNG
3.9 KB View Download
Note: The Windows CriticalSection look handles this case by trying to acquire the lock (using an interlocked-exchange operation) first. If this fails then it optionally spins for a user-defined time period. If that fails then it waits on an event, thus avoiding priority inversions or other sources of long-term spinning, while still being 100% efficient in the non-contended case. Critical section objects are also instrumented so that debuggers and profilers can track down deadlocks and can track transfers of lock ownership between threads. For all of these reasons I think that we should consider using critical sections on Windows instead of custom spin locks.

Addendum: I found some memory data for chrome itself and, during the nine seconds it was spinning there was no apparent growth. The problematic process held steady at about 403 MB of private working set. RDCMan.exe held steady at 1.9 GB, dropped to 981 MB after the Chrome processes were killed, and then bounced back up to 1.7 GB, with a lot of memory allocation/free traffic, and significant CPU usage (making up for lost time?) after Chrome went away. I don't know why RDCMan uses so much memory. 2 GB of memory on a machine with 7 GB is not fatal, but it is significant, and it doesn't leave a lot for, for instance, the system cache (838 MB).

Cc: hongchan@chromium.org dalecur...@chromium.org rtoy@chromium.org
TL;DR calling PartitionAlloc from a real-time priority (15) thread risks deadlock on single-core machines. Calling PartitionAlloc from *two* real-time priority (15) threads risks deadlock on dual core machines, which seems to be what has happened here. Adding possibly relevant audio developers.

If there are more questions about the data in the ETW trace please let me know.

Comment 15 by rtoy@chromium.org, Aug 18 2017

Components: Blink>WebAudio
Cc: maxmorin@chromium.org
> real-time priority (15) thread risks deadlock on single-core machines

It seems like the app is already using AudioDeviceThread and it has been a real-time thread from the beginning. The analysis in #12 makes me think that the high frequency PostTask/CrossThreadBind (~3ms) with is torturing the system in the low-end devices.

FWIW, the fix (revert to the previous threading model) is ready and I would like to merge it as soon as the review is completed. Also I believe Max recently worked on the similar problem, so I am cc-ing him to this issue.
Yes, this is 710245. An alternative way to fix it would be to add Windows next to Chrome OS here: https://cs.chromium.org/chromium/src/media/audio/audio_device_thread.cc?l=15 (or just drop the os check completely).
We're experiencing the same problem with multiple customers, all on RDS or CTX server.
Cc: -hongchan@chromium.org bustamante@chromium.org
Labels: -Pri-2 M-60 Pri-1
Owner: hongchan@chromium.org
Status: Assigned (was: Unconfirmed)
Assigning to hongchan based on #16. Also ccing M60 release owner. Not sure exactly how large impact this has, but a fix should probably be merged to M60?

Comment 20 by olka@chromium.org, Aug 21 2017

Cc: olka@chromium.org
Hi, any update on this matter? We have multiple clients reporting this issue on chrome version: 60.0.3112.101
All use 2012 R2 with Citrix or RDS 
re 17#: maxmorin@

If you think this is a dupe of 710245, please mark this issue so. My tentative fix is a revert of WebAudio's dual-threading and has nothing to do with the root cause of 710245. Also the alternative fix you suggested is actual outside of WebAudio.

How about we try to fix AudioDeviceThread first (by adding Windows) and then to revert the WebAudio's dual-threading? I think it's bad to fix both at the same time or in a random order. WDYT?
I mean it has the same root cause as 710245, not that it's literally the same. I agree it makes sense to lower the priority of the AudioDeviceThread on Windows first, and merge that to M60, as it's a safe change (as far as stability goes).
Owner: maxmorin@chromium.org
per #23, maxmorin@ and I agreed to work on the priority inversion issue first and then fix the WebAudio dual-threading later. After a CL from Max lands, I will reassign this issue to myself.
As noted on the code review, we shouldn't do this for all ADT objects unless we know they're broken in normal playback cases too. Nor should we merge a change to do so for all if we do.

It's too risky given the number of active playbacks. If we can limit this fix to WebAudio only that's okay with me to merge, but would defer to the WebAudio team on the risk.
Re #25:

Dale, the comment from #16 suggests that this is an artifact of the battle between two real-time threads. WebAudio doesn't use any REALTIME_AUDIO thread at this moment, and even the fix will use DISPLAY priority.

I am not really familiar with AudioDeviceThread, but it seems like the offending change is actually posting a cross-thread task from AudioDeviceThread to WebThread for every 3~20ms. So from my perspective, WebAudio's fix is simply changing the priority and it won't solve the issue in AudioDeviceThread. That's why I am not sure if WebAudio is the right venue for the fix.

Anyhow, my plan is:

1) Use single-thread rendering unless AudioWorklet is enabled. (default)
2) Use AudioWorkletThread for AudioWorklet, which has the DISPLAY priority.

So without a proper fix for this priority inversion problem, I suspect that the option 2) will still suffer from the same root cause. That's a bad new for WebAudio developers.
I just meant we should only apply the priority change when an ADT is created for WebAudio right now.
Dale: I see your point. In that case, I think we skip my CL and try to merge Hongchans CL instead if it's needed (https://chromium-review.googlesource.com/c/chromium/src/+/617588).
As far as I know, WebAudio is the only client of ADT affected by this issue, since it's the only client allocating memory with the dangerous WTF allocator. Hongchans CL to not use a separate thread when not using AudioWorklet avoids the blink::CrossThreadBind that does the allocating, so it should fix the issue. As for avoiding this issue when using audio worklets, I think it's a separate bug/design concern from this one.
Owner: hongchan@chromium.org
Status: Started (was: Assigned)

Comment 30 by cluma...@gmail.com, Aug 22 2017

Same Issue here on several Microsoft Windows 2012R2 RDS servers. Can this please soon be fixed?

Comment 31 by rtoy@chromium.org, Aug 22 2017

Won't Honchan's CL that does not use a separate thread just hide this issue?  To support AudioWorklets, we do need a separate thread (AFAIK), so when AudioWorklets are enabled, we'll have exactly this problem again, right?
re #31:

That's what I am afraid of as well. Of course the elevated thread priority (DISPLAY) will help the situation, but I don't think our task scheduler is capable of handling the high-frequency cross-thread calls from the audio device. (per #12)


Comment 33 by mcco...@gmail.com, Aug 22 2017

Same issue with MS Windows 2012 R2 RDS impacting a couple of customers.
> I don't think our task scheduler is capable of handling the
> high-frequency cross-thread calls from the audio device.

Isn't the real root cause the spin lock? If we could avoid spinning (or at least cap it to a ms or less) then the priority inversions would go away. It seems like that is the best long-term fix.

The alternative is to not use PartitionAlloc from threads that are not extra-high or extra-low priority.

brucedawson@

Who is the best person to talk to about this issue? This is too far from WebAudio, so I would like to defer the fix to someone who is more knowledgeable in that area.

With that said, my workaround fix for this issue is under the review and hopefully can land soon.
Drive-by:

If the problem is that partition alloc's implementation, then the right folks would be the partition alloc OWNERS, no? If they can't provide a fix to partition alloc, then maybe the right course of action is to not use partition alloc here.
I am talking to PartitionAlloc owners about this. However, changing the PartitionAlloc lock is a big decision, which may not happen. I think the audio subsystem should consider not using PartitionAlloc. It sounds like audio does not represent the sort of workload which it was designed for.

So, I will own pushing for changes to PartitionAlloc, but WebAudio should consider stopping using PartitionAlloc ASAP.

Comment 38 by rtoy@chromium.org, Aug 22 2017

It appears that webaudio uses WTF::Partitions:FastMalloc in just one place (platform/audio/AudioArray.h) What should be used instead?
new []/delete []?

Comment 40 by rtoy@chromium.org, Aug 22 2017

Is that fast? I don't know the history of the code, but presumably the original author wanted this to be very fast so as not to glitch audio.

It also doesn't help that AudioArray keeps reallocating memory if it's not aligned on 16 or 32 byte boundary.

We also lose the WTF_HEAP_PROFILER_TYPE_NAME. Not sure how important that is, but seems really useful to know who is using all of the heap.
new/delete are assumed to be fast - we use them very heavily so we count on that. Like any general-purpose allocator their performance will vary depending on the current state of the heap. But, for instance, on Windows they can do many allocations smaller than 16 KB in a lock-free manner which can be extremely fast.

As a general rule of thumb you can only beat the performance of a general-purpose allocator if you make use of significant domain specific knowledge (order of allocs/frees or heavily restricted sizes) to implement custom optimizations, and that is not being done in this case.

So, absent any specific reason to believe that PartitionAlloc is faster we should (I think) start by assuming that it has similar performance to new/delete. And, we now know that it is currently unsuitable for use by high-priority threads.

I think there are other heap profiling options.

+1 to the above, since you need aligned memory, you might see if base::AlignedAlloc() can be used; it should certainly be faster then reallocating if you get the wrong alignment...
Is there an eta on the fix/workaround.

We're considering downgrading to version 59 till the issue is fixed but would like to know if it is still worth the effort?
We too are an MSP experiencing the same issues. This is causing us a major headache on 2012R2 RDS sessions. Seems to impact on disconnected sessions more where we have observed. Is a fix likely to be forthcoming in the next few days, or should we look at downgrading to a previous version?
We're also having this issue in our hosted RDS (2012 R2) environment.
I wrote this cmdlet as a bandaid: https://github.com/RobBiddle/Stop-GreedyProcess
Needs to be run as a scheduled task with priority 0 (realtime) in order to do the job.
re# 40: I don't think we're passing AudioArray in AudioDestination.

1. AudioDestination has USING_FAST_MALLOC() in its header.
2. AudioDestination::Render() calls CrossThreadBind() and then BindInteral() inside. I believe this is making a cross-thread copy.
3. To make a copy of AudioDestination object in the bind, WTF::Partitions::FastMalloc() gets called.

In order to avoid the issue, we can replace USING_FAST_MALLOC() with something else.
The workaround fix is landed (https://chromium-review.googlesource.com/c/chromium/src/+/617588) but this issue was not updated automatically. (crbug.com is having a bad day)

However, merging this to M60 is too risky and the merge request is highly likely to be rejected. So I will try to merge this to M61.
If some of the affected users could check whether this is fixed using canary that would be great. Ideally we want somebody to install canary today, reproduce the problem, and then confirm that tomorrow's canary resolves the issue.
Cc: gov...@chromium.org
Labels: ReleaseBlock-Stable M-61
#47 Yeah since M60 has been in stable for a month, and M61 is coming up in a couple weeks, it's loo late for a post-stable merge for this issue.

Adding govind@ (M61 release owner) and tagging for 61.
Cc: pbomm...@chromium.org
Per comment #47 this is too risky for M60 Stable merge and M61 is going to stable soon (We will only have 1 beta release before stable promotion). I'm hesitate to take this merge in for M61 unless it is fully safe baked/verified in Canary.

Comment 51 by rtoy@chromium.org, Aug 24 2017

See https://chromium-review.googlesource.com/c/chromium/src/+/633852 for a CL that replaces FastAlloc with base:AlignedAlloc.  Ran a few simple tests and performance didn't really seem to change.  (Can't really tell; we don't have any performance-type benchmarks that do lots of allocation.)

This will help a little with this issue, but, as @hongchan pointed out, we have USE_FAST_MALLOC() all over the place.

Comment 52 by tlamm...@gmail.com, Aug 25 2017

MSP here, please let us know if there is anything we can do to test with this bug.
Status: Fixed (was: Started)
re #52: The fix from #47 is now in 62.0.3196.0, which is the latest Canary.
Labels: Needs-Feedback
Labels: Merge-TBD
[Auto-generated comment by a script] We noticed that this issue is targeted for M-60; it appears the fix may have landed after branch point, meaning a merge might be required. Please confirm if a merge is required here - if so add Merge-Request-60 label, otherwise remove Merge-TBD label. Thanks.
govind@

Like I stated in #47, can we consider merging this to M61 once all the reports here confirmed the glitch is fixed? Or do we need more data to verify the fix?
[a drive by comment after seeing the blink-dev thread. Don't want to add more stuff on the fire, especially given the priority of this bug, so feel free to ignore this entirely]

out of curiosity, how often does AudioArray::Allocate gets called and with which size (dozen of bytes? hundreds?).
If that happens quite frequently and the size is large-ish (say hundreds of KB) it might be worth dropping the memset and using a low-level api (mmap/VirtualAlloc) which guarantees that the memory is already zeroed (most OS have a pool of zeroed pages, that's the trick).

Doing very rough math, on recent-ish desktop CPUs (say Intel Nehalem family and beyond) the L3 bandwidth is in the order of ~10-50 GB/s, which gives ~2-10 us for zeroing a 100 KiB buffer via memset, which I expect to be > than the cost of a mmap. Especially considering that if your allocation is in the order of hundreds of KB both PartitionAlloc and malloc will extremely likely do a mmap under the hoods anyways.
The main bummer is that, IIRC, neither base nor blink expose a raw page allocator. but if there is a need I think it's reasonable to make a case to expose this (I know about other places that implement their own page allocator with #if defined(OS_...) mmap / VirtualAlloc etc)
When we have enough confirmation from users, we can consider merging to M61. Few notes for the fix:

- The audio glitch happens only on a small population. (system with fewer CPU cores)
- The fix was to revert the previous threading model, and use the new threading model only when AudioWorklet experimental flag is enabled.
re #57:

Thanks for the opinion. Perhaps we should create a new issue about the memory allocation? I am not really familiar with the memory infrastructure, so someone who works on the area should drive the discussion.

Comment 60 by rtoy@chromium.org, Aug 25 2017

Interesting idea.  (On a different personal project, we used to use mmap to
get zeroed pages.  Turned out that this was slower than just memset'ing the
memory ourselves.  Don't remember all the details anymore, though, but the
pages were relatively small.)

I don't have any hard numbers on this, but a quick grep through the code
suggests that most allocations are probably about 128 floats or so.  A fair
number (depending on what AudioNodes are used) may allocate up to maybe 64K
floats. On rare occasions we might have millions when OfflineAudioContext's
are used.  We recently switched allocators there not to zero out memory
because we were going to overwrite the contents with our own data
eventually.  Saved about 100-200 ms for a 100,000,000 float array.

I also don't think we do a lot of temp allocations.  We basically
pre-allocate everything that might be needed when AudioNodes are created
and connected.
> using a low-level api (mmap/VirtualAlloc) which guarantees that
> the memory is already zeroed

I would recommend against this. The zeroed pages from the OS will be strictly more expensive. The memset(0) still has to be done, just now it is done elsewhere (in Windows by the system process) which makes it less accountable. That is, it might *seem* cheaper (less time in the Chrome process) but the total CPU time will actually be greater.

And, using VirtualAlloc means that the pages have to be faulted in on first use, and then removed from the working set when freed (both are quite expensive, at least on Windows), whereas malloc/new/PartitionAlloc/AlignedAlloc will usually/often already have the pages in the working set.

So, interesting idea but I am reasonably sure that it will increase the cost. I looked at this (from a different perspective) here:
https://randomascii.wordpress.com/2014/12/10/hidden-costs-of-memory-allocation/

It is a good idea to avoid zeroing any more bytes than needed, and to avoid zeroing memory that will then be filled in with other data, but that is orthogonal.

Our org is also experiencing these issues, I see this status has been changed to FIXED, and there is a Chrome update available this morning. Has this fix been incorporated into Version 60.0.3112.113 (Official Build) (64-bit)? 
The fix is not in M60. It is currently in Chrome canary (the latest testing version). If someone can verify the issue doesn't reproduce there (using 62.0.3198.0 or later), we might merge it to M61 (meaning it will be available for users of the stable channel in a few weeks).
bryan.lee.kaufman@ tlamming@

As maxmorin@ said, could you verify the fix with 62.0.3198.0 or later?
Cc: krajshree@chromium.org hongchan@chromium.org kkaluri@chromium.org
 Issue 749344  has been merged into this issue.
Cc: flim@chromium.org
 Issue 755173  has been merged into this issue.
I've been running Canary since Friday but with the weekend and National Holiday yesterday in the UK, I've not really been able to thoroughly test.

I'll try to reproduce the issue over the next couple of days.
After the patch landed, there weren't enough confirmation from the reporters to merge the fix to M61. The fix will be shipped with M62.
Labels: -Merge-TBD
Removing "Merge-TBD" label per comment #68.
Cc: pastarmovj@chromium.org blumberg@chromium.org ligim...@chromium.org
 Issue 755122  has been merged into this issue.
Labels: -M-60
Status: Assigned (was: Fixed)
We may need to consider to ship the patch with M61. This is impacting users with windows TS and Citrix RD.
govind@

ligimole@ opened this issue again for the M61 merge, but I think it's too late for it. WDYT?
Cc: yanglee@chromium.org mzheng@chromium.org
+ Enterprise folks as FYI.
Cc: amineer@chromium.org
Per comment #68, we don't have any confirmation yet the fix works. 
Unless fix is verified and fully safe to merge to M61, I'm not comfortable to take this merge in for M61 as this is a Blink change (affects all OSs except iOS). Please note no further M61 Beta releases at this point(if approved fix will directly go to M61 Stable and this was regressed in M60). 

+amineer@ to take his input. This merge request is for the workaround fix landed at #47 (https://chromium-review.googlesource.com/c/chromium/src/+/617588).
bryan.lee.kaufman@, tlamming@, nicholas.bond@,

Could you verify the fix with 62.0.3198.0 or later? If fix works, we might merge it to M61 (meaning it will be available for users of the stable channel in a few weeks). 


FWIW, WebAudio team (rtoy@, hongchan@) believes the fix at #47 is safe to merge. I'll wait for the confirmation from any reporter in this issue.
If anyone has any steps to reproduce on demand, please let me know. We see this on 300+ citrix-servers but I am unsure of exactly what triggers the high cpu usage. 
thea.s.breistrand@

Can you check if the version 62.0.3198.0 or later (current Canary) fixes the issue?
Labels: Merge-Request-61
Project Member

Comment 80 by sheriffbot@chromium.org, Aug 31 2017

Labels: -Merge-Request-61 Merge-Review-61 Hotlist-Merge-Review
This bug requires manual review: We are only 4 days from stable.
Please contact the milestone owner if you have questions.
Owners: amineer@(Android), cmasso@(iOS), ketakid@(ChromeOS), govind@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
govind@

Per #76, even if we did not have confirmation from the reporters here, but we believe the fix is safe and it resolves other two related issues. So we request the merge to M61.
Labels: -Merge-Review-61 Merge-Approved-61
Approving merge to M61 branch 3163 based on comment #81. Please merge ASAP. Thank you.
We have about 20+ terminal server (not citrix) version 60.0.3112.113 in my research any time a user goes to foxnews.com or yahoo.com the cpu level spikes to 20-30% and stays there per chrome process. I tested on a terminal server, removed version 30.0.3112.113 and installed 59.0.3071.86. Went to foxnews.com and yahoo.com and cpu spiked initially but went back down to 2-3%. Terminal server versions is Server 2012 R2. The share cpu disk and nic have been disabled.

kfoster38a@

The fix has landed at 62.0.3198.0 or later (latest Canary). Could you verify the fix with one of your machines?
govind@

During the merge process, I found out the fix CL requires other changes that are only available in M62. At this point, I believe merging all the required changes to M61 is a bit risky.

We should leave the fix in M62. If any reporter here is having troubles with this issue, please try it with the latest Canary.
Labels: -Merge-Approved-61 Merge-Rejected-61
Rejecting merge to M61 based on comment #85.
Labels: -M-61 M-62
Status: Fixed (was: Assigned)
Waiting for the verification of the fix from reporters here.
I directly contacted nich.....@8networks.co.uk and got the respond below:

---
I see that it looks like the fix is going to be merged anyway, but I have been running the Canary build 62.0.3198.0 and haven’t had any issues since. It was hard to determine exact steps to replicate the issue but before running the Canary build, Chrome would crash approx. twice a day so I’m confident that the fix has resolved the issue.
---

If I get few more confirmation from the reporters, I will mark this issue as "verified".
I am having better luck with the canary build 63.0.3206.0 when I go to foxnews or yahoo the cpu usage ramps up but drops back down to less than 1-2%.

Status: Verified (was: Fixed)
Another feedback from bryan.lee.kaufman@:

---
We deployed Canary to 3 customers 3 days ago. These customer servers were experiencing the CPU spike daily but have had no additional symptoms since Canary install.
---

I am going to mark this issue as "verified".
I've noticed that the issue has now been "verified".

Is anyone able to confirm a date for when we're likely to see this fix in a stable release? I've seen on the chromium-dev forum that the M62 Full Stable Release is October 17th, 2017 - will this fix be in that release?

Thanks.

Comment 93 by rtoy@chromium.org, Sep 12 2017

As mentioned in c#84, the fix landed in 62.0.3198.0.  It will be available in M62, barring any unforeseen circumstances where the fix needs to be reverted.  Seems unlikely now since it's been almost 2 weeks since the fix.
Google Chrome 62 released today!
I am still experiencing this issue (100% CPU utilization due to Chrome) on Chrome Version 63.0.3239.132 (Official Build) (64-bit).


There were some further improvements which will appear in 64. Can you help us help you by doing two things:

1. Please try a cnaary build in Chrome in your test environment and check if this reduces the CPU pressure of Chrome.
2. Either in a test or production env reproduce the issue and collect a performance trace of the system as pointed in here https://randomascii.wordpress.com/2015/09/01/xperf-basics-recording-a-trace-the-ultimate-easy-way/ and share this with us so that we can see what causes the high utilization.
Project Member

Comment 97 by bugdroid1@chromium.org, Feb 12 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/42a128b7ad14524b01bcf4da74b4d0aa836b3c61

commit 42a128b7ad14524b01bcf4da74b4d0aa836b3c61
Author: Max Morin <maxmorin@chromium.org>
Date: Mon Feb 12 13:24:48 2018

Reenable realtime audio threads on 2-core Chrome OS.

Workaround shouldn't be needed anymore after crrev.com/530835.
Reenabling realtime priority should reduce audio glitches.

This really shouldn't cause any regressions, but adding the old
bugs as FYI just in case.

Bug:  754213 , 734490 ,710245, 770312 ,803419
Cq-Include-Trybots: master.tryserver.chromium.android:android_optional_gpu_tests_rel;master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel
Change-Id: I2f7c6c793fade47a205a896a25468a7312ba55de
Reviewed-on: https://chromium-review.googlesource.com/913272
Reviewed-by: Olga Sharonova <olka@chromium.org>
Commit-Queue: Max Morin <maxmorin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#536069}
[modify] https://crrev.com/42a128b7ad14524b01bcf4da74b4d0aa836b3c61/media/audio/audio_device_thread.cc

We have two 2012R2 RDS servers that are having this issue.  
Both are using 64.0.3282.186 (Official Build)(64-bit) and are having the same issues.  We contacted Chrome enterprise support and they wanted us to add to this thread our issues.  One server only has 15 users, while the other has 50.  Both see each tab use 2-3% and primarily use their servers for Chrome and looking at their gmail.  Both do have audio enabled through RDP.  Any recommendations on what we can do to improve the situation?  Most users have logged off so the CPU usage has gone down, but we can perform an ETW trace tomorrow if that would help?
An ETW trace would help. It is unexpected that this issue would still be happening.

2-3% CPU usage sounds like a separate issue however - I think this issue was tracking problems that were causing 100% CPU usage and hangs, so consider recording ETW traces and then filing another bug, CC brucedawson@chromium.org
You know, you were right.  I posted in  Issue795546  as the issue is relevant to the "Utility:Video Capture Service" that keeps opening and closing while we have gmail open in Chrome.  We had to disable hangouts for our company because that task kept hitting each user session with an extra 1 to 2% of server CPU which added up among all our users.  Thanks for replying.  Only posting this in case it helps anyone else. 
In RDS / Terminal services environments, how are you all installing chrome / chrome enterprise? Are you using rd-mode (in control panel), or the change user /install
change user /execute switches? 

Comment 102 by 4bud...@gmail.com, May 21 2018

Any devs which can fix this issue
Bug is now active and reproduced
Without bug CPU load was nice 1-5%

here my RDP connect to amazon instance
Password is    &@KEWXpqZB
ec2-34-240-8-3.eu-west-1.compute.amazonaws.com.rdp
105 bytes Download
File a new bug and indicate which version of Chrome is used and what the steps to reproduce are.
Having the same issue, I have a windows 2012 R2 server with firefox i can have 40+ users, with Chrome 18 if im lucky. 

Windows 2012 R2
Chrome 64 version 66.0.3369.181 (official build)

Can we as IT admin's via a Chrome GPO

1. limit the amount of processors Chrome fires up?
2. Limit the amoutn of tabs users can open?
3. suspend the process if the user tab isnt in focus.

joe



I would love more control over how much cpu chrome gets. It's getting out
of hand.
There are many different things which could potentially be causing runaway CPU usage in Chrome, and we would like to fix such bugs. However, without a profile/trace of the runaway CPU behavior we have no way to understand and fix the issue. If you look at comment #8 you can see a link to instructions for recording an ETW trace. These ETW traces have been used to fix several bugs in Chrome which only showed up on terminal servers. See crrev.com/c/873647 for an example of a fix that was made using an ETW trace that was supplied based on discussion in this bug (see comment #12 for details).

So, if you are having CPU consumption issues with Chrome please consider recording an ETW trace to share with us. You should open a new bug since this particular bug has been fixed, and whatever issue you are experiencing is presumed to be a different issue. After you have opened a bug you can contact me at brucedawson@chromium.org to share the ETW trace.

Showing comments 7 - 106 of 106 Older

Sign in to add a comment