New issue
Advanced search Search tips

Issue 614192 link

Starred by 5 users

Issue metadata

Status: Fixed
Owner:
Closed: Jun 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 2
Type: Bug



Sign in to add a comment

Hangouts is doing ~75 context switches per second, from logging and network I/O

Project Member Reported by brucedaw...@chromium.org, May 23 2016

Issue description

While verifying that  crbug.com/612366  was fixed I noticed another unknown Chrome process (showed up as Subframe: https://google.com in Chrome's task manager) causing ~75-80 context switches per second in Version 53.0.2746.0 canary (64-bit) on Windows. After turning off the site isolation experiment by adding this to the command line:

    --force-fieldtrials=SiteIsolationExtensions/Control

I then found that those context switches were now inside of the hangouts extension. After grabbing an ETW trace I found that the context switches were in three threads, with two of the threads talking to each other:

TID 7,732 wakes up at ~8 Hz on this call stack:
  |    chrome_child.dll!extensions::ObjectBackedNativeHandler::Router
  |    chrome_child.dll!extensions::RuntimeCustomBindings::OpenChannelToExtension
  |    chrome_child.dll!content::RenderThreadImpl::Send
  |    chrome_child.dll!IPC::SyncChannel::Send
  |    chrome_child.dll!IPC::SyncChannel::WaitForReply
  |    chrome_child.dll!base::WaitableEvent::WaitMany
  |    KernelBase.dll!WaitForMultipleObjects
It is woken up by TID 24,348

TID 24,348 wakes up at ~42 Hz on this call stack:
  chrome_child.dll!base::`anonymous namespace'::ThreadFunc
  chrome_child.dll!base::Thread::ThreadMain
  chrome_child.dll!base::Thread::Run
  chrome_child.dll!base::RunLoop::Run
  chrome_child.dll!base::MessagePumpWin::Run
  chrome_child.dll!base::MessagePumpForIO::DoRunLoop
  |- chrome_child.dll!base::MessagePumpForIO::WaitForIOCompletion
  |    KernelBase.dll!GetQueuedCompletionStatus
It is woken up by thread 7,732 two thirds of the time, the remaining times it is woken by the browser process.

TID 24,104 wakes up at 20 Hz on this call stack:
  chrome_child.dll!rtc::PlatformThread::StartThread
  chrome_child.dll!rtc::PlatformThread::Run
  chrome_child.dll!webrtc::RtcEventLogHelperThread::ThreadOutputFunction
  chrome_child.dll!webrtc::RtcEventLogHelperThread::WriteLog
  chrome_child.dll!rtc::Event::Wait
  KernelBase.dll!WaitForSingleObjectEx
It appears to be waking up because a timeout in WaitForSingleObjectEx is expiring.

Task manager shows occasional network activity, but it's not clear what is really going on.

It seems like too much activity for when hangouts is not closed. The only visible page open was an HTML only page. I was signed in with my Google account.

I *think* this is new from last week, but it's hard to be completely certain without more analysis of previous traces.



The 20 Hz wakeups on the WriteLog thread appear to be because of this code at the end of the function:
    wake_up_->Wait(50);
This has been there for about a month, and may just have not been noticed previously.

The other code seems to be constantly checking the network and probably needs to back off to a slower rate?

 
Labels: -Pri-3 Pri-2
Summary: Hangouts is doing ~75 context switches per second, from logging and network I/O (was: Hangouts logging is doing ~75 context switches per second)
Cc: tommi@chromium.org
tommi@ - just making sure you saw this. I think this is unrelated to the previous issue and I think it's not new, but I'm not completely sure. Getting this to zero for most seconds should be the goal I think, for when Hangouts is not actively in use.

Comment 3 by tommi@chromium.org, May 24 2016

Owner: terelius@chromium.org
Bjorn - can you take a look at the RtcEventLogHelperThread issue?
Cc: jansson@chromium.org
Status: Assigned (was: Untriaged)
Status: Started (was: Assigned)
Changing to only run a separate thread when logging to file. I hope to have a CL up tomorrow.
Any update on this? I'm particularly interested in a fix to RtcEventLogHelperThread::ThreadOutputFunction so that I can see whether the remaining idle wakeups are a separate issue.
I have a CL up, and I hope to land it in the webrtc repo today. The change will make the thread sleep until the next event is inserted, if the queues are empty. During normal operation the thread will still wake up periodically, but only if there are elements in the queues i.e. if a PeerConnection is actively being used.

ivoc@ is working on moving ownership of the logging object from PeerConnectionFactory to PeerConnection/Call. After this change, the thread will be terminated when the PeerConnection is terminated.
Fix has landed in WebRTC, but not yet rolled into Chrome. https://codereview.webrtc.org/2035483003/
Owner: brucedaw...@chromium.org
Status: Assigned (was: Started)
Could you verify that the context switches caused by RtcEventLogHelperThread are gone?
Sort of...

I can verify that in Version 53.0.2768.0 canary (64-bit) the context switches from RtcEventLogHelperThread are gone.

However the Hangouts extension process is still waking up frequently - about 50 times a second. When it wakes up it communicates with the browser process, causing *it* to wake up about 50 times a second.

But, I'll file a separate bug for that. It looks like some regression tests for context switch counts in hangouts are called for. It is unfortunate that these are difficult to implement on Windows.

I just opened  crbug.com/620505 . Can you look at that and see if it is a new bug or a separate bug?

BTW, manually measuring context switches with process explorer is easy. Getting context switch call stacks is trickier, although still manageable - I can share details.

The APIActivityLogger::LogInternal is something different, and not related to the WebRTC logs. As far as I understand, it is used to log all extension API calls.
Okay, closing this bug. Discussions are ongoing on the new bug (620505).

Comment 13 by tommi@chromium.org, Jun 16 2016

Thanks. Fwiw I was thinking about WebRTC logging, not APIActivityLogger.
Status: Fixed (was: Assigned)
Oops - forgot to actually tag this as fixed.

Sign in to add a comment