New issue
Advanced search Search tips

Issue 813445 link

Starred by 5 users

Issue metadata

Status: Fixed
Owner: ----
Closed: Apr 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 3
Type: Bug



Sign in to add a comment

ChromeMac frozen for ~2 minutes - stuck in content::ResourceDispatcherHostImpl::UpdateLoadStateOnUI

Project Member Reported by tapted@chromium.org, Feb 18 2018

Issue description

Chrome Version       : 66.0.3343.3
OS Version: OS X 10.13.3
URLs (if applicable) : https://allo.google.com/web (maybe not applicable - that's what I had open though)

What steps will reproduce the problem?
1. Computer was idle for a while - possibly sleeping
2. Unlock screen, try to interact with Chrome

What is the expected result?

Responsive Chrome.

What happens instead of that?

Button hover effects activate, but never deactivate. Clicking buttons/menus doesn not work. Tab loading spinner spins though. Took a sample:
 - 86% of samples are in a callback that invokes content::ResourceDispatcherHostImpl::UpdateLoadStateOnUI

Maybe there is a huge backlog of messages that got dumped on the UI thread when I unlocked my screen.

The samples point to some things being slow that really shouldn't be, like
  content::ResourceDispatcherHostImpl::UpdateLoadStateOnUI calling free() (~400 samples)
  GURL::~GURL (~30 samples)
  content::WebContentsImpl::FromRenderFrameHostID (~300 samples)

 
Chrome-unusable.txt
484 KB View Download
Chrome-unusable crsym.txt
705 KB View Download

Comment 1 by ajha@chromium.org, Feb 19 2018

Labels: Needs-Triage-M66
Cc: vamshi.kommuri@chromium.org
Labels: Needs-Feedback Triaged-ET
Thanks for filing the issue!

@Reporter: We kept our Mac idle for some time then tried unlocking followed by using Chrome browser, it was functioning well with out getting stuck. Could you please elaborate whether you are running anything before the Machine is kept idle. It would be highly helpful if clarified on the "content::ResourceDispatcherHostImpl::UpdateLoadStateOnUI" as ET team is not very clear what exactly does that mean! Any further inputs from your end may help us to triage the issue in a better way.
Components: Internals>Network
Adding Internals>Network so it gets on networking team's queue (since it's sort of a content/browser/loader thing)

tapted: is possible you have many outstanding network requests when you opened your laptop? This method iterates through all of them so it could be a source of slowness. We call a function that posts this task to the UI thread every 250ms.

Perhaps we could implement this with cancellation so if multiple tasks are posted, we cancel previous ones that haven't run yet.
Components: UI>Browser>History
I also see ~1000 counts for history::HistoryBackend::GetCountsAndLastVisitForOrigins, due to some calls to history::HistoryService::NotifyURLsDeleted.
Cc: gab@chromium.org fdoray@chromium.org
Labels: -Needs-Feedback
Removing Needs-Feedback for now, dev team knows what UpdateLoadStateOnUI is.

The internal timeticks clock keeps running on Mac when the system is sleeping. I wonder if somehow that could cause repeating timers to go haywire on resume, causing a backlog of tasks. I think a new UpdateLoadStateonUI only needs to be scheduled after the previous one runs, which is how I thought the base timers worked.
+gab,fdoray if they've seen something like it or if my understanding is off.

We can probably optimize the load state updates but unless you have many many resources in flight I can't imagine it'll be too slow. If I have time this triage shift I may be able to land a cleanup or two.

Even with millions of requests in flight, UpdateLoadStateOnUI only iterates over frames (Main frames?  Not sure), not requests, so this seems a really weird place to hang.
I think we do iterate through all the requests to generate the WebContents map, and indeed in the trace it seems like PickMoreInterestingLoadInfo is a bottleneck.
Project Member

Comment 8 by bugdroid1@chromium.org, Mar 6 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/911fa064e3cf8c8ab006b39277d01094e05bfdf1

commit 911fa064e3cf8c8ab006b39277d01094e05bfdf1
Author: Charles Harrison <csharrison@chromium.org>
Date: Tue Mar 06 21:01:46 2018

Change LoadInfo to only store URL hosts

The WebContentsImpl is the only consumer of this information, and it
only cares about the host.

URLs can be huge (up to 2MB), while hosts are much much smaller.
It isn't a good idea to copy the URL for every outgoing request to
the UI thread every 250ms.

Note: previously this code did some grouping by RVid (I think) before
thread hopping. This would further alleviate copying issues when there
are lots of requests. Probably we could just use two sets:
1. Global frame routing id -> LoadInfo for subresources
2. All navigation requests

Bug:  813445 
Change-Id: Ied3709fac7bba079d5e31694dc8e477d1c7f2f68
Reviewed-on: https://chromium-review.googlesource.com/951364
Reviewed-by: Matt Menke <mmenke@chromium.org>
Reviewed-by: Avi Drissman <avi@chromium.org>
Commit-Queue: Charlie Harrison <csharrison@chromium.org>
Cr-Commit-Position: refs/heads/master@{#541189}
[modify] https://crrev.com/911fa064e3cf8c8ab006b39277d01094e05bfdf1/content/browser/loader/loader_delegate.h
[modify] https://crrev.com/911fa064e3cf8c8ab006b39277d01094e05bfdf1/content/browser/loader/resource_dispatcher_host_impl.cc
[modify] https://crrev.com/911fa064e3cf8c8ab006b39277d01094e05bfdf1/content/browser/loader/resource_dispatcher_host_impl.h
[modify] https://crrev.com/911fa064e3cf8c8ab006b39277d01094e05bfdf1/content/browser/loader/resource_dispatcher_host_unittest.cc
[modify] https://crrev.com/911fa064e3cf8c8ab006b39277d01094e05bfdf1/content/browser/loader_delegate_impl.cc
[modify] https://crrev.com/911fa064e3cf8c8ab006b39277d01094e05bfdf1/content/browser/loader_delegate_impl.h
[modify] https://crrev.com/911fa064e3cf8c8ab006b39277d01094e05bfdf1/content/browser/web_contents/web_contents_impl.cc
[modify] https://crrev.com/911fa064e3cf8c8ab006b39277d01094e05bfdf1/content/browser/web_contents/web_contents_impl.h

Cc: csharrison@chromium.org
Labels: M-66
Status: Untriaged (was: Unconfirmed)
Changing the status to Untriaged and requesting csharrison@ to update the bug status accordingly if there is no further work to be done here.

Thanks!
I have another fix to UpdateLoadState in review at https://chromium-review.googlesource.com/c/chromium/src/+/951743. However, these will only mitigate the issue, not solve root cause which seems like it could be due to a large amount of tasks queuing up during Mac sleep.

I took a look through the tracker and it seems like issue 808031 is quite similar.
Project Member

Comment 11 by bugdroid1@chromium.org, Mar 9 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/4c135b3c228cfb4c595c8e760428f1ef6a6bb966

commit 4c135b3c228cfb4c595c8e760428f1ef6a6bb966
Author: Charles Harrison <csharrison@chromium.org>
Date: Fri Mar 09 18:23:05 2018

Aggregate LoadState per-frame before copying to UI

Currently we copy hosts + other state for every outgoing request to the
UI thread before doing per-WebContents aggregation there. This can
involve a lot of copying of data around. This CL adds some initial
per-frame aggregation beforehand to reduce this.

Also, we add a browsertest for this functionality, since none previously
existed.

Bug:  813445 
Cq-Include-Trybots: master.tryserver.chromium.linux:linux_mojo
Change-Id: I33438bdd5d4cab9caf3fcd07419fef3f7153935e
Reviewed-on: https://chromium-review.googlesource.com/951743
Reviewed-by: Avi Drissman <avi@chromium.org>
Reviewed-by: Matt Menke <mmenke@chromium.org>
Commit-Queue: Charlie Harrison <csharrison@chromium.org>
Cr-Commit-Position: refs/heads/master@{#542157}
[modify] https://crrev.com/4c135b3c228cfb4c595c8e760428f1ef6a6bb966/content/browser/loader/resource_dispatcher_host_impl.cc
[modify] https://crrev.com/4c135b3c228cfb4c595c8e760428f1ef6a6bb966/content/browser/loader/resource_dispatcher_host_impl.h
[modify] https://crrev.com/4c135b3c228cfb4c595c8e760428f1ef6a6bb966/content/browser/web_contents/web_contents_impl_browsertest.cc
[modify] https://crrev.com/4c135b3c228cfb4c595c8e760428f1ef6a6bb966/content/test/BUILD.gn
[modify] https://crrev.com/4c135b3c228cfb4c595c8e760428f1ef6a6bb966/testing/buildbot/filters/mojo.fyi.network_content_browsertests.filter

Owner: csharrison@chromium.org
Status: Started (was: Untriaged)
Loading triage.
Let me assign csharrison@ and change the status to 'Started'.
Owner: ----
Status: Untriaged (was: Started)
Sorry, I was only really working on this because I was on triage duty and had some spare cycles.

I implemented two low hanging fruits which should improve things but might not really get to the root cause of the issue (task buildup?). It could use more triage.
Labels: Network-Triaged
This is expected to be completely rewritten for the network service - probably not worth digging into.

Comment 15 by gab@chromium.org, Mar 14 2018

Re. lots of tasks on wakeup : we've discussed this previously in issue 707962 (hadn't realized it was closed, need to open new one). We know that the jump in TimeTicks on system wake up causes a lot of delayed tasks to expire (and they all want to execute *right now* even if it was delayed by 6 hours initially... We're thinking of addressing that with TaskScheduler)

Re. Repeating timers : they shouldn't have more than one task each in the queue (the next delay is only posted when the current one runs).
Thanks for the extra info gab. Unless there's some bug in our repeating timer implementation (which seems unlikely, and I recently looked through the code to verify behavior), I think this may just be caused by having tons of outstanding requests, with potentially long URLs.

tapted: If you can repro on a newer canary with these fixes it would falsify this hypothesis.
Labels: Needs-Feedback
I'll keep an eye out - haven't seen anything since filing this. (and thanks for taking a look!)
Status: Fixed (was: Untriaged)
#15: this could also be related to issue 808031.

In any case, since tapted@ hasn't seen anything further in a month (I assume? :)), I'll mark this one Fixed and let the remaining work be tracked on issue 808031.

Sign in to add a comment