New issue
Advanced search Search tips

Issue 812478 link

Starred by 1 user

Issue metadata

Status: Assigned
Owner:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug



Sign in to add a comment

KeywordWebDataService posts too many tasks

Project Member Reported by ssid@chromium.org, Feb 15 2018

Issue description

If there was a misbehaving website causing redirects, there are lot of url commits that happen. This causes NotifyUrlVisited to be called a lot of times. This causes the KeywordWebDataService to posts a lot of DB tasks. Is it possible to avoid posting one task for each url and post tasks in batches?
Attached the stack trace of the posted tasks
 
keywork_tasks.png
135 KB View Download

Comment 1 by ssid@chromium.org, Feb 16 2018

Components: UI>Browser>Search
Owner: pkasting@chromium.org
The attached trace shows 600k tasks in the db task queue. I guess each task writes to the database and takes too long. This ends up keeping ~150MB in memory for that long. Is there anyway to have a cache and not update db everytime?
These cases where spam happens, usually the url or the origin is same. Maybe we could use that?
Do you know of (or could construct) a test page I can visit to repro this behavior?

That would help in designing and testing a fix.

In principle we could probably cancel some earlier tasks in favor of later ones, but a couple things make this challenging.  First, because there's already batching, we might be posting tasks including multiple updates rather than single ones, so we can't necessarily cancel earlier tasks outright.  Second, we could only safely cancel if the earlier writes would be completely overwritten by later ones (i.e. not changing any fields not changed in the later tasks).

Depending on how fast the page is generating updates, a simple way to bypass the above issues might just be to change code like this:

enter batch mode
add changes to queue
flush batch mode

to this:

enter batch mode
add changes to queue
if no such task already exists, post a 5 sec-delayed task to flush batch mode

This would collect updates into 5-second windows before posting them to DB tasks.  There could be shutdown consequences here (we'd probably want to detect shutdown and flush immediately).
Status: Assigned (was: Untriaged)
This bug has an owner, thus, it's been triaged. Changing status to "assigned".

Sign in to add a comment