New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 708458 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Jun 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Feature

Blocking:
issue 609747



Sign in to add a comment

[subresource_filter] Make IndexedRuleset compact and fast.

Project Member Reported by pkalinnikov@chromium.org, Apr 5 2017

Issue description

The IndexedRuleset data structure is memory-mapped whenever subresource_filter is activated. Currently it leads to 2.5+ Mb memory overhead, which is quite a lot, especially on mobile.

This bug is aimed at reducing the size of the IndexedRuleset, while still keeping it fast.
 
Project Member

Comment 1 by bugdroid1@chromium.org, Apr 6 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/c4b33b2fe691abd42aeeaf5489a68bd5b53b2e08

commit c4b33b2fe691abd42aeeaf5489a68bd5b53b2e08
Author: pkalinnikov <pkalinnikov@chromium.org>
Date: Thu Apr 06 08:42:08 2017

[subresource_filter] Don't store no-op rules in IndexedRuleset.

Currently there are more than 1000 URL rules wasting space because they are not
supported. These are a few rules with REGEXP pattern type, and lots of
popup-only filtering rules, as well as 500+ rules controlling activation of CSS
filtering.

This CL removes such rules from IndexedRuleset, reducing its size by ~75 kB.

BUG= 708458 

Review-Url: https://codereview.chromium.org/2796133002
Cr-Commit-Position: refs/heads/master@{#462397}

[modify] https://crrev.com/c4b33b2fe691abd42aeeaf5489a68bd5b53b2e08/components/subresource_filter/core/common/indexed_ruleset.cc
[modify] https://crrev.com/c4b33b2fe691abd42aeeaf5489a68bd5b53b2e08/components/subresource_filter/core/common/indexed_ruleset_unittest.cc

Project Member

Comment 2 by bugdroid1@chromium.org, Apr 7 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/986188a907c91d38938e24604f7b4397fdaaff75

commit 986188a907c91d38938e24604f7b4397fdaaff75
Author: pkalinnikov <pkalinnikov@chromium.org>
Date: Fri Apr 07 10:04:35 2017

[subresource_filter] Index activation rules separately.

This CL adds a separate index for activation rules. This drastically increases
speed of IndexedRuleset::ShouldDisableFilteringForDocument, because currently
there are only about 20 rules with activation options, as opposed to tens of
thousands URL rules total in the ruleset. Ultimately, this change makes
computing activation states for subdocuments very fast.

Memory footprint added by the new index is less than a kilobyte.

BUG= 708458 

Review-Url: https://codereview.chromium.org/2797133006
Cr-Commit-Position: refs/heads/master@{#462825}

[modify] https://crrev.com/986188a907c91d38938e24604f7b4397fdaaff75/components/subresource_filter/core/common/flat/rules.fbs
[modify] https://crrev.com/986188a907c91d38938e24604f7b4397fdaaff75/components/subresource_filter/core/common/indexed_ruleset.cc
[modify] https://crrev.com/986188a907c91d38938e24604f7b4397fdaaff75/components/subresource_filter/core/common/indexed_ruleset.h
[modify] https://crrev.com/986188a907c91d38938e24604f7b4397fdaaff75/components/subresource_filter/core/common/indexed_ruleset_unittest.cc

Project Member

Comment 3 by bugdroid1@chromium.org, Apr 7 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/f529b3a0319b740dcef179629b6fd949d7417cbd

commit f529b3a0319b740dcef179629b6fd949d7417cbd
Author: pkalinnikov <pkalinnikov@chromium.org>
Date: Fri Apr 07 11:02:36 2017

[subresource_filter] Include WebSocket bit to the default element_types.

This change makes a difference of about 64 kB in the memory footprint of
IndexedRuleset.

BUG= 708458 

Review-Url: https://codereview.chromium.org/2798363002
Cr-Commit-Position: refs/heads/master@{#462833}

[modify] https://crrev.com/f529b3a0319b740dcef179629b6fd949d7417cbd/components/subresource_filter/core/common/flat/rules.fbs
[modify] https://crrev.com/f529b3a0319b740dcef179629b6fd949d7417cbd/components/subresource_filter/core/common/indexed_ruleset.cc

Project Member

Comment 4 by bugdroid1@chromium.org, Apr 10 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/e64c5d36227046633521b638ffc2f0b9a903590f

commit e64c5d36227046633521b638ffc2f0b9a903590f
Author: pkalinnikov <pkalinnikov@chromium.org>
Date: Mon Apr 10 12:26:15 2017

[subresource_filter] Check request flags before URL pattern.

This CL makes IndexedRuleset's request matching 30% faster by putting the rule
flags check before the URL pattern matching. Most of the candidate rules have
non-matching flags, like ElementType, and matching the flags is cheaper than
that of a URL pattern.

BUG= 708458 

Review-Url: https://codereview.chromium.org/2801303002
Cr-Commit-Position: refs/heads/master@{#463231}

[modify] https://crrev.com/e64c5d36227046633521b638ffc2f0b9a903590f/components/subresource_filter/core/common/indexed_ruleset.cc
[modify] https://crrev.com/e64c5d36227046633521b638ffc2f0b9a903590f/components/subresource_filter/core/common/indexed_ruleset_unittest.cc

Project Member

Comment 5 by bugdroid1@chromium.org, Apr 13 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/154fc5ad392d229570cf7332ce8fb686da7d96c0

commit 154fc5ad392d229570cf7332ce8fb686da7d96c0
Author: pkalinnikov <pkalinnikov@chromium.org>
Date: Thu Apr 13 17:38:14 2017

[IndexedRuleset] Improve worst-case domain list matching.

Some URL rules have long domain lists (up to 100-200 items). The previous
solution simply scanned the list linearly whenever the rule was being matched
against a network request, and checked each domain to be a subdomain of the
document's origin.

The CL replaces this algorithm. Now the origin which is matched against the
list is scanned. Each of its subdomains (in descending order of length) gets
binary-searched in the sorted list of domains, until some of them is found.

The old linear scanning still has place for short domain lists. Quick
measurements have shown that the length threshold of 5 achieves a decent
trade-off. The new binary-search-based algorithm activates only for domain
lists longer than 5 items.

Additionally, the CL changes the FlatBuffers format of the ruleset by splitting
domain list into 2 parts (included/excluded) instead of embedding the exclusion
bit into domain strings.

This CL should improve the long-tail/worst-case matching performance.

BUG= 708458 

Review-Url: https://codereview.chromium.org/2816693002
Cr-Commit-Position: refs/heads/master@{#464457}

[modify] https://crrev.com/154fc5ad392d229570cf7332ce8fb686da7d96c0/components/subresource_filter/core/common/flat/rules.fbs
[modify] https://crrev.com/154fc5ad392d229570cf7332ce8fb686da7d96c0/components/subresource_filter/core/common/indexed_ruleset.cc
[modify] https://crrev.com/154fc5ad392d229570cf7332ce8fb686da7d96c0/components/subresource_filter/core/common/indexed_ruleset_unittest.cc

Status: Fixed (was: Started)
Closing this bug since I don't have plans to optimize further. Charlie, feel free to reopen if needed.

Sign in to add a comment