[subresource_filter] Make IndexedRuleset compact and fast. |
||
Issue descriptionThe IndexedRuleset data structure is memory-mapped whenever subresource_filter is activated. Currently it leads to 2.5+ Mb memory overhead, which is quite a lot, especially on mobile. This bug is aimed at reducing the size of the IndexedRuleset, while still keeping it fast.
,
Apr 7 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/986188a907c91d38938e24604f7b4397fdaaff75 commit 986188a907c91d38938e24604f7b4397fdaaff75 Author: pkalinnikov <pkalinnikov@chromium.org> Date: Fri Apr 07 10:04:35 2017 [subresource_filter] Index activation rules separately. This CL adds a separate index for activation rules. This drastically increases speed of IndexedRuleset::ShouldDisableFilteringForDocument, because currently there are only about 20 rules with activation options, as opposed to tens of thousands URL rules total in the ruleset. Ultimately, this change makes computing activation states for subdocuments very fast. Memory footprint added by the new index is less than a kilobyte. BUG= 708458 Review-Url: https://codereview.chromium.org/2797133006 Cr-Commit-Position: refs/heads/master@{#462825} [modify] https://crrev.com/986188a907c91d38938e24604f7b4397fdaaff75/components/subresource_filter/core/common/flat/rules.fbs [modify] https://crrev.com/986188a907c91d38938e24604f7b4397fdaaff75/components/subresource_filter/core/common/indexed_ruleset.cc [modify] https://crrev.com/986188a907c91d38938e24604f7b4397fdaaff75/components/subresource_filter/core/common/indexed_ruleset.h [modify] https://crrev.com/986188a907c91d38938e24604f7b4397fdaaff75/components/subresource_filter/core/common/indexed_ruleset_unittest.cc
,
Apr 7 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/f529b3a0319b740dcef179629b6fd949d7417cbd commit f529b3a0319b740dcef179629b6fd949d7417cbd Author: pkalinnikov <pkalinnikov@chromium.org> Date: Fri Apr 07 11:02:36 2017 [subresource_filter] Include WebSocket bit to the default element_types. This change makes a difference of about 64 kB in the memory footprint of IndexedRuleset. BUG= 708458 Review-Url: https://codereview.chromium.org/2798363002 Cr-Commit-Position: refs/heads/master@{#462833} [modify] https://crrev.com/f529b3a0319b740dcef179629b6fd949d7417cbd/components/subresource_filter/core/common/flat/rules.fbs [modify] https://crrev.com/f529b3a0319b740dcef179629b6fd949d7417cbd/components/subresource_filter/core/common/indexed_ruleset.cc
,
Apr 10 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/e64c5d36227046633521b638ffc2f0b9a903590f commit e64c5d36227046633521b638ffc2f0b9a903590f Author: pkalinnikov <pkalinnikov@chromium.org> Date: Mon Apr 10 12:26:15 2017 [subresource_filter] Check request flags before URL pattern. This CL makes IndexedRuleset's request matching 30% faster by putting the rule flags check before the URL pattern matching. Most of the candidate rules have non-matching flags, like ElementType, and matching the flags is cheaper than that of a URL pattern. BUG= 708458 Review-Url: https://codereview.chromium.org/2801303002 Cr-Commit-Position: refs/heads/master@{#463231} [modify] https://crrev.com/e64c5d36227046633521b638ffc2f0b9a903590f/components/subresource_filter/core/common/indexed_ruleset.cc [modify] https://crrev.com/e64c5d36227046633521b638ffc2f0b9a903590f/components/subresource_filter/core/common/indexed_ruleset_unittest.cc
,
Apr 13 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/154fc5ad392d229570cf7332ce8fb686da7d96c0 commit 154fc5ad392d229570cf7332ce8fb686da7d96c0 Author: pkalinnikov <pkalinnikov@chromium.org> Date: Thu Apr 13 17:38:14 2017 [IndexedRuleset] Improve worst-case domain list matching. Some URL rules have long domain lists (up to 100-200 items). The previous solution simply scanned the list linearly whenever the rule was being matched against a network request, and checked each domain to be a subdomain of the document's origin. The CL replaces this algorithm. Now the origin which is matched against the list is scanned. Each of its subdomains (in descending order of length) gets binary-searched in the sorted list of domains, until some of them is found. The old linear scanning still has place for short domain lists. Quick measurements have shown that the length threshold of 5 achieves a decent trade-off. The new binary-search-based algorithm activates only for domain lists longer than 5 items. Additionally, the CL changes the FlatBuffers format of the ruleset by splitting domain list into 2 parts (included/excluded) instead of embedding the exclusion bit into domain strings. This CL should improve the long-tail/worst-case matching performance. BUG= 708458 Review-Url: https://codereview.chromium.org/2816693002 Cr-Commit-Position: refs/heads/master@{#464457} [modify] https://crrev.com/154fc5ad392d229570cf7332ce8fb686da7d96c0/components/subresource_filter/core/common/flat/rules.fbs [modify] https://crrev.com/154fc5ad392d229570cf7332ce8fb686da7d96c0/components/subresource_filter/core/common/indexed_ruleset.cc [modify] https://crrev.com/154fc5ad392d229570cf7332ce8fb686da7d96c0/components/subresource_filter/core/common/indexed_ruleset_unittest.cc
,
Jun 12 2017
Closing this bug since I don't have plans to optimize further. Charlie, feel free to reopen if needed. |
||
►
Sign in to add a comment |
||
Comment 1 by bugdroid1@chromium.org
, Apr 6 2017