De-dupe duplicate domain lists in url_pattern_index |
||
Issue descriptionRight now we have vectors of offsets to strings that could have the same exact contents. We could save some memory (~50kb for a current easylist, for example) if we de-dupe these, like the existing CreateSharedString.
,
Oct 30
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/2638ba4cf77361775c12be9fc107f0aa634f05fc commit 2638ba4cf77361775c12be9fc107f0aa634f05fc Author: Charlie Harrison <csharrison@chromium.org> Date: Tue Oct 30 15:42:21 2018 Share vectors of domain offsets in url_pattern_index URL rules have two vectors of domains: domains to exclude matching on, and domains to include matching on (i.e. only match on those domains). Many URL rules share vectors, so we can compress the data by only writing domain vectors once, and referencing existing vectors when we hit a duplicate. This should slow down indexing, but cut down on memory and disk usage. No meaningful behavior should change. Extensions code is not modified, so their ruleset version is not incremented, which would trigger an unnecessary re-indexing. Bug: 899790 Change-Id: Ic5763f8e65185441c1df9774961b9e004f777f31 Reviewed-on: https://chromium-review.googlesource.com/c/1305497 Commit-Queue: Charlie Harrison <csharrison@chromium.org> Reviewed-by: Josh Karlin <jkarlin@chromium.org> Cr-Commit-Position: refs/heads/master@{#603912} [modify] https://crrev.com/2638ba4cf77361775c12be9fc107f0aa634f05fc/components/subresource_filter/core/common/indexed_ruleset.cc [modify] https://crrev.com/2638ba4cf77361775c12be9fc107f0aa634f05fc/components/subresource_filter/core/common/indexed_ruleset.h [modify] https://crrev.com/2638ba4cf77361775c12be9fc107f0aa634f05fc/components/url_pattern_index/url_pattern_index.cc [modify] https://crrev.com/2638ba4cf77361775c12be9fc107f0aa634f05fc/components/url_pattern_index/url_pattern_index.h [modify] https://crrev.com/2638ba4cf77361775c12be9fc107f0aa634f05fc/components/url_pattern_index/url_pattern_index_unittest.cc [modify] https://crrev.com/2638ba4cf77361775c12be9fc107f0aa634f05fc/components/url_pattern_index/url_rule_util_unittest.cc
,
Oct 30
|
||
►
Sign in to add a comment |
||
Comment 1 by bugdroid1@chromium.org
, Oct 29