New issue
Advanced search Search tips

Issue 899790 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Oct 30
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug



Sign in to add a comment

De-dupe duplicate domain lists in url_pattern_index

Project Member Reported by csharrison@chromium.org, Oct 29

Issue description

Right now we have vectors of offsets to strings that could have the same exact contents. We could save some memory (~50kb for a current easylist, for example) if we de-dupe these, like the existing CreateSharedString.
 
Project Member

Comment 1 by bugdroid1@chromium.org, Oct 29

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/78f3d60afe620382809b3c4ead82cc5d56eea8c0

commit 78f3d60afe620382809b3c4ead82cc5d56eea8c0
Author: Charlie Harrison <csharrison@chromium.org>
Date: Mon Oct 29 18:21:25 2018

Add a helper for serializing domain rules

This CL shouldn't change behavior.

Bug:  899790 
Change-Id: Ia59af2fafaa95130dfbb2138356c8783c22030b7
Reviewed-on: https://chromium-review.googlesource.com/c/1305493
Reviewed-by: Josh Karlin <jkarlin@chromium.org>
Commit-Queue: Charlie Harrison <csharrison@chromium.org>
Cr-Commit-Position: refs/heads/master@{#603571}
[modify] https://crrev.com/78f3d60afe620382809b3c4ead82cc5d56eea8c0/components/url_pattern_index/url_pattern_index.cc

Project Member

Comment 2 by bugdroid1@chromium.org, Oct 30

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/2638ba4cf77361775c12be9fc107f0aa634f05fc

commit 2638ba4cf77361775c12be9fc107f0aa634f05fc
Author: Charlie Harrison <csharrison@chromium.org>
Date: Tue Oct 30 15:42:21 2018

Share vectors of domain offsets in url_pattern_index

URL rules have two vectors of domains: domains to exclude matching on,
and domains to include matching on (i.e. only match on those domains).

Many URL rules share vectors, so we can compress the data by only
writing domain vectors once, and referencing existing vectors when
we hit a duplicate.

This should slow down indexing, but cut down on memory and disk usage.
No meaningful behavior should change.

Extensions code is not modified, so their ruleset version is not
incremented, which would trigger an unnecessary re-indexing.

Bug:  899790 
Change-Id: Ic5763f8e65185441c1df9774961b9e004f777f31
Reviewed-on: https://chromium-review.googlesource.com/c/1305497
Commit-Queue: Charlie Harrison <csharrison@chromium.org>
Reviewed-by: Josh Karlin <jkarlin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#603912}
[modify] https://crrev.com/2638ba4cf77361775c12be9fc107f0aa634f05fc/components/subresource_filter/core/common/indexed_ruleset.cc
[modify] https://crrev.com/2638ba4cf77361775c12be9fc107f0aa634f05fc/components/subresource_filter/core/common/indexed_ruleset.h
[modify] https://crrev.com/2638ba4cf77361775c12be9fc107f0aa634f05fc/components/url_pattern_index/url_pattern_index.cc
[modify] https://crrev.com/2638ba4cf77361775c12be9fc107f0aa634f05fc/components/url_pattern_index/url_pattern_index.h
[modify] https://crrev.com/2638ba4cf77361775c12be9fc107f0aa634f05fc/components/url_pattern_index/url_pattern_index_unittest.cc
[modify] https://crrev.com/2638ba4cf77361775c12be9fc107f0aa634f05fc/components/url_pattern_index/url_rule_util_unittest.cc

Status: Fixed (was: Untriaged)

Sign in to add a comment