Optimize SameDomainOrHost via a fast-fail path |
||||
Issue description
SameDomainOrHost() is a function in registry_controlled_domain.cc that
determines whether two input strings have the same domain. Currently it works
like this:
bool SameDomainOrHost(base::StringPiece host1,
base::StringPiece host2,
PrivateRegistryFilter filter) {
// Check for exact host matches, which is faster than looking up the domain
// and registry.
if (host1 == host2)
return true;
// Check for a domain and registry match.
const base::StringPiece& domain1 =
GetDomainAndRegistryAsStringPiece(host1, filter);
return !domain1.empty() &&
(domain1 == GetDomainAndRegistryAsStringPiece(host2, filter));
}
We might have an opportunity for a fast-fail path after the first call to
GetDomainAndRegistryAsStringPiece; once we obtain domain1, we can check whether
host2.endsWith(domain1).
As a further optimization, if we calculated the domain of the longer host first,
we might have a chance to fail-fast with a length difference from endsWith().
This idea emerged in https://codereview.chromium.org/2568133007/
csharrison says: This is probably worth it. I think a big chunk of time we end
up calling SameDomainOrHost we are calling it with a subresource URL and a
first_party_for_cookies URL. I would expect in cases where the hosts are not
identical, it is more probable for the domains to be different.
,
Jan 4 2017
The idea is that host2.endsWith(domain1) is a necessary (not sufficient) condition for domain1 == GetDomainAndRegistryAsStringPiece(host2, filter) to be true.
,
Jan 6 2017
Patch is up at http://codereview.chromium.org/2612313003, but we'd still need to actually measure if it's actually faster.
,
Jan 6 2017
It'll be hard to evaluate this change without knowing the distribution of inputs this function typically gets. Maybe we should log the inputs of this function for a small set of popular pages? rsleevi, do you have a good sense of this?
,
Jan 6 2017
csharrison: I'm not quite sure what you're proposing, but it doesn't sound correct. This is used whenever loading profiles or sending requests, so a representative test is simply comparing two sets of existing profiles that have accumulated browsing state. It would be the long-tail of sites the user sees that more negatively affects the performance, rather than a few 'top sites', so a realistic test is one that contains more than just 'popular pages'. I would argue that the test simply needs to use a fixed comparison of data; taking an existing 'real world' profile and comparing the performance would be fine IMO.
,
Jan 6 2017
Sorry, "popular" is the wrong word, maybe "representative"? I'm mostly nervous about popular sites because my gut tells me that they are more likely to use subdomains for resource requests. For this particular optimization those sites will perform worse because they must do an additional endsWith scan. Of course, my intuition may be off-base.
,
Sep 26
,
Jan 11
This issue has been marked as started, but has no owner. Making available. |
||||
►
Sign in to add a comment |
||||
Comment 1 by rsleevi@chromium.org
, Jan 4 2017