Add HostResolver Mojo interface and convert consumers. mmenke@ said this needs some design work, but can largely just keep the same API, and perhaps remove the method to clear the cache.
Still digging into it, but this failure leading to the rollback is a weird one. Culprit finder found a few different environments with the issue, some of which were tested (and passed) with the original CQ. And so far, everything seems to be passing fine in my newly-created retry CL.
Within the failed test reports from culprit finder, the failures are all newly-added-in-the-CL _ResolveHost tests, so it does seem related to the CL. But the failures themselves don't make much sense to me. They're all cases testing the result of a host resolution. In the memory dumps of expected vs actual IpEndpoint objects, the actuals looks reasonable (eg "00-00 00-00 00-00 00-00 00-00 00-00 00-00 00-01 10-00 50-00" which parses out to ::1:80 or localhost) but the expecteds look like garbage data (eg "01-AE 9D-0C 4F-02 00-00 00-00 00-00 00-00 00-00 00-02 50-00"). Is something going wrong with the IP-from-string parsing used by the tests?
I'm running out of ideas. Nothing jumps out as wrong with the relevant parsing code, and I can't find any recent changes to any of it.
Maybe I'll just try resubmitting the CL, and we'll see if it causes any issues again. Maybe something else crazy was going on at the time and then culprit finder found my stuff because it added new tests (that then failed because of the crazy stuff going on). It's a weak theory, but maybe worth a try...
Lots of work remains here. The biggest thing is to implement the service API itself (https://chromium-review.googlesource.com/c/chromium/src/+/1113665, still being finished up). After that, we need to convert at least the code we care about for servicification to that new API (we may do that under a separate bug). And while doing that, we're likely to find new cases that need to be handled by the API, leading to more work there.
I'm also investigating a quicker plan to unblock canary before we finish all the main DNS servicification work for this bug. We would then be able to take more time to do all the actual DNS servicification work, as it's not an effort we are comfortable with rushing.
Notes for the record on a backup "minimal" plan to unblock servicification canary without actually finishing up this bug and the rest of DNS servicification:
*Only root reason this is a canary blocker is the separate host caches and that, for privacy, they would all need to be cleared when we attempt to clear any.
*We could unblock canary by adding code to all the cache clearing locations (browser data removal, closing incognito) to, if network service is enabled, also directly clear the old "local" cache rather than just the service cache.
*Would have minor performance implications to have separate caches, but at least the important privacy implications would be handled.
*Better solution is still to continue with DNS servicification, so we'll only plan on implementing this if DNS becomes the long pull for canary. Should be a fairly simple CL if we decide to do the minimal shortcut.
*Whether we do this "minimal" plan or just continue with strait DNS servicification, there is no plan to rush the actual DNS API work. We will take our time with that to ensure it is done right, and it will likely take quite a bit of time.
-> without actually finishing up this bug and the rest of DNS servicification *first*. We would still finish that at our leisure once unblocking canary.
Comment 1 by xunji...@chromium.org
, May 9 2018