foo.test triggers a google search instead of DNS resolution
Reported by
joa...@gmail.com,
Nov 7 2017
|
|||||||||||
Issue descriptionUserAgent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3262.0 Safari/537.36 Example URL: http://foo.test Steps to reproduce the problem: 1. Go to foo.test in the omnibar 2. See chromium fail miserably opening google.com/search What is the expected behavior? The site under foo.test should be rendered. What went wrong? Chromium fails to respect the RFC 2606 (and 6761?). > ".test" is recommended for use in testing of current or new DNS related code. Did this work before? N/A Chrome version: 64.0.3262.0 Channel: dev OS Version: 10.0 Flash Version: I'm only facing this issue since .dev which is a gTLD owned by Google got HTST now, so I had to move my sites to .test of which Google Chrome failed miserably to render.
,
Nov 7 2017
joaumg@gmail.com: Please respect our Code of Conduct (https://chromium.googlesource.com/chromium/src/+/master/CODE_OF_CONDUCT.md) and be respectful and constructive. Saying that "foo.test" is treated as a search term and not a URL would be sufficient to describe the issue and your expectation. It's not helpful to describe this (twice) as "failing miserably". Note that there's a workaround available that I believe would work in your case. If you enter the test hostname with a scheme ("http://foo.test") and it successfully resolves and loads a page, then subsequently entering the hostname without the scheme ("foo.test") should navigate instead of searching. mpearson: I *think* we'd do what the reporter expects here if "test" was listed in https://www.publicsuffix.org/list/public_suffix_list.dat but I guess it's not because it's a special case and not a registry-controlled TLD? Do we have special cases for other TLDs in the AutocompleteInput parsing? (I don't see anything obvious but I might be missing something.) Should we? Feel free to assign back to me if you think there's something we should do.
,
Nov 7 2017
Your assessment is right. We special case "localhost" as a host. https://cs.chromium.org/chromium/src/components/omnibox/browser/autocomplete_input.cc?type=cs&q=localhost+file:autocomplete&sq=package:chromium&l=440 We don't handle special any of the reserved gTLDs. I think this is just an oversight. See https://tools.ietf.org/html/rfc2606 for the list and details. Of the four, I think we should special case: .test, .example, and .localhost to mark these inputs as type URL, leaving .invalid off. Back to jdonnelly@ for further triage and/or assignment. I think this is a "good first bug", as the change is localized in the code and easily tested. I also think pkasting@ should review the code, as he has a lot more knowledge about URL input type identification and our policies around gTLDs than I do; he may know something I overlooked.
,
Nov 7 2017
Hrm...In the DNS resolver, we also special case localhost6, and one or two others. Would it make sense to either do a cache-only DNS lookup, or use net::IsLocalHostname? That wouldn't cover .example or .test, but it would cover domains that are hard-coded. The cache-only DNS option would cover recently visited domains, which may or may not be desirable (On one hand, we know they're valid domains, on the other, that gets us inconsistent behavior, and would be affected by ISPs that domain squat).
,
Nov 7 2017
I think using cache-only DNS may lead to inconsistencies; I'd avoid it. Using IsLocalHostname() as part of the solution sounds fine to me. But if we're touching the omnibox code, we might as well also fix .example and .text while we're at it.
,
Nov 7 2017
I agree that we should hard code .example and .test, was just thinking of what doing that alone might miss.
,
Nov 8 2017
> Saying that "foo.test" is treated as a search term and not a URL would be sufficient to describe the issue and your expectation. It's not helpful to describe this (twice) as "failing miserably". I apology for that. It was wrongly worded and won't happen again. I know and respect the effort you people put in the project. BTW, is there a way to edit the issue (so it can be better worded)? > Of the four, I think we should special case: .test, .example, and .localhost to mark these inputs as type URL, leaving .invalid off. Seems like people are already using .invalid (maybe in a way is was not designed to?) https://iyware.com/dont-use-dev-for-development/#invalidtld
,
Nov 8 2017
> I apology for that. It was wrongly worded and won't happen again. I know and respect the effort you people put in the project. BTW, is there a way to edit the issue (so it can be better worded)? Yes, but I think it's only available to members of the Chromium project. If you really want, I can edit it but I wouldn't worry about it. It's pretty mild. We all get frustrated sometimes and I've seen much worse. :)
,
Nov 8 2017
Ok, cool, I'll do this or find someone else to take it on. For future reference, here's a summary of what to do: - Somewhere in AutocompleteInput::Parse*: - Add a case that checks for either net::IsLocalHostname or a TLD of .example, .test, .invalid or .localhost. - Along with a comment referring to https://tools.ietf.org/html/rfc2606. - If any of the above conditions are true, return metrics::OmniboxInputType::URL. - Send to pkasting@chromium.org for code review. * https://cs.chromium.org/chromium/src/components/omnibox/browser/autocomplete_input.cc?l=177&rcl=905e5b5cd96bfd7dd5b51525cad1d398c7850f10
,
Nov 9 2017
,
Nov 16 2017
,
Nov 16 2017
I'm deeply uncertain about this. (1) I wonder if these should be in the PSL despite not being "true" TLDs. This would be the most consistent (and, as a bonus, easy) way to get all parts of Chrome to recognize them as such. If they're effectively "real, but uncontrolled" TLDs, I think listing in the PSL is appropriate. I'd like to hear rsleevi's thoughts on this. (2) If these do not belong in the PSL, I wonder if they should be specially-handled at all. Fixing bug 104638 ought to make use of these relatively painless (one infobar clickthough per TLD). This would make them consistent with other "unknown" TLDs that are used in the real world. By contrast, treating them as "definitely navigable" would potentially block legitimate search queries. (That's a consequence of (1) as well.) (3) If we think these _should_ be specially-handled, and should _not_ be in the PSL, then I'm not certain which of the domains should be specially-handled. The parser is supposed to return URL only for things which are definitely navigable. But .invalid is documented as guaranteed not-navigable and .example is supposed to not be used for real navigations (only fake examples). I would think we should exclude both of these. (I disagree with the page that claims that making .invalid resolve is "acceptable according to the specification" -- I don't read it as acceptable per RFC 2606. If this use is not widespread, I'd rather discourage it than promote it.) Note that in the past, we've WontFixed requests to specially handle .localhost and the like under the justification given in item (2).
,
Nov 16 2017
1) The PSL maintainers had a long discussion of this (can dig it up), but the conclusion was not to add the IANA-reserved TLDs to the PSL. This was based on an assessment of the widespread use of the PSL for purposes of determining 'valid' TLD, as well as our past committments that the contents will be the root zone database + gTLDs. 2) Note that RFC 6761 (which establishes these reservations) specifically advises against application software recognizing these as special (Bullets 2 and 3 of their registration profiles) That said, I defer to the omnibox team regarding how they'd like to handle.
,
Nov 16 2017
RFC 6761 suggests to me that the omnibox should assume .invalid is QUERY (6.4.1-3), .localhost is URL (6.3.1-3), and not do anything special with .test (6.2.1-3). (.example is not covered; I would treat it like .test and have no special handling). Under this, I'd narrow my suggestion regarding the PSL to be "it should include .localhost", since if the purpose of the PSL is to determine "valid" TLDs, RFC 6761 seems to guarantee this is always valid (contra comment 13, "Name resolution APIs and libraries SHOULD recognize localhost names as special and SHOULD always return the IP loopback address for address queries"). I agree with not adding the rest to the PSL.
,
Nov 16 2017
As noted, the conclusion was that 'valid' means the contents of the root zone database and (pending) gTLDs. The discussion specifically included localhost in consideration. https://groups.google.com/forum/#!forum/publicsuffix-discuss can serve for discussion about revisiting that (it was setup after that discussion). See also https://bugzilla.mozilla.org/show_bug.cgi?id=1161102
,
Dec 7 2017
My reading of rfc6761, specifically in sections 6.2 to 6.5 where there's more clear explanation on what should happen with .test, .localhost, .invalid, and .example respectively, leads me think these should no longer treat them as searches. https://tools.ietf.org/html/rfc6761#section-6.2 I realize there's more to this topic, related to what happens after they are not treated as a special case and used in a search. But just as a 'step 1', I'd like to strive for a consensus on whether they should not be treated as search input.
,
Dec 8 2017
I don't know what "no longer treat them as searches" means. We don't do special handling for any of these today, so they may search or navigate depending on the user's history. We only "assume these are searches" insofar as we assume any other potentially-navigable input to an intranet hostname is a search. That's consistent across intranets. What I think you're asking is not that we "stop assuming these are searches" but that we "begin assuming these are navigations". I'm uncomfortable with that except for .localhost. I stand by my comments in #14: we should assume .invalid isn't navigable, and we treat all known-non-navigable inputs as QUERY. And we should assume .localhost is a URL. The other two should be handled as they are today. This conclusion comes from the directives in the cited RFC sections.
,
Dec 8 2017
In a separate conversation with pkasting@ the topic of: what happens when chrome decides whether something is a URL or not, and how we help the user find what they desire. I tested with a junk .com and .test text to see what chrome does. IIUC, what I describe below will be different if the user has previously successfully navigated to this site. So this might only happen if the site doesn't really exist or if the user has cleared their history (again, iiuc, which I might not). For the .com case, if chrome fails to navigate we explain that and ask whether the user would like to search google for that text instead (first screen shot). In the .test case, the user is taken to search results (seconds screen shot). There's an alternative 'did you mean' search, but no indication or suggestion that maybe the user meant to navigate. Even if we decide not to change the behavior of the omnibox, could we do better helping the user navigate in the .test case? Maybe with a link in the search results or explaining that a https:// prefix will navigate to the .test site?
,
Dec 8 2017
That'd be a question for the server-side folks. It would be worth asking how many queries that contain the string ".test" come in, which might in theory be navigable, that look like actual attempts to navigate versus which don't.
,
Dec 8 2017
Does the ".test" site you tried with (that long string) actually exist? If would assume we launch off a ping to the URL when we do the search, and if it the URL returns a valid code, we'd display the "did-you-mean to navigate" infobar in that case This is the best solution (better than putting something directly in the search results page for a particular search engine). It might already work.
,
Dec 8 2017
If mpearson agrees with what I said earlier (that treating .test/.example like other intranet TLDs, i.e. showing accidental nav infobars, is the best solution), then I think the only thing to be done on this bug would be to decide if we want to force .localhost to parse as URL and .invalid as QUERY.
,
Dec 8 2017
I don't have a strong opinion on .test and .example. .localhost should be parsed as a URL. .invalid should be parsed as a query (unless it's something like a http://foo.invalid -- i.e., the user tried to navigate to it)
,
Dec 8 2017
I think even http://foo.invalid should be parsed as a query, since this doc is telling us we should assume it can never resolve. IOW, we should treat it just like we treat inputs with invalid hostnames; even prefixing those with http:// won't force a navigation (nothing will). The counterargument would be real-world evidence that people set up .invalid machines locally in violation of the RFC. We had to back off our more-restrictive hostname-checking rules due to that kind of evidence there.
,
Dec 8 2017
> The counterargument would be real-world evidence that people set up .invalid machines locally in violation of the RFC. See comment 7.
,
Dec 8 2017
I saw, but I'm skeptical that that's actually widely-followed. I'm willing to break a couple people who are doing counter-spec things. Not so willing to break a thousand or something. With the hostname stuff we generally let a couple different complaints about a rule build before slacking it off, so we could be sure it wasn't some random one-off site that could just fix itself. My hope is that almost nobody is actually using .invalid for things they should have used .test for. (I don't know why the author of that article claims the recommendation is within-spec. It's clearly against RFC 6761 6.4 item 4.)
,
Dec 12 2017
Issue 794001 has been merged into this issue.
,
Dec 12 2017
RFC 6761 6.2.2 (considerations for ".test") says: > Application software SHOULD NOT recognize test names as special I think this means that showing an "accidental nav infobar" would be inappropriate for ".test" domains. > and SHOULD use test names as they would other domain names I think this means that if we treat ".com" domains as navigable, then ".test" domains should be navigable, too.
,
Dec 12 2017
There's lots of discussion on the RFCs above, and Dave and I had another long discussion over IM the other day. The omnibox' existing treatment of domains is complex, and if one is not already familiar with how it works, I'm not sure it makes sense to say what things like "special treatment" and "as other domains would be treated" mean in the context of internal omnibox processing. The two people who understand this code best have opined above.
,
Dec 12 2017
@28 I accept that debating RFC's may be the wrong approach. There are clearly some users that expect foo.test to navigate to foo.test or explain that it is not reachable. The core debate as I currently understand it is whether there will be more unhappy users if foo.test navigates or if it searches; along with: how much effort is needed to understand when and how to recover from the wrong (from the *user's* point of view) choice being made. What is a good way to determine how changing the handling will affect users? Is there a more appropriate method than, change it and see? Could a user study get reliable results for this topic?
,
Dec 12 2017
I think the core debate is whether there is something distinct about these domains that justifies handling them in a way distinct from other intranet domains. If not, then debates about whether users are happy should be made in the context of changing the whole intranet handling UX, rather than for these specific domains. We've also had very little feedback from users on whether these .test domains they were trying to use were actually navigable, whether they got the accidental search infobar, whether using it resolved their problems, and therefore whether the complaint here is "momentary disconcerting behavior" versus "I can't do what I want". What to fix depends on that kind of distinction.
,
Dec 12 2017
#30 Very reasonable. So far, the complaints I've seen are "I can't do what I want" even though that's somewhat incorrect -- they can do what they want, if they only knew how to do so. I tested with a fake DNS entry for display.test. Below is a screen shot of what happens if it is *not* defined (not a recognized host) in the first picture. The second picture shows what happens if it *is* defined. This matches what pkasting@ has mentioned about whether it's 'actually navigable'. I agree that in the case where it's actually navigable, that the tip is enough. In the case that it is not actually navigable, it seems like Chrome is not working properly and/or the DNS I'm trying to setup is not working properly. I can predict someone getting very frustrated trying to setup a server or do network testing (being mislead about whether things are working correctly). This is not a large number of people by percentage, I'm sure, but those doing such work often make recommendations to friends, organizations, and businesses about what browser to use. I'd rather not frustrate them. Maybe we can show the tip for all *.test (and maybe others)?
,
Dec 12 2017
I'm not clear on the use case for trying to navigate to something that Chrome knows cannot be navigated to. One of several reasons we don't show these infobars on all syntactically-navigable (but not actually responding) inputs is because they're worse than useless -- they give users a false choice of an action that can't succeed. If you're in the case of setting up servers and doing network testing and you know enough to configure DNS to respond to .test entries and the like, then I claim you know enough to tell, today, whether things are working -- you use tools like curl and such. You use your web browser to actually load web pages. And in between the two there are things like the developer tools and network tracing page. So it's not obvious to me that there's a missing use case that we don't support today.
,
Dec 12 2017
#22 .invalid navigating to http://foo.invalid will currently navigate (i.e. incorrectly) to that site if it's defined (there's an entry for that host). So maybe at least that should change.
,
Dec 13 2017
Right, that's the discussion in comment 23 ff.
,
Dec 13 2017
#30, #32 Our ".test" domains are navigable. Our team of web developers launches a production site at "foo.com", but on our development machines, we use "too.test" instead. Our team uses proxy servers like Pow, which scan localhost requests for ".test" hosts, and dynamically start Rails applications based on the host. Because Ruby is slow, it may take a long time (about 10 seconds) before the server becomes responsive, and only then will the accidental search infobar appear. At this point the user already feels that Chrome isn't tailored for visiting these types of fancy faux domains. My case is, at the very least, an argument for showing an infobar right away for some syntactically-navigable (but not actually responding) inputs. It would be nicer for the 3000+ Pow users (based-off GitHub stars) if the browser just went to the domain right away though.
,
Dec 13 2017
OK, that helps me understand what's going on better. If you suffix with "/" or prefix with "http://" once, we'll navigate by default, and then we'll do it ever afterwards for that hostname. If we fix bug 104638, then doing that once for any .test site will cause all navigations for _all_ .test hostnames to navigate by default. Honestly, I think that's sufficient.
,
Dec 13 2017
I was aware that "/" or "http://" would trigger a navigation. I think this is non-obvious and feels more like a workaround than a solution. My usage of the omnibox has trained me to expect that I can type domain names without "the technical junk" and they will just work. I think it's a trap to dismiss issues that only occur "the first time," as this guarantees everyone will trip over it at least once. If we could prove or confidently suppose that it's more common for users to trip over accidental navigations when they intended to search for strings including ".test", then I could accept this inconvenient first-time learning hump. It seems like an unusual search substring though.
,
Dec 13 2017
FWIW, not many search queries include the text ".test". There's an order of magnitude more people (accidentally?) searching for domain names like "speedtest.net" than ".test". There are even 2x more people searching for "define test" than searching for ".test". https://trends.google.com/trends/explore?q=%22.test%22,%22speedtest.net%22,%22define%20test%22 So however you decide to handle ".test", it won't matter that much in the aggregate. :-)
,
Dec 13 2017
@37: I didn't mean to dismiss the issue, just to say that while imperfect, I think having the same solution here that we use for all other intranet domain names is maybe acceptable. I definitely think fixing bug 104638 would make life better though.
,
Dec 13 2017
@39: Certainly, "dismiss" was probably too strong of a word for me to use. I understand that we're trying to balance between two options with similar merit. I also agree that, at least, fixing bug 104638 would make life better. Another argument for promoting these domains to behave more similarly to "real domains" is that, unlike intranet domain names which (I believe?) can be anything at all, these TLDs are specifically reserved for certain purposes. These purposes make them slightly more likely to be navigable than "http://totally.unknown.domain.on.myintranettld".
,
Dec 13 2017
Another FWIW, AFAIK there isn't much difference between searching for "define test" and "define.test". So if "define.test" were treated as a navigation then the search could still be done as "define test", (or "?define test", or "?define.test", or by clicking the search link shown if http://define.test/ is not found).
,
Dec 13 2017
There's no search link in a guest or incognito profile -- just a regular one. (Weird.) I don't think "?define.test" is discoverable, and I still think these should behave like intranet TLDs and not global ones, but I have said enough. If you want to do it the other way, go for it.
,
Dec 18 2017
Thanks Peter, My current plan is to check for "localhost" and "test" in components/omnibox/browser/autocomplete_input.cc using canonicalized_url->DomainIs(). IIUC, canonicalized_url->DomainIs() overlaps with canonicalized_url->host() so I can replace the existing check for canonicalized_url->host() == "localhost". That covers "localhost" and "test". For "example", we already consider example.com (and others) a URL. We could say "example" is already WAI or we could say it should be a TLD*. (agrees with #12.3) For "invalid", IMO, this appears to be more for testing DNS systems which seem to be out-of-scope for an omnibox input. I'm inclined to leave the behavior as-is for "invalid", which is to start a QUERY. (agrees with #3, #12, #14, #22) *The earlier RFC2606 (and the link at #7) mentions .example as a TLD, but the later RFC6761 limits example to Second Level Domains. Is ".example" expected to be a TLD (as mentioned in RFC2606)? On net::IsLocalHostname(): sounds reasonable, but I'd like to leave it out of the initial CL (consider doing it separately). There's CL at https://chromium-review.googlesource.com/c/chromium/src/+/831214/1/components/omnibox/browser/autocomplete_input.cc
,
Dec 18 2017
* "leave behavior as-is for invalid" does not agree with comments 12/14/22, which all request that Parse() mark this as QUERY rather than UNKNOWN. (You use the unclear phrase "start a QUERY"; I think you are simply referring to "search by default", which is not the same thing as parsing as QUERY. UNKNOWN and QUERY both search by default for most inputs, but UNKNOWN may navigate in various scenarios, while QUERY will not.) * RFC 6761 does not limit example to 2LDs; section 6.5 explicitly mentions the "example." TLD.
,
Dec 19 2017
#44
(1) Thanks, didn't know.
(2) Ah, I was misreading that the others mentioned (".invalid.", ".test.", and ".localhost.") all had a preceding period.
,
Dec 22 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/9aa2101f42e1727355c43e24310dbc80c0033896 commit 9aa2101f42e1727355c43e24310dbc80c0033896 Author: Dave Schuyler <dschuyler@chromium.org> Date: Fri Dec 22 01:01:05 2017 [omnibox] treat localhost, test, example, and invalid as top level domains. This CL changes a omnibox input such as foo.test to be a URL rather than a QUERY. The input of localhost was already being treated as a URL, but now foo.localhost will also be treated as a URL. The invalid TLD will be treated as an invalid URL. Bug: 782146 Change-Id: I5676ff1f36fa860f6c6f654bcfa475be06a6d875 Reviewed-on: https://chromium-review.googlesource.com/831214 Commit-Queue: Dave Schuyler <dschuyler@chromium.org> Reviewed-by: Justin Donnelly <jdonnelly@chromium.org> Reviewed-by: Peter Kasting <pkasting@chromium.org> Cr-Commit-Position: refs/heads/master@{#525881} [modify] https://crrev.com/9aa2101f42e1727355c43e24310dbc80c0033896/components/omnibox/browser/autocomplete_input.cc [modify] https://crrev.com/9aa2101f42e1727355c43e24310dbc80c0033896/components/omnibox/browser/autocomplete_input_unittest.cc
,
Dec 22 2017
dschuyler: Fixed?
,
Dec 22 2017
,
Jan 12 2018
Issue 801103 has been merged into this issue.
,
Mar 12 2018
Issue 815292 has been merged into this issue.
,
Sep 4
Issue 880389 has been merged into this issue. |
|||||||||||
►
Sign in to add a comment |
|||||||||||
Comment 1 by morlovich@chromium.org
, Nov 7 2017Status: Untriaged (was: Unconfirmed)