New issue
Advanced search Search tips

Issue 592006 link

Starred by 1 user

Issue metadata

Status: Duplicate
Merged: issue 369989
Owner: ----
Closed: Mar 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: All
Pri: 2
Type: Bug



Sign in to add a comment

omnibox searches fail to match in the middle of domain names in history

Reported by wjrog...@gmail.com, Mar 4 2016

Issue description

UserAgent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.75 Safari/537.36

Steps to reproduce the problem:
1. Visit http://destinycalcifiedfragments.com/
2. Search for "calcified" in the omnibox

What is the expected behavior?
The URL http://destinycalcifiedfragments.com/ appears in the search results.

What went wrong?
The omnibox suggests 5 google searches and does not return anything from my recent history, despite having visited http://destinycalcifiedfragments.com/ several times over multiple days this week.

Did this work before? No 

Chrome version: 49.0.2623.75  Channel: stable
OS Version: 10.0
Flash Version: Shockwave Flash 20.0 r0

Chrome should match substrings in history URLs so I can find this site by the more-relevant search term "calcified" instead of the more general one "destiny".
 
chrome_search_substring_failure.png
19.6 KB View Download
chrome_search_substring_success.png
26.1 KB View Download
Cc: pkasting@chromium.org
Components: -UI UI>Browser>Omnibox
Labels: -OS-Windows -Via-Wizard -Arch-x86_64 OS-All
Owner: mpear...@chromium.org
Status: Assigned (was: Unconfirmed)
Mark, I asked the reporter to file this bug.

In general, I've gotten a lot of comments from people wanting what amounts to "mid-word" matching.  Most frequently this is not actually the middle of a word, it's the start of a word in the midst of a concatenated string of words (as in this example).  Various comments have mentioned e.g. "even though I bookmarked this, I can't get it to appear" or "Firefox matches these much better".

What can we do here?
* Ignoring the ranking challenges, is it feasible to change the HQP to do arbitrary mid-word matches, or is the segmenting system structured so we can only do "word starts"?  (I'm assuming we don't do mid-word matching today, unless my memory is faulty.)
* Could we maybe segment not only based on punctuation but using the dictionary, so we'd get a lot of the cases "for free"?
* If we can find a way to do something akin to mid-word matching, is there a way we can rank it without causing the couple most-commonly-visited URLs to just always match against everything?
Labels: Needs-Feedback
Thanks for filing this bug wjrogers@gmail.com.  We should be returning URLs that match in the middle of the domain name.  To help me investigate, can you please paste the chrome://omnibox output for the input "destinycalcifi" (not including the quotes) here?  Please check the "show all details" and "show results per provider" boxes.  I'd like to verify that Chrome thinks the URL was visited enough for Chrome to put it the databases the omnibox uses.

Thanks.


Peter, here are the answers to your questions:
* The HQP can and does to mid-word matching.  At some point we disable disabled mid-word matching for everything except hostnames.  This was a mainly product decision because mid-word matches to later parts of the URL can often look stupid.  The change evaluated as neutral.
* Our theory behind this was mostly the anecdote that people who try to retrieve based on mid-word matches tend to be segmenting the term in the same way as the writer of the web page did in the title.  I.e., mid-word URL matches often match at word boundaries in the title.
* Segmenting based on a dictionary is possible.  But getting dictionaries that work in the many languages that Chrome users speak will be a pain, and lead a lot of data dependency for little gain.
* Yes, we can rank mid-word matches smartly by giving them much lower scores than regular matches.  This wouldn't help the earlier product decision: if the omnibox doesn't have enough suggestions, it doesn't matter how poorly we rank the URL matches with odd mid-word matches--they'll still appear.

Comment 3 by wjrog...@gmail.com, Mar 4 2016

This may distract from the central issue, but is it relevant here that the title of the page in question has the (separate) word "Calcified" in it?

Comment 4 by wjrog...@gmail.com, Mar 4 2016

Enter omnibox input text:  
destinycalcifi
  Submit
Input parameters:
  Prevent inline autocomplete
  In keyword mode
Current page context:  
Display parameters:
  Show incomplete results
  Show all details
  Show results per provider, not just merged results
cursor position = 14
elapsed time = 69ms
all providers done = true
host = destinycalcifi has is_typed_host = false
Combined results.
Provider	Type	Relevance	Contents	Can Be Default	Starred	Description	URL	Fill Into Edit	Inline Autocompletion	Del	Prev	Tran	Done	Associated Keyword	Keyword	Duplicates	Additional Info
HistoryURL	history-url	1413	destinycalcifiedfragments.com	✔	✗	Destiny Calcified Fragments	http://destinycalcifiedfragments.com/	destinycalcifiedfragments.com	edfragments.com	✔	✗	1	✔			1	
last visit:	3/4/16, 1:21:36 PM
typed count:	3
visit count:	3
Search	search-suggest	1002	destiny calcified fragments	✗	✗	Google Search	https://www.google.com/search?q=destiny+calcified+fragments&oq=destinycalcifi&aqs=chrome.1.69i60j0j69i57j0l2&sourceid=chrome&ie=UTF-8	destiny calcified fragments		✗	✗	5	✔		google.com	0	
relevance_from_server:	true
should_prefetch:	false
Search	search-what-you-typed	1001	destinycalcifi	✔	✗		https://www.google.com/search?q=destinycalcifi&oq=destinycalcifi&aqs=chrome..69i60j0j69i57j0l2&sourceid=chrome&ie=UTF-8	destinycalcifi		✗	✗	5	✔		google.com	0	
relevance_from_server:	true
should_prefetch:	false
Search	search-suggest	601	destiny calcified fragment tracker	✗	✗		https://www.google.com/search?q=destiny+calcified+fragment+tracker&oq=destinycalcifi&aqs=chrome.3.69i60j0j69i57j0l2&sourceid=chrome&ie=UTF-8	destiny calcified fragment tracker		✗	✗	5	✔		google.com	0	
relevance_from_server:	true
should_prefetch:	false
Search	search-suggest	600	destiny calcified fragment hunter	✗	✗		https://www.google.com/search?q=destiny+calcified+fragment+hunter&oq=destinycalcifi&aqs=chrome.4.69i60j0j69i57j0l2&sourceid=chrome&ie=UTF-8	destiny calcified fragment hunter		✗	✗	5	✔		google.com	0	
relevance_from_server:	true
should_prefetch:	false

Results for individual providers.
Provider	Type	Relevance	Contents	Can Be Default	Starred	Description	URL	Fill Into Edit	Inline Autocompletion	Del	Prev	Tran	Done	Associated Keyword	Keyword	Duplicates	Additional Info
HistoryQuick	history-url	520	destinycalcifiedfragments.com	✔	✗	Destiny Calcified Fragments	http://destinycalcifiedfragments.com/	destinycalcifiedfragments.com	edfragments.com	✔	✗	1	✔			0	
last visit:	3/4/16, 1:21:36 PM
typed count:	3
visit count:	3

Provider	Type	Relevance	Contents	Can Be Default	Starred	Description	URL	Fill Into Edit	Inline Autocompletion	Del	Prev	Tran	Done	Associated Keyword	Keyword	Duplicates	Additional Info
HistoryURL	history-url	1413	destinycalcifiedfragments.com	✔	✗	Destiny Calcified Fragments	http://destinycalcifiedfragments.com/	destinycalcifiedfragments.com	edfragments.com	✔	✗	1	✔			0	
last visit:	3/4/16, 1:21:36 PM
typed count:	3
visit count:	3

Provider	Type	Relevance	Contents	Can Be Default	Starred	Description	URL	Fill Into Edit	Inline Autocompletion	Del	Prev	Tran	Done	Associated Keyword	Keyword	Duplicates	Additional Info
Search	search-what-you-typed	1001	destinycalcifi	✔	✗		https://www.google.com/search?q=destinycalcifi&sourceid=chrome&ie=UTF-8	destinycalcifi		✗	✗	5	✔		google.com	0	
relevance_from_server:	true
should_prefetch:	false
Search	search-suggest	1002	destiny calcified fragments	✗	✗		https://www.google.com/search?q=destiny+calcified+fragments&oq=destinycalcifi&sourceid=chrome&ie=UTF-8	destiny calcified fragments		✗	✗	5	✔		google.com	0	
relevance_from_server:	true
should_prefetch:	false
Search	search-suggest	601	destiny calcified fragment tracker	✗	✗		https://www.google.com/search?q=destiny+calcified+fragment+tracker&oq=destinycalcifi&sourceid=chrome&ie=UTF-8	destiny calcified fragment tracker		✗	✗	5	✔		google.com	0	
relevance_from_server:	true
should_prefetch:	false
Search	search-suggest	600	destiny calcified fragment hunter	✗	✗		https://www.google.com/search?q=destiny+calcified+fragment+hunter&oq=destinycalcifi&sourceid=chrome&ie=UTF-8	destiny calcified fragment hunter		✗	✗	5	✔		google.com	0	
relevance_from_server:	true
should_prefetch:	false
Thanks for the prompt feedback.  Can you paste the chrome://omnibox output for "calcified" here as well?  It looks like the appropriate database contains the URL.  This will help me figure out if it's not being retrieved or simply being assigned a low score.  Again, check the "show all details" and "show results per provider" boxes.

Comment 6 by wjrog...@gmail.com, Mar 4 2016

Enter omnibox input text:  
calcified
  Submit
Input parameters:
  Prevent inline autocomplete
  In keyword mode
Current page context:  
Display parameters:
  Show incomplete results
  Show all details
  Show results per provider, not just merged results
cursor position = 9
elapsed time = 122ms
all providers done = true
host = calcified has is_typed_host = false
Combined results.
Provider	Type	Relevance	Contents	Can Be Default	Starred	Description	URL	Fill Into Edit	Inline Autocompletion	Del	Prev	Tran	Done	Associated Keyword	Keyword	Duplicates	Additional Info
Search	search-what-you-typed	1300	calcified	✔	✗	Google Search	https://www.google.com/search?q=calcified&oq=calcified&aqs=chrome..69i57j0l5&sourceid=chrome&ie=UTF-8	calcified		✗	✗	5	✔		google.com	1	
relevance_from_server:	true
should_prefetch:	false
Search	search-suggest	1301	calcified fragments	✗	✗		https://www.google.com/search?q=calcified+fragments&oq=calcified&aqs=chrome.1.69i57j0l5&sourceid=chrome&ie=UTF-8	calcified fragments		✗	✗	5	✔		google.com	0	
relevance_from_server:	true
should_prefetch:	false
Search	search-suggest	600	calcified granuloma	✗	✗		https://www.google.com/search?q=calcified+granuloma&oq=calcified&aqs=chrome.2.69i57j0l5&sourceid=chrome&ie=UTF-8	calcified granuloma		✗	✗	5	✔		google.com	0	
relevance_from_server:	true
should_prefetch:	false
Search	search-suggest	565	calcified pineal gland	✗	✗		https://www.google.com/search?q=calcified+pineal+gland&oq=calcified&aqs=chrome.3.69i57j0l5&sourceid=chrome&ie=UTF-8	calcified pineal gland		✗	✗	5	✔		google.com	0	
relevance_from_server:	true
should_prefetch:	false
Search	search-suggest	564	calcified fragment locations	✗	✗		https://www.google.com/search?q=calcified+fragment+locations&oq=calcified&aqs=chrome.4.69i57j0l5&sourceid=chrome&ie=UTF-8	calcified fragment locations		✗	✗	5	✔		google.com	0	
relevance_from_server:	true
should_prefetch:	false
Search	search-suggest	563	calcified fragments insight	✗	✗		https://www.google.com/search?q=calcified+fragments+insight&oq=calcified&aqs=chrome.5.69i57j0l5&sourceid=chrome&ie=UTF-8	calcified fragments insight		✗	✗	5	✔		google.com	0	
relevance_from_server:	true
should_prefetch:	false

Results for individual providers.
Provider	Type	Relevance	Contents	Can Be Default	Starred	Description	URL	Fill Into Edit	Inline Autocompletion	Del	Prev	Tran	Done	Associated Keyword	Keyword	Duplicates	Additional Info
HistoryQuick	history-url	520	destinycalcifiedfragments.com	✗	✗	Destiny Calcified Fragments	http://destinycalcifiedfragments.com/	destinycalcifiedfragments.com		✔	✗	1	✔			0	
last visit:	3/4/16, 1:21:36 PM
typed count:	3
visit count:	3

Provider	Type	Relevance	Contents	Can Be Default	Starred	Description	URL	Fill Into Edit	Inline Autocompletion	Del	Prev	Tran	Done	Associated Keyword	Keyword	Duplicates	Additional Info
Search	search-what-you-typed	1300	calcified	✔	✗		https://www.google.com/search?q=calcified&sourceid=chrome&ie=UTF-8	calcified		✗	✗	5	✔		google.com	1	
relevance_from_server:	true
should_prefetch:	false
Search	search-suggest	1301	calcified fragments	✗	✗		https://www.google.com/search?q=calcified+fragments&oq=calcified&sourceid=chrome&ie=UTF-8	calcified fragments		✗	✗	5	✔		google.com	0	
relevance_from_server:	true
should_prefetch:	false
Search	search-suggest	600	calcified granuloma	✗	✗		https://www.google.com/search?q=calcified+granuloma&oq=calcified&sourceid=chrome&ie=UTF-8	calcified granuloma		✗	✗	5	✔		google.com	0	
relevance_from_server:	true
should_prefetch:	false
Search	search-suggest	565	calcified pineal gland	✗	✗		https://www.google.com/search?q=calcified+pineal+gland&oq=calcified&sourceid=chrome&ie=UTF-8	calcified pineal gland		✗	✗	5	✔		google.com	0	
relevance_from_server:	true
should_prefetch:	false
Search	search-suggest	564	calcified fragment locations	✗	✗		https://www.google.com/search?q=calcified+fragment+locations&oq=calcified&sourceid=chrome&ie=UTF-8	calcified fragment locations		✗	✗	5	✔		google.com	0	
relevance_from_server:	true
should_prefetch:	false
Search	search-suggest	563	calcified fragments insight	✗	✗		https://www.google.com/search?q=calcified+fragments+insight&oq=calcified&sourceid=chrome&ie=UTF-8	calcified fragments insight		✗	✗	5	✔		google.com	0	
relevance_from_server:	true
should_prefetch:	false
Regarding disabling mid-word matches outside hostnames: Hmmmmmmmmmmmmmmmmmmmmmm.

I've heard a lot of feedback for a while about wanting mid-word matches, including not-in-hostnames (though it's more commonly requested in hostnames, which is separate, and is what this bug mainly tracks).  My original philosophy with the omnibox was that stupid-looking or low-ranking matches are fine if we don't have something better, so I'm naturally skeptical towards a "these look dumb, turn them off" answer.

It's interesting that that change evaluated as neutral.  Either positive (the "dumb matches" were distracting or misleading) or negative (these were occasionally useful) would make sense to me; neutral is more curious.

I wonder if we just didn't have a good combination of ranking signals?  Perhaps we could try more limited experiments, e.g. allow these but only for bookmarked URLs, and see if we can find something that evaluates positively.  That might help us understand how to do this?  I feel like on principle allowing these matches, just ranked correctly, has to be better than hiding them...

Is this something we should split into another bug/reopen bug 591981 to discuss?
Labels: -Needs-Feedback
Owner: ----
Status: Available (was: Assigned)
Thanks wjrogers@,

It's clear that Chrome finds the match in the middle of the domain name.  However, that URL suggestion is getting out-scored by search query suggestions.  I think this is probably because you've only visited the page three times.

It sounds like this bug is primarily a duplicate of  bug 369989  ("HQP Should Be More Aggressive with Few Visits").  That's still ongoing.  We've tried one solution for that but it evaluated poorly, so we're trying another tactic.

We could also try to score mid-hostname matches more aggressively (regardless of the number of visits).  This however makes me nervous; I don't want to see amazon.com for instance in response to the input "ma".  I'd rather see mail.google.com, maps.google.com, and so on.  That's why I think this is primarily a "few visits" issue.

I'll leave it to pkasting@ to decide whether to dup this bug or whether he wants to morph this in some way.

Maybe the "recency" part of the frecency score should be contributing more?  Three visits within the last few days is significantly stronger than three visits over the past month.

This could likely be covered by  bug 369989 , but I'm not sure it currently is.  Mark, do you think there's anything worth looking at in terms of tweaking the recency part of the calculation?  If so, should we add that to  bug 369989  and dupe, or cover that here?

If you don't think so, then this is probably a dupe of that bug, although before duping I'd want to know that your fix(es) on that bug actually fix the behavior in this specific case.
Hey wjrogers@gmail.com, who has been so helpful thus far,

Can I get a copy of the "History" file associated with this profile?  You can e-mail me (mpearson @google.com or mpearson @chromium.org).  I promise to keep it confidential.

Here's why.  pkasting@ writes:
>>>
before duping I'd want to know that your fix(es) on that bug actually fix the behavior in this specific case.
>>>
One of the fixes in 369989 corrects a problem with misidentifying typed users.  I would look in your history database for this URL, look at the visit_info field to see what the type of transitions that brought you to the URL.  I know from the chrome://omnibox output that it should be counting as a "typed visit" (meaning done through the omnibox).  But if something else is set in that field (e.g., a redirect), then the part of the omnibox system that scores the URLs will not count it as a typed visit.  This is a bug, and I'd like to know if the fix to the bug will fix your problem.

I'd also check to see how recent the visits to the URL are.  Few hours apart, few days apart, etc.

thanks,
mark

Mergedinto: 369989
Status: Duplicate (was: Available)
Thanks for your history file.

All three of these visits are of transition type 838860801.  This means they're affected by the typed visit bug mentioned in  bug 369989 .  These visits are not getting counted as typed for the purposes of scoring.  Fixing that bug should probably fix this issue.  I'll push again to see if I can roll out that bug fix to everyone.

(For the record, 838860801 corresponds to PAGE_TRANSITION_CHAIN_END & PAGE_TRANSITION_CHAIN_START & PAGE_TRANSITION_FROM_ADDRESS_BAR & PAGE_TRANSITION_TYPED.)

Not that it matters given my discovery above, but these visits are close together in time.  From the first visit to the second is about 40 minutes, and then the third is about 40 minutes after that.

As I explain in
https://bugs.chromium.org/p/chromium/issues/detail?id=369989#c62
the fix I submitted solves this issue. :-)  Thanks for reporting it; it provided a useful motivating example for the change.

Sign in to add a comment