Issue metadata
Sign in to add a comment
|
HQP should (?) return URL in which all inputs terms are mid-word matches
Reported by
teo8...@gmail.com,
Mar 4 2016
|
||||||||||||||||||||
Issue descriptionUserAgent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36 Steps to reproduce the problem: 1. I have an URL in my history that is like: http://www.example.com/aword-whatever-foobar-etc.html (the relevant parts being "aword" and "foobar") 2. type into the omnibox: "example.com aword foo bar" What is the expected behavior? I understand that partial word matches score a lot lower than matches of full words, so if there were enough other better results to show I could understand that they could fill up the proposed results and leave the matching URLS out. But since that is not the case and the url in my history IS a match, it should show up What went wrong? Absolutely no result shows up. On the other hand, if I type "example.com aword foobar" or even just "example.com aword foo", the matching url does show up. Did this work before? N/A Chrome version: 48.0.2564.116 Channel: n/a OS Version: Flash Version: Shockwave Flash 20.0 r0
,
Mar 4 2016
This maybe should be duped against either bug 591979 or bug 592006 .
,
Mar 4 2016
Thanks for mentioning this. We used to have mid-word matches in later parts of the URL. We turned them off because they often looked stupid. I know you say you want these matches to appear when there are no better suggestions. However, we heard enough complaints from other users saying the opposite and we turned them off. I'll take your suggestion into account and listen to hear if the general sentiment on this issue is changing.
,
Mar 4 2016
What is really irritating is when you have an url that contains, say "irishcoffee", and you write "irish coffee" and the result doesn't show up. From a brand like Google I would expect a somewhat more "intelligent" , natural-language-oriented way of determining whether an url contains given words, than just based on splitting on certain characters and begins-with/ends-with/contains logic. Anyway, if the problem is that mid-word matches gives garbage results, shouldn't the solution perhaps making them score lower rather than ignoring them completely? One thing is having only a midword match and the result not showing up. Another is having a match at the start of a word PLUS a mid-word match, and that the latter PREVENTS the result from showing up while in fact it makes it a better match, not a worse one.
,
Mar 8 2016
I'm reopening this and morphing slightly. This is now about re-enabling the HQP's ability to do mid-word matches outside hostnames. I believe removing these was a mistake. However, when re-adding them, we'll need to experiment with the scoring to find something beneficial, to make sure we don't accidentally make things worse. Mark has said he's willing to offer advice and help interpret the results of experiments, but we still need someone willing to actually come up with a potential algorithm (maybe spelunk to find out what we used to do here, and start with that?) and write a patch to do it. I am unlikely to get to this myself, so not taking.
,
Mar 9 2017
This issue has been available for more than 365 days, and should be re-evaluated. Please re-triage this issue. The Hotlist-Recharge-Cold label is applied for tracking purposes, and should not be removed after re-triaging the issue. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Mar 9 2017
Mark can decide how to dupe/leave open
,
Mar 10 2017
I'm experimenting with this as part of my general HQP experiments. No appropriate bug to dup it into though.
,
Jun 13 2017
Still experimenting. It's looking promising. The experiment won't return URLs that have only mid-word matches, but it will be forgiving of mid-word matches. (Roughly, at least half the matches will have to be matches at the beginning of a word.)
,
Jul 19 2017
,
Aug 1 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/fbc743ee8228f50c12e466a440f2920f3c86d5b8 commit fbc743ee8228f50c12e466a440f2920f3c86d5b8 Author: Mark Pearson <mpearson@chromium.org> Date: Tue Aug 01 20:08:33 2017 Omnibox - Launch Aggressively Suggest Infrequently Visited URLs Sets default values for all relevant parameters to the ones we decided to launch. In particular, this change: * makes it so that URLs that match the input--even if only visited once or twice--are likely to score above low quality query suggestions that come from the server. * boosts a URL suggestion if it appears the URL suggestion is clearly seeking that URL. In particular, if the omnibox input only matches that single URL from history, it gets a 3x boost (in effect we count it as having three times as many visits). This boost decreases as the number of matching URLs increases, so that if the user input matches five or more items from history, nothing gets a boost. * lowers the threshold for how well a URL must match the input in order to be displayed. Previously, for example, we wouldn't return URLs that match a word in the input if the word matches in the ?query or #hash section of the URL. Now we do. * reduces the relative weight of a "typed visit" (a time the URL is selected from the omnibox) compared with a regular visit (click on a link). It used to be that the former was worth 20x the latter. Now it's only 1.5x. * changes to a scoring model in which additional visits to a URL are guaranteed to increase its score. Previously we used a model based on the average quality of a visit, which means that if a URL has many typed visits and then gets a new untyped visit, its score (the average) will go down. Now we use simply a sum, which means the score will definitely increase. Precisely, in terms of code / config, we're launching the following settings: "HQPExperimentalScoringBuckets": "0.0:550,1:625,9.0:1300,90.0:1399", "HQPTypedValue": "1.5", "HQPFreqencyUsesSum": "true", "HQPNumMatchesScores": "1:3,2:2.5,3:2,4:1.5", "HQPExperimentalScoringTopicalityThreshold": "0.5" In the process, removes some of the flags for frequency scoring that I don't think are useful (not the right model for scoring) and aren't worth going back to. Bug: 695560, 327085 , 369989 , 508262, 580688 , 591981, 598184 Change-Id: Id349c5aaa2e09e6b5284c55fc5790f4b14b8fa7b Reviewed-on: https://chromium-review.googlesource.com/585377 Commit-Queue: Mark Pearson <mpearson@chromium.org> Reviewed-by: Peter Kasting <pkasting@chromium.org> Cr-Commit-Position: refs/heads/master@{#491089} [modify] https://crrev.com/fbc743ee8228f50c12e466a440f2920f3c86d5b8/components/omnibox/browser/history_quick_provider_unittest.cc [modify] https://crrev.com/fbc743ee8228f50c12e466a440f2920f3c86d5b8/components/omnibox/browser/in_memory_url_index_unittest.cc [modify] https://crrev.com/fbc743ee8228f50c12e466a440f2920f3c86d5b8/components/omnibox/browser/omnibox_field_trial.cc [modify] https://crrev.com/fbc743ee8228f50c12e466a440f2920f3c86d5b8/components/omnibox/browser/omnibox_field_trial.h [modify] https://crrev.com/fbc743ee8228f50c12e466a440f2920f3c86d5b8/components/omnibox/browser/scored_history_match.cc [modify] https://crrev.com/fbc743ee8228f50c12e466a440f2920f3c86d5b8/components/omnibox/browser/scored_history_match.h [modify] https://crrev.com/fbc743ee8228f50c12e466a440f2920f3c86d5b8/components/omnibox/browser/scored_history_match_unittest.cc
,
Aug 1 2017
This change helped but does not fix the issue. The omnibox still doesn't allow mid-word matches, so if one of the omnibox input words only matches a URL in those parts, the URL is not returned. The change listed above made the omnibox more willing to return suggestions that only match in the /path part of the URL, but those matches have to be at the beginning of a word boundary. I'm going to leave this bug open, though will not likely work on it for a while unless it appears to be a widespread problem for users.
,
Nov 30 2017
,
Nov 30 2017
Rephrasing summary, as the behave changed recently. I *think* the current behavior is if you have a multi-word omnibox input, and at least one term matches at a word boundary, then the URL will be returned. If all terms matches only in the middle of strings, then the URL will not be returned.
,
Jan 9 2018
The NextAction date has arrived: 2018-01-09
,
Jan 25 2018
I think the current state is reasonable. CC jdonnelly@, because he's thinking about fuzzy matching and this falls into that vein.
,
Jan 8
The NextAction date has arrived: 2019-01-08
,
Jan 8
CC skare@ as an FYI. He's thinking about fuzzy matching these days. |
|||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||
Comment 1 by b...@chromium.org
, Mar 4 2016