New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 603883 link

Starred by 13 users

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 2
Type: Bug



Sign in to add a comment

Omnibox autocomplete incorrectly handles words with combining diacritics

Reported by dani...@gmail.com, Apr 15 2016

Issue description

UserAgent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36

Steps to reproduce the problem:
1. Search for an hebrew word that starts with diacritics (Nikud). for example עֲבוֹדָה
2. Now start a new search by just typing the first letter and let it auto complete, in the above example type the hebrew letter ע
3. The autocomplete will mark the rest of the letters as expected. e.g. בודה will be marked
4. now if you type another letter, let's say ג, as opposed to English or regular Hebrew behavior, the entire word will be deleted and only ג will remain in the bar. 
The expected behavior is that עג will be shown in the bar, this probably happens because the diacritic letter is counted as a letter and not as a mark.

What is the expected behavior?

What went wrong?
Autocomplete behavior is inconsistent with diacritics, see steps to reproduce

Did this work before? N/A 

Chrome version: 50.0.2661.75  Channel: stable
OS Version: OS X 10.11.4
Flash Version: Shockwave Flash 21.0 r0
 
Components: -UI UI>Browser>Omnibox
Labels: Needs-TestConfirmation
Is this Mac-specific? Adding Needs-TestConfirmation to check this on non-Mac.
Labels: -Needs-TestConfirmation -OS-Mac Needs-Feedback OS-All
This wouldn't be Mac-specific.

From Unicode's perspective, the diacritics are distinct code points that combine with the prior characters.  They also work this way when using the backspace key: one backspace will delete the diacritic, the next the letter.

Inline autocompleting in the case where the input string has proceeded past where one of these diacritics would appear, without including it, is problematic, because we have no way to represent an autocompletion that contains the diacritic (we can only add on to the end of the input) and we don't know that the input without the diacritic has the same meaning.  (I don't speak Hebrew, but AFAICT from searching, the two strings aren't the same, or at least aren't treated the same by Google.)

So I think what the omnibox is doing is correct: it's not autocompleting you to a different string that you never typed, because that different string does not necessarily mean the same thing to e.g. a search engine.

Reporter, if I'm mistaken, then I'll need a fuller explanation of how Hebrew diacritics affect semantics.

Comment 3 by dani...@gmail.com, Apr 16 2016

Well, You are right that diacritics can combine to result different meanings. But modern Hebrew usually doesn't use them. The main uses are:
1. Poetry
2. Bible
3. Content aimed at children who just learned to read

They are supposed to represent vowels, but you can write the same word without them and you'll get the same meaning and pronunciation, they are implied when read.

More so, Chrome actually supports this definition, if you use ctrl-f to search a page for a non-diacritical version of an Hebrew word that is contained in the page as diacritical, Chrome would find it.

Although you are right, there can be two distinct Hebrew words with the same spelling and different diacritical marks, which also means a different meaning. but because of Hebrew use of diacritics, a different mark on a letter does not make it a different letter. That's why the behavior is inconsistent, in the eyes of a user, it is the same letter no matter how it's marked.

Now, I think I haven't really correctly explained the problem, The thing is, that once you've once searched for a Hebrew word with a diacritical sign on the first letter, your search history is contaminated, and you would have to write the first letter twice to write a different search term. and this also prevents you from using any autocomplete on the word, as any autocomplete suggestion is deleted.

I'll give you an example in the Latin alphabet, which will probably be easier to understand compared to a language you don't speak.

I'm using "y" and " ̆" (U+306) to construct the word "y̆our"
Steps to recreate:
1. Clear your search history
2. Use omnibox to search for y̆our
3. Start a new search by typing y, notice that the entire word is selected by autocomplete and not only the diacritic and "our" as expected
4. Start typing "ep", as you would if you wanted to search for the word "yep". When e is pressed, the entire word is deleted which will leave you with the string "ep" in the omnibox, and not "yep" as expected




Project Member

Comment 4 by sheriffbot@chromium.org, Apr 16 2016

Labels: -Needs-Feedback Needs-Review
Owner: pkasting@chromium.org
Thank you for providing more feedback. Adding requester "pkasting@chromium.org" for another review and adding "Needs-Review" label for tracking.

For more details visit https://sites.google.com/a/chromium.org/dev/issue-tracking/autotriage - Your friendly Sheriffbot
Cc: msw@chromium.org shrike@chromium.org
Labels: -OS-All OS-Mac
@3: I can't reproduce the behavior you describe, at least on trunk.  Using your steps, while the selection in comment 3 does look as if it covers the whole word (probably because we can't always sanely draw a selection for "just the diacritic on a letter", but CCing msw in case he wants to comment here), typing in step 4 acts as it should -- "ep" results in "yep", not "ep".

For some reason I can't get the Hebrew input in comment 0 to autocomplete at all, despite debug info telling me it ought to, so I can't test that case :/

I wonder if this really is Mac-specific after all. +CC shrike -- can you test the steps in comment 3 on Mac?  If they are broken, then I think this is Mac-only (and maybe msw can once again comment, this time on whether we use all our own Textfield machinery there as we do elsewhere).

Comment 6 by dani...@gmail.com, Apr 18 2016

I've tried this on my work computer, Windows 10, and it didn't recreate, so it probably is Mac-only

Btw, when trying to autocomplete the Hebrew input, keep in mind it's a right-to-left language, so if you'd try copy pasting from the left, it won't work. also the non diacritic version of the first letter is "ע"

Anyway, thanks for the quick response for this rather obscure issue :)

Cc: -shrike@chromium.org pkasting@chromium.org
Owner: shrike@chromium.org
Yeah, I did the correct repro steps for the Hebrew autocompletion.  I don't know why it's not happening.

Anyway, I bet this is basically "Mac omnibox does not use views::Textfield" and whatever system it does use is handling this incorrectly.  ->Jayson to triage, I don't know who formally owns the Mac omnibox now.

Comment 8 by shrike@chromium.org, Apr 18 2016

Hello danilan@,

Would you please provide more detailed steps on how to reproduce the problem? I am not super familiar with Hebrew and so I'm not sure I'm typing the right characters. Maybe let me know the exact keys to press on my US keyboard.

It may be easier to make a movie of the whole thing with the onscreen keyboard visible - that way I can see exactly what to tap on that keyboard, and see the results you're getting in the browser.

Comment 9 by shrike@chromium.org, Apr 18 2016

Labels: Needs-Feedback
@8: Some info on inputting Unicode in Mac OS: https://en.wikipedia.org/wiki/Unicode_input#In_Mac_OS

You can search for particular Unicode characters at http://www.fileformat.info/ to obtain their code values and such.
I know about entering Unicode, and how to invoke the Hebrew input manager to enter Hebrew. I just would like exact instructions from the reporter so that I'm sure I'm entering the exact characters he is to reproduce the problem.

OK.  I used the ones he gave in comment 0 :)

(Incidentally, I think the reason I wasn't getting autocompletion on Win is because I was pasting rather than keying in the characters, which disables inline autocompletion.  Unfortunately the Windows omnibox doesn't support a couple of the Windows standard methods for entering Unicode; I've filed a separate bug about that.)

Comment 13 by dani...@gmail.com, Apr 19 2016

@11: I won't have time right now to map the string in comment 1 to a latin keyboard, (it's morning :) but I did recreate it on a Latin alphabet in the end of comment 3. Can you try that first?

Cc: shrike@chromium.org
Labels: -Needs-Feedback
Owner: ----
Status: Available (was: Unconfirmed)
Thank you for the additional info. Here are precise steps to reproduce:

1. Launch Chrome with an empty user-data-dir
2. Create a new tab
3. Paste y̆our into the Omnibox and press return
4. Click in the Omnibox and type y

At this point the Omnibox contains y̆our but the entire word is selected. If you perform the same steps with "your", only the last three characters are selected at this point.

On the Mac the text system is treating the two-character sequence as a single glyph (which I think is the right way to do it), which I suspect is throwing off where the Omnibox thinks the selection should fall. I haven't looked at any code but I think it'll be difficult to correct for this behavior on the Mac side.

Labels: -Needs-Review
@14: The critical bit is not how the selection appears, but what happens when you type the letter "o" after the letter "y".  If this deletes the letter "y", then this basically means typing anything with a diacritic can break typing of that letter in the future.  Even if that's hard to solve, that should be a P1 bug and it needs to have an owner; we can't live with that.  Can you please find an appropriate owner?

If, OTOH, you get "yo" (with or without an autocompletion), then it doesn't seem like we can repro the primary bug reported here.
I understand what the critical issue is, thank you, and my point about selection is that the Omnibox code appears to be directing the Mac text system to select the wrong amount of text, which is why the string gets deleted when you type the second character.

It is incorrect to say that this is a problem typing anything with a diacritic - for example, I cannot reproduce the problem when searching for a string like über.

That's a good distinction to make.  This only applies to combining diacritics;  ü is a single Unicode codepoint.

I'm still concerned, though, since it seems like some languages may have a lot of combining diacritics or other combining/joining characters.  I'm not familiar enough to know for sure, but I'd worry about Vietnamese and maybe Arabic?

Perhaps the solution here is to switch the Mac omnibox to use the views::Textfield, even outside the normal path of other Mac views work?  I don't know if that's possible and I don't know how complete the Mac port of Textfield is, but in theory maybe that would leave us in the same place as other platforms.

Or maybe we're using the Mac text editing APIs incorrectly and there's a fix to how we say to do selection, where we can select in between the diacritic codepoint and the previous (combined) character codepoint, but we're incorrectly not doing so?  Maybe there's some cross-platform omnibox code that needs to be per-codepoint that isn't?

Is there anyone at all available to look into all this more deeply on the mac side?  Maybe we'll end up punting this bug but ideally we could answer the above first to ensure we completely understand the scope of the problems + fixes.
Cc: -shrike@chromium.org
Owner: shrike@chromium.org
Status: Assigned (was: Available)
I will try to find a little time to look at this. I was thinking we might have to resort to some kind of intermediate string massaging - perhaps a views::Textfield is the answer.

Cc: shrike@chromium.org
Labels: M-56
Owner: lgrey@chromium.org
Cc: yukishiino@chromium.org lgrey@chromium.org sky@chromium.org
 Issue 158070  has been merged into this issue.
Summary: Omnibox autocomplete incorrectly handles words with combining diacritics (was: Omnibox autocomplete incorrectly handles words with Hebrew diacritics)
Note from duped-in issue: this affects Thai as well.
Project Member

Comment 22 by bugdroid1@chromium.org, Oct 17 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/8a8bae9fbc2e72afc208a0376c0f4a49388a6da3

commit 8a8bae9fbc2e72afc208a0376c0f4a49388a6da3
Author: lgrey <lgrey@chromium.org>
Date: Mon Oct 17 13:56:41 2016

Preserve original selection when suggesting completions with diacritics

When NSTextView is asked to select a range which does not begin on a grapheme
boundary, it expands the selection to the previous boundary. This change
preserves the original selection, then contracts the range sent to
NSTextView to fall on the *next* boundary.

Text editing operations that operate on the selection use the original
selection instead of the visual selection.

Since the omnibox view uses the text view's selected range, the
view's |selectedRange| returns the original range and not the visual
range.
BUG=603883

Review-Url: https://codereview.chromium.org/2395233005
Cr-Commit-Position: refs/heads/master@{#425671}

[modify] https://crrev.com/8a8bae9fbc2e72afc208a0376c0f4a49388a6da3/chrome/browser/ui/cocoa/location_bar/autocomplete_text_field_editor.h
[modify] https://crrev.com/8a8bae9fbc2e72afc208a0376c0f4a49388a6da3/chrome/browser/ui/cocoa/location_bar/autocomplete_text_field_editor.mm
[modify] https://crrev.com/8a8bae9fbc2e72afc208a0376c0f4a49388a6da3/chrome/browser/ui/cocoa/location_bar/autocomplete_text_field_editor_unittest.mm
[modify] https://crrev.com/8a8bae9fbc2e72afc208a0376c0f4a49388a6da3/chrome/browser/ui/cocoa/omnibox/omnibox_view_mac.mm
[modify] https://crrev.com/8a8bae9fbc2e72afc208a0376c0f4a49388a6da3/chrome/browser/ui/cocoa/omnibox/omnibox_view_mac_browsertest.mm

Project Member

Comment 23 by bugdroid1@chromium.org, Oct 18 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/67434dc6fcbb9c1f734b147e75183d864b33f174

commit 67434dc6fcbb9c1f734b147e75183d864b33f174
Author: lgrey <lgrey@chromium.org>
Date: Tue Oct 18 15:55:59 2016

Revert of [Mac] Preserve original selection when suggesting completions with diacritics (patchset #13 id:240001 of https://codereview.chromium.org/2395233005/ )

Reason for revert:
This almost definitely caused crbug.com/656972, but I can't repro the issue. Reverting while I investigate.

Original issue's description:
> Preserve original selection when suggesting completions with diacritics
>
> When NSTextView is asked to select a range which does not begin on a grapheme
> boundary, it expands the selection to the previous boundary. This change
> preserves the original selection, then contracts the range sent to
> NSTextView to fall on the *next* boundary.
>
> Text editing operations that operate on the selection use the original
> selection instead of the visual selection.
>
> Since the omnibox view uses the text view's selected range, the
> view's |selectedRange| returns the original range and not the visual
> range.
> BUG=603883
>
> Committed: https://crrev.com/8a8bae9fbc2e72afc208a0376c0f4a49388a6da3
> Cr-Commit-Position: refs/heads/master@{#425671}

TBR=asvitkine@chromium.org,erikchen@chromium.org
# Not skipping CQ checks because original CL landed more than 1 days ago.
BUG=603883

Review-Url: https://codereview.chromium.org/2426983002
Cr-Commit-Position: refs/heads/master@{#425975}

[modify] https://crrev.com/67434dc6fcbb9c1f734b147e75183d864b33f174/chrome/browser/ui/cocoa/location_bar/autocomplete_text_field_editor.h
[modify] https://crrev.com/67434dc6fcbb9c1f734b147e75183d864b33f174/chrome/browser/ui/cocoa/location_bar/autocomplete_text_field_editor.mm
[modify] https://crrev.com/67434dc6fcbb9c1f734b147e75183d864b33f174/chrome/browser/ui/cocoa/location_bar/autocomplete_text_field_editor_unittest.mm
[modify] https://crrev.com/67434dc6fcbb9c1f734b147e75183d864b33f174/chrome/browser/ui/cocoa/omnibox/omnibox_view_mac.mm
[modify] https://crrev.com/67434dc6fcbb9c1f734b147e75183d864b33f174/chrome/browser/ui/cocoa/omnibox/omnibox_view_mac_browsertest.mm

Looking at the Summary tab in the crash reporter, this is the exception that was thrown:

Crashing on exception: *** -[NSBigMutableString substringWithRange:]: Range {0, 32000} out of bounds; string length 88

So something about the current selection looks to be screwed up. This might be another spot where the current selection gets set (separate from setSelectedRange:).

Comment 25 by lgrey@chromium.org, Oct 18 2016

From a few things ellyjones@ discovered, it looks like 32000 is coming from the max size of a text span in the TextEdit framework (https://github.com/steventroughtonsmith/MPWTestSuite/blob/master/MacC/TESample.h#L179 for example.)

That makes us think this isn't so much the selection being updated without us seeing it, but some error condition in the Carbon code that's giving up and setting the "max" selection size as a fallback.

There's also this: https://crash.corp.google.com/browse?stbtiq=57efc43b00000000 If it's the same issue (and I'm moderately confident it is), then the right-click menu is a red herring. Would love to figure out how to repro.


lgrey@, Do we have any latest update on this? Also we can mark it as 'Fixed' if no other CL is pending.

Thank you!

Comment 27 by lgrey@chromium.org, May 24 2017

Hi manoranjanr@

I haven't had a chance to look at this again yet. I definitely wouldn't mark it fixed, since the CL was rolled back. I think a fix will be quite involved and/or may require action from Apple.
lgrey@: should this be downgraded to P-3?  The issue sounds serious (affects core omnibox functionality in RTL languages it seems), which makes me think P-2, yet it doesn't sound like we can / plan to fix it in the short term, hence P-3?

Comment 29 by lgrey@chromium.org, Jun 15 2017

shrike@, what's your take?
The change in c#22 was made to accommodate the omnibox code, basically maintaining the selection range that the omnibox expects while adjusting that range to conform to how NSTextView works. This was kind of an ugly hack because we had to override a private Appkit method; it ultimately also did not work. What if, instead, we change the omnibox to send Chrome Mac the kind of selection range that NSTextView expects? That way both the omnibox and NSTextView have the same selection range (so there's no trickery needed to keep the two in sync). Short of trying this approach, I'm not sure we will be able to fix this bug.

Comment 31 by lgrey@chromium.org, Jun 16 2017

I think we'd just be trading one sharp corner for another

(using ü to illustrate the point, I know as per above it's a single code point)

Let's say typing u autocompletes to über for me. Since the omnibox knows NSTextView can't select inside the boundary, it sends the selection range as {1, 3}.

ü[ber]

If I press delete at this point, I get:
ü

If I press delete again, I get the empty string. Basically impossible to type just "u". IIRC this is similar to one of the current failure modes.
lgrey@ and shrike@,  (scan comments #14 and onward)

Is there any way out of this wilderness?  This is a problem for Hebrew, Thai, and several other languages with combining characters.  Should we try to re-land the hack that overrides a private AppKit method?  Should we leave this as-is until someday we get regular TextFields on Mac?  Is there another approach?  Or, as a stopgap, do we have to decide between the two bad behaviors:
- when inline autocompletion happens, you cannot type beyond the first combining character in the completion (e.g., to type a different query).
- when inline autocompletion happens, you cannot type a non-combined character if it's part of the inline autocompletion.  (This is lgray@'s comment #32.)

Ugh.

Comment 33 by lgrey@chromium.org, Oct 11 2017

I wouldn't mind trying to reland the hack. This was my first Chromium CL, so I think I'm better equipped now to debug the crashes this time assuming they still occur.
Cc: m...@jirayu.in.th mpear...@chromium.org
 Issue 773737  has been merged into this issue.
For the record, this appears to be a problem on Views too.  See bug 702716.

Comment 36 by msw@chromium.org, Feb 21 2018

The actual corresponding Views bug is Issue 813534
Is there any solution for this yet? or a way to completely disable autocomplete in omnibar?

This issue becomes more frustrated already.
Issue 813534 would be the one to follow now, since the Cocoa Omnibox is 99.99% likely to be retired starting from M69.

The good news is, it's a much more tractable problem in Views since we're in control of the whole stack there.
Labels: Hotlist-CocoaBrowser

Sign in to add a comment