New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 699441 link

Starred by 4 users

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 3
Type: Bug



Sign in to add a comment

Leading 4-byte unicode char can break existing text with MSIME Japanese

Reported by k...@digitaldolphins.jp, Mar 8 2017

Issue description

UserAgent: Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36

Steps to reproduce the problem:
1. Type a 4-byte unicode char "hoshi" (A glowing star U+1F31F character).
2. Type ABCDEF
3. Go back cursor after to "hoshi".
4. Type "sakujosakujosakujo"
5. Press space to convert
6. Press enter to submit.
7. ABCDEF is overwritten.

(See break.gif)

What is the expected behavior?
The following ones leave:
- "hoshi" (A glowing star U+1F31F character).
- conversion result of "sakujosakujosakujo"
- ABCDEF

What went wrong?
The following ones left:
- "hoshi" (A glowing star U+1F31F character).
- conversion result of "sakujosakujosakujo"

Did this work before? N/A 

Chrome version: 56.0.2924.87  Channel: stable
OS Version: 6.3
Flash Version: Shockwave Flash 24.0 r0

* Ctrl+Z undo command can recover eliminated characters.
 
break.gif
38.6 KB View Download

Comment 1 by tkent@chromium.org, Mar 8 2017

Components: -UI Blink>Editing

Comment 2 by yosin@chromium.org, Mar 9 2017

Cc: changwan@chromium.org yukawa@chromium.org nona@chromium.org
Components: -Blink>Editing Blink>Editing>IME
Status: Available (was: Unconfirmed)
For ease of repro, following URL shows INPUT with U+1F31F
https://jsfiddle.net/fx71v6kk/

It seems we should pass offset based on Code Point instead of UTF-16 Code Unit to MS-IME.

Offset in WebTextInputInfo and TextInputInfo are UTF-16 Code Unit.
> It seems we should pass offset based on Code Point instead of UTF-16 Code Unit to MS-IME.

IME APIs (IMM32 and Text Services Framework) used in Windows always expect character offsets in UTF-16.

Isn't this a regression?  I think Chromium has supported surrogate pair for long time, and is skeptical that this has been broken since its beginning.

Comment 4 by yosin@chromium.org, Oct 4 2017

Labels: Pri-3
Project Member

Comment 5 by sheriffbot@chromium.org, Oct 4

Labels: Hotlist-Recharge-Cold
Status: Untriaged (was: Available)
This issue has been Available for over a year. If it's no longer important or seems unlikely to be fixed, please consider closing it out. If it is important, please re-triage the issue.

Sorry for the inconvenience if the bug really should have been left as Available.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
My problem is no longer reproducible on latest Chrome Version: 69.0.3497.100(Official Build) (64 bit)

Status: Available (was: Untriaged)

Sign in to add a comment