New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 661320 link

Starred by 1 user

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Task



Sign in to add a comment

Add a test string that would overflow even with word-break:break-all to f/t/midword-break-before-surrogate-pair

Project Member Reported by js...@chromium.org, Nov 1 2016

Issue description

Spun off from https://codereview.chromium.org/2447513002/

With update to ICU 58, regional indicator pairs are treated like ID for LB. That is, there is a line breaking opportunity between 'regional indicator pairs' whether 'word-break: break-all' is applied or not. 

midword-break-before-surrogate-pair assumes that there is no line breaking opportunity and used it as a test string that would overflow even when 'w-b' is set to 'break-all'. 

LB=IS, LB=IN, LB=OP have some non-BMP characters that can be used for testing. (there might be other categories)

http://unicode.org/cldr/utility/list-unicodeset.jsp?a=[:Line_Break=Inseparable:]

http://unicode.org/cldr/utility/list-unicodeset.jsp?a=[:Line_Break=Inseparable:]

http://unicode.org/cldr/utility/list-unicodeset.jsp?a=[:Line_Break=Open_Punctuation:]


See https://drafts.csswg.org/css-text/#valdef-word-break-break-all

 

Comment 1 by kojii@chromium.org, Nov 2 2016

Interesting. UAX#14 PU (proposed update) still has RI
http://www.unicode.org/reports/tr14/proposed.html#RI

Happen to know reasons why ICU decided to handle RI differently from UAX#14, or is UAX#14 PU behind ICU 58?

Comment 2 by kojii@chromium.org, Nov 2 2016

Cc: e...@chromium.org nona@chromium.org
BTW, we're trying to solve several types of "shape-across-xxx", such as shape-across-element-boundaries, and many important cases are in our radar, but shape-across-break-opportunities is still even more challenging and I can't tell how well we can support it atm. Even if we managed to support it, it's likely to hit the layout performance since we need to re-shape after line break. Issue 479370 and issue 601694 have similar unresolved technical challenge.

I don't know the motivation behind the change in ICU, but I have mild preference to tailor the rules for Blink and avoid hitting the technical challenge for regional indicators.

Comment 3 by js...@chromium.org, Nov 17 2016

> I don't know the motivation behind the change in ICU, but I have mild preference
> to tailor the rules for Blink and avoid hitting the technical challenge for 
> regional indicators.

Sorry that I don't understand why this issue has to do with what you wrote about 'shape-across-lb-opportunity'. A pair of RI codepoints will always stay together. So, I have little clue why you brought up 'shape-across-lb-opportunity'. 


> Happen to know reasons why ICU decided to handle RI differently from UAX#14, 
> or is UAX#14 PU behind ICU 58?

Because UAX#14 PU makes a lot more sense than otherwise. :-) 

ICU 58 implemented draft Unicode standards in a few places ( Emoji 4.0 beta instead of Emoji 3.0,  handling of confusable characters, etc). 

 


Comment 4 by js...@chromium.org, Nov 17 2016

Well, in this case, UAX 14 (the latest version) specifies the same behavior regarding RI pairs. So, it's not a matter of UAX 14 vs UAX 14 PU. 


Comment 5 by js...@chromium.org, Nov 17 2016

Ok. it appears that you misunderstood what I wrote (sorry if it's not clear). There is NO LB opportunity between one RI and the other RI as long as both of them belong to a matching pair. There is a LB opportunity between two adjacent pairs of RIs. 

Comment 6 by kojii@chromium.org, Nov 17 2016

Ah, got it. yes I misunderstood that way, thank you for pointing it out.
Owner: kojii@chromium.org
Labels: -Type-Bug Type-Task

Sign in to add a comment