MUSICAL SYMBOL G CLEF should not have grapheme boundary in it |
||||||
Issue descriptionWhile working on a change to RenderText, we noticed that the UTF-8 bytestring for MUSICAL SYMBOL G CLEF (\xF0\x9D\x84\x9E) '𝄞' is allowed to be broken after \xF0. This character seems to be latin small letter eth 'ð', which doesn't seem like an shortened part of the full character. I don't think we should allow breaks in this grapheme. For the context of mws@'s discovery of this, see this message, https://chromium-review.googlesource.com/c/chromium/src/+/611789#message-8c0d7f8c56730fb0d8cc71b7d20617d7a801183b and the following one.
,
Aug 30 2017
,
Aug 30 2017
,
Aug 30 2017
I don't think there is a grapheme boundary problem within the MUSICAL SYMBOL G CLEF, rather some encoding mixup. ð is U+00F0, it's UTF-16 representation is C3 B0. 𝄞 is F0 9D 84 9E in UTF-8.
,
Aug 30 2017
Sorry, I don't have any ideas about this either.
,
Feb 8 2018
I think msw@ discovered this was an encoding problem, as drott@ thought. Closing. |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by mpear...@chromium.org
, Aug 30 2017