We have various tests in blink that test processing of emojis (or other non-BMP unicode chars). Here's one of the tests as processed by clang-format, compared to the same test with each emoji replaced with an ansi char:
thakis@thakis:~/src/chrome/src$ cat test.cc
TEST_F(SymbolsIteratorTest, AllEmojiZWSSequences)
{
CHECK_RUNS({ { "abcdefghijklmnopqrstuvwxyzabcdefghij"
"klmnopqrstuvxyzabcdefghijklmnopqrstuvwxyzabcdefghilklmnopqrstuvwx"
"yzabcdefghijklmnopqrstuvwxyzabcdef",
FontFallbackPriority::EmojiEmoji } });
}
TEST_F(SymbolsIteratorTest, AllEmojiZWSSequences)
{
CHECK_RUNS({ { "๐๐ฉโโค๏ธโ๐โ๐จ๐จโโค๏ธโ๐โ๐จ๐ฉโโค๏ธโ๐โ๐ฉ๐๐ฉโโค๏ธโ๐จ๐จโโค๏ธโ๐จ๐ฉโโค๏ธ"
"โ๐ฉ๐ช๐จโ๐ฉโ๐ฆ๐จโ๐ฉโ๐ง๐จโ๐ฉโ๐งโ๐ฆ๐จโ๐ฉโ๐ฆโ๐ฆ๐จโ๐ฉโ๐งโ๐ง๐จโ๐จโ๐ฆ๐จโ๐จโ๐ง๐จโ๐จโ๐งโ๐ฆ๐จโ๐จโ๐ฆโ๐ฆ๐จโ๐จโ๐งโ๐ง"
"๐ฉโ๐ฉโ๐ฆ๐ฉโ๐ฉโ๐ง๐ฉโ๐ฉโ๐งโ๐ฆ๐ฉโ๐ฉโ๐ฆโ๐ฆ๐ฉโ๐ฉโ๐งโ๐ง๐โ๐จ",
FontFallbackPriority::EmojiEmoji } });
}
thakis@thakis:~/src/chrome/src$ buildtools/linux64/clang-format test.cc
TEST_F(SymbolsIteratorTest, AllEmojiZWSSequences) {
CHECK_RUNS(
{{"abcdefghijklmnopqrstuvwxyzabcdefghij"
"klmnopqrstuvxyzabcdefghijklmnopqrstuvwxyzabcdefghilklmnopqrstuvwx"
"yzabcdefghijklmnopqrstuvwxyzabcdef",
FontFallbackPriority::EmojiEmoji}});
}
TEST_F(SymbolsIteratorTest, AllEmojiZWSSequences) {
CHECK_RUNS(
{{"๐๐ฉโโค๏ธโ๐โ๐จ๐จโโค๏ธโ๐โ๐จ๐ฉโโค๏ธโ๐โ๐ฉ๐๐ฉโโค๏ธโ๐จ๐จโโค๏ธ"
"โ๐จ๐ฉโโค๏ธ"
"โ๐ฉ๐ช๐จโ๐ฉโ๐ฆ๐จโ๐ฉโ๐ง๐จโ๐ฉโ๐งโ๐ฆ๐จโ๐ฉโ๐ฆโ๐ฆ๐จโ๐ฉโ๐งโ๐ง๐จโ๐จโ"
"๐ฆ๐จโ๐จโ๐ง๐จโ๐จโ๐งโ๐ฆ๐จโ๐จโ๐ฆโ๐ฆ๐จโ๐จโ๐งโ"
"๐ง"
"๐ฉโ๐ฉโ๐ฆ๐ฉโ๐ฉโ๐ง๐ฉโ๐ฉโ๐งโ๐ฆ๐ฉโ๐ฉโ๐ฆโ๐ฆ๐ฉโ๐ฉโ๐งโ๐ง๐โ"
"๐จ",
FontFallbackPriority::EmojiEmoji}});
}
Note all the pointless linebreaks in the emoji version.
What's worse, if I iterate `clang-format -i test.cc`, then clang-format will move one single char from the end of the first string in its own line on every run, so that I'll eventually end up with:
TEST_F(SymbolsIteratorTest, AllEmojiZWSSequences) {
CHECK_RUNS(
{{"๐๐ฉโโค๏ธโ๐โ๐จ๐จโโค๏ธโ๐โ๐จ๐ฉโโค"
"๏ธ"
"โ"
"๐"
"โ"
"๐ฉ"
"๐"
"๐ฉ"
"โ"
"โค"
"๏ธ"
"โ"
"๐จ"
"๐จ"
"โ"
"โค"
"๏ธ"
"โ๐จ๐ฉโโค๏ธ"
"โ๐ฉ๐ช๐จโ๐ฉโ๐ฆ๐จโ๐ฉโ๐ง๐จโ๐ฉโ๐งโ"
"๐ฆ"
"๐จ"
"โ"
"๐ฉ"
"โ"
"๐ฆ"
"โ"
"๐ฆ"
"๐จ"
"โ"
"๐ฉ"
"โ"
"๐ง"
"โ"
"๐ง"
"๐จ"
"โ"
"๐จ"
"โ"
"๐ฆ๐จโ๐จโ๐ง๐จโ๐จโ๐งโ๐ฆ๐จโ๐จโ๐ฆโ"
"๐ฆ"
"๐จ"
"โ"
"๐จ"
"โ"
"๐ง"
"โ"
"๐ง"
"๐ฉโ๐ฉโ๐ฆ๐ฉโ๐ฉโ๐ง๐ฉโ๐ฉโ๐งโ๐ฆ๐ฉโ"
"๐ฉ"
"โ"
"๐ฆ"
"โ"
"๐ฆ"
"๐ฉ"
"โ"
"๐ฉ"
"โ"
"๐ง"
"โ"
"๐ง"
"๐"
"โ"
"๐จ",
FontFallbackPriority::EmojiEmoji}});
}
That seems to be the fixpoint, after that form clang-format stops changing the file.