New issue
Advanced search Search tips

Issue 906369 link

Starred by 4 users

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug



Sign in to add a comment

Chrome incorrectly capitalizes if there's an abspos between two inline nodes.

Project Member Reported by emilio@chromium.org, Nov 17

Issue description

What steps will reproduce the problem?
(1) Open the attached example.

What is the expected result?

"blog" isn't capitalized, since it's part of the same inline run and there's no whitespace between them.

What happens instead?

Blog is capitalized.

Remove the pseudo-element for the correct rendering. Also, Blink fails to handle the removal of the pseudo-element correctly, and doesn't uncapitalize the word.
 
t.html
190 bytes View Download
Labels: Needs-Feedback
NextAction: 2018-11-26
Please provide a link to the spec that suggests this behavior. 

https://drafts.csswg.org/css-text-3/#text-transform-property states that "For capitalize, what constitutes a “word“ is UA-dependent"
The NextAction date has arrived: 2018-11-26
NextAction: 2018-12-03
@emilio - your feedback is needed.
Labels: -Needs-Feedback
Ah, really sorry, had missed this.

> https://drafts.csswg.org/css-text-3/#text-transform-property states that "For capitalize, what constitutes a “word“ is UA-dependent"

I suspect that's meant to decide which algorithm is used for word-breaking, but I don't think it should be affected by content out of flow. In any case,  Blink is inconsistent here as I mentioned in the report. It's really weird that it considers:

  <a href="#">home</a><span style="position: absolute"></span><a href="#">blog</a>

To be two words, but only one if you remove the position: absolute declaration.

Plus as mentioned above Blink doesn't handle dynamically moving the abspos into the flow again correctly (un-capitalizing the word), so I suspect this is not really intended.
Cc: mstensho@chromium.org e...@chromium.org
NextAction: ----
Status: Available (was: Untriaged)
Thanks for your response! Yes, you're right. This is a plain bug. An even weirder thing is that if you add some content (and not just '') inside the abspos pseudo, the second A element doesn't get capitalized - as long as the content doesn't end with whitespace.

We don't need to use pseudo elements to reproduce the problem, though. Attaching something slightly simpler, that just puts an abspos with no content between two text nodes not separated by any kind of whitespace. If we add something to the abspos (that doesn't end with whitespace), the second text node won't get capitalized anymore.

I suspect LayoutText::PreviousCharacter(), added in 2007: c7a6874e4d8212e6df186eab2749441463a5da72

Also fails in LayoutNG.
tc.html
204 bytes View Download
Yeah, this cat get fixed by fixing PreviousCharacter().

We could land a fix like this: https://chromium-review.googlesource.com/c/chromium/src/+/1350920

But I'm sure I'll break something. Walking backwards in the tree like this is tricky. It seems that also LayoutNG uses this code at the moment. However, we'll have to rewrite it for NG at some point, presumably in a way that walks forwards in the tree, so that we can easily tell if something should be capitalized or not.

I think I'd prefer to keep PreviousCharacter() as it is, unless someone is willing to deal with any fallout of modifying it. Let's do it right for NG instead.

Are there no WPT test for this? Looks like my CL didn't make anything additional pass.
Cc: kojii@chromium.org
Interesting, thank you. Yeah, NG relies on legacy for text-transform atm, so we have the same behavior today.

When we re-write text-transform in NG, however, I'm not sure what the expected behavior is.

<div>12345<span style="position: absolute"></span>67890<div>

has a break opportunity between "5" and "6" on Blink and Gecko. NG implements the same behavior; we handle abspos as an object replacement character for the text processing purpose.

So...is this a word boundary for 'text-transform: capitalize' or not? I don't know, I might try what ICU would do for the ORC.
I don't know either, but what's *inside* the abspos surely shouldn't make a difference, at least.
Looking at our legacy logic of text-transform, we:
1. Transform nbsp to ' '.
2. Then use word break iterator in ICU.
so I guess ORC should be a word separator?
> what's *inside* the abspos surely shouldn't make a difference

Agree.
Oh, I submitted https://github.com/web-platform-tests/wpt/pull/14258 before reading the latest comments.

Feel free to hold off on landing that, or I could tweak the test to add some invisible non-whitespace text in the abspos to test that the content inside the abspos makes no difference.
Owner: mstensho@chromium.org
That test is great anyway - thank you! It fails in Chrome.

I'm temporarily assigning this one to myself, to make sure that I remember to catch the test failure (and do some paperwork in TestExpectations) when the new test gets imported.

       Break Capitalize
Edge     N        N
Gecko    Y        N
WebKit   Y        Y
Blink    Y        Y

I personally prefer Capitalize and break opportunities match, but looks like you two are strong believer. I'm ok to add a hack to NG.
Oh, I'm not. I do realize that Emilio's incoming test will expect us to ignore the abspos as far as capitalization is a concerned, though.

Does the spec need any clarification here? Is Edge wrong, BTW? It pretends that the abspos isn't there at all, both for capitalization and line break opportunities? That does make sense to me. If two adjacent text nodes with no whitespace don't produce a break opportunity, inserting an out-of-flow element between them shouldn't make any difference. It would be like inserting a display:none node there, in my mind. But my mind is no line breaking expert. :)
I agree with that, I think Edge is Right.

I was recently doing some Gecko work related to floats in:

  https://bugzilla.mozilla.org/show_bug.cgi?id=488725

And I stumbled with OOFs being break opportunities in Gecko, see the discussion in:

  https://bugzilla.mozilla.org/show_bug.cgi?id=1485764

I don't think we would oppose to change whether OOFs constitute breaking opportunities, I think it's basically an accident, and we have some exceptions for that, like for floating first-letter already, if I'm reading the code right.
I see, thank you for clarifying, that helps.

I agree with that too, Edge makes the most sense. NG tried not to emit ORC but could not pass existing tests without doing so. Maybe we can try if this change is web compat once NG is stabilized.
While I was on train back to home, I was thinking why author did that (I suppose this is from a real site failure?)

```
  a::after {
    position: absolute;
    content: '';
  }
```
This is really no-op if abspos does not create break/capitalize boundaries, so I guess the intention is this a technique the author developed to create the boundary without changing HTML. Probably what he should have done in more interoperable and well-defined way is:
```
  a {
    display: inline-block;
  }
```
but he chose this one. I don't know how common this kind of technique is, but web-compat challenge is probably not zero.
I think you may be right, that it's likely that someone is using this technique to create an artificial word boundary or break opportunity. But Edge seems to get away with not "supporting" it, right?

If the specs aren't clear on this topic, it should be clarified.
FWIW the pseudo-element was there for some borders and backgrounds, not for the sake of creating a word boundary. I just reduced it.

The original report is https://bugzilla.mozilla.org/show_bug.cgi?id=1508066, which was from a developer who has fixed the website already, fwiw.
Thank you for the info, it is helpful, Emilio. I'm not against, but just mixed and putting my opinion on hold for now.

I wish the consistency between break opportunities and capitalize word boundaries, and break opportunities is a bit unknown until we challenge the web compat. If we succeed, that's good. If we fail, we have a choice; we give up the consistency, or we give up fixing this.
One thing I remember while reviewing someone's patch today; we needed ORC for abspos because the static position needs an anchor object to represent it while bidi-reordering. Maybe we can do that without ORC, haven't really gave enough thoughts on it, but it's probably related.
Cc: robertma@chromium.org
 Issue 909598  has been merged into this issue.
https://github.com/w3c/csswg-drafts/issues/3518

“Out-of-flow elements and inline element boundaries must not introduce a text-transform word boundary and must be ignored when determining such word boundaries.”
Project Member

Comment 25 by bugdroid1@chromium.org, Jan 15

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/3c7813554abb55e4f2df533a828220f09f4983aa

commit 3c7813554abb55e4f2df533a828220f09f4983aa
Author: Morten Stenshorne <mstensho@chromium.org>
Date: Tue Jan 15 10:29:56 2019

text-transform-capitalize-033.html has a dedicated bug number.

TBR=eae@chromium.org,kojii@chromium.org

Bug: 906369
Change-Id: Ie49608c17017d3db22ce525eb14609d9a79f7247
Reviewed-on: https://chromium-review.googlesource.com/c/1411335
Reviewed-by: Morten Stenshorne <mstensho@chromium.org>
Commit-Queue: Morten Stenshorne <mstensho@chromium.org>
Cr-Commit-Position: refs/heads/master@{#622798}
[modify] https://crrev.com/3c7813554abb55e4f2df533a828220f09f4983aa/third_party/blink/web_tests/TestExpectations

Owner: ----
I've now (finally) done what I promised in #c13 - so unassigning.

Sign in to add a comment