New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 611970 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner: ----
Closed: May 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: All
Pri: 2
Type: Bug



Sign in to add a comment

[webfonts] Ligatures missing when viewing generated html file

Reported by rajasuba...@gmail.com, May 14 2016

Issue description

UserAgent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.94 Safari/537.36

Steps to reproduce the problem:
1. Open the html page  
2. You could find the characters like `ti` , `tt` would be found missing
3. The above characters sets are available in the page (please check with inspect element).

What is the expected behavior?
The characters should have appeared while viewing the html.

What went wrong?
The characters like `ti` and `tt` and `ff` are skipped while viewing the file.This works well with firefox whereas In Safari and Chrome I found the characters missing.

Did this work before? No 

Chrome version: 50.0.2661.94  Channel: n/a
OS Version: OS X 10.11.0
Flash Version: Shockwave Flash 21.0 r0

Please help me to sort out this.
 
missing_font_issue.html
551 KB View Download
Labels: Needs-Feedback
Thanks for your report. Do you have a smaller test case? This one is over 500KB of CSS + javascript, so it's difficult to tell is this is a problem with Chrome or the page itself.
Actually, I am converting my pdf files to html to render in my site using pdf2HtmlEx where I am getting this kind of character missing issue.

Using the following command `pdf2htmlEX --split-pages 1 --zoom 3 --fit-width 920 --dest-dir $1 $2` to using `pdf2HtmlEx` tool to convert it to html.
quarta_parte_colori.pdf
3.5 MB Download
Project Member

Comment 3 by sheriffbot@chromium.org, May 20 2016

Labels: -Needs-Feedback Needs-Review
Owner: ellyjo...@chromium.org
Thank you for providing more feedback. Adding requester "ellyjones@chromium.org" for another review and adding "Needs-Review" label for tracking.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot

Comment 4 Deleted

Thanks a lot sheriffbot for your prompt response. Would like to hear from you soon.
Labels: -Needs-Review Needs-Feedback
Owner: ----
When I try loading missing_font_issue.html I get a blank page with a rotating PDF icon.

Please try narrowing the problem down in the html file to a smaller test case, and please see if the same problem occurs in Safari and Firefox.

Components: -UI Infra>Client>Pdfium
Is this particular to PDF files?

I tested quarta parte colori, and it works on my machine (I can find "ff", etc), M53. Was this recently fixed in pdfium?

I suspect that this has to do with the way that ligature is handled (see https://en.wikipedia.org/wiki/Typographic_ligature).

Hi,

Thanks for writing to me. Yeah, I think the issue is due to ligatures. So
tried turning off the ligatures by default. [Please refer my post @
stackoverflow:
http://stackoverflow.com/questions/37226949/characters-like-tt-ti-ff-in-my-html-page-disappears-while-viewing-with
].

But It would be more helpful for me if you could explain why the chrome
browser behaves in such a way.

1. All the time the characters like `tt` `tf` `ti` are read as ligatures
only a few times. I couldn't sort out the exact scenario where this reads
the character sets as ligature. Is this font specific?
2. How Chrome identifies these character set as ligature?

Please help me to know about this.

Thanks and Regards,
Rajasuba S.
Project Member

Comment 9 by sheriffbot@chromium.org, Jul 2 2016

Labels: -Needs-Feedback Needs-Review
Owner: shrike@chromium.org
Thank you for providing more feedback. Adding requester "shrike@chromium.org" for another review and adding "Needs-Review" label for tracking.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Cc: shrike@chromium.org
Components: -Infra>Client>Pdfium Blink>Fonts Blink>Layout
Labels: -Needs-Review Needs-Feedback
Owner: ----
rajasuba.suba@ - if you can send a reproducible case, or steps to reproduce? When opening your missing_font_issue.html file in Chrome and Safari it never gets to the point of displaying rendered text, only a spinning PDF logo.
Hi,

I am sorry for the inconvenience caused. The HTML page is just the index of
the pages. Since `.page` files are missing the HTML was not rendered
properly. I've shared the following folder
<https://drive.google.com/open?id=0B1N73zJVTNRnTnFITHNENExOZkE> containing
PDF which is converted into HTML and pages to render into my site in which
the above issue reproduces.

The reported issue could be reproduced by just opening the html file, in
the above folder in chrom/safari.

*OS Details*: OS X EI Captain Version 10.11
*Browser Details:*  Google Chrome (Version 51.0.2704.103 (64-bit))

Kindly contact me if you further face any problem with reproducing the
issue.

PS: If you're still seeing a spinning PDF logo, please try to the html page
on a server (like python server by running `python -m SimpleHTTPServer`
from your terminal). Since chrome and safari do not read the files referred
internally even though they are in the same folder due to some security
reasons.

Thanks and Regards,
Rajasuba S.
Project Member

Comment 12 by sheriffbot@chromium.org, Jul 8 2016

Labels: -Needs-Feedback Needs-Review
Owner: shrike@chromium.org
Thank you for providing more feedback. Adding requester "shrike@chromium.org" for another review and adding "Needs-Review" label for tracking.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Labels: -Needs-Review Needs-Feedback
I downloaded the folder and opened the HTML file but still get the spinning PDF logo, in both Chrome and Safari. No text actually renders, so these materials do not reproduce the problem.

Hi,

The html page is just the index of the .page files contained in that
folder. Have you tried hosting the html file on a server like python?

Kindly try with hosting it on python by running the above command. If you
still face the above issue I'm ready to help you with debugging.

Thanks and Regards,
Rajasuba S.
Hello,

Rather than having us try to host the file from a server, it would be better if you:

1. Host the HTML yourself
2. Save the page (File -> Save Page As...)
3. Confirm that the saved page reproduces the problem
4. Attach the saved page to this bug report


Yeah, sure !! Have attached the same.

Thanks and Regards,
Rajasuba S.
Project Member

Comment 17 by sheriffbot@chromium.org, Jul 10 2016

Labels: -Needs-Feedback Needs-Review
Thank you for providing more feedback. Adding "Needs-Review" label for tracking.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Labels: -Needs-Review Needs-Feedback
Did you intend to attach it but forgot? I don't see an attachment.

Have attached file. Please check the same.
NewlySavedHTMLFile.htm
741 KB View Download
I've attached in the mail thread. But it was not listed in the portal.
Please try now.
Project Member

Comment 21 by sheriffbot@chromium.org, Jul 11 2016

Labels: -Needs-Feedback Needs-Review
Thank you for providing more feedback. Adding "Needs-Review" label for tracking.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Components: -Blink>Layout
Labels: -Needs-Review -OS-Mac OS-All
Status: Untriaged (was: Unconfirmed)
Thank you very much for this.

Steps to reproduce:

1. Open the html file attached to # 19 in a browser window.

What is the expected behavior?
The second line of text should read, "Please fill in the fields below to begin DocuSign API Certification." 

What went wrong?
Instead the text reads, "Please ll in the elds below to begin DocuSign API Cercaon." The ligatures "fi" and "ti" are missing.

Owner: ----

Comment 24 by e...@chromium.org, Jul 12 2016

Cc: behdad@google.com drott@chromium.org e...@chromium.org
Status: Available (was: Untriaged)
Summary: [webfonts] Ligatures missing when viewing generated html file (was: Characters like tt ti fi are missing while viewing my html file)
It appears that the webfont has a corrupt ligature table. Using any other font and it renders correctly.

Is this something we could detect Behdad and perhaps disable ligatures in these cases?
Hi,

Have made a temporary fix by disabling the ligature. Yeah, it works well by disabling the ligatures. 

But I would like to know the following

1. What is a ligature? - have read the wikipedia a thousand times but still couldn't get a clear idea about ligature.
2. Why do we have ligatures with fonts? What is the relationship with these two?
3. What is the advantage of having ligatures?
4. What is a ligature table?
5. Is this font dependent?
6. How and why browser detects these characters set as ligatures?
7. What is the possible way that a ligature table could be corrupted? How can I get it rectified?

And moreover, you have said that it works fine with other fonts. I've come across using the same fonts where this case does reproduce. 

Kindly clarify me with this regard.

Thanks and Regards,
Rajasuba S.


Ligatures are a way for font creators to tweak the appearance of certain character sequences to make them more visually pleasing. For example, for a certain font the "fi" character sequence may be most attractive by placing the i so close to the f that the dot over the i overlaps the curly tip of the f. You can't get that spacing with the normal f and i glyphs so the font includes a special "fi" glyph. Text engines, when they encounter the "fi" character sequence, can display the single "fi" glyph instead of two separate glyphs.

I assume a ligature table is a table within the font that tells users of the font what ligatures it has and the character sequences the ligatures map to. I don't know how a table could get corrupted, but I can imagine it being encoded improperly. You would only notice on a system that supports ligatures (and has ligatures turned on).

Comment 27 by behdad@google.com, Jul 17 2016

Looks like a bad font subsetter... This might be the one and only case so far I've seen that OTS is Firefox is catching a broken GSUB table and discarding it, whereas we decided to disable OTS for GSUB/GPOS (since it got in the way more often).  Again, if it's font bug, I'm fine saying WAI / Garbage-In-Garbage-Out.
Can I learn or know about Glyph table substitution somewhere.. So, that I could find out when I can skip the ligatures when the Glyph substitution table appears corrupted. 
Status: ExternalDependency (was: Available)
The PDF rajasuba.suba@ provided on Drive in #11 contains a broken subsetted Calibri Bold font. Extracting it straight from the PDF file using FontForge, and trying to shape with HarfBuzz' hb-view results in an empty output, no .notdefs or anything.

What I assume is happening: The font seems to have been subsetted to used glyphs without reducing lookups from GSUB so that ligatures are formed pointing to glyphs that are not present in the font's glyf table.

It could probably be detected in HarfBuzz by doing additional glyf table lookups to see whether ligature target glyphs are actually present in the font file, otherwise flagging GSUB as broken.

I am attaching the broken subsetted Calibri file.

I am going to file a HarfBuzz issue for it. On the Chrome side, this is a WontFix, or ExternalDependency if Behdad considers it worthwhile catching this case.

Alternatively, pdf2htmlEX should probably output "font-feature-settings: 'liga' 0;" for PDF with subsetted font files so that no additional ligature processing is done.

Example hb-view call:
$ hb-view Calibri-Bold.otf "fifiti"
empty output

whereas:
hb-view --features=-liga Calibri-Bold.otf "fifiti" 
produces the correctly rendered string.

Similarly, adding font-feature-settings: 'liga' 0; to disable ligature lookups in the generated HTML makes the characters appear.

Calibri-Bold_broken_subset.otf
261 KB Download
Notified the PDF converter project as well: https://github.com/coolwanglu/pdf2htmlEX/issues/710
The PDF contains a line
/Producer (PDFKit.NET 4.0.40.0)
from https://www.tallcomponents.com/contact - this generator might do the incorrect subsetting.

Comment 33 by e...@chromium.org, May 21 2018

Status: WontFix (was: ExternalDependency)
Closing as infeasible as per harfbuzz issue in comment 30.

Sign in to add a comment