[webfonts] Ligatures missing when viewing generated html file
Reported by
rajasuba...@gmail.com,
May 14 2016
|
||||||||||||||||
Issue descriptionUserAgent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.94 Safari/537.36 Steps to reproduce the problem: 1. Open the html page 2. You could find the characters like `ti` , `tt` would be found missing 3. The above characters sets are available in the page (please check with inspect element). What is the expected behavior? The characters should have appeared while viewing the html. What went wrong? The characters like `ti` and `tt` and `ff` are skipped while viewing the file.This works well with firefox whereas In Safari and Chrome I found the characters missing. Did this work before? No Chrome version: 50.0.2661.94 Channel: n/a OS Version: OS X 10.11.0 Flash Version: Shockwave Flash 21.0 r0 Please help me to sort out this.
,
May 19 2016
Actually, I am converting my pdf files to html to render in my site using pdf2HtmlEx where I am getting this kind of character missing issue. Using the following command `pdf2htmlEX --split-pages 1 --zoom 3 --fit-width 920 --dest-dir $1 $2` to using `pdf2HtmlEx` tool to convert it to html.
,
May 20 2016
Thank you for providing more feedback. Adding requester "ellyjones@chromium.org" for another review and adding "Needs-Review" label for tracking. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
May 21 2016
Thanks a lot sheriffbot for your prompt response. Would like to hear from you soon.
,
Jun 4 2016
When I try loading missing_font_issue.html I get a blank page with a rotating PDF icon. Please try narrowing the problem down in the html file to a smaller test case, and please see if the same problem occurs in Safari and Firefox.
,
Jul 1 2016
Is this particular to PDF files? I tested quarta parte colori, and it works on my machine (I can find "ff", etc), M53. Was this recently fixed in pdfium? I suspect that this has to do with the way that ligature is handled (see https://en.wikipedia.org/wiki/Typographic_ligature).
,
Jul 2 2016
Hi, Thanks for writing to me. Yeah, I think the issue is due to ligatures. So tried turning off the ligatures by default. [Please refer my post @ stackoverflow: http://stackoverflow.com/questions/37226949/characters-like-tt-ti-ff-in-my-html-page-disappears-while-viewing-with ]. But It would be more helpful for me if you could explain why the chrome browser behaves in such a way. 1. All the time the characters like `tt` `tf` `ti` are read as ligatures only a few times. I couldn't sort out the exact scenario where this reads the character sets as ligature. Is this font specific? 2. How Chrome identifies these character set as ligature? Please help me to know about this. Thanks and Regards, Rajasuba S.
,
Jul 2 2016
Thank you for providing more feedback. Adding requester "shrike@chromium.org" for another review and adding "Needs-Review" label for tracking. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Jul 6 2016
rajasuba.suba@ - if you can send a reproducible case, or steps to reproduce? When opening your missing_font_issue.html file in Chrome and Safari it never gets to the point of displaying rendered text, only a spinning PDF logo.
,
Jul 8 2016
Hi, I am sorry for the inconvenience caused. The HTML page is just the index of the pages. Since `.page` files are missing the HTML was not rendered properly. I've shared the following folder <https://drive.google.com/open?id=0B1N73zJVTNRnTnFITHNENExOZkE> containing PDF which is converted into HTML and pages to render into my site in which the above issue reproduces. The reported issue could be reproduced by just opening the html file, in the above folder in chrom/safari. *OS Details*: OS X EI Captain Version 10.11 *Browser Details:* Google Chrome (Version 51.0.2704.103 (64-bit)) Kindly contact me if you further face any problem with reproducing the issue. PS: If you're still seeing a spinning PDF logo, please try to the html page on a server (like python server by running `python -m SimpleHTTPServer` from your terminal). Since chrome and safari do not read the files referred internally even though they are in the same folder due to some security reasons. Thanks and Regards, Rajasuba S.
,
Jul 8 2016
Thank you for providing more feedback. Adding requester "shrike@chromium.org" for another review and adding "Needs-Review" label for tracking. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Jul 8 2016
I downloaded the folder and opened the HTML file but still get the spinning PDF logo, in both Chrome and Safari. No text actually renders, so these materials do not reproduce the problem.
,
Jul 9 2016
Hi, The html page is just the index of the .page files contained in that folder. Have you tried hosting the html file on a server like python? Kindly try with hosting it on python by running the above command. If you still face the above issue I'm ready to help you with debugging. Thanks and Regards, Rajasuba S.
,
Jul 9 2016
Hello, Rather than having us try to host the file from a server, it would be better if you: 1. Host the HTML yourself 2. Save the page (File -> Save Page As...) 3. Confirm that the saved page reproduces the problem 4. Attach the saved page to this bug report
,
Jul 9 2016
Yeah, sure !! Have attached the same. Thanks and Regards, Rajasuba S.
,
Jul 10 2016
Thank you for providing more feedback. Adding "Needs-Review" label for tracking. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Jul 11 2016
Did you intend to attach it but forgot? I don't see an attachment.
,
Jul 11 2016
Have attached file. Please check the same.
,
Jul 11 2016
I've attached in the mail thread. But it was not listed in the portal. Please try now.
,
Jul 11 2016
Thank you for providing more feedback. Adding "Needs-Review" label for tracking. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Jul 11 2016
Thank you very much for this. Steps to reproduce: 1. Open the html file attached to # 19 in a browser window. What is the expected behavior? The second line of text should read, "Please fill in the fields below to begin DocuSign API Certification." What went wrong? Instead the text reads, "Please ll in the elds below to begin DocuSign API Cercaon." The ligatures "fi" and "ti" are missing.
,
Jul 11 2016
,
Jul 12 2016
It appears that the webfont has a corrupt ligature table. Using any other font and it renders correctly. Is this something we could detect Behdad and perhaps disable ligatures in these cases?
,
Jul 13 2016
Hi, Have made a temporary fix by disabling the ligature. Yeah, it works well by disabling the ligatures. But I would like to know the following 1. What is a ligature? - have read the wikipedia a thousand times but still couldn't get a clear idea about ligature. 2. Why do we have ligatures with fonts? What is the relationship with these two? 3. What is the advantage of having ligatures? 4. What is a ligature table? 5. Is this font dependent? 6. How and why browser detects these characters set as ligatures? 7. What is the possible way that a ligature table could be corrupted? How can I get it rectified? And moreover, you have said that it works fine with other fonts. I've come across using the same fonts where this case does reproduce. Kindly clarify me with this regard. Thanks and Regards, Rajasuba S.
,
Jul 13 2016
Ligatures are a way for font creators to tweak the appearance of certain character sequences to make them more visually pleasing. For example, for a certain font the "fi" character sequence may be most attractive by placing the i so close to the f that the dot over the i overlaps the curly tip of the f. You can't get that spacing with the normal f and i glyphs so the font includes a special "fi" glyph. Text engines, when they encounter the "fi" character sequence, can display the single "fi" glyph instead of two separate glyphs. I assume a ligature table is a table within the font that tells users of the font what ligatures it has and the character sequences the ligatures map to. I don't know how a table could get corrupted, but I can imagine it being encoded improperly. You would only notice on a system that supports ligatures (and has ligatures turned on).
,
Jul 17 2016
Looks like a bad font subsetter... This might be the one and only case so far I've seen that OTS is Firefox is catching a broken GSUB table and discarding it, whereas we decided to disable OTS for GSUB/GPOS (since it got in the way more often). Again, if it's font bug, I'm fine saying WAI / Garbage-In-Garbage-Out.
,
Jul 19 2016
Can I learn or know about Glyph table substitution somewhere.. So, that I could find out when I can skip the ligatures when the Glyph substitution table appears corrupted.
,
Mar 9 2017
The PDF rajasuba.suba@ provided on Drive in #11 contains a broken subsetted Calibri Bold font. Extracting it straight from the PDF file using FontForge, and trying to shape with HarfBuzz' hb-view results in an empty output, no .notdefs or anything. What I assume is happening: The font seems to have been subsetted to used glyphs without reducing lookups from GSUB so that ligatures are formed pointing to glyphs that are not present in the font's glyf table. It could probably be detected in HarfBuzz by doing additional glyf table lookups to see whether ligature target glyphs are actually present in the font file, otherwise flagging GSUB as broken. I am attaching the broken subsetted Calibri file. I am going to file a HarfBuzz issue for it. On the Chrome side, this is a WontFix, or ExternalDependency if Behdad considers it worthwhile catching this case. Alternatively, pdf2htmlEX should probably output "font-feature-settings: 'liga' 0;" for PDF with subsetted font files so that no additional ligature processing is done. Example hb-view call: $ hb-view Calibri-Bold.otf "fifiti" empty output whereas: hb-view --features=-liga Calibri-Bold.otf "fifiti" produces the correctly rendered string. Similarly, adding font-feature-settings: 'liga' 0; to disable ligature lookups in the generated HTML makes the characters appear.
,
Mar 9 2017
,
Mar 9 2017
Notified the PDF converter project as well: https://github.com/coolwanglu/pdf2htmlEX/issues/710
,
Mar 9 2017
The PDF contains a line /Producer (PDFKit.NET 4.0.40.0) from https://www.tallcomponents.com/contact - this generator might do the incorrect subsetting.
,
May 21 2018
Closing as infeasible as per harfbuzz issue in comment 30. |
||||||||||||||||
►
Sign in to add a comment |
||||||||||||||||
Comment 1 by ellyjo...@chromium.org
, May 19 2016