New issue
Advanced search Search tips

Issue 838168 link

Starred by 10 users

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 3
Type: Bug



Sign in to add a comment

Chrome Headless large pdf generation

Reported by travi...@gmail.com, Apr 30 2018

Issue description

I was referred to you by puppeteer's maintainers here: https://github.com/GoogleChrome/puppeteer/issues/2440

Chrome Version       : 65.0.3325.181
OS Version: 10.0
URLs (if applicable) : https://travistx.github.io/example-html/example.html
Other browsers tested:
  Add OK or FAIL after other browsers where you have tested this issue:
     PhantomJS: OK

What steps will reproduce the problem?
Puppeteer Repro Steps
    1. Execute npm install puppeteer@1.3.0-next.1524503013067
    2. Place the attached ExamplePuppeteer.js into the current working directory
    3. Execute node ExamplePuppeteer.js
    4. A pdf file will be generated in the current working directory

PhantomJS Repro Steps:
    1. Download the latest http://phantomjs.org/download.html
    2. Extract the archive to a directory, and also add the attached ExamplePhantomJs.js
    3. Execute phantomjs.exe ExamplePhantomJs.js
    4. A pdf file will be generated in the current working directory


What is the expected result?
I expect the pdf generated by Puppeteer to be a smaller file size, as it was with phantomjs.


What happens instead of that?
Larger file size.


Please provide any additional information below. Attach a screenshot if
possible.

When analyzing the files with Adobe Acrobat's "Audit Space Usage" tool, I noticed the pdf generated by puppeteer has much more font data stored in the pdf.

 
pdf-space-usage.png
18.8 KB View Download
example-phantomjs.pdf
16.6 KB Download
example-puppeteer.pdf
45.6 KB Download
ExamplePhantomJs.js
227 bytes View Download
ExamplePuppeteer.js
319 bytes View Download
Labels: Needs-Triage-M65
Components: Internals>Headless Internals>Plugins>PDF
Cc: skyos...@chromium.org halcanary@chromium.org
I am not exactly sure how PDF generation in headless works, i.e. if that is the printing stack or Skia.

halcanary@ or skyostil@, are you familiar with this area?
looks like the font subsetting is more efficient for PhantomJs.  
Cc: behdad@chromium.org
Components: -Internals>Plugins>PDF Internals>Skia>PDF
Labels: -Needs-Triage-M65
Components: Internals>Printing
Headless uses the printing pipeline for generating PDFs, so +Internals>Printing.
Status: Available (was: Unconfirmed)
halcanary: Any interest in looking at this?
behdad: You mentioned a while back that we'll eventually switch away from sfntly. Will that improve the situation here?
Cc: bunge...@chromium.org
This is not a priority for me.  I'd have to do a lot of research into sfntly and the opentype standard before I could even begin.  behdad@ or bungeman@ have a lot more background in this area; maybe they could say how difficult it would be to fix sfntly or write a better library.
I believe the plan for typeface subsetting is to use libharfbuzz-subset.so when it is ready. It's developed in the harfbuzz tree (see https://github.com/harfbuzz/harfbuzz/blob/master/src/hb-subset.h for its interface).
I'm glad to hear that there is a plan!

For reference, behdad@, I have a SkStreamAsset, a SkBitSet, and a TTC index to pass to the subsetter.  I'd love to have an interface that doesn't involve extra unnecessary memory allocation.
We'll get there eventually with harfbuzz subsetter.  No exact timeline right now.  The focus is NOT printing-subsetting, but that's definitely feasible to add after we finish our first implementation phase (should be done by late this year.)
Ideally, I'd like a "subsetter" that takes *any* SkTypeface object and produces an OpenType font with *only* the glyph shapes for only some given glyph indices.
That would work as long as the SkTypeface is backed by an OpenType font.  If Type1 is also required, we can figure out how to synthesize OpenType fonts out of them.
Synthesizing OpenType fonts from a sequence of SkPaths was what I had in mind.

Sign in to add a comment