New issue
Advanced search Search tips

Issue 666665 link

Starred by 2 users

Issue metadata

Status: WontFix
Owner: ----
Closed: Dec 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux , Windows , Chrome , Mac
Pri: 3
Type: Bug



Sign in to add a comment

Loading large PDFs off of disk is slower than before

Project Member Reported by thestig@chromium.org, Nov 18 2016

Issue description

What steps will reproduce the problem?
(1) Put a 250 MB PDF in /tmp
(2) Load the PDF

With a test PDF, I ran /path/to/debug_build/chrome /tmp/foo.pdf on my workstation and timed the entire process until the PDF finishes loading with a stopwatch. So the times here include browser launch time, PDFium parsing time, and a wee bit of human reaction time.

Before r429514: 50 seconds
At r429514: It hangs - the loading was slow enough that we found another bug...
At r429514 + r433051 (kReadDelayMs = 2): 65 seconds
At r429514 + r433051 + kReadDelayMs = 1: 58 seconds
At r429514 + r433051 + kReadDelayMs = 0: 51 seconds

Given the 2 ms delay and a 64 KB buffer size, that means when we read off of disk, we are artificially constrained to 31 MB/s. Maybe that's an ok speed for 10 years ago when loading off a HDD, but now we have faster SSDs, LANs, and more RAM to cache files.
 
The r429514, allow us to show pdf content as fast as posible in case when we have file size and can send the range requset.
This is possible bacause we can load few part of document just for visible pages (and show its), and continue loading other data in background (and for this case is not necessary how long we will load full document (In a reasonable time, of course)).
The kReadDelayMs>0 allow user to comfortably works with partialy loaded document without hangs (if kReadDelayMs == 0, the UrlLoader hang the UI thread, by processing new received data.).

But for local documents:
1) The Pdf plugin is not received the file size (In "ContentLength" field), and as result, we should load full document always.
(Fix for this was tried in https://codereview.chromium.org/1585403004/ )
2) Local files unsupports the Range Request (like random data access). (May be its work already, but i don't try, because 1) )

I think that the fixes for 1) and 2) are more prefered.
What about increasing the buffer size to something like 256 KB? Or dynamically growing the buffer size for each request (e.g. growing the buffer by 50% for each request, up to some reasonable maximum)?

I set the buffer size to 64 KB because that seemed like the lowest number that didn't have horrible performance, but the vast majority of desktop connections are probably fast enough that a 256 KB buffer wouldn't lower performance significantly.
What effects would a big buffer have on (slow) network connections? Should I or someone else try reviving https://codereview.chromium.org/1585403004/ instead?
Using a bigger buffer means that the minimum request size will be larger. So even if the renderer only needs say 5KB somewhere in the document, the plugin will make the request big enough to fill the buffer. This means the rendering of linearized PDFs may be slower, especially for slow connections.

I think 1) and 2) would be nice to have, but won't fix the problem if the PDF is not linearized.

Another option is to use a bigger buffer when not using partial loading. Unfortunately, the buffer size is currently a compile-time constant, so it might require a bit more work than just always using a bigger buffer.
Project Member

Comment 5 by bugdroid1@chromium.org, Nov 24 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/9c7322baa752411c1f983d6305f06842d0670373

commit 9c7322baa752411c1f983d6305f06842d0670373
Author: art-snake <art-snake@yandex-team.ru>
Date: Thu Nov 24 05:16:22 2016

[PDF Plugin] Increase read buffer size

BUG= 666665 

R=spelchat@chromium.org, thestig@chromium.org

Review-Url: https://codereview.chromium.org/2525903002
Cr-Commit-Position: refs/heads/master@{#434306}

[modify] https://crrev.com/9c7322baa752411c1f983d6305f06842d0670373/pdf/document_loader.h

Note, the r429514 CL has been reverted. We should look into the performance issues before relanding.
Status: Available (was: Untriaged)
Labels: -Pri-2 -M-56 -Type-Bug-Regression Pri-3 Type-Bug
Status: WontFix (was: Available)
With all the work from the past year, the loading time is now ~32 seconds here with kReadDelayMs set to 2 or 1. So I'm happy with the loading time.

Sign in to add a comment