New issue
Advanced search Search tips

Issue 720578 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: May 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 2
Type: Bug



Sign in to add a comment

Chrome PDF Viewer doesn't identify links correctly where ";" follows the link

Reported by whitney....@gmail.com, May 10 2017

Issue description

Chrome Version       : 58.0.3029.96
OS Version: OS X 10.12.3
URLs (if applicable) :
Other browsers tested:
  Add OK or FAIL after other browsers where you have tested this issue:
     Safari 5:
  Firefox 4.x:
     IE 7/8/9:

What steps will reproduce the problem?
1.Download a PDF where a link in the PDF is https://www.link.com; - (Link that includes a semi colon after that is not part of the address link)
2. Open the PDF in Preview or Adobe on my mac - The link correctly opens in the pdf and doesn't include the ";"
3. Open the PDF in Chrome PDF Viewer - the link fails as it thinks ";" is a part of the link.

What is the expected result?
That the link in the pdf will work as it does in preview or adobe based on the fact that the web address link doesn't include the ";"

What happens instead of that?
The web address link in the pdf fails as it erroneously includes the ";" that isn't part of the web link.

Please provide any additional information below. Attach a screenshot if
possible.


UserAgentString: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.96 Safari/537.36



 
Cc: ligim...@chromium.org
Components: Internals>Plugins>PDF
Labels: Needs-Triage-M58

Comment 2 by npm@chromium.org, May 10 2017

Cc: weili@chromium.org
Do you have a sample PDF? Otherwise we can create one
I do, but it's confidential. Sorry!

Comment 4 by sdy@chromium.org, May 10 2017

Labels: -Pri-3 -Needs-Triage-M58 Pri-2
Owner: thestig@chromium.org
Status: Assigned (was: Unconfirmed)
thestig@: Thoughts (or feel free to reassign)? I can reproduce this problem by visiting this URI:

    data:text/html,http://example.com/;

…then using the Print dialog. I can click the link in the preview, or save a PDF to the Desktop and open that. Weirdly, it only seems to get linkified on Mac.

Comment 5 by weili@chromium.org, May 15 2017

Looks like we need to validate URLs more carefully. @thestig, I can take this one if you prefer.

Comment 6 by weili@chromium.org, May 18 2017

Cc: -weili@chromium.org
Owner: weili@chromium.org
Status: Started (was: Assigned)
Project Member

Comment 7 by bugdroid1@chromium.org, May 20 2017

The following revision refers to this bug:
  https://pdfium.googlesource.com/pdfium/+/6c8ed646d1fcb8cce5a01c843c5149d989e6d5f0

commit 6c8ed646d1fcb8cce5a01c843c5149d989e6d5f0
Author: Wei Li <weili@chromium.org>
Date: Sat May 20 05:30:10 2017

Better identify web links by trimming irrelevant chars

Sometimes, web links are written with other text such as punctuations
which makes the extracted web links invalid. We improve this by trimming
invalid chars at the end of host name only URLs. For example, host names
never ends with ';' or ','.

BUG= chromium:720578 

Change-Id: Id619025b2153531376d268a69a3a89c3d49fce08
Reviewed-on: https://pdfium-review.googlesource.com/5692
Commit-Queue: Wei Li <weili@chromium.org>
Reviewed-by: Lei Zhang <thestig@chromium.org>

[modify] https://crrev.com/6c8ed646d1fcb8cce5a01c843c5149d989e6d5f0/core/fpdftext/fpdf_text_int_unittest.cpp
[modify] https://crrev.com/6c8ed646d1fcb8cce5a01c843c5149d989e6d5f0/core/fpdftext/cpdf_linkextract.cpp
[modify] https://crrev.com/6c8ed646d1fcb8cce5a01c843c5149d989e6d5f0/fpdfsdk/fpdftext_embeddertest.cpp

Project Member

Comment 8 by bugdroid1@chromium.org, May 20 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/0900889fa54802cd6eb33481c4c774300715a8ed

commit 0900889fa54802cd6eb33481c4c774300715a8ed
Author: pdfium-deps-roller@chromium.org <pdfium-deps-roller@chromium.org>
Date: Sat May 20 07:16:28 2017

Roll src/third_party/pdfium/ d15ce4c1e..6c8ed646d (1 commit)

https://pdfium.googlesource.com/pdfium.git/+log/d15ce4c1e088..6c8ed646d1fc

$ git log d15ce4c1e..6c8ed646d --date=short --no-merges --format='%ad %ae %s'
2017-05-19 weili Better identify web links by trimming irrelevant chars

Created with:
  roll-dep src/third_party/pdfium
BUG= 720578 


Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md

If the roll is causing failures, see:
http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls


TBR=dsinclair@chromium.org

Change-Id: Ib1b36144a82c69fc153fad4271742f82338ec983
Reviewed-on: https://chromium-review.googlesource.com/510062
Reviewed-by: <pdfium-deps-roller@chromium.org>
Commit-Queue: <pdfium-deps-roller@chromium.org>
Cr-Commit-Position: refs/heads/master@{#473423}
[modify] https://crrev.com/0900889fa54802cd6eb33481c4c774300715a8ed/DEPS

Comment 9 by weili@chromium.org, May 22 2017

Status: Fixed (was: Started)

Sign in to add a comment