New issue
Advanced search Search tips

Issue 788103 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Nov 2017
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux , Windows , Chrome , Mac
Pri: 1
Type: Bug-Regression

Blocking:
issue 654578



Sign in to add a comment

Find in page in a PDF does not work

Project Member Reported by thestig@chromium.org, Nov 23 2017

Issue description

Chrome Version: 62
OS: Win / Linux

What steps will reproduce the problem?

Almost the same as  bug 654578 :
(1) Open http://www.acrotex.net/blog/wp-content/uploads/2011/07/pdfblog_24.pdf
(2) Press ctrl+f, search for "round"

What is the expected result?

The only occurrence of is the word snippet "been around since" on page 1.

Find in page should some text near the word "round" highlighted.

What happens instead?

Find in page does not find the word "round" at all.

This is because the fix for  bug 761770  and  bug 761626 , e.g. r499732 and possibly PDFium side changes, are not 100% correct. This already impacts M62 stable. Let's try to fix it before M64 branches.
 
Oh, I forgot to mention I bisected this to r497540. I really think the PDFium APIs were working fine before, and so was the Chrome PDF Viewer code. String refactoring in PDFium is the only thing that changed there.

r499732 and other CLs that tried to fix the problems went in the wrong direction w.r.t. FPDFText_GetText().
Status: Started (was: Assigned)
I agree that there needs to be adjustments to how get text handles length of string calculations as you mentioned. There is also issues with this file, since the number of characters and the length of the text are different, specifically there are two non-printing control characters. This leads to the char count being 2 larger then the text length. Since the string code has been made more strict about invalid inputs, this is leading to some operations failing (and a DCHECK being hit in come cases).

I think I will have to write 2 fixes for this bug, 1) fix the string length termination issue, 2) fix issues with char count != text length.
Project Member

Comment 4 by bugdroid1@chromium.org, Nov 27 2017

The following revision refers to this bug:
  https://pdfium.googlesource.com/pdfium/+/3fc7fe5e4d8fa257e35e6ae86fc6cf4d6b5016a2

commit 3fc7fe5e4d8fa257e35e6ae86fc6cf4d6b5016a2
Author: Ryan Harrison <rharrison@chromium.org>
Date: Mon Nov 27 19:30:17 2017

Change FPDF_GetText to return "" when asked to get 0 characters

BUG= chromium:788103 

Change-Id: I8ebdbc78eb14c358d7ac019b96de4828e6071b79
Reviewed-on: https://pdfium-review.googlesource.com/19350
Commit-Queue: Ryan Harrison <rharrison@chromium.org>
Reviewed-by: dsinclair <dsinclair@chromium.org>

[modify] https://crrev.com/3fc7fe5e4d8fa257e35e6ae86fc6cf4d6b5016a2/fpdfsdk/fpdftext_embeddertest.cpp
[modify] https://crrev.com/3fc7fe5e4d8fa257e35e6ae86fc6cf4d6b5016a2/core/fxcrt/widestring.h
[modify] https://crrev.com/3fc7fe5e4d8fa257e35e6ae86fc6cf4d6b5016a2/fpdfsdk/fpdftext.cpp

Project Member

Comment 5 by bugdroid1@chromium.org, Nov 27 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/82bae22f124ff4287e25d2aa8608d0942f2b5462

commit 82bae22f124ff4287e25d2aa8608d0942f2b5462
Author: pdfium-deps-roller@chromium.org <pdfium-deps-roller@chromium.org>
Date: Mon Nov 27 21:53:46 2017

Roll src/third_party/pdfium/ f473672fd..9b93815ed (13 commits)

https://pdfium.googlesource.com/pdfium.git/+log/f473672fd630..9b93815edae6

$ git log f473672fd..9b93815ed --date=short --no-merges --format='%ad %ae %s'
2017-11-27 dsinclair Add helpers to get attribute information
2017-11-27 dsinclair Fold XFA_GetAttributeByName into CXFA_Node::NameToAttribute
2017-11-27 dsinclair Convert CPDF_StructTree to size_t
2017-11-27 rharrison Change FPDF_GetText to return "" when asked to get 0 characters
2017-11-27 npm Prepend ++ in CFX_DIBSource
2017-11-27 dsinclair Convert CPDF_StructElement::CountKids to size_t
2017-11-27 dsinclair Remove CollectionSize from CPDF_RenderStatus
2017-11-27 dsinclair Remove CollectionSize from CPDF_CharPosList
2017-11-27 dsinclair Remove CollectionSize from FPDF_DataDecode
2017-11-27 dsinclair Convert CPDF_TextObject to not use CollectionSize
2017-11-27 dsinclair Convert CPDF_ClipPath::Get{Path|Text}Count to size_t
2017-11-27 dsinclair Add some helpers for attribute lookup
2017-11-27 dsinclair Remove use of CollectionSize from CPDF_CMapParser

Created with:
  roll-dep src/third_party/pdfium
BUG= 788103 


The AutoRoll server is located here: https://pdfium-roll.skia.org

Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md

If the roll is causing failures, please contact the current sheriff, who should
be CC'd on the roll, and stop the roller if necessary.


TBR=dsinclair@chromium.org

Change-Id: I835d32e34297c5d5b40ed36d9b55e2e555f8a4b0
Reviewed-on: https://chromium-review.googlesource.com/791651
Reviewed-by: <pdfium-deps-roller@chromium.org>
Commit-Queue: <pdfium-deps-roller@chromium.org>
Cr-Commit-Position: refs/heads/master@{#519418}
[modify] https://crrev.com/82bae22f124ff4287e25d2aa8608d0942f2b5462/DEPS

Project Member

Comment 6 by bugdroid1@chromium.org, Nov 28 2017

The following revision refers to this bug:
  https://pdfium.googlesource.com/pdfium/+/0f1ca17e45bee4f1cb04d33437eba010b1afcd72

commit 0f1ca17e45bee4f1cb04d33437eba010b1afcd72
Author: Ryan Harrison <rharrison@chromium.org>
Date: Tue Nov 28 18:19:57 2017

Convert from character stream to visible text indices in GetText

Most of the API methods FPDFText operate on indices in terms of the
underlying stream of characters. This stream includes non-printing
control characters, which are not part of the visible text. The
majority of files do not appear to have these hidden characters so
there is a 1:1 correspondence between them. When they are present
conversion needs to occur to make sure that GetText doesn't attempt to
retrieve for a span that is out of range.

BUG= chromium:788103 , chromium:788220 

Change-Id: I4c9fa403ea65754ba94e3f15ded49fe0641e9db5
Reviewed-on: https://pdfium-review.googlesource.com/19550
Reviewed-by: dsinclair <dsinclair@chromium.org>
Commit-Queue: dsinclair <dsinclair@chromium.org>

[modify] https://crrev.com/0f1ca17e45bee4f1cb04d33437eba010b1afcd72/fpdfsdk/fpdftext.cpp

Status: Fixed (was: Started)
This should be fixed now. There are still issues with the highlights showing up in slightly wrong location, and continuing a search by changing the string not always working. But in general searching for a string in the doc works.

The remaining issues are being address as part of  crbug.com/788811 .
Project Member

Comment 8 by bugdroid1@chromium.org, Nov 28 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/e18360c36e1e9e8ea9e6d7cbba548b2d862c18d1

commit e18360c36e1e9e8ea9e6d7cbba548b2d862c18d1
Author: pdfium-deps-roller@chromium.org <pdfium-deps-roller@chromium.org>
Date: Tue Nov 28 22:54:34 2017

Roll src/third_party/pdfium/ 99acb1c81..1ff9b7ffa (5 commits)

https://pdfium.googlesource.com/pdfium.git/+log/99acb1c816dc..1ff9b7ffa653

$ git log 99acb1c81..1ff9b7ffa --date=short --no-merges --format='%ad %ae %s'
2017-11-28 dsinclair Remove CollectionSize from fpdfppo.
2017-11-28 dsinclair Remove CollectionSize from fpdf_flatten.
2017-11-28 dsinclair Remove CollectionSize from string_view_template
2017-11-28 dsinclair Remove CollectionSize from JBig2_Context
2017-11-28 rharrison Convert from character stream to visible text indices in GetText

Created with:
  roll-dep src/third_party/pdfium
BUG= 788103 , 788220 


The AutoRoll server is located here: https://pdfium-roll.skia.org

Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md

If the roll is causing failures, please contact the current sheriff, who should
be CC'd on the roll, and stop the roller if necessary.


TBR=dsinclair@chromium.org

Change-Id: Iefdff68f09558ee706d0702bf47b8241a7141f4c
Reviewed-on: https://chromium-review.googlesource.com/794026
Reviewed-by: <pdfium-deps-roller@chromium.org>
Commit-Queue: <pdfium-deps-roller@chromium.org>
Cr-Commit-Position: refs/heads/master@{#519878}
[modify] https://crrev.com/e18360c36e1e9e8ea9e6d7cbba548b2d862c18d1/DEPS

Sign in to add a comment