Find in page in a PDF does not work |
|||
Issue descriptionChrome Version: 62 OS: Win / Linux What steps will reproduce the problem? Almost the same as bug 654578 : (1) Open http://www.acrotex.net/blog/wp-content/uploads/2011/07/pdfblog_24.pdf (2) Press ctrl+f, search for "round" What is the expected result? The only occurrence of is the word snippet "been around since" on page 1. Find in page should some text near the word "round" highlighted. What happens instead? Find in page does not find the word "round" at all. This is because the fix for bug 761770 and bug 761626 , e.g. r499732 and possibly PDFium side changes, are not 100% correct. This already impacts M62 stable. Let's try to fix it before M64 branches.
,
Nov 23 2017
,
Nov 27 2017
I agree that there needs to be adjustments to how get text handles length of string calculations as you mentioned. There is also issues with this file, since the number of characters and the length of the text are different, specifically there are two non-printing control characters. This leads to the char count being 2 larger then the text length. Since the string code has been made more strict about invalid inputs, this is leading to some operations failing (and a DCHECK being hit in come cases). I think I will have to write 2 fixes for this bug, 1) fix the string length termination issue, 2) fix issues with char count != text length.
,
Nov 27 2017
The following revision refers to this bug: https://pdfium.googlesource.com/pdfium/+/3fc7fe5e4d8fa257e35e6ae86fc6cf4d6b5016a2 commit 3fc7fe5e4d8fa257e35e6ae86fc6cf4d6b5016a2 Author: Ryan Harrison <rharrison@chromium.org> Date: Mon Nov 27 19:30:17 2017 Change FPDF_GetText to return "" when asked to get 0 characters BUG= chromium:788103 Change-Id: I8ebdbc78eb14c358d7ac019b96de4828e6071b79 Reviewed-on: https://pdfium-review.googlesource.com/19350 Commit-Queue: Ryan Harrison <rharrison@chromium.org> Reviewed-by: dsinclair <dsinclair@chromium.org> [modify] https://crrev.com/3fc7fe5e4d8fa257e35e6ae86fc6cf4d6b5016a2/fpdfsdk/fpdftext_embeddertest.cpp [modify] https://crrev.com/3fc7fe5e4d8fa257e35e6ae86fc6cf4d6b5016a2/core/fxcrt/widestring.h [modify] https://crrev.com/3fc7fe5e4d8fa257e35e6ae86fc6cf4d6b5016a2/fpdfsdk/fpdftext.cpp
,
Nov 27 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/82bae22f124ff4287e25d2aa8608d0942f2b5462 commit 82bae22f124ff4287e25d2aa8608d0942f2b5462 Author: pdfium-deps-roller@chromium.org <pdfium-deps-roller@chromium.org> Date: Mon Nov 27 21:53:46 2017 Roll src/third_party/pdfium/ f473672fd..9b93815ed (13 commits) https://pdfium.googlesource.com/pdfium.git/+log/f473672fd630..9b93815edae6 $ git log f473672fd..9b93815ed --date=short --no-merges --format='%ad %ae %s' 2017-11-27 dsinclair Add helpers to get attribute information 2017-11-27 dsinclair Fold XFA_GetAttributeByName into CXFA_Node::NameToAttribute 2017-11-27 dsinclair Convert CPDF_StructTree to size_t 2017-11-27 rharrison Change FPDF_GetText to return "" when asked to get 0 characters 2017-11-27 npm Prepend ++ in CFX_DIBSource 2017-11-27 dsinclair Convert CPDF_StructElement::CountKids to size_t 2017-11-27 dsinclair Remove CollectionSize from CPDF_RenderStatus 2017-11-27 dsinclair Remove CollectionSize from CPDF_CharPosList 2017-11-27 dsinclair Remove CollectionSize from FPDF_DataDecode 2017-11-27 dsinclair Convert CPDF_TextObject to not use CollectionSize 2017-11-27 dsinclair Convert CPDF_ClipPath::Get{Path|Text}Count to size_t 2017-11-27 dsinclair Add some helpers for attribute lookup 2017-11-27 dsinclair Remove use of CollectionSize from CPDF_CMapParser Created with: roll-dep src/third_party/pdfium BUG= 788103 The AutoRoll server is located here: https://pdfium-roll.skia.org Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, please contact the current sheriff, who should be CC'd on the roll, and stop the roller if necessary. TBR=dsinclair@chromium.org Change-Id: I835d32e34297c5d5b40ed36d9b55e2e555f8a4b0 Reviewed-on: https://chromium-review.googlesource.com/791651 Reviewed-by: <pdfium-deps-roller@chromium.org> Commit-Queue: <pdfium-deps-roller@chromium.org> Cr-Commit-Position: refs/heads/master@{#519418} [modify] https://crrev.com/82bae22f124ff4287e25d2aa8608d0942f2b5462/DEPS
,
Nov 28 2017
The following revision refers to this bug: https://pdfium.googlesource.com/pdfium/+/0f1ca17e45bee4f1cb04d33437eba010b1afcd72 commit 0f1ca17e45bee4f1cb04d33437eba010b1afcd72 Author: Ryan Harrison <rharrison@chromium.org> Date: Tue Nov 28 18:19:57 2017 Convert from character stream to visible text indices in GetText Most of the API methods FPDFText operate on indices in terms of the underlying stream of characters. This stream includes non-printing control characters, which are not part of the visible text. The majority of files do not appear to have these hidden characters so there is a 1:1 correspondence between them. When they are present conversion needs to occur to make sure that GetText doesn't attempt to retrieve for a span that is out of range. BUG= chromium:788103 , chromium:788220 Change-Id: I4c9fa403ea65754ba94e3f15ded49fe0641e9db5 Reviewed-on: https://pdfium-review.googlesource.com/19550 Reviewed-by: dsinclair <dsinclair@chromium.org> Commit-Queue: dsinclair <dsinclair@chromium.org> [modify] https://crrev.com/0f1ca17e45bee4f1cb04d33437eba010b1afcd72/fpdfsdk/fpdftext.cpp
,
Nov 28 2017
This should be fixed now. There are still issues with the highlights showing up in slightly wrong location, and continuing a search by changing the string not always working. But in general searching for a string in the doc works. The remaining issues are being address as part of crbug.com/788811 .
,
Nov 28 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/e18360c36e1e9e8ea9e6d7cbba548b2d862c18d1 commit e18360c36e1e9e8ea9e6d7cbba548b2d862c18d1 Author: pdfium-deps-roller@chromium.org <pdfium-deps-roller@chromium.org> Date: Tue Nov 28 22:54:34 2017 Roll src/third_party/pdfium/ 99acb1c81..1ff9b7ffa (5 commits) https://pdfium.googlesource.com/pdfium.git/+log/99acb1c816dc..1ff9b7ffa653 $ git log 99acb1c81..1ff9b7ffa --date=short --no-merges --format='%ad %ae %s' 2017-11-28 dsinclair Remove CollectionSize from fpdfppo. 2017-11-28 dsinclair Remove CollectionSize from fpdf_flatten. 2017-11-28 dsinclair Remove CollectionSize from string_view_template 2017-11-28 dsinclair Remove CollectionSize from JBig2_Context 2017-11-28 rharrison Convert from character stream to visible text indices in GetText Created with: roll-dep src/third_party/pdfium BUG= 788103 , 788220 The AutoRoll server is located here: https://pdfium-roll.skia.org Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, please contact the current sheriff, who should be CC'd on the roll, and stop the roller if necessary. TBR=dsinclair@chromium.org Change-Id: Iefdff68f09558ee706d0702bf47b8241a7141f4c Reviewed-on: https://chromium-review.googlesource.com/794026 Reviewed-by: <pdfium-deps-roller@chromium.org> Commit-Queue: <pdfium-deps-roller@chromium.org> Cr-Commit-Position: refs/heads/master@{#519878} [modify] https://crrev.com/e18360c36e1e9e8ea9e6d7cbba548b2d862c18d1/DEPS |
|||
►
Sign in to add a comment |
|||
Comment 1 by thestig@chromium.org
, Nov 23 2017