Find in pdf should find text broken across lines
Reported by
mr.ber...@gmail.com,
Jun 28 2016
|
||||||
Issue descriptionUserAgent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.106 Safari/537.36 Example URL: http://www.dfg.de/formulare/54_01/54_01_de.pdf Steps to reproduce the problem: 1. Open http://www.dfg.de/formulare/54_01/54_01_de.pdf 2. Search for "Ba-sismodul" or "Basismodul" (both without quotation marks) What is the expected behavior? At least one is found (ideally, both). What went wrong? It is not. Adobe Reader DC, however, does find it "Ba-sismodul". Does it occur on multiple sites: N/A Is it a problem with a plugin? Yes pdf Did this work before? No Does this work in other browsers? Yes Chrome version: 51.0.2704.106 Channel: stable OS Version: 10.0 Flash Version: Shockwave Flash 22.0 r0 Now, the correct word is "Basismodul", but it is hyphenated. For context, Adobe Reader DC and Chrome PDF behave differently when copying text with a hyphen at the end of the line: Adobe copies all hyphens: - "Ba-sismodul" stays "Ba-sismodul" (incorrect) - "Noether-Programms" (page 2, line 1) stays "Noether-Programms" (correct) Chrome removes all hyphens: - "Ba-sismodul" becomes "Basismodul" (nice!) - "Noether-Programms" becomes "NoetherProgramms" (booh) Why am I posting this context? Because at least, Adobe Reader finds "Ba-sismodul", which is consistent with the text that you obtain when you copy text from Adobe Reader. Which means, copy "Ba-sismodul" from pdf, Ctrl-F, Ctrl-V, Enter - found something. The same is not true in Chrome: "Ba-sismodul" becomes "Basismodul", but "Basismodul" is not found in the pdf. This is what I propose to fix. Of source, "Noether-Programms" should also be found, so the hyphenated variante should be findable, as well (as should "Ba-sismodul", then).
,
Jun 29 2016
Moving this nonessential bug to the next milestone. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Jul 1 2016
This issue has been moved once and is lower than Pri-1. Removing the milestone. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Oct 25 2016
,
Oct 19 2017
First, about the problem of "Noether-Programms" becoming "NoetherProgramms", there isn't much we could do to differentiate this from the case of "Ba-sismodul" without getting language specific. As for search not finding occurrences split across lines, this is related to bug 58402 . However, it should not be a duplicate because this report includes the complication of a hyphen in the middle of the word.
,
Sep 13
Archiving old bugs that haven't been actively assigned in over 180 days. If you feel this issue should still be addressed, feel free to reopen it or to file a new issue. Thanks! |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by rnimmagadda@chromium.org
, Jun 29 2016Components: Internals>Plugins>PDF UI>Browser>FindInPage
Labels: -Type-Compat M-52 OS-Linux OS-Mac Type-Bug
Status: Untriaged (was: Unconfirmed)
14.4 MB
14.4 MB Download