Allow pagination_algo = "none" in DOM distiller |
||
Issue descriptionWe don't support multipage in iOS RL and the RegExps to find the next page use English words. If a link is next, we will distill the second page and just throw it. Allow setting pagination_algo to null (or default to none instead of "next" if the option is not used) to avoid a costly regex processing.
,
Feb 27 2017
Independently of the solution we will use for iOS, I think an option to disable pagination would be really useful (specially as pagination involve regexps that can be heavy). For iOS, I am a little reluctant to distill pages that where not added by the user. Are there restrictions on pages that can be considered as page 2 (same origin as page 1?)
,
Feb 27 2017
I agree being able to skip pagination detection could be useful if the result is not used. I'd say there are few false positives in our our next page detection, given that the page contains an article. They need to be in the same origin as page 1 indeed. Our algorithm does rely on language-dependent hints, but also some language-neutral ones like numeric patterns. In many cases, even if the page is intended for non-English readers, the HTML id and class names are still in English, so the language-dependent hints are applicable more often than we'd think.
,
Feb 27 2017
The false positive rate could be high if the page doesn't contain an article. This symptom was mostly suppressed by this CL "Stop fetching the next page if the first page has no content": https://codereview.chromium.org/1891103002
,
Mar 6 2017
I am hitting a DCHECK when there is a pagination. For example while distilling: https://ar.m.wikipedia.org/wiki/%D8%A5%D8%B3%D8%AD%D8%A7%D9%82_%D9%86%D9%8A%D9%88%D8%AA%D9%86
,
Mar 6 2017
Thanks for reporting this next page bug!
,
Mar 8 2017
Note that the DCHECK is a check on mime type, but does not stop or alter the distillation.
,
Mar 16 2017
Allow pagination detection to be skipped should be fairly easy, so labeling this Good First Bug. Is the usefulness of page stitching in iOS Reading List still unclear?
,
Mar 16 2017
I personally think it's pretty useful to distill and save the whole article if it was divided over multiple pages and it would help us achieve feature-parity with other browsers who support this functionality.
,
Mar 16 2017
The false positive next page link in #c5 is separated here: https://bugs.chromium.org/p/chromium/issues/detail?id=702424 |
||
►
Sign in to add a comment |
||
Comment 1 by wychen@chromium.org
, Feb 24 2017