Repeated "No data found" in distilled content |
|||||
Issue descriptionVersion: M51 What steps will reproduce the problem? (1) Enable Reader Mode on Clank and set triggering heuristic to "Always", or use "Simplify page" in print preview (2) Distill http://danbooru.donmai.us/posts What is the expected output? Just one "No data found" and stop. If the distilled first page contains nothing, it's highly likely that this is not an article, and the detected "next page" is moot. What do you see instead? The loading indicator keeps showing, and "No data found" keeps being appended. Other examples: http://www.gsp.ro/sporturi/tenis/galerie-foto-explozie-de-fericire-bucuria-fetei-care-n-a-incetat-niciodata-sa-creada-in-ea-insasi-458289-galerie-foto-pic-682968.html https://raleigh.craigslist.org/search/cto http://m.ria.ru/index_3.html http://m.vav.kr/page/2 http://best-muzon.com/shanson http://m.falabella.com/mt/www.falabella.com/falabella-cl/category/CMR/Oportunidades-CMR http://m.cafe.daum.net/ssaumjil/LnOm http://m.ntvspor.net/f/News.aspx
,
Apr 15 2016
For http://m.falabella.com/mt/www.falabella.com/falabella-cl/category/CMR/Oportunidades-CMR the first page is not empty, but too short to be useful.
,
Apr 16 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/6d72911b0cb47f5519288325fc502470815b9e7a commit 6d72911b0cb47f5519288325fc502470815b9e7a Author: wychen <wychen@chromium.org> Date: Sat Apr 16 00:09:16 2016 Stop fetching the next page if the first page has no content In DOM distiller, if the distilled content of the first page is empty, it's very likely that the page is not good for distillation, so fetching the next page doesn't make sense. BUG= 602139 Review URL: https://codereview.chromium.org/1891103002 Cr-Commit-Position: refs/heads/master@{#387759} [modify] https://crrev.com/6d72911b0cb47f5519288325fc502470815b9e7a/components/dom_distiller/core/distiller.cc [modify] https://crrev.com/6d72911b0cb47f5519288325fc502470815b9e7a/components/dom_distiller/core/distiller_unittest.cc
,
Apr 20 2016
,
Apr 20 2016
Your change meets the bar and is auto-approved for M51 (branch: 2704)
,
Apr 20 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/686bd1741e6921f5e200aa2cffd8822c436f93b5 commit 686bd1741e6921f5e200aa2cffd8822c436f93b5 Author: Wei-Yin Chen (陳威尹) <wychen@chromium.org> Date: Wed Apr 20 19:19:28 2016 Stop fetching the next page if the first page has no content In DOM distiller, if the distilled content of the first page is empty, it's very likely that the page is not good for distillation, so fetching the next page doesn't make sense. BUG= 602139 Review URL: https://codereview.chromium.org/1891103002 Cr-Commit-Position: refs/heads/master@{#387759} (cherry picked from commit 6d72911b0cb47f5519288325fc502470815b9e7a) Review URL: https://codereview.chromium.org/1903853002 . Cr-Commit-Position: refs/branch-heads/2704@{#150} Cr-Branched-From: 6e53600def8f60d8c632fadc70d7c1939ccea347-refs/heads/master@{#386251} [modify] https://crrev.com/686bd1741e6921f5e200aa2cffd8822c436f93b5/components/dom_distiller/core/distiller.cc [modify] https://crrev.com/686bd1741e6921f5e200aa2cffd8822c436f93b5/components/dom_distiller/core/distiller_unittest.cc |
|||||
►
Sign in to add a comment |
|||||
Comment 1 by wychen@chromium.org
, Apr 11 2016