Main content missing because layout table misclassified as data |
|
Issue descriptionVersion: M55 What steps will reproduce the problem? (1) Distill https://www.grc.nasa.gov/www/k-12/airplane/bga.html What is the expected output? Main content extracted. What do you see instead? "No data found" User feedback: https://feedback.corp.google.com/product/282/neutron?lView=rd&lReport=18197533560 Quick diagnosis: The table containing the main content doesn't fit any of the classification rules, and the default is data table. Possible fix: - If the table contains <hr>, treat as layout. |
|
►
Sign in to add a comment |
|
Comment 1 by benhenry@chromium.org
, Aug 1