CORB: confirmation sniffing is confused by html/js polyglots using <!-- --> <script> |
|||||||||||
Issue descriptionREPRO STEPS: 1. Visit http://www.pythonforbeginners.com/files/reading-and-writing-files-in-python 2. Observe some CORB message in the DevTools console. Among them is a message about blocking https://clarium.global.ssl.fastly.net/?wrapper=bbdvOAJnqH-Idffgn_02C2Cyx_E&tpid=YmJkdk9BSm5xSC1JZGZmZ25fMDJDMkN5eF9FL2Fkc3BhcmM6MzAweDYwMA%3D%3D&d=eyJ3aCI6IlltSmtkazlCU201eFNDMUpaR1ptWjI1Zk1ESkRNa041ZUY5RkwyRmtjM0JoY21NNk16QXdlRFl3TUE9PSIsIndkIjp7ImsiOnsiaGJfYmlkZGVyIjpbImFkc3BhcmMiXSwiaGJfc2l6ZSI6WyIzMDB4NjAwIl19fSwid3IiOjB9 NOTES: - https://clarium.global.ssl.fastly.net/?wrapper responds with: - Content-Length: 11496 - Content-Type: text/html; charset=UTF-8 - no X-Content-Type-Options header - body: <!-- confiant --> <script type='text/javascript'> //<![CDATA[ ...... valid javascript goes here ...... //]]>--></script> - I've tested that this resource works fine 1) when sourced by a script tag and 2) when opened as a top-level html doc.
,
May 4 2018
https://www.ecma-international.org/ecma-262/8.0/index.html#sec-html-like-comments includes the following possible BNF rules/expansions: <Comment> ::= <SingleLineHTMLOpenComment> <SingleLineHTMLOpenComment> ::= "<!--" [<SingleLineCommentChars>] <SingleLineCommentChars> ::= <SingleLineCommentChar> [<SingleLineCommentChars>] <SingleLineCommentChar> ::= <SourceCharacter but not LineTerminator> <SourceCharacter> ::= any Unicode code point <LineTerminator> ::= <LF> | <CR> | <LS> | <PS> AFAIU this means that the following 2 comments single lines are parsed in a similar way <!-- this --> is a comment x = 123; // this is a comment x = 123; There is also: <SingleLineHTMLCloseComment> ::= <LineTerminatorSequence> <HTMLCloseComment> <HTMLCloseComment> ::= [<WhiteSpaceSequence>] [<SingleLineDelimitedCommentSequence>] "-->" [<SingleLineCommentChars>] I am not yet sure what this means for CORB's HTML sniffer. In particular - I am not sure when skipping of HTML comments should end after a single line VS after encountering "-->" characters. Some examples to consider: Should not sniff as HTML: <!-- blah --> <script> var x = "I am a html/javascript polyglot" // </script> Should sniff as HTML (this javascript contains only a comment): <!-- blah --> <html> <!-- single-line HTML / no javascript here --> Should sniff as HTML ("<script>" is not a valid javascript statement): <!-- blah // --> <script> ???: <!-- this line is a valid comment in javascript -> SingleLineHTMLOpenComment rule // We can have javascript here (inside HTML comment) delimited by <! -- and -- > var we_are_inside = "html comment"; // here. /* --> this line is a valid javascript comment -> SingleLineHTMLCloseComment rule <!-- we can have html here (inside javascript comment) delimited by / * ... * / --> <p>Blah<p> <!-- */ -->
,
May 4 2018
It looks like we need to ignore not just leading HTML comments, but anything on the line after the close HTML comment. We can only block it if the first line following the leading HTML comments sniffs as HTML. For example, we could not block this: <!-- a --> <!-- b --><something> code(); But we could block this: <!-- a -->anything <html>...</html> Does that sound right?
,
May 4 2018
Commenting on your examples from comment 2:
1) Should not sniff as HTML:
<!-- blah --> <script>
var x = "I am a html/javascript polyglot"
// </script>
Agreed. This is script and we can't block it.
2) Should sniff as HTML (this javascript contains only a comment):
<!-- blah --> <html> <!-- single-line HTML / no javascript here -->
It's probably ok if we don't block this. It's not really possible to identify if it actually contains JavaScript. It's a valid prefix to a JavaScript file, so I don't think we can block it.
3) Should sniff as HTML ("<script>" is not a valid javascript statement):
<!-- blah // -->
<script>
Agreed. To the extent that we can sniff line 2 (<script>) as HTML, we should block this, since it's not parseable as valid JavaScript.
4) ???:
<!-- this line is a valid comment in javascript -> SingleLineHTMLOpenComment rule
// We can have javascript here (inside HTML comment) delimited by <! -- and -- >
var we_are_inside = "html comment"; // here.
/*
--> this line is a valid javascript comment -> SingleLineHTMLCloseComment rule
<!-- we can have html here (inside javascript comment) delimited by / * ... * / -->
<p>Blah<p>
<!--
*/
-->
I don't think we would block this. It looks like several HTML comments, and thus could be JavaScript.
,
May 4 2018
I think I like the current proposal to drop everything until the end of the line after "-->" characters (when skipping a HTML comment). OTOH, this would mean that the following wouldn't sniff as HTML - I assume this is acceptable?
<!-- blah --> <p>
Paragraph <!-- <- "Paragraph" doesn't sniff as a html tag. -->
</p>
,
May 4 2018
Correct. It does cut into our coverage of HTML documents in a non-trivial way, but it was always kind of best effort there. We're trying to encourage people to use nosniff (or perhaps From-Origin) to get more effective coverage in general. I think it's more important not to block things that could be functional JavaScript, as long as we can keep the HTML definition reasonably large. Would you be able to help update the sniffer for this?
,
May 7 2018
Is this M67 stable blocker? If yes, pls apply "RBS" label.
,
May 8 2018
RE: #c7 Yes - this is a blocker for shipping site-per-process to 100% of M67 stable. WIP CL @ https://chromium-review.googlesource.com/c/chromium/src/+/1047851
,
May 9 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/ebd882cf95df2b1a3376eefba28a410c9e76c41f commit ebd882cf95df2b1a3376eefba28a410c9e76c41f Author: Lukasz Anforowicz <lukasza@chromium.org> Date: Wed May 09 00:04:02 2018 Handle more html/js polyglots in CORB confirmation sniffing. Cross-Origin Read Blocking (CORB) tries to protect certain resource types (e.g. text/html). To be resilient against HTTP responses mislabeled with an incorrect Content-Type, CORB sniffs the response body to confirm if it truly is the protected type. Before this CL the confirmation sniffing logic blocked resources that are both a valid html and a valid javascript. Blocking of such resources is undesirable, because it is disruptive to existing websites that use such polyglot responses in <script> tags. After this CL, CORB sniffer takes into account the https://www.ecma-international.org/ecma-262/8.0/index.html#prod-annexB-SingleLineHTMLCloseComment rule which means that the sniffing doesn't resume immediately after "-->" characters, but instead also consumes all the characters until the first line terminator. Bug: 839945 Cq-Include-Trybots: master.tryserver.chromium.linux:linux_mojo Change-Id: I7c8221acc2013adffe8095d188ae22e1c6a2fdab Reviewed-on: https://chromium-review.googlesource.com/1047851 Commit-Queue: Charlie Reis <creis@chromium.org> Reviewed-by: Charlie Reis <creis@chromium.org> Cr-Commit-Position: refs/heads/master@{#557018} [modify] https://crrev.com/ebd882cf95df2b1a3376eefba28a410c9e76c41f/content/browser/loader/cross_site_document_blocking_browsertest.cc [modify] https://crrev.com/ebd882cf95df2b1a3376eefba28a410c9e76c41f/content/browser/loader/cross_site_document_resource_handler_unittest.cc [add] https://crrev.com/ebd882cf95df2b1a3376eefba28a410c9e76c41f/content/test/data/site_isolation/js-html-polyglot2.html [modify] https://crrev.com/ebd882cf95df2b1a3376eefba28a410c9e76c41f/services/network/cross_origin_read_blocking.cc [modify] https://crrev.com/ebd882cf95df2b1a3376eefba28a410c9e76c41f/services/network/cross_origin_read_blocking_explainer.md [modify] https://crrev.com/ebd882cf95df2b1a3376eefba28a410c9e76c41f/services/network/cross_origin_read_blocking_unittest.cc [modify] https://crrev.com/ebd882cf95df2b1a3376eefba28a410c9e76c41f/third_party/WebKit/LayoutTests/external/wpt/fetch/corb/resources/html-js-polyglot.js [add] https://crrev.com/ebd882cf95df2b1a3376eefba28a410c9e76c41f/third_party/WebKit/LayoutTests/external/wpt/fetch/corb/resources/html-js-polyglot2.js [add] https://crrev.com/ebd882cf95df2b1a3376eefba28a410c9e76c41f/third_party/WebKit/LayoutTests/external/wpt/fetch/corb/resources/html-js-polyglot2.js.headers [modify] https://crrev.com/ebd882cf95df2b1a3376eefba28a410c9e76c41f/third_party/WebKit/LayoutTests/external/wpt/fetch/corb/script-html-js-polyglot.sub.html
,
May 9 2018
Thank you lukasza@, pls update the bug with canary result on Thursday and request a merge to M67 if canary result looks good.
,
May 9 2018
Commit ebd882cf... initially landed in 68.0.3425.0. I've tried this Canary version on a Mac and while http://www.pythonforbeginners.com/files/reading-and-writing-files-in-python still hit quite a few "Blocked current origin from receiving cross-site document" messages, I didn't see any html/js polyglots after extracting the URLs of the affected resources, downloading them via wget and manually inspecting the response headers and bodies. Note that some requests were rejected when done by the wget client (like www.atdmt.com one), but looking at the response manually shows that it has content-length: 0. Let's give this a bit more time until Thursday and request the merge tomorrow if everything looks good.
,
May 9 2018
[Auto-generated comment by a script] We noticed that this issue is targeted for M-67; it appears the fix may have landed after branch point, meaning a merge might be required. Please confirm if a merge is required here - if so add Merge-Request-67 label, otherwise remove Merge-TBD label. Thanks.
,
May 10 2018
The NextAction date has arrived: 2018-05-10
,
May 10 2018
Things are still looking good (e.g. no magic signature looks related when I look at crashes from 68.0.3425.0). I've already confirmed yesterday that the fix works as intended on Canary - see #c11 above. So - let me request a merge to M67.
,
May 10 2018
This bug requires manual review: M67 has already been promoted to the beta branch, so this requires manual review Please contact the milestone owner if you have questions. Owners: cmasso@(Android), cmasso@(iOS), kbleicher@(ChromeOS), govind@(Desktop) For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
May 10 2018
Approving merge to M67 branch 3396 based on comment #8, #11 and #14. Please merge ASAP. Thank you.
,
May 10 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/54eb5b35cd06d8092d0c9a91a6f6c0479e1f1668 commit 54eb5b35cd06d8092d0c9a91a6f6c0479e1f1668 Author: Lukasz Anforowicz <lukasza@chromium.org> Date: Thu May 10 18:35:34 2018 Handle more html/js polyglots in CORB confirmation sniffing. Cross-Origin Read Blocking (CORB) tries to protect certain resource types (e.g. text/html). To be resilient against HTTP responses mislabeled with an incorrect Content-Type, CORB sniffs the response body to confirm if it truly is the protected type. Before this CL the confirmation sniffing logic blocked resources that are both a valid html and a valid javascript. Blocking of such resources is undesirable, because it is disruptive to existing websites that use such polyglot responses in <script> tags. After this CL, CORB sniffer takes into account the https://www.ecma-international.org/ecma-262/8.0/index.html#prod-annexB-SingleLineHTMLCloseComment rule which means that the sniffing doesn't resume immediately after "-->" characters, but instead also consumes all the characters until the first line terminator. Bug: 839945 Cq-Include-Trybots: master.tryserver.chromium.linux:linux_mojo Change-Id: I7c8221acc2013adffe8095d188ae22e1c6a2fdab Reviewed-on: https://chromium-review.googlesource.com/1047851 Commit-Queue: Charlie Reis <creis@chromium.org> Reviewed-by: Charlie Reis <creis@chromium.org> Cr-Original-Commit-Position: refs/heads/master@{#557018} Reviewed-on: https://chromium-review.googlesource.com/1054192 Cr-Commit-Position: refs/branch-heads/3396@{#549} Cr-Branched-From: 9ef2aa869bc7bc0c089e255d698cca6e47d6b038-refs/heads/master@{#550428} [modify] https://crrev.com/54eb5b35cd06d8092d0c9a91a6f6c0479e1f1668/content/browser/loader/cross_site_document_blocking_browsertest.cc [modify] https://crrev.com/54eb5b35cd06d8092d0c9a91a6f6c0479e1f1668/content/browser/loader/cross_site_document_resource_handler_unittest.cc [add] https://crrev.com/54eb5b35cd06d8092d0c9a91a6f6c0479e1f1668/content/test/data/site_isolation/js-html-polyglot2.html [modify] https://crrev.com/54eb5b35cd06d8092d0c9a91a6f6c0479e1f1668/services/network/cross_origin_read_blocking.cc [modify] https://crrev.com/54eb5b35cd06d8092d0c9a91a6f6c0479e1f1668/services/network/cross_origin_read_blocking_explainer.md [modify] https://crrev.com/54eb5b35cd06d8092d0c9a91a6f6c0479e1f1668/services/network/cross_origin_read_blocking_unittest.cc [modify] https://crrev.com/54eb5b35cd06d8092d0c9a91a6f6c0479e1f1668/third_party/WebKit/LayoutTests/external/wpt/fetch/corb/resources/html-js-polyglot.js [add] https://crrev.com/54eb5b35cd06d8092d0c9a91a6f6c0479e1f1668/third_party/WebKit/LayoutTests/external/wpt/fetch/corb/resources/html-js-polyglot2.js [add] https://crrev.com/54eb5b35cd06d8092d0c9a91a6f6c0479e1f1668/third_party/WebKit/LayoutTests/external/wpt/fetch/corb/resources/html-js-polyglot2.js.headers [modify] https://crrev.com/54eb5b35cd06d8092d0c9a91a6f6c0479e1f1668/third_party/WebKit/LayoutTests/external/wpt/fetch/corb/script-html-js-polyglot.sub.html |
|||||||||||
►
Sign in to add a comment |
|||||||||||
Comment 1 by creis@chromium.org
, May 4 2018