Security: UTF-16BE charset sniffing facilitates XSS attacks
Reported by
ar...@rawsec.net,
Jun 19 2017
|
|||||
Issue description
VULNERABILITY DETAILS
If a site doesn't explicitly specify a charset, Chrome can be forced to parse the document as UTF-16BE (variants with UTF-32 seem possible too). This method can be employed to evade common anti-XSS measures (and it facilitates bypassing the XSS auditor).
A byte sequence in the form of \x05x\x06x\x07x\x08x\x10x\x11x\x12x\x13x\x14 will trigger UTF-16BE if it occurs in any text node of the document while other browsers (e.g. Firefox) correctly fall back to UTF-8.
(I used this technique to bypass the XSS filter in a challenge of this year's Google CTF ("Geokitties v2").)
VERSION
Chrome Version: 59.0.3071.86 (Official Build) (64-bit) stable
Operating System: Linux 4.11.3-1-ARCH
REPRODUCTION CASE
The <textarea> tag is supposed to contain the sequence, but UTF-16BE garbles the HTML:
<?php
header('Content-Type: text/html;charset=');
echo "<textarea>\x05x\x06x\x07x\x08x\x10x\x11x\x12x\x13x\x14</textarea>";
echo "\x00U\x00T\x00F";
?>
If the security implications of this behavior are unclear, I'm happy to provide more details.
,
Jun 20 2017
Thanks for the report. Per https://dev.chromium.org/Home/chromium-security/security-faq#TOC-Are-XSS-filter-bypasses-considered-security-bugs-, we don't consider this a security bug, so I'm removing the security labels.
,
Jun 20 2017
My apologies, I understood the FAQ section to be about Chrome's built-in XSS auditor while this appeared to me as a more generic problem with charset sniffing. Anyways, thanks for handling this!
,
Jul 19 2017
,
Aug 25 2017
Seems like auto-detecting as UTF-16 is probably not desirable - see issue 691985 for other cases where non-ASCII encoding detection is problematic. jinsukkim@ - any thoughts here? (We removed blink support for UTF-32 so that avenue should be gone, too)
,
Sep 1 2017
Thank you very much for the report and sorry for the late reponse - this somehow evaded my radar. jsbell@ I agree that we better not autodetect UTF16 for potential side effects as the reported case. Will work on this.
,
Sep 1 2017
The problem is indeed not limited to UTF-16BE as my title suggested. Just for demonstration, here are two more examples of Chrome being over-zealous. This is sniffed as "Big5": data:text/html,%AA%BB And this is sniffed as "UTF-16LE": data:text/html,%00%BB
,
Sep 11 2017
Feeding just 2 bytes to the autodetector and expect the right result would be too much to ask. I'm not intending to handle all these edge cases here. Let me stop UTF-16BE/LE only, for which attack PoCs have been found.
,
Sep 13 2017
r501528 rolled in the changes that excludes the autodetection of UTF-16LE/BE encoding. Let me close this bug.
,
Sep 13 2017
Wrong link... r501423 |
|||||
►
Sign in to add a comment |
|||||
Comment 1 by elawrence@chromium.org
, Jun 19 2017