Undefined-shift in ucnv_UTF8FromUTF8 |
|||||||
Issue descriptionDetailed report: https://clusterfuzz.com/testcase?key=5525936635379712 Fuzzer: libfuzzer_libxml_xml_read_memory_fuzzer Job Type: libfuzzer_chrome_ubsan Platform Id: linux Crash Type: Undefined-shift Crash Address: Crash State: ucnv_UTF8FromUTF8 ucnv_convertEx_59 xmlUconvWrapper Sanitizer: undefined (UBSAN) Regressed: https://clusterfuzz.com/revisions?job=libfuzzer_chrome_ubsan&range=395640:395746 Reproducer Testcase: https://clusterfuzz.com/download?testcase_id=5525936635379712 Issue filed automatically. See https://chromium.googlesource.com/chromium/src/+/master/testing/libfuzzer/reproducing.md for more information.
,
Jun 13 2017
A minimized test has an offending byte sequence of:
f1 91 93 96 91 94
The first four bytes represent U+514D6, but the last two are invalid.
Stack:
../../third_party/icu/source/common/ucnv_u8.cpp:918:29: runtime error: left shift of negative value -213035503
#0 0x4ee7d3 in ucnv_UTF8FromUTF8(UConverterFromUnicodeArgs*, UConverterToUnicodeArgs*, UErrorCode*)
third_party/icu/source/common/ucnv_u8.cpp:918:29
#1 0x4cbb3a in ucnv_convertEx_59 third_party/icu/source/common/ucnv.cpp:2286:13
#2 0x4c30ec in xmlUconvWrapper third_party/libxml/src/encoding.c:0:9
---------------------------
while(toULength<toULimit) {
if(source<sourceLimit) {
b=*source;
if(U8_IS_TRAIL(b)) {
++source;
++toULength;
c=(c<<6)+b;
Line 918 is |c=(c<<6)+b;|. |c| is UChar32 (typedef'd to int32_t) and |b| is uint8. UBsan complains:
left-shift of negative value -213035503 ( 11110011 01001101 01010110 00010001 )
,
Oct 1 2017
Automatically applying components based on information from OWNERS files. If this seems incorrect, please apply the Test-Predator-Wrong-Components label.
,
Oct 1 2017
I don't see any way that c can become negative with the example sequence. I do see the possibility for some other inputs. toULimit=U8_COUNT_TRAIL_BYTES(0xf1)+1=4 so worst case we are at toULength=3 and are reading the last byte of the 4-byte sequence. In this case, c is ((0xf1<<6)+0x91)<<6)+0x93=0xF34D3 before we shift it left by 6 once more and then add 0x96, getting 0x3CD3553. No overflow, only the lower 26 bits are used. For the illegal trail bytes, toULimit=1 so we don't enter the loop. I can see potential for overflow only for lead bytes F8..FD for the old 5- and 6-byte UTF-8 sequences that ICU 59 still reads as single errors. These lead bytes would be shifted left by 24 or 30 bits. This does not happen any more in the upcoming ICU 60 because it does not recognize bytes over F4 as lead bytes at all any more. This would also not have been a regression in ICU 59, because this code had not changed for many years.
,
Oct 13 2017
Hex dump of the character sequence in question taken from the minimized test file: $ xxd ~/Downloads/clusterfuzz-testcase-minimized-5525936635379712 000000d0: 3d22 656e 223e f191 9893 9691 948c 3d22 ="en">........=" Things to note: 1/ sequence to decode: f1 91 98 93 96 91 94 8c (slightly different to what is given above in #c2) 2/ This sequence starts at offset 214 libxml reads 180 bytes of chars at the start of a file to decode the xmldecl. https://cs.chromium.org/chromium/src/third_party/libxml/src/encoding.c?l=2124&rcl=adb61db19020ed8ecee5e91b1a0ea4c924ae2988 For some reason, when this call gets through to ucnv_u8 to do the utf8 decoding, it is starting at offset 36 (rather than the expected 0) and it reads 180 chars which takes it to an incomplete read of 'f1 91'. https://cs.chromium.org/chromium/src/third_party/icu/source/common/ucnv_u8.cpp?l=771&rcl=21d33b1a09a77f033478ea4ffffb61e6970f83bd The parser seems to recognize that it has read truncated input and there is code that appears that it should handle this OK and pick up correctly when it next starts reading input. This code saves the fact that it should be reading a 4-byte sequence (toULimit=4) https://cs.chromium.org/chromium/src/third_party/icu/source/common/ucnv_u8.cpp?l=990&rcl=21d33b1a09a77f033478ea4ffffb61e6970f83bd This code sets an error code as TRUNCATED_CHAR_FOUND https://cs.chromium.org/chromium/src/third_party/icu/source/common/ucnv.cpp?l=1457&rcl=21d33b1a09a77f033478ea4ffffb61e6970f83bd When libxml next calls for utf8 decoding, it restores some of the state, but gets one field incorrect. https://cs.chromium.org/chromium/src/third_party/icu/source/common/ucnv_u8.cpp?l=794&rcl=21d33b1a09a77f033478ea4ffffb61e6970f83bd It correctly retrieves the previous value for c and for toULimit, but it sets toULength=0 rather than 2. This means that rather than reading only 2 more bytes (98 93), it reads 4 more bytes (98 93 96 91) and blows up. I believe the place where toULength is being set to zero in error is at: https://cs.chromium.org/chromium/src/third_party/icu/source/common/ucnv.cpp?l=1533&rcl=21d33b1a09a77f033478ea4ffffb61e6970f83bd Commenting out that line fixes this error, but I don't yet understand what other consequences it may have. I will investigate some more and maybe see if I can contact original author or any maintainer.
,
Oct 13 2017
Markus, are you are the maintainer of this code at ICU? I apologise for showing my ignorance. Do you think removing the line of code that I suggested would work as a fix? When I trace through the calls from cnv->fromCharErrorBehaviour using UCNV_TO_U_DEFAULT_CALLBACK, I don't see anywhere that the toULength field is used. Or would it work to save the value before calling fromCharErrorBehaviour, and then set the value back? https://cs.chromium.org/chromium/src/third_party/icu/source/common/ucnv.cpp?l=1533&rcl=21d33b1a09a77f033478ea4ffffb61e6970f83bd /* save/set the converter state to deal with the next character */ (+) int oldToULength = cnv->toULength; cnv->toULength=0; /* call the callback function */ if(cnv->toUCallbackReason==UCNV_ILLEGAL && *err==U_INVALID_CHAR_FOUND) { cnv->toUCallbackReason = UCNV_UNASSIGNED; } cnv->fromCharErrorBehaviour(cnv->toUContext, pArgs, cnv->invalidCharBuffer, errorInputLength, cnv->toUCallbackReason, err); (+) cnv->toULength = oldToULength; /* reset to old value */ cnv->toUCallbackReason = UCNV_ILLEGAL; /* reset to default value */
,
Oct 20 2017
,
Oct 24 2017
For more information, please see https://chromium.googlesource.com/chromium/src/+/master/testing/libfuzzer/reference.md. The link referenced in the description is no longer valid. (bulk edit)
,
Oct 24 2017
This error is caused by libxml setting flush=TRUE on all calls. flush should only be set on the last read from an input buffer. A pivot buffer is also required for calls to ucnv_convertEx. See patch at http://crosreview.com729616 I will try and get this patch submitted into libxml.
,
Nov 1 2017
Patch sent to libxml mailing list: https://mail.gnome.org/archives/xml/2017-October/msg00012.html
,
Nov 7 2017
,
Nov 10 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/1932201570802d524f80c5055644bfaac65a5a86 commit 1932201570802d524f80c5055644bfaac65a5a86 Author: Joel Hockey <joelhockey@chromium.org> Date: Fri Nov 10 04:53:59 2017 Roll libxml to 72182550926d31ad17357bd3ed69e49d7e69df02 Fixes ICU flush and pivot buffer causing fuzzing error. BUILD.gn updated to not set WIN32_LEAN_AND_MEAN which is defined in libxml. Bug: 722420 Change-Id: Ib4ff2f870f99fb78e02f9841e33f6ad7e72e059e Reviewed-on: https://chromium-review.googlesource.com/759919 Commit-Queue: Joel Hockey <joelhockey@chromium.org> Reviewed-by: Scott Graham <scottmg@chromium.org> Cr-Commit-Position: refs/heads/master@{#515451} [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/BUILD.gn [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/README.chromium [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/linux/config.h [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/linux/include/libxml/xmlversion.h [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/linux/xml2-config [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/mac/config.h [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/mac/include/libxml/xmlversion.h [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/Makefile.am [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/Makefile.in [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/SAX2.c [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/config.h.in [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/configure [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/configure.ac [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/dict.c [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/encoding.c [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/include/libxml/encoding.h [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/include/libxml/nanoftp.h [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/include/libxml/threads.h [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/include/win32config.h [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/include/wsockcompat.h [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/libxml.h [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/libxml2.spec [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/nanoftp.c [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/nanohttp.c [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/parser.c [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/parserInternals.c [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/runtest.c [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/testrecurse.c [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/threads.c [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/timsort.h [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/tree.c [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/uri.c [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/valid.c [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/win32/Makefile.mingw [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/win32/Makefile.msvc [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/xmlIO.c [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/xmlmemory.c [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/xmlwriter.c [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/xpath.c [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/win32/config.h [modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/win32/include/libxml/xmlversion.h
,
Nov 10 2017
ClusterFuzz has detected this issue as fixed in range 515450:515459. Detailed report: https://clusterfuzz.com/testcase?key=5525936635379712 Fuzzer: libFuzzer_libxml_xml_read_memory_fuzzer Job Type: libfuzzer_chrome_ubsan Platform Id: linux Crash Type: Undefined-shift Crash Address: Crash State: ucnv_UTF8FromUTF8 ucnv_convertEx_59 xmlUconvWrapper Sanitizer: undefined (UBSAN) Regressed: https://clusterfuzz.com/revisions?job=libfuzzer_chrome_ubsan&range=395640:395746 Fixed: https://clusterfuzz.com/revisions?job=libfuzzer_chrome_ubsan&range=515450:515459 Reproducer Testcase: https://clusterfuzz.com/download?testcase_id=5525936635379712 See https://chromium.googlesource.com/chromium/src/+/master/testing/libfuzzer/reference.md for more information. If you suspect that the result above is incorrect, try re-doing that job on the test case report page.
,
Nov 10 2017
ClusterFuzz testcase 5525936635379712 is verified as fixed, so closing issue as verified. If this is incorrect, please add ClusterFuzz-Wrong label and re-open the issue. |
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by msrchandra@chromium.org
, May 16 2017Labels: M-60 Test-Predator-Wrong-CLs
Owner: js...@chromium.org
Status: Assigned (was: Untriaged)