New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 722420 link

Starred by 1 user

Issue metadata

Status: Verified
Owner:
Closed: Nov 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux , Android , Windows , Chrome , Mac , Fuchsia
Pri: 2
Type: Bug



Sign in to add a comment

Undefined-shift in ucnv_UTF8FromUTF8

Project Member Reported by ClusterFuzz, May 15 2017

Issue description

Detailed report: https://clusterfuzz.com/testcase?key=5525936635379712

Fuzzer: libfuzzer_libxml_xml_read_memory_fuzzer
Job Type: libfuzzer_chrome_ubsan
Platform Id: linux

Crash Type: Undefined-shift
Crash Address: 
Crash State:
  ucnv_UTF8FromUTF8
  ucnv_convertEx_59
  xmlUconvWrapper
  
Sanitizer: undefined (UBSAN)

Regressed: https://clusterfuzz.com/revisions?job=libfuzzer_chrome_ubsan&range=395640:395746

Reproducer Testcase: https://clusterfuzz.com/download?testcase_id=5525936635379712


Issue filed automatically.

See https://chromium.googlesource.com/chromium/src/+/master/testing/libfuzzer/reproducing.md for more information.
 
Cc: msrchandra@chromium.org
Labels: M-60 Test-Predator-Wrong-CLs
Owner: js...@chromium.org
Status: Assigned (was: Untriaged)
Predator and CL did not provide any possible suspects.
Using Code Search for the file, "ucnv_u8.cpp" assigning to the concern file.
Suspecting Commit#
https://chromium.googlesource.com/chromium/deps/icu.git/+/87232d8d763692d4b8303f0194472a82c829a6a9

@jshin -- Could you please look into the issue, kindly re-assign if this is not related to your changes.
Thank You.

Comment 2 by js...@chromium.org, Jun 13 2017

Cc: mscherer@google.com
A minimized test has an offending byte sequence of:

f1 91 93 96 91 94

The first four bytes represent U+514D6, but the last two are invalid. 

Stack:

../../third_party/icu/source/common/ucnv_u8.cpp:918:29: runtime error: left shift of negative value -213035503
#0 0x4ee7d3 in ucnv_UTF8FromUTF8(UConverterFromUnicodeArgs*, UConverterToUnicodeArgs*, UErrorCode*) 

third_party/icu/source/common/ucnv_u8.cpp:918:29
#1 0x4cbb3a in ucnv_convertEx_59 third_party/icu/source/common/ucnv.cpp:2286:13
#2 0x4c30ec in xmlUconvWrapper third_party/libxml/src/encoding.c:0:9

---------------------------

            while(toULength<toULimit) {
                if(source<sourceLimit) {
                    b=*source;
                    if(U8_IS_TRAIL(b)) {
                        ++source;
                        ++toULength;
                        c=(c<<6)+b;

Line 918 is |c=(c<<6)+b;|.  |c| is UChar32 (typedef'd to int32_t) and |b| is uint8. UBsan complains: 

left-shift of negative value -213035503  ( 11110011 01001101 01010110 00010001 )


Project Member

Comment 3 by ClusterFuzz, Oct 1 2017

Components: Blink>XML
Labels: Test-Predator-AutoComponents
Automatically applying components based on information from OWNERS files. If this seems incorrect, please apply the Test-Predator-Wrong-Components label.
I don't see any way that c can become negative with the example sequence. I do see the possibility for some other inputs.

toULimit=U8_COUNT_TRAIL_BYTES(0xf1)+1=4

so worst case we are at toULength=3 and are reading the last byte of the 4-byte sequence. In this case, c is ((0xf1<<6)+0x91)<<6)+0x93=0xF34D3 before we shift it left by 6 once more and then add 0x96, getting 0x3CD3553. No overflow, only the lower 26 bits are used.

For the illegal trail bytes, toULimit=1 so we don't enter the loop.

I can see potential for overflow only for lead bytes F8..FD for the old 5- and 6-byte UTF-8 sequences that ICU 59 still reads as single errors. These lead bytes would be shifted left by 24 or 30 bits. This does not happen any more in the upcoming ICU 60 because it does not recognize bytes over F4 as lead bytes at all any more.

This would also not have been a regression in ICU 59, because this code had not changed for many years.
Owner: joelhockey@chromium.org
Hex dump of the character sequence in question taken from the minimized test file:
$ xxd ~/Downloads/clusterfuzz-testcase-minimized-5525936635379712 

000000d0: 3d22 656e 223e f191 9893 9691 948c 3d22  ="en">........="

Things to note:
1/ sequence to decode: f1 91 98 93 96 91 94 8c (slightly different to what is given above in #c2)
2/ This sequence starts at offset 214

libxml reads 180 bytes of chars at the start of a file to decode the xmldecl.
https://cs.chromium.org/chromium/src/third_party/libxml/src/encoding.c?l=2124&rcl=adb61db19020ed8ecee5e91b1a0ea4c924ae2988

For some reason, when this call gets through to ucnv_u8 to do the utf8 decoding, it is starting at offset 36 (rather than the expected 0) and it reads 180 chars which takes it to an incomplete read of 'f1 91'.
https://cs.chromium.org/chromium/src/third_party/icu/source/common/ucnv_u8.cpp?l=771&rcl=21d33b1a09a77f033478ea4ffffb61e6970f83bd

The parser seems to recognize that it has read truncated input and there is code that appears that it should handle this OK and pick up correctly when it next starts reading input.

This code saves the fact that it should be reading a 4-byte sequence (toULimit=4)
https://cs.chromium.org/chromium/src/third_party/icu/source/common/ucnv_u8.cpp?l=990&rcl=21d33b1a09a77f033478ea4ffffb61e6970f83bd

This code sets an error code as TRUNCATED_CHAR_FOUND
https://cs.chromium.org/chromium/src/third_party/icu/source/common/ucnv.cpp?l=1457&rcl=21d33b1a09a77f033478ea4ffffb61e6970f83bd

When libxml next calls for utf8 decoding, it restores some of the state, but gets one field incorrect.
https://cs.chromium.org/chromium/src/third_party/icu/source/common/ucnv_u8.cpp?l=794&rcl=21d33b1a09a77f033478ea4ffffb61e6970f83bd

It correctly retrieves the previous value for c and for toULimit, but it sets toULength=0 rather than 2.  This means that rather than reading only 2 more bytes (98 93), it reads 4 more bytes (98 93 96 91) and blows up.

I believe the place where toULength is being set to zero in error is at:
https://cs.chromium.org/chromium/src/third_party/icu/source/common/ucnv.cpp?l=1533&rcl=21d33b1a09a77f033478ea4ffffb61e6970f83bd

Commenting out that line fixes this error, but I don't yet understand what other consequences it may have.  I will investigate some more and maybe see if I can contact original author or any maintainer.
Markus, are you are the maintainer of this code at ICU?  I apologise for showing my ignorance.

Do you think removing the line of code that I suggested would work as a fix?

When I trace through the calls from cnv->fromCharErrorBehaviour using UCNV_TO_U_DEFAULT_CALLBACK, I don't see anywhere that the toULength field is used.

Or would it work to save the value before calling fromCharErrorBehaviour, and then set the value back?
https://cs.chromium.org/chromium/src/third_party/icu/source/common/ucnv.cpp?l=1533&rcl=21d33b1a09a77f033478ea4ffffb61e6970f83bd

    /* save/set the converter state to deal with the next character */
(+) int oldToULength = cnv->toULength;
    cnv->toULength=0;

    /* call the callback function */
    if(cnv->toUCallbackReason==UCNV_ILLEGAL && *err==U_INVALID_CHAR_FOUND) {
        cnv->toUCallbackReason = UCNV_UNASSIGNED;
    }
    cnv->fromCharErrorBehaviour(cnv->toUContext, pArgs,
    cnv->invalidCharBuffer, errorInputLength,
        cnv->toUCallbackReason,
        err);
(+) cnv->toULength = oldToULength; /* reset to old value */
    cnv->toUCallbackReason = UCNV_ILLEGAL; /* reset to default value */



Comment 7 by js...@chromium.org, Oct 20 2017

Cc: js...@chromium.org
Labels: -M-60 M-61 OS-Android OS-Chrome OS-Fuchsia OS-Mac OS-Windows

Comment 8 by mmoroz@chromium.org, Oct 24 2017

For more information, please see https://chromium.googlesource.com/chromium/src/+/master/testing/libfuzzer/reference.md.

The link referenced in the description is no longer valid.

(bulk edit)
This error is caused by libxml setting flush=TRUE on all calls.  flush should only be set on the last read from an input buffer.  A pivot buffer is also required for calls to ucnv_convertEx.

See patch at http://crosreview.com729616

I will try and get this patch submitted into libxml.
Patch sent to libxml mailing list:
https://mail.gnome.org/archives/xml/2017-October/msg00012.html
Labels: -Test-Predator-AutoComponents Test-Predator-Auto-Components
Project Member

Comment 12 by bugdroid1@chromium.org, Nov 10 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/1932201570802d524f80c5055644bfaac65a5a86

commit 1932201570802d524f80c5055644bfaac65a5a86
Author: Joel Hockey <joelhockey@chromium.org>
Date: Fri Nov 10 04:53:59 2017

Roll libxml to 72182550926d31ad17357bd3ed69e49d7e69df02

Fixes ICU flush and pivot buffer causing fuzzing error.

BUILD.gn updated to not set WIN32_LEAN_AND_MEAN which is
defined in libxml.

Bug:  722420 
Change-Id: Ib4ff2f870f99fb78e02f9841e33f6ad7e72e059e
Reviewed-on: https://chromium-review.googlesource.com/759919
Commit-Queue: Joel Hockey <joelhockey@chromium.org>
Reviewed-by: Scott Graham <scottmg@chromium.org>
Cr-Commit-Position: refs/heads/master@{#515451}
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/BUILD.gn
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/README.chromium
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/linux/config.h
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/linux/include/libxml/xmlversion.h
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/linux/xml2-config
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/mac/config.h
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/mac/include/libxml/xmlversion.h
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/Makefile.am
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/Makefile.in
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/SAX2.c
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/config.h.in
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/configure
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/configure.ac
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/dict.c
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/encoding.c
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/include/libxml/encoding.h
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/include/libxml/nanoftp.h
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/include/libxml/threads.h
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/include/win32config.h
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/include/wsockcompat.h
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/libxml.h
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/libxml2.spec
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/nanoftp.c
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/nanohttp.c
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/parser.c
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/parserInternals.c
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/runtest.c
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/testrecurse.c
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/threads.c
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/timsort.h
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/tree.c
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/uri.c
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/valid.c
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/win32/Makefile.mingw
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/win32/Makefile.msvc
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/xmlIO.c
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/xmlmemory.c
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/xmlwriter.c
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/src/xpath.c
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/win32/config.h
[modify] https://crrev.com/1932201570802d524f80c5055644bfaac65a5a86/third_party/libxml/win32/include/libxml/xmlversion.h

Project Member

Comment 13 by ClusterFuzz, Nov 10 2017

ClusterFuzz has detected this issue as fixed in range 515450:515459.

Detailed report: https://clusterfuzz.com/testcase?key=5525936635379712

Fuzzer: libFuzzer_libxml_xml_read_memory_fuzzer
Job Type: libfuzzer_chrome_ubsan
Platform Id: linux

Crash Type: Undefined-shift
Crash Address: 
Crash State:
  ucnv_UTF8FromUTF8
  ucnv_convertEx_59
  xmlUconvWrapper
  
Sanitizer: undefined (UBSAN)

Regressed: https://clusterfuzz.com/revisions?job=libfuzzer_chrome_ubsan&range=395640:395746
Fixed: https://clusterfuzz.com/revisions?job=libfuzzer_chrome_ubsan&range=515450:515459

Reproducer Testcase: https://clusterfuzz.com/download?testcase_id=5525936635379712

See https://chromium.googlesource.com/chromium/src/+/master/testing/libfuzzer/reference.md for more information.

If you suspect that the result above is incorrect, try re-doing that job on the test case report page.
Project Member

Comment 14 by ClusterFuzz, Nov 10 2017

Labels: ClusterFuzz-Verified
Status: Verified (was: Assigned)
ClusterFuzz testcase 5525936635379712 is verified as fixed, so closing issue as verified.

If this is incorrect, please add ClusterFuzz-Wrong label and re-open the issue.

Sign in to add a comment