New issue
Advanced search Search tips
Starred by 7 users

Issue metadata

Status: Fixed
Owner: ----
Closed: Apr 2015
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 2
Type: Bug



Sign in to add a comment

Security: OSX: TCP OOB data will hang main process event loop

Reported by ken...@corp.sandstorm.io, Nov 30 2014 Back to list

Issue description

Note: This is "merely" a DoS attack (that happens to take down the whole browser with no more than a single, perfectly-valid TCP packet delivered on any connection regardless of protocol), so maybe not considered a critical security problem for Chrome. However, as this affects many other OSX apps -- and may be a serious problem for servers in particular -- the details may be sensitive. I am reporting the problem directly to Apple as well.

VULNERABILITY DETAILS
Please provide a brief explanation of the security issue.

If Chrome on Mac OSX receives, over any network connection, a single byte of out-of-band data, the main browser process will go into a busy loop and hang, locking up the UI and requiring a force-quit.

The problem is that OSX's kqueue interface raises EVFILT_READ events on OOB data. However, Chrome does not expect OOB data, therefore it just does a normal recv(), gets zero bytes back, and calls it a day, without ever checking for OOB. Chrome uses kqueue in level-triggered mode, so when it returns to the event queue, the same event is returned again. This leads to a busy loop and starvation of other events.

The fix is (I think) to check for the EV_OOBAND flag on EVFILT_READ events and, if present, perform a recv() with MSG_OOB to clear the OOB buffer (which is, I think, never allowed to contain more than one byte). Or maybe just close the connection, because it's evil.

This problem does not affect any other platform because all other platforms' event-handling APIs treat OOB data as a whole different event type, not as a read event. For instance, the classic poll() system call has POLLIN for read and POLLPRI for OOB data. Even FreeBSD does not raise EVFILT_READ on OOB data; this seems to be a "feature" of OSX alone.

VERSION
Chrome Version: 39.0.2171.71 stable
Operating System: Mac OSX

REPRODUCTION CASE

Connect to oob.sandstorm.io port 25436. The server will return to you a single byte of OOB data, locking up Chrome on Mac.

I have attached the source code for this server, if you want to test locally.

FOR CRASHES, PLEASE INCLUDE THE FOLLOWING ADDITIONAL INFORMATION
Type of crash: browser
Crash State: infinite loop trying to handle I/O event
Client ID (if relevant): N/A

 
sendoobserver.c
1.6 KB Download

Comment 1 by kenrb@chromium.org, Dec 1 2014

Cc: rsesek@chromium.org
Labels: Security_Impact-Stable Security_Severity-Low Pri-2
Owner: byungchul@chromium.org
Status: Assigned (was: NULL)
Thanks for the report. rsesek@ has verified the issue.

byungchul@, can you please have a look at this or help reassign? It needs an owner who is familiar with the sockets code in base/.
Here's a sample from the process when this happens. It looks like this may be a bug in libevent itself or in how  MessagePumpLibevent uses it.

    1990 Thread_3021428: Chrome_IOThread
    + 1927 thread_start  (in libsystem_pthread.dylib) + 13  [0x7fff8dfa6fc9]
    + ! 1927 _pthread_start  (in libsystem_pthread.dylib) + 137  [0x7fff8dfa272a]
    + !   1927 _pthread_body  (in libsystem_pthread.dylib) + 138  [0x7fff8dfa2899]
    + !     1927 base::(anonymous namespace)::ThreadFunc(void*)  (in Chromium Framework) + 171  [0x10c732e3b]  platform_thread_posix.cc:80
    + !       1927 base::Thread::ThreadMain()  (in Chromium Framework) + 211  [0x10c736e43]  thread.cc:228
    + !         1927 content::BrowserThreadImpl::IOThreadRun(base::MessageLoop*)  (in Chromium Framework) + 24  [0x10f0064a8]  browser_thread_impl.cc:218
    + !           1927 base::MessageLoop::Run()  (in Chromium Framework) + 29  [0x10c7074bd]  message_loop.cc:310
    + !             1927 base::RunLoop::Run()  (in Chromium Framework) + 99  [0x10c71cd03]  run_loop.cc:55
    + !               1927 base::MessagePumpLibevent::Run(base::MessagePump::Delegate*)  (in Chromium Framework) + 129  [0x10c6c3931]  message_pump_libevent.cc:236
    + !                 986 event_base_loop  (in Chromium Framework) + 669  [0x10c74780d]  event.c:504
    + !                 : 922 kq_dispatch  (in Chromium Framework) + 83  [0x10c749d13]  kqueue.c:232
    + !                 : | 922 kevent  (in libsystem_kernel.dylib) + 10,20,...  [0x7fff88c5f64a,0x7fff88c5f654,...]
    + !                 : 18 kq_dispatch  (in Chromium Framework) + 260  [0x10c749dc4]  kqueue.c:293
    + !                 : | 9 event_active  (in Chromium Framework) + 84,103,...  [0x10c747f64,0x10c747f77,...]  event.c:961
    + !                 : | 3 event_active  (in Chromium Framework) + 17  [0x10c747f21]  event.c:810
    + !                 : | 3 event_active  (in Chromium Framework) + 119  [0x10c747f87]  event.c:814
    + !                 : | 3 event_active  (in Chromium Framework) + 44  [0x10c747f3c]  event.c:954
    + !                 : 15 kq_dispatch  (in Chromium Framework) + 83,78  [0x10c749d13,0x10c749d0e]  kqueue.c:232
    + !                 : 5 kq_dispatch  (in Chromium Framework) + 0,1  [0x10c749cc0,0x10c749cc1]  kqueue.c:219
    + !                 : 5 kq_dispatch  (in Chromium Framework) + 85  [0x10c749d15]  kqueue.c:234
    + !                 : 4 kq_dispatch  (in Chromium Framework) + 244  [0x10c749db4]  kqueue.c:290
    + !                 : 3 kq_dispatch  (in Chromium Framework) + 320,111  [0x10c749e00,0x10c749d2f]  kqueue.c:246
    + !                 : 3 kq_dispatch  (in Chromium Framework) + 141,131  [0x10c749d4d,0x10c749d43]  kqueue.c:249
    + !                 : 3 kq_dispatch  (in Chromium Framework) + 222,230  [0x10c749d9e,0x10c749da6]  kqueue.c:270
    + !                 : 3 kq_dispatch  (in Chromium Framework) + 260  [0x10c749dc4]  kqueue.c:293
    + !                 : 3 kq_dispatch  (in Chromium Framework) + 398,391  [0x10c749e4e,0x10c749e47]  kqueue.c:298
    + !                 : 1 kq_dispatch  (in Chromium Framework) + 51  [0x10c749cf3]  kqueue.c:228
    + !                 : 1 kq_dispatch  (in Chromium Framework) + 92  [0x10c749d1c]  kqueue.c:235

bug-437642.txt
149 KB View Download
Indeed, the bug is partly in libevent, but it's unclear how libevent could fully solve it without breaking abstractions.

On OSX, libevent needs to look for the (undocumented) EV_OOBAND flag when it gets an EVFILT_READ event from kqueue(2). What to do next is unclear. Seemingly the "right" thing to do would be for it to inform the application through the libevent API that OOB data is available, but this is an API change, and it would then be up to the app to handle that OOB data (you can lead a horse to water...). Another option would be for libevent to automatically read and discard the OOB data -- more of a hack, but probably fine for all existing users. Probably.

You might be able to fix the problem without a change to libevent with this heuristic: If you receive a "read" event from libevent, but read() immediately fails with EAGAIN (i.e. there is no data available), then additionally do a MSG_OOB recv() to check for OOB data. This could waste a syscall in some cases, but should be rare enough not to affect performance.

-Kenton
Owner: mmenke@chromium.org
I believe Matt is better person to handle this.
Cc: mmenke@chromium.org
Labels: Cr-Internals-Network
Owner: ----
I'm not familiar with OOB data, OSX, or the net code in base, and just don't have any time to work on this.

Comment 6 by rsesek@chromium.org, Dec 19 2014

I believe we can set SO_OOBINLINE to have the normal read() return OOB data.

mmenke: Do you think that'd be reasonable to do that in //net's POSIX socket implementation? If you're not the right decision maker, can you CC that person?

I've attached a simple client that complements the sendoobserver.c demonstrating that the SO_OOBINLINE can clear the OOB data.
recvoob.c
1.2 KB Download
Status: Available (was: NULL)
I think it's reasonable to handle OOB data in net/, and it seems like ignoring it is more consistent with behavior on other platforms than returning an error when we get it.

I do think this extra logic should be OSX/iOS-specific, since it's not needed on other POSIX platforms.

HOWEVER, I'm not sure setting SO_OOBINLINE is safe to do.  When set, we can't determine if what we're reading from the socket is inline data or not.  Is is guaranteed on OSX if we get a read notification, recv fails with EAGAIN, that 1) we have SO_OOBINLINE, and 2), the next byte of data we read with SO_OOBINLINE set will be OOB data?
I would argue that simply setting SO_OOBINLINE is fine on all platforms:

No protocol which Chrome speaks calls for OOB data or (I'm guessing) even specifies what to do with OOB data. Therefore, any time Chrome receives OOB data, the request is outside of spec and Chrome can respond any way it wants. So, interpreting OOB bytes as if they were sent as part of the main stream should be fine.

Meanwhile, if the sender is in fact malicious (which they almost certainly are, since it's pretty hard to "accidentally" send an OOB packet), inlining those OOB bytes doesn't give them any additional power to do damage, since they obviously could have sent arbitrary data in the main stream anyway.
That is changing current behavior pretty significantly.  Without some idea of impact, I'd be very reluctant to just arbitrarily change web-visible behavior, given the huge number of servers that do very weird, very broken stuff, that just happens to work given the current state of browsers.
I know this sounds crazy, but I actually think it would be reasonably safe to assume that no one in the world intentionally sends OOB data on HTTP connections. In addition to being obscure, the implemented behavior of OOB varies wildly -- many systems intentionally do not follow the spec. For example, on Linux, the OOB buffer is only 1 byte; if a second byte is received, the first byte is inlined at an arbitrary point. Generally, anyone who is touching OOB is asking for trouble.

Moreover, given that Mac Chrome has apparently for its entire history totally locked up any time it received OOB data, yet no one noticed, presumably it is not that common. :)

That said, reading and ignoring the bytes shouldn't be that hard either... but will probably require modifying libevent. So, up to you.
Re: #8: I can't find any guarantees about how the kernel is supposed to place data in the user queue if SO_OOBINLINE is set. It looks like that as OOB data are received, they are placed in the queue along-side normal inline data (so presumably FIFO irrespective of inline vs OOB).

Re: #11: I think I agree that it should be safe to set SO_OOBINLINE, because if servers were sending OOB data, then I think we would have heard about Chrome locking up like this long ago.

But if we don't want to set SO_OOBINLINE, I don't think this would require a change to libevent. Instead, when libevent signals the MessagePump to handle the socket, if the return value of read() is -1 w/ EAGAIN, it could then try and to a recv(…, MSG_OOB) to read and discard the the OOB data. It'd also be good to figure out a way to short-circuit this infinite EAGAIN loop that causes the IO thread to lock up, but I don't know enough about //net architecture to figure that out. E.g., if both the read() and recv() return EAGAIN, consider closing the socket after the MessagePump activates the read callback more than once or twice.

Changing libevent could make that a little cleaner, if it could report the OOB (or if Apple didn't raise EVFILT_READ for OOB), but I don't think it's strictly necessary.

On OS X, I also noticed that there's an SO_WANTOOBFLAG, which looks like it just always ensures that receive operations (even just read()) are called with MSG_OOB. I don't know how/why this exists or how (if at all) it differs from SO_OOBINLINE—it's not documented. It looks like this will cause it to always try reading OOB data first, and then inline if no OOB data was present. Contrast that to SO_OOBINLINE, which places the data in a different queue as it arrives.

FYI: This is the kernel source I was looking at http://opensource.apple.com/source/xnu/xnu-2422.115.4/ (specifically bsd/kern/uipc_socket.c:soreceive). I'm not very familiar with the BSD socket stuff in the kernel, so take this with a grain of salt :).
Thanks for the info, guys.  Given what you both said, my main concerns about just setting SO_OOBINLINE on posix:

1)  Differing behaviors across platforms.  The behavior of our socket classes should vary as little as possible across platforms.

2)  Imagine a server that sends OOB data...And then we have site that works with other browsers, but doesn't with chrome (Admittedly, would presumably hang other browsers on OSX, if they do what we're doing now).  I really don't think we want to anyone to get stuck investigating such a bug.  Note that if we have special smarts to add the SO_OOBINLINE flag only in the case it seems to be needed, emit a NetLog event, and then continue to read the data, there's actually a reasonable way to diagnose the issue, but if we just always apply the flag, the issue becomes much harder to investigate.

That having been said, I certainly do agree we want to resolve this fairly promptly.
A different question: are consumers of a net::SocketLibevent ever interested in OOB data, including TCP Urgent Data (RFC 6093 provides some color)? If yes, then it seems like SocketLibevent should have a mechanism for reading that data or notifying the client of the OOB data, rather than just treating it like normal inline data. E.g., SocketLibevent::Read could return ERR_OOB_DATA or somesuch to let the client know it needs to make a call to ::ReadOOBData() to handle it.

These semantics could be accomplished by having DoRead() first do a read(), and if that returned -1 EAGAIN, issue a recv(…, MSG_OOB|MSG_PEEK), and have that return ERR_OOB_DATA if rv >= 0. Then a call to ::ReadOOBData() would issue the same recv() without MSG_PEEK. If the recv() also returned EAGAIN, close the socket because something went weird.

For Chrome's HTTP stack, I think any server sending OOB data could safely be considered an error, and the connection could be closed with an error. For non-HTTP sockets, it would give the client/application an opportunity to handle the OOB data as it sees fit. That seems like a more correct thing to do, rather than trying to force OOB data into the inline data stream for clients of SocketLibevent. It seems impossible to do that correctly, since the class has no concept of how the higher-level application wants OOB data handled.
I'm fine with returning an error if we get OOB data.  As we don't currently support it, presumably no consumers are currently interested in OOB data, and we're most likely safe just erroring out.  Could see this breaking some sort of platform-dependent Android/iOS/ChromeOS thing where current embadders expect us just to ignore OOB data, so if we go this route, definitely want to leave plent of lead time to notice any such issues.
Labels: -Restrict-View-SecurityTeam -Type-Bug-Security -Security_Impact-Stable -Security_Severity-Low Restrict-View-EditIssue Type-Bug
Browser-level DoS is not considered a security vulnerability, so I'm removing this from that queue.
OK, as long as the issue remains restricted-access -- Considering that this issue affects a wide range of OSX apps, including servers, Apple might feel that it is a sensitive security issue for OSX.
FTR, I did throw together a patch for this a while ago: https://codereview.chromium.org/879723002/. But I couldn't figure out the right thing to do on other OSes (maybe just #ifdef this to OS_MACOSX?), so I didn't push it forward.
FYI, over the weekend I reported this problem to Node.js since it affects them too, and they rapidly put together a patch which they'd like to ship with a CVE ASAP. They decided to update libuv to set OOBINLINE and have already committed the change: https://github.com/libuv/libuv/commit/e19089f7b14800d3f9bd74d9699e841e4b61a36d

I guess this means that the problem is now public, for some definition of public.
According to Apple-SA-2015-04-08-2, this should be fixed in OS X 10.10.3:

Kernel
Available for:  OS X Yosemite v10.10 to v10.10.2
Impact:  A remote attacker may be able to cause a denial of service
Description:  A state inconsistency issue existed in the handling of
TCP out of band data. This issue was addressed through improved state
management.
CVE-ID
CVE-2015-1105 : Kenton Varda of Sandstorm.io
Indeed. Though they don't seem to have documented anywhere what they did to fix it (the short description they gave makes no sense). I'll experiment with it this afternoon.
I tested Chrome on 10.10.3, as well as your demo program, and this appears to be fixed.

I think the change occurred in soo_kqfilter. It's hard to figure out the fix without spending more time to recover the constants and struct offsets, but it looks like it avoids raising EVFILT_READ in a certain case (maybe unless SO_OOBINLINE). When Apple release the 10.10.3 source code, this should be easier to verify.

Attaching what I think is the relevant portion of the change.
soo_kqfilter-10.10-.2.png
80.6 KB View Download
soo_kqfilter-10.10-.3.png
96.5 KB View Download
Yep, looks fixed.

Here's my writeup: https://blog.sandstorm.io/news/2015-04-08-osx-security-bug.html

I guess this bug can be un-restricted now?
Labels: -Restrict-View-EditIssue CVE-2015-1105
Status: Fixed (was: NULL)
Yup. I wanted to wait for your confirmation.
Labels: OS-Mac
Labels: CVE_description-submitted

Sign in to add a comment