New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 636558 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Aug 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug

Blocked on:
issue 637866

Blocking:
issue 634965



Sign in to add a comment

Windows ASan crashing all over the place due to StringImpl ref count issues

Project Member Reported by och...@chromium.org, Aug 10 2016

Issue description

After  bug 635715 , it appears that we are getting a large volume of crash reports on CF that weren't found on other platforms/configurations, even though the majority of these look like they shouldn't be specific to Windows. Some examples that I filed:  bug 636457 ,  bug 636456 ,  bug 636453 . 

I can reproduce these with builds pulled from https://commondatastorage.googleapis.com/chromium-browser-asan/index.html?prefix=win32-release. However, none of these reproduce on Linux ASan builds (32 or 64-bit) or a locally built Windows Chrome without any sanitizers (with full pageheap turned on).

Reid, could you please take a look into these to see if anything is going wrong with the ASan instrumentation on Windows?
 

Comment 1 by och...@chromium.org, Aug 10 2016

Labels: Stability-Memory-AddressSanitizer

Comment 2 by aarya@google.com, Aug 10 2016

Blocking: 634965
Cc: kcc@chromium.org aizatsky@chromium.org etienneb@chromium.org
This is really bad, it is good we only filed a few reports as otherwise this would creating even more developer confusion due to false reports.

e.g. bad reports
https://cluster-fuzz.appspot.com/testcase?key=4930301359030272
https://cluster-fuzz.appspot.com/testcase?key=6093690592559104
https://cluster-fuzz.appspot.com/testcase?key=4556624876535808
https://cluster-fuzz.appspot.com/testcase?key=5618336430030848

Comment 3 by r...@chromium.org, Aug 10 2016

I can reliably repro locally with my local asan build, so this is not a goma thing.

Comment 4 by aarya@google.com, Aug 10 2016

It could be related to the roll, since we didn't see these crashes before. but yes, any of these should reliably reproduce, and i also tried with blank asan_options (unlike CF), i still see same bad crash.

Comment 5 by r...@chromium.org, Aug 11 2016

I spent a while debugging this, and if you disable optimizations on SelectorChecker::checkPseudoElement with '#pragma clang optimize off', the report makes a little more sense:

    #0 0x1ad8f194 in WTF::AtomicString::isEmpty C:\src\chromium\src\third_party\WebKit\Source\wtf\text\WTFString.h:107
    #1 0x1ad89c28 in blink::SelectorChecker::checkPseudoClass C:\src\chromium\src\third_party\WebKit\Source\core\css\SelectorChecker.cpp:657

Now we see that we really are accessing a string that asan thinks has been freed. It is accessing offset 4 of a StringImpl, to pull out m_length, which is where we UAF.

I don't know why the debug location is off.

As for why it doesn't reproduce on other platforms... I don't know. It could either be a compiler bug in clang, or an environmental thing that we share with MSVC, like right-to-left argument evaluation order.

Comment 6 by r...@chromium.org, Aug 11 2016

Yeah, more and more I'm becoming convinced that this is a refcount getting dropped on the floor, maybe because of some compiler or platform-specific handle passing bug. These bytes make for a very very plausible freed StringImpl:

3:029> db 0x056228a0
056228a0  a7 00 80 58 54 00 00 00-00 00 00 ba 0a 41 41 41  ...XT........AAA
056228b0  41 41 41 41 41 41 41 41-41 41 41 41 41 41 41 41  AAAAAAAAAAAAAAAA
056228c0  41 41 41 41 41 41 41 41-41 41 41 41 41 41 41 41  AAAAAAAAAAAAAAAA
056228d0  41 41 41 41 41 41 41 41-41 41 41 41 41 41 41 41  AAAAAAAAAAAAAAAA
056228e0  41 41 41 41 41 41 41 41-41 41 41 41 41 41 41 41  AAAAAAAAAAAAAAAA
056228f0  41 41 41 41 41 41 41 41-41 41 41 41 41 20 63 0a  AAAAAAAAAAAAA c.

There's 12 bytes of StringImpl, with a screwy refcount, and 84 bytes of plausible ASCII data.

Comment 7 by aarya@google.com, Aug 11 2016

These kind of bugs are showing up with one particular fuzzer bj_broddelwerk which is known to create recursive trees in editing. Could this be an indication of oom somehow which ASAN is incorrectly processing. Based on even more ton of reports today, it is hard to believe this as legit.
Cc: yoichio@chromium.org tkent@chromium.org yosin@chromium.org
free stack is common in many reports, could be legit too.

freed by thread T0 here:
    #0 0xe224ea8 in free e:\b\build\slave\win_upload_clang\build\src\third_party\llvm\projects\compiler-rt\lib\asan\asan_malloc_win.cc:44
    #1 0x793471f in WTF::StringImpl::destroyIfNotStatic third_party/WebKit/Source/wtf/text/StringImpl.cpp:286
    #2 0xac7d6fb in blink::BreakingContext::handleText third_party/WebKit/Source/core/layout/line/BreakingContextInlineHeaders.h:915
    #3 0xac6fa4f in blink::LineBreaker::nextLineBreak third_party/WebKit/Source/core/layout/line/LineBreaker.cpp:86

+cc some editing folks as well.
Labels: Restrict-View-SecurityTeam

Comment 10 by r...@chromium.org, Aug 11 2016

I think it's a legit UAF, and there are one of two possibilities:

1. This is a real, Windows-specific UAF that affects Chrome's official build with MSVC. This is what we set out to find in the first place. :)

2. This UAF is only present when using clang as the compiler, either because the code is relying on some undefined or implementation defined behavior, or because there is a bug in clang.

I can easily imagine that the refcounting smart pointers used by webkit have some hidden gotchas.

Comment 11 by aarya@google.com, Aug 11 2016

Summary: Windows ASan crashing all over the place due to StringImpl ref count issues (editing). (was: Windows ASan builds seemingly giving bogus reports on CF.)
Cc: esprehn@chromium.org

Comment 13 by aarya@google.com, Aug 11 2016

Cc: e...@chromium.org dgro...@chromium.org och...@chromium.org msten...@opera.com
 Issue 635047  has been merged into this issue.

Comment 14 by r...@chromium.org, Aug 11 2016

I've tried a lot of different things here. I've manually instrumented chrome so I can track the bad allocation refcount operations, I have a log of the refcount operations, and I'm diffing that log against MSVC and looking at the points in the log where things start to differ. It's slow going, though.
Components: Blink>Internals>WTF
Summary: Windows ASan crashing all over the place due to StringImpl ref count issues (was: Windows ASan crashing all over the place due to StringImpl ref count issues (editing).)
At first I thought this might be the threaded HTML parser since I see that in a bunch of the stacks, but aarya@ has some that make no sense:
https://cluster-fuzz.appspot.com/testcase?key=6025319368884224

which makes me think maybe this is a compiler/optimizer bug?

Comment 16 by r...@chromium.org, Aug 12 2016

I'm not sure it's the compiler's fault, I think it might be some subtle corner case in copy/move semantics.

Clang is picking a different constructor overload to do something somewhere either in LazyLineBreakIterator or some password form autofill code (sorry, I have it in the debugger at work). It has something to do with WebPrivatePtr in WebString. BTW that code is a heinous ODR violation, I wouldn't be surprised if that's the issue.

Comment 17 by r...@chromium.org, Aug 12 2016

My evidence for these claims is that I have a log of all the refcount operations on the WTF::StringImpl that is used after free, and the refcount logs diverge between MSVC and Clang somewhere in the code that I mentioned. I'll have to dig more tomorrow.

Comment 18 by msten...@opera.com, Aug 12 2016

Cc: -msten...@opera.com

Comment 19 by r...@chromium.org, Aug 12 2016

I'm pretty sure the bug is inside blink::BreakingContext::handleText. In an asan build, we come out of that function with one less refcount than we do with MSVC.

Comment 20 by r...@chromium.org, Aug 12 2016

Now I'm pretty sure this is a clang bug. My bet is it's a bad interaction between ASan alloca instrumentation and Windows pass-by-value conventions.

Comment 21 by r...@chromium.org, Aug 12 2016

This looks like a good reduction:

struct Foo {
  Foo();
  Foo(const Foo &o);
  ~Foo();
  int a;
};
__forceinline void h(Foo o) {}
__forceinline void g() { h(Foo()); }
void f() { g(); }

Clang miscompiles it. This comes up in BreakingContext::handleText because it calls lots of functions marked ALWAYS_INLINE, and one of them happens to contain an inalloca call (lastBreakablePositionForBreakAll).

Removing ALWAYS_INLINE makes the problem go away.

Comment 22 by r...@chromium.org, Aug 12 2016

Bug in the inliner, fixed in LLVM r278571.

Comment 23 by aarya@google.com, Aug 13 2016

Cc: h...@chromium.org
Thanks a lot Reid. Lets roll forward clang if possible early next week.

Comment 24 by r...@chromium.org, Aug 15 2016

Blockedon: 637866
Project Member

Comment 26 by bugdroid1@chromium.org, Aug 18 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/92c4889905b14456f84a4631eafb168328b5f8a5

commit 92c4889905b14456f84a4631eafb168328b5f8a5
Author: guidou <guidou@chromium.org>
Date: Thu Aug 18 10:14:35 2016

Revert of Roll clang 277962:278861 (patchset #5 id:80001 of https://codereview.chromium.org/2241413003/ )

Reason for revert:
This is suspect of breaking Mac 10.9 builder.

See
https://build.chromium.org/p/chromium.mac/builders/Mac10.9%20Tests%20%28dbg%29/builds/28850

Sample logs:
Retrying 1 test (retry #1)
[0817/224954:ERROR:kill_posix.cc(84)] Unable to terminate process group 8687: No such process
[ RUN      ] ThreadTest.StartWithOptions_StackSize
[2060/2060] ThreadTest.StartWithOptions_StackSize (CRASHED)
Retrying 1 test (retry #2)
[0817/224955:ERROR:kill_posix.cc(84)] Unable to terminate process group 8688: No such process
[ RUN      ] ThreadTest.StartWithOptions_StackSize
[2061/2061] ThreadTest.StartWithOptions_StackSize (CRASHED)
Retrying 1 test (retry #3)
[0817/224955:ERROR:kill_posix.cc(84)] Unable to terminate process group 8689: No such process
[ RUN      ] ThreadTest.StartWithOptions_StackSize
[2062/2062] ThreadTest.StartWithOptions_StackSize (CRASHED)
1 test crashed:
    ThreadTest.StartWithOptions_StackSize (../../base/threading/thread_unittest.cc:135)
Tests took 21 seconds.
Additional test environment:
    CHROME_DEVEL_SANDBOX=/opt/chromium/chrome_sandbox
    LANG=en_US.UTF-8
Command: ./base_unittests --brave-new-test-launcher --test-launcher-bot-mode --test-launcher-summary-output=/b/swarm_slave/w/ionk6XHB/output.json

Original issue's description:
> Roll clang 277962:278861
>
> BUG= 636558 , 637866 
>
> Committed: https://crrev.com/29ecb32ba7b5ca183782c5f5d2b7a1ae23309edb
> Cr-Commit-Position: refs/heads/master@{#412745}

TBR=rnk@chromium.org,dcheng@chromium.org,dpranke@chromium.org,inferno@chromium.org,hans@chromium.org
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG= 636558 , 637866 

Review-Url: https://codereview.chromium.org/2249163007
Cr-Commit-Position: refs/heads/master@{#412785}

[modify] https://crrev.com/92c4889905b14456f84a4631eafb168328b5f8a5/base/threading/thread_unittest.cc
[modify] https://crrev.com/92c4889905b14456f84a4631eafb168328b5f8a5/build/config/compiler/BUILD.gn
[modify] https://crrev.com/92c4889905b14456f84a4631eafb168328b5f8a5/tools/clang/scripts/update.py

Project Member

Comment 28 by bugdroid1@chromium.org, Aug 19 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/b557e12b23a45aaff07428fc164934020e10c4ff

commit b557e12b23a45aaff07428fc164934020e10c4ff
Author: hans <hans@chromium.org>
Date: Fri Aug 19 01:24:02 2016

Revert of Roll clang 277962:278861 (patchset #6 id:100001 of https://codereview.chromium.org/2241413003/ )

Reason for revert:
This broke the build on a certain Mac builder:
https://build.chromium.org/p/chromium/builders/Mac/builds/18669

It looks like the build is timing out maybe? All other Mac
builders seem happy though, which makes this confusing, but
the buildbot logs very clearly implicate the roll: the same
kind of error happened last time it landed, and it went away
on the previous revert.

Original issue's description:
> Roll clang 277962:278861
>
> BUG= 636558 , 637866 
>
> Committed: https://crrev.com/df7510f96322254211068a52c2b459c0fe0b2090
> Cr-Commit-Position: refs/heads/master@{#412943}

TBR=rnk@chromium.org,dcheng@chromium.org,dpranke@chromium.org,inferno@chromium.org
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG= 636558 , 637866 

Review-Url: https://codereview.chromium.org/2257333002
Cr-Commit-Position: refs/heads/master@{#412999}

[modify] https://crrev.com/b557e12b23a45aaff07428fc164934020e10c4ff/base/threading/thread_unittest.cc
[modify] https://crrev.com/b557e12b23a45aaff07428fc164934020e10c4ff/build/config/compiler/BUILD.gn
[modify] https://crrev.com/b557e12b23a45aaff07428fc164934020e10c4ff/tools/clang/scripts/update.py

Status: Fixed (was: Assigned)
Project Member

Comment 31 by sheriffbot@chromium.org, Aug 21 2016

Labels: -Restrict-View-SecurityTeam Restrict-View-SecurityNotify
Project Member

Comment 32 by sheriffbot@chromium.org, Nov 27 2016

Labels: -Restrict-View-SecurityNotify allpublic
This bug has been closed for more than 14 weeks. Removing security view restrictions.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot

Sign in to add a comment