New issue
Advanced search Search tips

Issue 893950 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner:
Closed: Oct 16
Cc:
EstimatedDays: ----
NextAction: ----
OS: Linux
Pri: 2
Type: Bug



Sign in to add a comment

CFI failures with unbundled libxml

Reported by evange...@foutrelis.com, Oct 10

Issue description

UserAgent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.45 Safari/537.36

Steps to reproduce the problem:
1. Build Chromium 70 with:

   - Clang 7 (for some reason Clang 6 works fine!)
   - Unbundled libxml
   - GN args: is_cfi=true use_cfi_icall=true

2. Visit https://sourceware.org/git/

What is the expected behavior?

What went wrong?
Tab crashes immediately with "Aw, Snap!".

Crashed report ID: (Not available in Chromium builds.)

How much crashed? Just one tab

Is it a problem with a plugin? No 

Did this work before? N/A 

Chrome version: 70.0.3538.45  Channel: beta
OS Version: 
Flash Version: 

To be clear, this doesn't affect Chrome; only Chromium built with Clang 7, unbundled libxml and cfi-icall enabled is affected.

The crash appears to be caused by CFI violations by calls to xmlMalloc and xmlFree. (See attached failure log above.)

vtsyrklevich@ and/or pcc@ might know why this only happens with Clang 7 (and not Clang 6) and can possibly suggest a way to make this work for unofficial Chromium builds with unbundled libxml.
 
libxml-cfi-failures.txt
1.7 KB View Download
Owner: vtsyrklevich@chromium.org
Can you explain how to perform an 'unbundled libxml' build?
Try these commands:

  $ cp build/linux/unbundle/libxml.gn third_party/libxml/BUILD.gn
  $ rm -r third_party/libxml/{src,linux}

(unbundle/libxml.gn copied over third_party/libxml/BUILD.gn will make it link to the system libxml. Removing the bundled libxml sources is a safeguard to be certain that the system libxml headers are used.)
Cc: p...@chromium.org
Getting this build to work required also making libxslt unbundled (otherwise the mismatched system libxml/chromium libxslt APIs did not line up) and adding a new public_config in libxml to add an include_dir to pull in the ICU headers.

The error is happening because chromium is trying to call function pointers in libxml that can't be verified as they are in another DSO. The only solution that I can think of is adding additional blacklist entries in your build to account for the call sites that might call those function pointers (xmlFree/xmlMalloc), e.g. by adding the following to tools/cfi/blacklist.txt.
[cfi-icall]
src:*third_party/libxslt/*
src:*third_party/libxml/*
src:*third_party/blink/renderer/core/xml/*

This is not something that I think we should upstream; however, I'm keeping this ticket open to take a look at the clang 6/7 discrepancy you noted.
I'm having difficulty building M70 with clang6 because of numerous flags in use not being supported and old build issues like https://bugs.chromium.org/p/chromium/issues/detail?id=834474 that have since been fixed. Are you sure you actually managed to build it with clang 6 and cfi-icall?
Probably not relevant anymore (see minimal xmlFree example bellow), but to build Chromium with Clang 6, "treat_warnings_as_errors=false" is likely needed. I was not building/running any tests, only the "chrome" and "chrome_sandbox" targets.

As for Chromium with system libxml, perhaps I could try adding a blacklist for the xmlMalloc/xmlFree functions similar to https://chromium-review.googlesource.com/966907 (not merged, a different fix was committed). I'm not sure however if xmlMalloc and xmlFree are the only affected functions.

I came up with a minimal example showing the difference in cfi-icall behavior between Clang 6/7. I have defined a custom free() which gets called by xmlFree. Chromium does something similar I believe, by providing a shim for malloc, free, etc. (If we had kept glibc's symbol, Clang 6 would actually recognize the CFI violation.)

Tested the following on Ubuntu 18.04 (to rule out this being an issue affecting only my Arch packages):

  $ clang-6.0 -flto -fsanitize=cfi-icall -fno-sanitize-trap=cfi -fsanitize-recover=cfi xml-free-test.c $(pkg-config --cflags --libs libxml-2.0)
  $ ./a.out

(No CFI error with Clang 6!)

Same program compiled with Clang 7 on Arch:

  $ clang-7 -flto -fsanitize=cfi-icall -fno-sanitize-trap=cfi -fsanitize-recover=cfi xml-free-test.c $(pkg-config --cflags --libs libxml-2.0)
  $ ./a.out
  xml-free-test.c:7:2: runtime error: control flow integrity check for type 'void (void *)' failed during indirect function call
(/../a.out+0x29980): note: (unknown) defined here

Hopefully you can figure out why Clang 6 behaves differently.
xml-free-test.c
120 bytes View Download
Come to think of it, it could also be a bug in LLVM. Both in Chromium's CFI reports and in the previous example, the (unknown) symbols are in the same executable but somehow lack type (?) information. 🤔
Bisecting points to: https://reviews.llvm.org/D47652 [1]

Is this something that can be fixed so the "xml-free-test.c" example above doesn't trigger a CFI failure?

[1] https://github.com/llvm-mirror/llvm/commit/7dc602e516
Now I understand what the difference is, clang 6 lacked an optimization that allowed the current libxml behavior to work correctly even though it's not supported. Once the optimization was applied libxml no longer saved a CFI-valid pointer and chromium code failed to validate it. This is not a CFI bug so unfortunately this isn't something we can address in the compiler. My best suggestion would be to maintain a patch to the blacklist like what I've mentioned above to disable cfi-icall checking for code that calls that function pointer.
Fair enough, thanks for looking into it. Feel free to close as WontFix. For now, replacing the calls to xmlMalloc/xmlFree with malloc/free seems to work for me. I'll also look into blacklisting like you suggested.

[I'm somewhat considering just bundling everything in Arch's Chromium (like Chrome does)... using system libraries is nice, but these CFI issues keep popping up!]
Status: WontFix (was: Unconfirmed)

Sign in to add a comment