New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 889967 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner:
Closed: Dec 11
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug
Build-Toolchain

Blocking:
issue 862442



Sign in to add a comment

LLD: security_SandboxStatus failed on binaries linked with LLD

Project Member Reported by yunlian@chromium.org, Sep 27

Issue description

/tmp/test_that_results_FfpUNn/results-1-security_SandboxStatus                        [  FAILED  ]
/tmp/test_that_results_FfpUNn/results-1-security_SandboxStatus                          FAIL: Unhandled DevtoolsTargetCrashException: Devtools target crashed
/tmp/test_that_results_FfpUNn/results-1-security_SandboxStatus/security_SandboxStatus [  FAILED  ]
/tmp/test_that_results_FfpUNn/results-1-security_SandboxStatus/security_SandboxStatus   FAIL: Unhandled DevtoolsTargetCrashException: Devtools target crashed
/tmp/test_that_results_FfpUNn/results-2-security_SandboxStatus                        [  FAILED  ]

It says chrome crashed with SIGILL.


 
Indeed, chrome crashes when I try to login Chrome.
From the core file, it looks like it triggers a stack protection error
Below is the end part of assembler code for function mojo::StructTraits<mojo_base::mojom::FileDataView, base::File>::Read(mojo_base::mojom::FileDataView, base::File*):


   0x00005f0ce13d8df3 <+307>:	mov    %fs:0x28,%rcx
   0x00005f0ce13d8dfc <+316>:	cmp    -0x18(%rbp),%rcx
   0x00005f0ce13d8e00 <+320>:	jne    0x5f0ce13d8e0b <mojo::StructTraits<mojo_base::mojom::FileDataView, base::File>::Read(mojo_base::mojom::FileDataView, base::File*)+331>
   0x00005f0ce13d8e02 <+322>:	add    $0x40,%rsp
   0x00005f0ce13d8e06 <+326>:	pop    %rbx
   0x00005f0ce13d8e07 <+327>:	pop    %r14
   0x00005f0ce13d8e09 <+329>:	pop    %rbp
   0x00005f0ce13d8e0a <+330>:	retq   
   0x00005f0ce13d8e0b <+331>:	callq  0x5f0ce6f4b4d0 <__stack_chk_fail@plt>
=> 0x00005f0ce13d8e10 <+336>:	ud2    
   0x00005f0ce13d8e12 <+338>:	int3   
   0x00005f0ce13d8e13 <+339>:	ud2    
   0x00005f0ce13d8e15 <+341>:	pushq  $0x3a
   0x00005f0ce13d8e17 <+343>:	callq  0x5f0ce6f4b510 <abort@plt>
End of assembler dump.

Owner: cmt...@chromium.org
Cc: yunlian@chromium.org llozano@chromium.org
Updates/additional information for this bug:

- If you build without CFI enabled, the image works.
- If you build with CFI enabled, and turn on CFI diagnostics, the image works.
- The mojo subtree (and file_mojo_traits.cc) is built with CFI checks, but there are no actual CFI checks in file_mojo_traits.cc, where the crash occurs.
- 'ud2', the instruction at the crash, is also used by LLVM to indicate unreachable code.
- The source code for mojo::StructTraits<mojo_base::mojom::FileDataView, base::File>::Read(mojo_base::mojom::FileDataView, base::File*), where the crash occurs looks like this:

bool StructTraits<mojo_base::mojom::FileDataView, base::File>::Read(
    mojo_base::mojom::FileDataView data,
    base::File* file) {
  base::PlatformFile platform_handle = base::kInvalidPlatformFile;
  if (mojo::UnwrapPlatformFile(data.TakeFd(), &platform_handle) !=
      MOJO_RESULT_OK) {
    return false;
  }
  *file = base::File(platform_handle, data.async());
  return true;
}

- I've successfully attached to the crashing program with gdb/gdbserver, and stepped through it up to/through the crash.  It crashes when trying to call base::File.  At that time, platform_handle is -1 and data contains a bunch of empty or zero-filled structures.
- I've also stepped through the same code, when built with gold.  platform_handle is still -1; I'm not sure what data is ("value is optimized out"), and I don't think it ever calls base::File.
- I've built the image with --icf=none.  It still fails.
- I've built the image with a linker script (as suggested in https://bugs.chromium.org/p/chromium/issues/detail?id=884989).  It still fails.
- I'm trying to build Chrome with --no-gc-sections, but that causes linking errors in libassistant.so (undefined symbols).
- I'm trying to build Chrome with a later version of llvm/lld/clang 
(llvm r344140, lld r344124).  nacl fails to build, with undefined symbols.

That's the status to date.
Status: Assigned (was: Untriaged)
Cc: manojgupta@chromium.org
Cc: yolanda....@intel.com
Caroline,

When you try build Chrome with a later version of llvm/lld/clang, what's the configuration you choose? 
Is it USE="chrome_internal"?
USE="afdo_use autotest buildcheck chrome_internal cups evdev_gestures lld opengles vaapi xkbcommon -gold verbose" emerge-samus chromeos-base/chromeos-chrome
I just set USE="afdo_use lld -gold". The build can complete successfully with the upgraded llvm/lld 8.0.0 using CL:https://chromium-review.googlesource.com/c/chromiumos/overlays/chromiumos-overlay/+/1246304

Not sure if the undefined symbol error relates to any other specific use option.

It looks like we've finally found the root cause for our issue:  There's bug in the interaction of the thinLto cache with virtual function calls.

Below is George's summary of the issue:

- We hash summaries of things that ThinLTO is going to import into a given module. We combine this with a few other hashes that describe the module/environment/etc. The resultant hash is a key in a lookup (literally thinlto-cache/llvmcache-${hash}). If that lookup hits, we skip all codegen/etc.
  - Importantly, this hash includes virtual functions that we call. So, if any devirtualization decisions change (due to more classes being added, or whatever), the hash will change
- In C++, there are multiple kinds of constructors. Ones that match /.*C1Ev$/ are "full object" constructors, and ones that match /.*C2Ev$/ are "base object" constructors. In some cases, both of these can be identical. So, clang is allowed to alias one to another.
- Clang happens to make C1Ev an alias to C2Ev, which is an actual function.
- All outer calls to constructors will call C1Ev
- When we were hashing these summaries, we weren't properly "looking through" aliases. So, the hashing logic saw C1Ev was an alias and moved on. Had we called C2Ev, we'd have picked up on the virtual calls.

The current plan is to enable LLD in Chrome OS, with ThinLTO caching disabled until we get a fix for this issue into LLVM.

Project Member

Comment 11 by bugdroid1@chromium.org, Nov 22

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/8cb0b4c9041a865650d06ba67c46f66c310f9fe1

commit 8cb0b4c9041a865650d06ba67c46f66c310f9fe1
Author: Caroline Tice <cmtice@google.com>
Date: Thu Nov 22 06:07:27 2018

Update compiler BUILD.gn to disable caching with ThinLTO in ChromeOS.

ThinLTO caching currently has a bug the shows up in Chrome OS and
causes bad images to be generated with LLD.  This CL disables
ThinLTO caching with LLD for Chrome OS, until we can get the
ThinLTO caching bug fixed.

BUG= chromium:889967 
TEST=Tested this patch, build Chrome with LLD in Chrome OS with
ThinLTO enabled, and the resulting images were good.

Change-Id: I549ee133f0a40f9296ab4d57e8054bcb84e801a8
Reviewed-on: https://chromium-review.googlesource.com/c/1345537
Reviewed-by: Dirk Pranke <dpranke@chromium.org>
Reviewed-by: Peter Collingbourne <pcc@chromium.org>
Commit-Queue: Caroline Tice <cmtice@chromium.org>
Cr-Commit-Position: refs/heads/master@{#610315}
[modify] https://crrev.com/8cb0b4c9041a865650d06ba67c46f66c310f9fe1/build/config/compiler/BUILD.gn

Cc: g...@chromium.org
https://reviews.llvm.org/D55060
Landed as r348216. Up to y'all if we want to try to cherrypick that, or just pick it up when we roll past it.

Should be low-risk if we do choose to cherrypick it, as long as we let it sit for a few days to see if it causes headaches for anyone else. :)
Status: Fixed (was: Assigned)
I believe this is fixed now.

Sign in to add a comment