New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 918662 link

Starred by 3 users

Issue metadata

Status: Started
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug
Build-Toolchain



Sign in to add a comment

llvm-next: new fuzzer failure for chromeos-ec unittest.

Project Member Reported by llozano@chromium.org, Jan 2

Issue description

It looks like this is only happening for llvm-next compiler. 


Happened for snappy, link, Nautilus, and for the amd64-llvm-next-toolchain (https://cros-goldeneye.corp.google.com/chromeos/legoland/builderHistory?buildConfig=amd64-llvm-next-toolchain&buildBranch=master)


https://luci-logdog.appspot.com/logs/chromeos/buildbucket/cr-buildbucket.appspot.com/8925962940292182720/+/steps/UnitTest/0/stdout
chromeos-ec-0.0.1-r5282:  * ASAN error detected:
1h 4m
 chromeos-ec-0.0.1-r5282:  *     #0 0x7f375554f8c7 in __sanitizer_print_stack_trace /var/tmp/portage/sys-devel/llvm-8.0_pre339409_p20180926-r8/work/llvm-8.0_pre339409_p20180926/projects/compiler-rt/lib/asan/asan_stack.cc:38:3
1h 4m
 chromeos-ec-0.0.1-r5282:  *     #1 0x7f375549e6c8 in fuzzer::PrintStackTrace() /var/tmp/portage/sys-devel/llvm-8.0_pre339409_p20180926-r8/work/llvm-8.0_pre339409_p20180926/projects/compiler-rt/lib/fuzzer/FuzzerUtil.cpp:206:5
1h 4m
 chromeos-ec-0.0.1-r5282:  *     #2 0x7f375547c109 in fuzzer::Fuzzer::AlarmCallback() /var/tmp/portage/sys-devel/llvm-8.0_pre339409_p20180926-r8/work/llvm-8.0_pre339409_p20180926/projects/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:301:5
1h 4m

assigning to Manoj since he is the current "Mage" for the next compiler.
 
Cc: allenwebb@google.com
Owner: allenwebb@chromium.org
Allen, can you check on this?

To repro, please build llvm with llvm-next USE flag:
$ USE=llvm-next sudo emerge llvm
Cc: manojgupta@chromium.org
How urgent is this? I am in the middle of trying to uprev protobuf.
Not very urgent since the fail is limited to toolchain builders but is a blocker for toolchain upgrade. Triaging it by early next week should be fine.
Branch point is January 11th. 
We need to do the toolchain roll before that day. Otherwise we miss the window.

Cc: mascasa@google.com euge...@chromium.org
I bisected it to the following CL from mascasa@:

52d9c5f7ba72aa7ab7ea2e6afa72764e41749176 is the first bad commit
commit 52d9c5f7ba72aa7ab7ea2e6afa72764e41749176
Author: Matt Morehouse <mascasa@google.com>
Date:   Thu Sep 13 21:45:55 2018 +0000

    [SanitizerCoverage] Create comdat for global arrays.
    
    Summary:
    Place global arrays in comdat sections with their associated functions.
    This makes sure they are stripped along with the functions they
    reference, even on the BFD linker.
    
    Reviewers: eugenis
    
    Reviewed By: eugenis
    
    Subscribers: eraman, hiraditya, llvm-commits
    
    Differential Revision: https://reviews.llvm.org/D51902
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342186 91177308-0d34-0410-b5e6-96231b3b80d8

:040000 040000 011567923f708d189e9692721648a7682f3cb45c 72aacf32d48ce11b92cf731431beaf10f2d84859 M      lib
:040000 040000 d80ee636ce7690d8f4c4e4dec77381ed4965c3d9 e427e83a459a94480d06f1332341d2fbdbce6b40 M      test

Allen, can you check with mascasa@ about why this CL causes a problem with chromeos-ec fuzzers. 
meanwhile, will it be possible for you to take it out from unit tests so that next compiler roll is not affected.
Does amd64-generic-fuzzer still build with llvm-next?

I am concerned turning off the unit test will just hide the root problem and may require turning off the ec fuzzers entirely.
What is the full ASan report?  And how do I repro?
Allen, is there a way to build just the EC fuzzer binaries outside the chroot?

The standalone toolchains then can be used to hopefully repro the underlying problem https://pantheon.corp.google.com/storage/browser/chromeos-image-archive/chromiumos-sdk-tryjob/R73-11519.0.0-b3308593
Allen or Manoj can provide the reproducible. 

I just wanted to point out that the log (from luci) also has hundreds of warnings like this:

chromeos-ec-0.0.1-r5282: /usr/x86_64-pc-linux-gnu/x86_64-cros-linux-gnu/binutils-bin/2.27.0/ld.bfd.real: build/host/usb_pd_fuzz/libec.a(libec.a.2.o): warning: sh_link not set for section `__sancov_pcs[test_reset]'
  chromeos-ec-0.0.1-r5282: /usr/x86_64-pc-linux-gnu/x86_64-cros-linux-gnu/binutils-bin/2.27.0/ld.bfd.real: build/host/usb_pd_fuzz/libec.a(libec.a.2.o): warning: sh_link not set for section `__sancov_cntrs[test_pass]'
  chromeos-ec-0.0.1-r5282: /usr/x86_64-pc-linux-gnu/x86_64-cros-linux-gnu/binutils-bin/2.27.0/ld.bfd.real: build/host/usb_pd_fuzz/libec.a(libec.a.2.o): warning: sh_link not set for section `__sancov_pcs[test_pass]'
  chromeos-ec-0.0.1-r5282: /usr/x86_64-pc-linux-gnu/x86_64-cros-linux-gnu/binutils-bin/2.27.0/ld.bfd.real: build/host/usb_pd_fuzz/libec.a(libec.a.2.o): warning: sh_link not set for section `__sancov_cntrs[test_fail]'
  chromeos-ec-0.0.1-r5282: /usr/x86_64-pc-linux-gnu/x86_64-cros-linux-gnu/binutils-bin/2.27.0/ld.bfd.real: build/host/usb_pd_fuzz/libec.a(libec.a.2.o): warning: sh_link not set for section `__sancov_pcs[test_fail]'
  chromeos-ec-0.0.1-r5282: /usr/x86_64-pc-linux-gnu/x86_64-cros-linux-gnu/binutils-bin/2.27.0/ld.bfd.real: build/host/usb_pd_fuzz/libec.a(libec.a.2.o): warning: sh_link not set for section `__sancov_cntrs[test_print_result]'
  chromeos-ec-0.0.1-r5282: /usr/x86_64-pc-linux-gnu/x86_64-cros-linux-gnu/binutils-bin/2.27.0/ld.bfd.real: build/host/usb_pd_fuzz/libec.a(libec.a.2.o): warning: sh_link not set for section `__sancov_pcs[test_print_result]'
  chromeos-ec-0.0.1-r5282: /usr/x86_64-pc-linux-gnu/x86_64-cros-linux-gnu/binutils-bin/2.27.0/ld.bfd.real: build/host/usb_pd_fuzz/libec.a(libec.a.2.o): warning: sh_link not set for section `__sancov_cntrs[test_get_error_count]'
  chromeos-ec-0.0.1-r5282: /usr/x86_64-pc-linux-gnu/x86_64-cros-linux-gnu/binutils-bin/2.27.0/ld.bfd.real: build/host/usb_pd_fuzz/libec.a(libec.a.2.o): warning: sh_link not set for section `__sancov_pcs[test_get_error_count]'
  chromeos-ec-0.0.1-r5282: /usr/x86_64-pc-linux-gnu/x86_64-cros-linux-gnu/binutils-bin/2.27.0/ld.bfd.real: build/host/usb_pd_fuzz/libec.a(libec.a.2.o): warning: sh_link not set for section `__sancov_cntrs[test_get_state]'
  chromeos-ec-0.0.1-r5282: /usr/x86_64-pc-linux-gnu/x86_64-cros-linux-gnu/binutils-bin/2.27.0/ld.bfd.real: build/host/usb_pd_fuzz/libec.a(libec.a.2.o): warning: sh_link not set for section `__sancov_pcs[test_get_state]'
  chromeos-ec-0.0.1-r5282: /usr/x86_64-pc-linux-gnu/x86_64-cros-linux-gnu/binutils-bin/2.27.0/ld.bfd.real: build/host/usb_pd_fuzz/libec.a(libec.a.2.o): warning: sh_link not set for section `__sancov_cntrs[test_clean_up]'
  chromeos-ec-0.0.1-r5282: /usr/x86_64-pc-linux-gnu/x86_64-cros-linux-gnu/binutils-bin/2.27.0/ld.bfd.real: build/host/usb_pd_fuzz/libec.a(libec.a.2.o): warning: sh_link not set for section `__sancov_pcs[test_clean_up]'
  chromeos-ec-0.0.1-r5282: /usr/x86_64-pc-linux-gnu/x86_64-cros-linux-gnu/binutils-bin/2.27.0/ld.bfd.real: build/host/usb_pd_fuzz/libec.a(libec.a.2.o): warning: sh_link not set for section `__sancov_cntrs[test_reboot_to_next_step]'
  chromeos-ec-0.0.1-r5282: /usr/x86_64-pc-linux-gnu/x86_64-cros-linux-gnu/binutils-bin/2.27.0/ld.bfd.real: build/host/usb_pd_fuzz/libec.a(libec.a.2.o): warning: sh_link not set for section `__sancov_pcs[test_reboot_to_next_step]'
  chromeos-ec-0.0.1-r5282: /usr/x86_64-pc-linux-gnu/x86_64-cros-linux-gnu/binutils-bin/2.27.0/ld.bfd.real: build/host/usb_pd_fuzz/libec.a(libec.a.2.o): warning: sh_link not set for section `__sancov_cntrs[test_run_step]'
  chromeos-ec-0.0.1-r5282: /usr/x86_64-pc-linux-gnu/x86_64-cros-linux-gnu/binutils-bin/2.27.0/ld.bfd.real: build/host/usb_pd_fuzz/libec.a(libec.a.2.o): warning: sh_link not set for section `__sancov_pcs[test_run_step]'


Is it possible libec.a has not been rebuilt with the new compiler?

Those warnings are targets other than the one that failed.

libec.a is a workaround for conflicting symbols that are present in both the ec code and cstdlib. These symbols are linked internally in an extra step so that outside code that depends on cstdlib can be used.
Here is a backtrace with all threads included.
all thread backtrace.txt
4.8 KB View Download
Cc: drinkcat@google.com
+drinkcat because he wrote the fuzzer that is locking up.
Cc: -drinkcat@google.com drinkcat@chromium.org
For repro in chroot, build  llvm as:

USE=llvm-next sudo emerge llvm

After that build chromeos-ec fuzzers that would show the problem.

The locking up problem seems to be a pre-existing issue based on clusterfuzz data where it seems to be hitting this issue but in a flaky way:
https://clusterfuzz.com/testcase-detail/6270664419573760
Project Member

Comment 17 by bugdroid1@chromium.org, Jan 8

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/ec/+/1d9dbd408f7d837cb2725ed0e37d775967f6bb21

commit 1d9dbd408f7d837cb2725ed0e37d775967f6bb21
Author: Allen Webb <allenwebb@google.com>
Date: Tue Jan 08 03:40:51 2019

Remove fuzzer test runs from buildall.

This removes the test runs for fuzzer targets temporarily until
crbug.com/918662 is resolved.

BRANCH=None
BUG=chromium:918662
TEST=make -j buildall

Change-Id: I80b9c4cd403924e41704462277da6d288796abc8
Signed-off-by: Allen Webb <allenwebb@google.com>
Reviewed-on: https://chromium-review.googlesource.com/1399201
Reviewed-by: Manoj Gupta <manojgupta@chromium.org>

[modify] https://crrev.com/1d9dbd408f7d837cb2725ed0e37d775967f6bb21/Makefile.rules

I'd like to look into this, but I have 0 experience with ChromeOS and am not sure how to even setup the chroot.

I do suspect that the sh_link warnings are related to the timeouts.
Owner: drinkcat@chromium.org
It looks like this might be a target specific bug, so I am changing the owner. drinkcat@ do you mind taking a look at this?
Status: Started (was: Assigned)
Status: Available (was: Started)
Commenting out common/test_util.c fixes the issue:
test_mockable void run_test(void) { }

We have:
#define test_mockable __attribute__((weak))

Basically the compiler chooses to run the mock (weak) run_test in common/test_util.c instead of the one in fuzz/host_command_fuzz.c .

Sounds like a compiler bug to me (or maybe something wrong with our hacks to prevent stdlib symbols from being picked up)...

You can debug the issue easily by adding a print in the mock function:

diff --git a/common/test_util.c b/common/test_util.c
index 92c41d52d..20cae9a0e 100644
--- a/common/test_util.c
+++ b/common/test_util.c
@@ -28,7 +28,7 @@ struct test_util_tag {
 int __test_error_count;
 
 /* Weak reference function as an entry point for unit test */
-test_mockable void run_test(void) { }
+test_mockable void run_test(void) { ccprintf("run_test mock\n"); }
 
 /* Default dummy test init */
 test_mockable void test_init(void) { }

If you see the line on start, then something is wrong.

mascasa@ Any ideas about #21.
The repro steps outside the chroot are at https://docs.google.com/document/d/1nyNcRDGBDptCP-qdyEcJcC2TuDwr7MdflcRLMeP-xJU/edit?usp=sharing 

I guess it would enough to understand why the weak decl is being picked by the linker.
My guess is that it's related to the warning:

  chromeos-ec-0.0.1-r5282: x86_64-cros-linux-gnu-objcopy: build/host/host_command_fuzz/libec.a.1.o: warning: sh_link not set for section `__sancov_cntrs[run_test]'
  chromeos-ec-0.0.1-r5282: x86_64-cros-linux-gnu-objcopy: build/host/host_command_fuzz/libec.a.1.o: warning: sh_link not set for section `__sancov_pcs[run_test]'

The linker doesn't like the run_test provided in libec, so it chooses the weak one.  Has anyone looked into rebuilding libec with the latest clang?
libec is built by the same makefile as a dependency.
FYI, this appears to be an issue in the compiler.  I've created https://reviews.llvm.org/D56516 (still needs review) to hopefully resolve.
Owner: mascasa@google.com
Status: Started (was: Available)
Owner: allenwebb@chromium.org
My patch landed in r351247.  Once that makes it to ChromeOS, you should be able to re-enable the fuzzer tests.
Project Member

Comment 28 by bugdroid1@chromium.org, Jan 17 (5 days ago)

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/d10e4285c32225aa12a209b0ef445840a39dfc63

commit d10e4285c32225aa12a209b0ef445840a39dfc63
Author: Manoj Gupta <manojgupta@google.com>
Date: Thu Jan 17 13:16:30 2019

llvm: Cherry-pick r351247

Upstream CL description:
Author: Matt Morehouse <mascasa@google.com>
Date:   Tue Jan 15 21:21:01 2019 +0000

    [SanitizerCoverage] Don't create comdat for interposable functions.

    Summary:
    Comdat groups override weak symbol behavior, allowing the linker to keep
    the comdats for weak symbols in favor of comdats for strong symbols.

    Fixes the issue described in:
    https://bugs.chromium.org/p/chromium/issues/detail?id=918662

BUG=chromium:918662
TEST=sudo emerge llvm works.

Change-Id: Ia83dfd9d9cb3e3ddb823714f8d37bfac159a8bbe
Reviewed-on: https://chromium-review.googlesource.com/1413779
Commit-Ready: Manoj Gupta <manojgupta@chromium.org>
Tested-by: Manoj Gupta <manojgupta@chromium.org>
Reviewed-by: Caroline Tice <cmtice@chromium.org>
Reviewed-by: Allen Webb <allenwebb@google.com>

[add] https://crrev.com/d10e4285c32225aa12a209b0ef445840a39dfc63/sys-devel/llvm/files/cherry/b2800c91d94c27c35d6f371819f6e4d6fb3e5404.patch
[modify] https://crrev.com/d10e4285c32225aa12a209b0ef445840a39dfc63/sys-devel/llvm/llvm-8.0_pre349610_p20190109.ebuild
[rename] https://crrev.com/d10e4285c32225aa12a209b0ef445840a39dfc63/sys-devel/llvm/llvm-8.0_pre349610_p20190109-r3.ebuild

Project Member

Comment 29 by bugdroid1@chromium.org, Jan 19 (3 days ago)

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/ec/+/683d56b24f45db76f198197f5c2687fcc80cdd58

commit 683d56b24f45db76f198197f5c2687fcc80cdd58
Author: Allen Webb <allenwebb@google.com>
Date: Sat Jan 19 08:14:05 2019

Revert "Remove fuzzer test runs from buildall."

This reverts commit 1d9dbd408f7d837cb2725ed0e37d775967f6bb21.

Reason for revert: There is a compiler fix in llvm r351247.

Original change's description:
> Remove fuzzer test runs from buildall.
>
> This removes the test runs for fuzzer targets temporarily until
> crbug.com/918662 is resolved.
>
> BRANCH=None
> BUG=chromium:918662
> TEST=make -j buildall
>
> Change-Id: I80b9c4cd403924e41704462277da6d288796abc8
> Signed-off-by: Allen Webb <allenwebb@google.com>
> Reviewed-on: https://chromium-review.googlesource.com/1399201
> Reviewed-by: Manoj Gupta <manojgupta@chromium.org>

Bug: chromium:918662
Change-Id: I002046822af9550312f6a88828331637c83e4682
Reviewed-on: https://chromium-review.googlesource.com/1418250
Commit-Ready: Allen Webb <allenwebb@google.com>
Tested-by: Allen Webb <allenwebb@google.com>
Reviewed-by: Manoj Gupta <manojgupta@chromium.org>
Reviewed-by: Allen Webb <allenwebb@google.com>
Reviewed-by: Nicolas Boichat <drinkcat@chromium.org>

[modify] https://crrev.com/683d56b24f45db76f198197f5c2687fcc80cdd58/Makefile.rules

Sign in to add a comment