New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 770889 link

Starred by 2 users

Issue metadata

Status: Verified
Owner:
Closed: Dec 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug

Blocking:
issue 769527



Sign in to add a comment

clang: undefined reference to `__compiletime_assert_xyz'

Project Member Reported by mka@chromium.org, Oct 2 2017

Issue description

Some kernel drivers that make use of the compiletime_assert() macros fail to build with clang.

One example is the NFP ethernet driver:

drivers/built-in.o: In function `__nfp_eth_set_aneg':
(.text+0x2f7b17): undefined reference to `__compiletime_assert_474'
drivers/built-in.o: In function `__nfp_eth_set_aneg':
(.text+0x2f7b41): undefined reference to `__compiletime_assert_478'
...

http://elixir.free-electrons.com/linux/v4.12.14/source/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_nsp_eth.c

The driver doesn't use compiletime_assert() directly, but through:

FIELD_GET (http://elixir.free-electrons.com/linux/v4.12.14/source/include/linux/bitfield.h#L100)
  BUILD_BUG_ON_MSG (http://elixir.free-electrons.com/linux/v4.12.14/source/include/linux/bug.h#L54)
    compiletime_assert (http://elixir.free-electrons.com/linux/v4.12.14/source/include/linux/compiler.h#L528)

http://elixir.free-electrons.com/linux/v4.12.14/source/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_nsp_eth.c

Steps to reproduce:

cd src/third_party/kernel/v4.12
make ARCH=x86 defconfig
edit .config, set CONFIG_NFP to 'y' and add '# CONFIG_NFP_DEBUG is not set'
make ARCH=x86 -j 48 CC=clang
 
__compiletime_assert is a macro. Most likely an incorrect append operation somewhere expanding the name to __compiletime_assert_474 instead of __compiletime_assert(something)

Comment 2 by mka@chromium.org, Oct 2 2017

474 is the line number of the compiletime_assert() statement. It is passed as suffix to __compiletime_assert, which expands it to __compiletime_assert_<line_no>:

#define __compiletime_assert(condition, msg, prefix, suffix)		\
	do {								\
		bool __cond = !(condition);				\
		extern void prefix ## suffix(void) __compiletime_error(msg); \
		if (__cond)						\
			prefix ## suffix();				\
		__compiletime_error_fallback(__cond);			\
	} while (0)


#define _compiletime_assert(condition, msg, prefix, suffix) \
	__compiletime_assert(condition, msg, prefix, suffix)


#define compiletime_assert(condition, msg) \
	_compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)


mka@ Do you have a preprocessed file that I can check?
Ok, I was able to repro it (partially). The root cause is this code relies on compiler optimizations to get rid of it.


$ cat checker.cpp 
# define __compiletime_error(message) __attribute__((error(message)))

#ifndef __compiletime_error_fallback
#define __compiletime_error_fallback(condition) do { } while (0)
#endif

#define __compiletime_assert(condition, msg, prefix, suffix)		\
	do {								\
		bool __cond = !(condition);				\
		extern void prefix ## suffix(void) __compiletime_error(msg); \
		if (__cond)						\
			prefix ## suffix();				\
		__compiletime_error_fallback(__cond);			\
	} while (0)

#define _compiletime_assert(condition, msg, prefix, suffix) \
	__compiletime_assert(condition, msg, prefix, suffix)

#define compiletime_assert(condition, msg) \
	_compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)


int main()
{
    int a = 0;
    bool b = (a == 0);
    compiletime_assert(b, "Passing true");
}

$ cat checker.ii
# 1 "checker.cpp"
# 1 "<built-in>" 1
# 1 "<built-in>" 3
# 346 "<built-in>" 3
# 1 "<command line>" 1
# 1 "<built-in>" 2
# 1 "checker.cpp" 2
# 23 "checker.cpp"
int main()
{
    int a = 0;
    bool b = (a == 0);
    do { bool __cond = !(b); extern void __compiletime_assert_27(void) __attribute__((error("Passing true"))); if (__cond) __compiletime_assert_27(); do { } while (0); } while (0);
}

$ x86_64-cros-linux-gnu-clang++ -o main checker.cpp -Wno-unknown-attributes -O2 // This passes


 $ x86_64-cros-linux-gnu-clang++ -o main checker.cpp -Wno-unknown-attributes -O0
/tmp/checker-368a9b.o:checker.cpp:function main: error: undefined reference to '__compiletime_assert_27()'
clang-5.0: error: linker command failed with exit code 1 (use -v to see invocation) // This failed since -O0 didn't remove the call to __compiletime_assert_27.

Now depending on the source code, clang optimizations might not remove the call to the __compiletime_assert_@@@@ functions.
TBH, relying on compiler optimizations to make the macro work does not look like portable behavior to me.

Comment 5 by mka@chromium.org, Oct 2 2017

The a preprocessed file is a attached. (Note to future self: print KBUILD_CFLAGS from kernel Makefile and use this + '-I include -I arch/x86/include/ -E' to generate the preprocessed version).

I agree that relying on optimization behavior of the compiler isn't really portable, but it's what we have for now unless we come up with a better solution :(

btw, CFLAGS of the kernel include -O2.
nfp_net_ethtool.c.pp
1.2 MB Download
clang is not able to clean up the function nfp_eth_set_bit_config called by __nfp_eth_set_aneg. 

Checking why it does not happen.
Root cause is use of __builtin_constant_p. Clang is returning 0 for some of its uses while gcc returns 1.

https://gcc.gnu.org/onlinedocs/gcc-5.3.0/gcc/Other-Builtins.html
You can use the built-in function __builtin_constant_p to determine if a value is known to be constant at compile time and hence that GCC can perform constant-folding on expressions involving that value. The argument of the function is the value to test. The function returns the integer 1 if the argument is known to be a compile-time constant and 0 if it is not known to be a compile-time constant. A return of 0 does not indicate that the value is not a constant, but merely that GCC cannot prove it is a constant with the specified value of the -O option.

You may use this built-in function in either a macro or an inline function. However, if you use it in an inlined function and pass an argument of the function as the argument to the built-in, GCC never returns 1 when you call the inline function with a string constant or compound literal (see Compound Literals) and does not return 1 when you pass a constant numeric value to the inline function unless you specify the -O option.


static __always_inline int
nfp_eth_set_bit_config(struct nfp_nsp *nsp, unsigned int raw_idx,
                       const u64 mask, unsigned int val, const u64 ctrl_bit)

...
...
        if (val == FIELD_GET(mask, reg))
...
...

Function nfp_eth_set_bit_config (marked always inline) is called by __nfp_eth_set_aneg with immediate arguments. However, clang cannot prove that the value is constant and returns while gcc returns 1.


#define __BF_FIELD_CHECK(_mask, _reg, _val, _pfx)			\
	({								\
		BUILD_BUG_ON_MSG(!__builtin_constant_p(_mask),		\    /// clang nnot prove this to be constant while gcc can. 
				 _pfx "mask is not constant");		\
		BUILD_BUG_ON_MSG(!(_mask), _pfx "mask is zero");	\
		BUILD_BUG_ON_MSG(__builtin_constant_p(_val) ?		\
				 ~((_mask) >> __bf_shf(_mask)) & (_val) : 0, \
				 _pfx "value too large for the field"); \
		BUILD_BUG_ON_MSG((_mask) > (typeof(_reg))~0ull,		\
				 _pfx "type of reg too small for mask"); \
		__BUILD_BUG_ON_NOT_POWER_OF_2((_mask) +			\
					      (1ULL << __bf_shf(_mask))); \
	})


So the issue is gcc by design  does inlining first and can prove the value as constant when -O is specified.
Clang however does not take inlining into account and hence cannot prove the constant condition to be true.

I will check how easy is to add the support for constant evaluation  with inlining in clang. 
There is an existing upstream bug for same builtin_constant_p issue at https://bugs.llvm.org/show_bug.cgi?id=4898
mka@ It might not be easy to get a fix in clang in short term. Can you talk to the authors of nfp driver to work out a (interim?) solution?

Comment 10 by mka@chromium.org, Oct 3 2017

Thanks for the investigation and involving upstream LLVM in the discussion.

I'll try to get this addressed in the kernel.

Comment 11 by mka@chromium.org, Oct 13 2017

It looks like the issue in the NFP driver will be fixed in the kernel.

There is another instance of __compiletime_assert_N not being optimized away, which doesn't involve inline functions:

drivers/media/platform/qcom/camss-8x16/camss-csid.o: In function `msm_csid_subdev_init':
drivers/media/platform/qcom/camss-8x16/camss-csid.c:(.text+0x32): undefined reference to `__compiletime_assert_804'
(and many others in the same driver)

The above instance stems from the to_device_index 'call' here:

http://elixir.free-electrons.com/linux/v4.14-rc4/source/drivers/media/platform/qcom/camss-8x16/camss-csid.c#L230

These are the definitions of to_device_index and the underlying macros:

#define to_device_index(ptr_module, index)      \
        (to_camss_index(ptr_module, index)->dev)
        
#define to_camss_index(ptr_module, index)       \
        container_of(module_pointer(ptr_module, index), \
                     struct camss, ptr_module

#define module_pointer(ptr_module, index)       \
        ((const struct ptr_module##_device (*)[]) &(ptr_module[-(index)]))

#define container_of(ptr, type, member) ({				\
	void *__mptr = (void *)(ptr);					\
	BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) &&	\
			 !__same_type(*(ptr), void),			\
			 "pointer type mismatch in container_of()");	\
	((type *)(__mptr - offsetof(type, member))); })


From the earlier discussion I deduce that clang cannot prove that conditions in  BUILD_BUG_ON_MSG are constant.

A pre-processed version of the file is attached.

(updated note to self for pre-processed output with some additional include paths: -I include/ -I include/uapi/ -I arch/x86/include/ -I arch/x86/include/generated -E)
camss-csid.c.pp
987 KB Download
Cc: g...@chromium.org
George, Can you take a look at the pre-processed file?
From a quick glance, clang might not be evaluating __builtin_types_compatible_p in same way as gcc does.

https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html

Comment 13 by g...@chromium.org, Oct 16 2017

Reduced test-case:

```
int main() {
  return __builtin_types_compatible_p(
     typeof(*((const int (*)[])0)),
     typeof(*((int (*)[])0)));
}
```

Compiling with clang trunk, main returns 0. Compiling with gcc 6.3, main returns 1. Moreover, clang believes that `const int[]` and `int[]` are incompatible; gcc does not.

Interestingly, if we tweak the test-case slightly to use pointers instead of arrays:
```
int main() {
  return __builtin_types_compatible_p(
     typeof(*((const int **)0)),
     typeof(*((int **)0)));
}
```

...Both clang and gcc agree that these types are *not* compatible.

My mental model for foo[] has always been "it's syntactic sugar for foo*," though I'm unsure how well that aligns with reality. If it's not too far off, this seems like a GCC bug.

In any case, glancing through the camss code, I don't understand why there's a `const` in `(const struct ptr_module##_device (*)[])` to begin with. All the callsites I saw seemed to be passing in a vanilla `struct csid_device *`, and the `const` just gets casted away by container_of anyway. I don't have the source to check if that breaks anything, but locally removing the `const`s from the test-case made __compiletime_assert_804 disappear for me.
Thanks George,

According to the GCC builtin Doc:
This built-in function ignores top level qualifiers (e.g., const, volatile). For example, int is equivalent to const int.


Comment 15 by g...@chromium.org, Oct 16 2017

Yes, but we're dealing with a `const int[]`, so I'd be surprised if that rule is meant to apply here (it doesn't in the case of an `int*` vs `const int*`, as detailed above).

Again, it could totally be that my mental model of what `int[]` means is broken; I plan to bug a language lawyer tomorrow. :)
I have another test case for you then :)

$ cat type.c 
#include <stdio.h>

int main() {
   int a = __builtin_types_compatible_p(
        typeof(int* const),
	     typeof(int*));
   int b = __builtin_types_compatible_p(
        typeof(int const*),
	     typeof(int*));

   printf("Are int* const and int* compatible: %d\n", a);
   printf("Are int const* and int* compatible: %d\n", b);
}

$ gcc -o type type.c // clang also produces same result.
$ ./type 
Are int* const and int* compatible: 1
Are int const* and int* compatible: 0

Comment 17 by mka@chromium.org, Oct 16 2017

Thanks George,

Removing the const attribute fixes some instances, but not all of them:

ERROR: "__compiletime_assert_1462" [drivers/media/platform/qcom/camss-8x16/qcom-camss.ko] undefined!
ERROR: "__compiletime_assert_236" [drivers/media/platform/qcom/camss-8x16/qcom-camss.ko] undefined!
ERROR: "__compiletime_assert_622" [drivers/media/platform/qcom/camss-8x16/qcom-camss.ko] undefined!
ERROR: "__compiletime_assert_1336" [drivers/media/platform/qcom/camss-8x16/qcom-camss.ko] undefined!
ERROR: "__compiletime_assert_1526" [drivers/media/platform/qcom/camss-8x16/qcom-camss.ko] undefined!
ERROR: "__compiletime_assert_1354" [drivers/media/platform/qcom/camss-8x16/qcom-camss.ko] undefined!
ERROR: "__compiletime_assert_2154" [drivers/media/platform/qcom/camss-8x16/qcom-camss.ko] undefined!
ERROR: "__compiletime_assert_1580" [drivers/media/platform/qcom/camss-8x16/qcom-camss.ko] undefined!
ERROR: "__compiletime_assert_2186" [drivers/media/platform/qcom/camss-8x16/qcom-camss.ko] undefined!
ERROR: "__compiletime_assert_2865" [drivers/media/platform/qcom/camss-8x16/qcom-camss.ko] undefined!
ERROR: "__compiletime_assert_1292" [drivers/media/platform/qcom/camss-8x16/qcom-camss.ko] undefined!

mka@ I suspect they are coming from another file. Can you identify which file it is?

Comment 19 by g...@chromium.org, Oct 16 2017

I talked with Richard Smith offline. He said that while this ignoring-const behavior is mildly quirky, we should ideally match it in clang. I have some perf results I'm waiting for, so I'm happy to look at that.

> My mental model for foo[] has always been "it's syntactic sugar for foo*,"

^ Also, I learned that this is incorrect. foo[] and foo* are apparently very different types, even though one decays to the other quite easily.

Comment 20 by mka@chromium.org, Oct 16 2017

Yes, the remaining instances are from other files of the driver. They stem from similar macros with const arrays that are used with container_of: ispif_line_array, vfe_line_array.

The driver builds with the const specifiers removed.


Comment 21 by g...@chromium.org, Oct 16 2017

r315951 in clang fixes the const vs non-const issue for me. Please let me know if I missed anything. :)
Thanks a lot George, this was far more quicker than I had expected :).
I'll cherry pick the CL to current LLVM.

Comment 23 by lloz...@google.com, Oct 17 2017

thanks for the quick help George.
Project Member

Comment 24 by bugdroid1@chromium.org, Oct 17 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/aeba29d85bad2c36b91b301a6de38824885a825f

commit aeba29d85bad2c36b91b301a6de38824885a825f
Author: Manoj Gupta <manojgupta@google.com>
Date: Tue Oct 17 20:02:51 2017

llvm: Pick an upstream CL to fix a kernel build issue.

Cherry pick CL r315951 to fix a Linux kernel build issue.

BUG= chromium:770889 
TEST=int[] and const int[] are reported compatible in clang when
using __builtin_types_compatible_p.

commit 73c1500cc3b3a4cd39a7c59753a7d0e63887a839
Author: George Burgess IV <george.burgess.iv@gmail.com>
Date:   Mon Oct 16 22:58:37 2017 +0000

    Make __builtin_types_compatible_p more like GCC's

    GCC ignore qualifiers on array types. Since we seem to have this
    function primarily for GCC compatibility, we should try to match that
    behavior.

Change-Id: Ie8ec3d65d654d36d81e2ae0fb37b7016c73f264c
Reviewed-on: https://chromium-review.googlesource.com/722455
Commit-Ready: Manoj Gupta <manojgupta@chromium.org>
Tested-by: Manoj Gupta <manojgupta@chromium.org>
Reviewed-by: Mike Frysinger <vapier@chromium.org>

[rename] https://crrev.com/aeba29d85bad2c36b91b301a6de38824885a825f/sys-devel/llvm/llvm-5.0_pre305632_p20170806-r9.ebuild
[add] https://crrev.com/aeba29d85bad2c36b91b301a6de38824885a825f/sys-devel/llvm/files/cherry/73c1500cc3b3a4cd39a7c59753a7d0e63887a839.patch

Comment 25 by mka@chromium.org, Nov 22 2017

Could we try to get r315951 merged into the clang 5.x branch to make it available in the next clang 5 release?
I believe this has already made it into the latest clang release in ChromeOS.

Comment 27 by mka@chromium.org, Nov 22 2017

#26: right, I was referring to upstream clang.

Besides our primary goal to use clang for Chrome OS we are also interested in enabling others to build the kernel with clang. This is way easier if we can say: "just download this version of clang", rather than "build clang yourself from master" or "apply these hacks to your kernel".

Comment 28 by g...@chromium.org, Nov 28 2017

Sorry for the delay; I was on vacation.

llvm.org says that the merge request deadline for 5.0.1 was 15-Nov, and the merge deadline was 22-Nov. Since this isn't a release blocking issue, I believe the 5.0.1 ship has sailed.

If the community decides to do a *.2 release (which is rare), I'll try to get this merged into that.
Just came back from vacation. Unfortunately, llvm 5.0.1 is no longer an option. So most likely, 6.0 will be the official version with all the fixes in place.

mka@ Is the upstream kernel fixed regarding the original bug related to NFP driver?

Comment 30 by mka@chromium.org, Dec 6 2017

Status: Verified
yes, the NFP driver is fixed:

commit 4e59532541c865c85c92d42be4edf2ba6aa4af64
Author: Jakub Kicinski <jakub.kicinski@netronome.com>
Date:   Sat Nov 4 16:48:54 2017 +0100

    nfp: don't depend on compiler constant propagation

Sign in to add a comment