clang: undefined reference to `__compiletime_assert_xyz' |
|||
Issue descriptionSome kernel drivers that make use of the compiletime_assert() macros fail to build with clang. One example is the NFP ethernet driver: drivers/built-in.o: In function `__nfp_eth_set_aneg': (.text+0x2f7b17): undefined reference to `__compiletime_assert_474' drivers/built-in.o: In function `__nfp_eth_set_aneg': (.text+0x2f7b41): undefined reference to `__compiletime_assert_478' ... http://elixir.free-electrons.com/linux/v4.12.14/source/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_nsp_eth.c The driver doesn't use compiletime_assert() directly, but through: FIELD_GET (http://elixir.free-electrons.com/linux/v4.12.14/source/include/linux/bitfield.h#L100) BUILD_BUG_ON_MSG (http://elixir.free-electrons.com/linux/v4.12.14/source/include/linux/bug.h#L54) compiletime_assert (http://elixir.free-electrons.com/linux/v4.12.14/source/include/linux/compiler.h#L528) http://elixir.free-electrons.com/linux/v4.12.14/source/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_nsp_eth.c Steps to reproduce: cd src/third_party/kernel/v4.12 make ARCH=x86 defconfig edit .config, set CONFIG_NFP to 'y' and add '# CONFIG_NFP_DEBUG is not set' make ARCH=x86 -j 48 CC=clang
,
Oct 2 2017
474 is the line number of the compiletime_assert() statement. It is passed as suffix to __compiletime_assert, which expands it to __compiletime_assert_<line_no>:
#define __compiletime_assert(condition, msg, prefix, suffix) \
do { \
bool __cond = !(condition); \
extern void prefix ## suffix(void) __compiletime_error(msg); \
if (__cond) \
prefix ## suffix(); \
__compiletime_error_fallback(__cond); \
} while (0)
#define _compiletime_assert(condition, msg, prefix, suffix) \
__compiletime_assert(condition, msg, prefix, suffix)
#define compiletime_assert(condition, msg) \
_compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
,
Oct 2 2017
mka@ Do you have a preprocessed file that I can check?
,
Oct 2 2017
Ok, I was able to repro it (partially). The root cause is this code relies on compiler optimizations to get rid of it.
$ cat checker.cpp
# define __compiletime_error(message) __attribute__((error(message)))
#ifndef __compiletime_error_fallback
#define __compiletime_error_fallback(condition) do { } while (0)
#endif
#define __compiletime_assert(condition, msg, prefix, suffix) \
do { \
bool __cond = !(condition); \
extern void prefix ## suffix(void) __compiletime_error(msg); \
if (__cond) \
prefix ## suffix(); \
__compiletime_error_fallback(__cond); \
} while (0)
#define _compiletime_assert(condition, msg, prefix, suffix) \
__compiletime_assert(condition, msg, prefix, suffix)
#define compiletime_assert(condition, msg) \
_compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
int main()
{
int a = 0;
bool b = (a == 0);
compiletime_assert(b, "Passing true");
}
$ cat checker.ii
# 1 "checker.cpp"
# 1 "<built-in>" 1
# 1 "<built-in>" 3
# 346 "<built-in>" 3
# 1 "<command line>" 1
# 1 "<built-in>" 2
# 1 "checker.cpp" 2
# 23 "checker.cpp"
int main()
{
int a = 0;
bool b = (a == 0);
do { bool __cond = !(b); extern void __compiletime_assert_27(void) __attribute__((error("Passing true"))); if (__cond) __compiletime_assert_27(); do { } while (0); } while (0);
}
$ x86_64-cros-linux-gnu-clang++ -o main checker.cpp -Wno-unknown-attributes -O2 // This passes
$ x86_64-cros-linux-gnu-clang++ -o main checker.cpp -Wno-unknown-attributes -O0
/tmp/checker-368a9b.o:checker.cpp:function main: error: undefined reference to '__compiletime_assert_27()'
clang-5.0: error: linker command failed with exit code 1 (use -v to see invocation) // This failed since -O0 didn't remove the call to __compiletime_assert_27.
Now depending on the source code, clang optimizations might not remove the call to the __compiletime_assert_@@@@ functions.
TBH, relying on compiler optimizations to make the macro work does not look like portable behavior to me.
,
Oct 2 2017
The a preprocessed file is a attached. (Note to future self: print KBUILD_CFLAGS from kernel Makefile and use this + '-I include -I arch/x86/include/ -E' to generate the preprocessed version). I agree that relying on optimization behavior of the compiler isn't really portable, but it's what we have for now unless we come up with a better solution :( btw, CFLAGS of the kernel include -O2.
,
Oct 3 2017
clang is not able to clean up the function nfp_eth_set_bit_config called by __nfp_eth_set_aneg. Checking why it does not happen.
,
Oct 3 2017
Root cause is use of __builtin_constant_p. Clang is returning 0 for some of its uses while gcc returns 1. https://gcc.gnu.org/onlinedocs/gcc-5.3.0/gcc/Other-Builtins.html You can use the built-in function __builtin_constant_p to determine if a value is known to be constant at compile time and hence that GCC can perform constant-folding on expressions involving that value. The argument of the function is the value to test. The function returns the integer 1 if the argument is known to be a compile-time constant and 0 if it is not known to be a compile-time constant. A return of 0 does not indicate that the value is not a constant, but merely that GCC cannot prove it is a constant with the specified value of the -O option. You may use this built-in function in either a macro or an inline function. However, if you use it in an inlined function and pass an argument of the function as the argument to the built-in, GCC never returns 1 when you call the inline function with a string constant or compound literal (see Compound Literals) and does not return 1 when you pass a constant numeric value to the inline function unless you specify the -O option. static __always_inline int nfp_eth_set_bit_config(struct nfp_nsp *nsp, unsigned int raw_idx, const u64 mask, unsigned int val, const u64 ctrl_bit) ... ... if (val == FIELD_GET(mask, reg)) ... ... Function nfp_eth_set_bit_config (marked always inline) is called by __nfp_eth_set_aneg with immediate arguments. However, clang cannot prove that the value is constant and returns while gcc returns 1. #define __BF_FIELD_CHECK(_mask, _reg, _val, _pfx) \ ({ \ BUILD_BUG_ON_MSG(!__builtin_constant_p(_mask), \ /// clang nnot prove this to be constant while gcc can. _pfx "mask is not constant"); \ BUILD_BUG_ON_MSG(!(_mask), _pfx "mask is zero"); \ BUILD_BUG_ON_MSG(__builtin_constant_p(_val) ? \ ~((_mask) >> __bf_shf(_mask)) & (_val) : 0, \ _pfx "value too large for the field"); \ BUILD_BUG_ON_MSG((_mask) > (typeof(_reg))~0ull, \ _pfx "type of reg too small for mask"); \ __BUILD_BUG_ON_NOT_POWER_OF_2((_mask) + \ (1ULL << __bf_shf(_mask))); \ }) So the issue is gcc by design does inlining first and can prove the value as constant when -O is specified. Clang however does not take inlining into account and hence cannot prove the constant condition to be true. I will check how easy is to add the support for constant evaluation with inlining in clang.
,
Oct 3 2017
There is an existing upstream bug for same builtin_constant_p issue at https://bugs.llvm.org/show_bug.cgi?id=4898
,
Oct 3 2017
mka@ It might not be easy to get a fix in clang in short term. Can you talk to the authors of nfp driver to work out a (interim?) solution?
,
Oct 3 2017
Thanks for the investigation and involving upstream LLVM in the discussion. I'll try to get this addressed in the kernel.
,
Oct 13 2017
It looks like the issue in the NFP driver will be fixed in the kernel. There is another instance of __compiletime_assert_N not being optimized away, which doesn't involve inline functions: drivers/media/platform/qcom/camss-8x16/camss-csid.o: In function `msm_csid_subdev_init': drivers/media/platform/qcom/camss-8x16/camss-csid.c:(.text+0x32): undefined reference to `__compiletime_assert_804' (and many others in the same driver) The above instance stems from the to_device_index 'call' here: http://elixir.free-electrons.com/linux/v4.14-rc4/source/drivers/media/platform/qcom/camss-8x16/camss-csid.c#L230 These are the definitions of to_device_index and the underlying macros: #define to_device_index(ptr_module, index) \ (to_camss_index(ptr_module, index)->dev) #define to_camss_index(ptr_module, index) \ container_of(module_pointer(ptr_module, index), \ struct camss, ptr_module #define module_pointer(ptr_module, index) \ ((const struct ptr_module##_device (*)[]) &(ptr_module[-(index)])) #define container_of(ptr, type, member) ({ \ void *__mptr = (void *)(ptr); \ BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \ !__same_type(*(ptr), void), \ "pointer type mismatch in container_of()"); \ ((type *)(__mptr - offsetof(type, member))); }) From the earlier discussion I deduce that clang cannot prove that conditions in BUILD_BUG_ON_MSG are constant. A pre-processed version of the file is attached. (updated note to self for pre-processed output with some additional include paths: -I include/ -I include/uapi/ -I arch/x86/include/ -I arch/x86/include/generated -E)
,
Oct 13 2017
George, Can you take a look at the pre-processed file? From a quick glance, clang might not be evaluating __builtin_types_compatible_p in same way as gcc does. https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
,
Oct 16 2017
Reduced test-case:
```
int main() {
return __builtin_types_compatible_p(
typeof(*((const int (*)[])0)),
typeof(*((int (*)[])0)));
}
```
Compiling with clang trunk, main returns 0. Compiling with gcc 6.3, main returns 1. Moreover, clang believes that `const int[]` and `int[]` are incompatible; gcc does not.
Interestingly, if we tweak the test-case slightly to use pointers instead of arrays:
```
int main() {
return __builtin_types_compatible_p(
typeof(*((const int **)0)),
typeof(*((int **)0)));
}
```
...Both clang and gcc agree that these types are *not* compatible.
My mental model for foo[] has always been "it's syntactic sugar for foo*," though I'm unsure how well that aligns with reality. If it's not too far off, this seems like a GCC bug.
In any case, glancing through the camss code, I don't understand why there's a `const` in `(const struct ptr_module##_device (*)[])` to begin with. All the callsites I saw seemed to be passing in a vanilla `struct csid_device *`, and the `const` just gets casted away by container_of anyway. I don't have the source to check if that breaks anything, but locally removing the `const`s from the test-case made __compiletime_assert_804 disappear for me.
,
Oct 16 2017
Thanks George, According to the GCC builtin Doc: This built-in function ignores top level qualifiers (e.g., const, volatile). For example, int is equivalent to const int.
,
Oct 16 2017
Yes, but we're dealing with a `const int[]`, so I'd be surprised if that rule is meant to apply here (it doesn't in the case of an `int*` vs `const int*`, as detailed above). Again, it could totally be that my mental model of what `int[]` means is broken; I plan to bug a language lawyer tomorrow. :)
,
Oct 16 2017
I have another test case for you then :)
$ cat type.c
#include <stdio.h>
int main() {
int a = __builtin_types_compatible_p(
typeof(int* const),
typeof(int*));
int b = __builtin_types_compatible_p(
typeof(int const*),
typeof(int*));
printf("Are int* const and int* compatible: %d\n", a);
printf("Are int const* and int* compatible: %d\n", b);
}
$ gcc -o type type.c // clang also produces same result.
$ ./type
Are int* const and int* compatible: 1
Are int const* and int* compatible: 0
,
Oct 16 2017
Thanks George, Removing the const attribute fixes some instances, but not all of them: ERROR: "__compiletime_assert_1462" [drivers/media/platform/qcom/camss-8x16/qcom-camss.ko] undefined! ERROR: "__compiletime_assert_236" [drivers/media/platform/qcom/camss-8x16/qcom-camss.ko] undefined! ERROR: "__compiletime_assert_622" [drivers/media/platform/qcom/camss-8x16/qcom-camss.ko] undefined! ERROR: "__compiletime_assert_1336" [drivers/media/platform/qcom/camss-8x16/qcom-camss.ko] undefined! ERROR: "__compiletime_assert_1526" [drivers/media/platform/qcom/camss-8x16/qcom-camss.ko] undefined! ERROR: "__compiletime_assert_1354" [drivers/media/platform/qcom/camss-8x16/qcom-camss.ko] undefined! ERROR: "__compiletime_assert_2154" [drivers/media/platform/qcom/camss-8x16/qcom-camss.ko] undefined! ERROR: "__compiletime_assert_1580" [drivers/media/platform/qcom/camss-8x16/qcom-camss.ko] undefined! ERROR: "__compiletime_assert_2186" [drivers/media/platform/qcom/camss-8x16/qcom-camss.ko] undefined! ERROR: "__compiletime_assert_2865" [drivers/media/platform/qcom/camss-8x16/qcom-camss.ko] undefined! ERROR: "__compiletime_assert_1292" [drivers/media/platform/qcom/camss-8x16/qcom-camss.ko] undefined!
,
Oct 16 2017
mka@ I suspect they are coming from another file. Can you identify which file it is?
,
Oct 16 2017
I talked with Richard Smith offline. He said that while this ignoring-const behavior is mildly quirky, we should ideally match it in clang. I have some perf results I'm waiting for, so I'm happy to look at that. > My mental model for foo[] has always been "it's syntactic sugar for foo*," ^ Also, I learned that this is incorrect. foo[] and foo* are apparently very different types, even though one decays to the other quite easily.
,
Oct 16 2017
Yes, the remaining instances are from other files of the driver. They stem from similar macros with const arrays that are used with container_of: ispif_line_array, vfe_line_array. The driver builds with the const specifiers removed.
,
Oct 16 2017
r315951 in clang fixes the const vs non-const issue for me. Please let me know if I missed anything. :)
,
Oct 17 2017
Thanks a lot George, this was far more quicker than I had expected :). I'll cherry pick the CL to current LLVM.
,
Oct 17 2017
thanks for the quick help George.
,
Oct 17 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/aeba29d85bad2c36b91b301a6de38824885a825f commit aeba29d85bad2c36b91b301a6de38824885a825f Author: Manoj Gupta <manojgupta@google.com> Date: Tue Oct 17 20:02:51 2017 llvm: Pick an upstream CL to fix a kernel build issue. Cherry pick CL r315951 to fix a Linux kernel build issue. BUG= chromium:770889 TEST=int[] and const int[] are reported compatible in clang when using __builtin_types_compatible_p. commit 73c1500cc3b3a4cd39a7c59753a7d0e63887a839 Author: George Burgess IV <george.burgess.iv@gmail.com> Date: Mon Oct 16 22:58:37 2017 +0000 Make __builtin_types_compatible_p more like GCC's GCC ignore qualifiers on array types. Since we seem to have this function primarily for GCC compatibility, we should try to match that behavior. Change-Id: Ie8ec3d65d654d36d81e2ae0fb37b7016c73f264c Reviewed-on: https://chromium-review.googlesource.com/722455 Commit-Ready: Manoj Gupta <manojgupta@chromium.org> Tested-by: Manoj Gupta <manojgupta@chromium.org> Reviewed-by: Mike Frysinger <vapier@chromium.org> [rename] https://crrev.com/aeba29d85bad2c36b91b301a6de38824885a825f/sys-devel/llvm/llvm-5.0_pre305632_p20170806-r9.ebuild [add] https://crrev.com/aeba29d85bad2c36b91b301a6de38824885a825f/sys-devel/llvm/files/cherry/73c1500cc3b3a4cd39a7c59753a7d0e63887a839.patch
,
Nov 22 2017
Could we try to get r315951 merged into the clang 5.x branch to make it available in the next clang 5 release?
,
Nov 22 2017
I believe this has already made it into the latest clang release in ChromeOS.
,
Nov 22 2017
#26: right, I was referring to upstream clang. Besides our primary goal to use clang for Chrome OS we are also interested in enabling others to build the kernel with clang. This is way easier if we can say: "just download this version of clang", rather than "build clang yourself from master" or "apply these hacks to your kernel".
,
Nov 28 2017
Sorry for the delay; I was on vacation. llvm.org says that the merge request deadline for 5.0.1 was 15-Nov, and the merge deadline was 22-Nov. Since this isn't a release blocking issue, I believe the 5.0.1 ship has sailed. If the community decides to do a *.2 release (which is rare), I'll try to get this merged into that.
,
Dec 6 2017
Just came back from vacation. Unfortunately, llvm 5.0.1 is no longer an option. So most likely, 6.0 will be the official version with all the fixes in place. mka@ Is the upstream kernel fixed regarding the original bug related to NFP driver?
,
Dec 6 2017
yes, the NFP driver is fixed: commit 4e59532541c865c85c92d42be4edf2ba6aa4af64 Author: Jakub Kicinski <jakub.kicinski@netronome.com> Date: Sat Nov 4 16:48:54 2017 +0100 nfp: don't depend on compiler constant propagation |
|||
►
Sign in to add a comment |
|||
Comment 1 by manojgupta@chromium.org
, Oct 2 2017