kallsyms fails building emerge-arm64-generic chromeos-kernel-4_12 |
|||||||||
Issue description"emerge-arm64-generic chromeos-kernel-4_12" fails with the following error. kallsyms failure: relative symbol value 0xffffff8008081000 out of range in relative mode Analysis shows bad symbols in the symbol table (System.map). 000000000000000e n __efistub_$d 000000000000000e n __efistub_$d 000000000000000e n __efistub_$d 000000000000000e n __efistub_$d ... This is how the symbols should look like (generated with a working toolchain): ffffff9009185bf0 t __efistub_$d ffffff9009185e90 t __efistub_$d ffffff9009185f38 t __efistub_$d ffffff90091860f8 t __efistub_$d ffffff9009186230 t __efistub_$d ffffff90091863e8 t __efistub_$d ... Due to the bad symbol table values, the relative address base is set to 0x0e (instead of something like 0xffffff90080bc000, ie _head), and all real symbols are considered to be out of range. Toolchain versions: x86_64-cros-linux-gnu-gcc.real (4.9.2_cos_gg_4.9.2-r164-0c5a656a1322e137fa4a251f2ccc6c4022918c0a_4.9.2-r164) 4.9.x 20150123 (prerelease) GNU gold (binutils-2.27.0-r3-85fafaf039799ebc8053bf36ce1c6e6df7adbbec_cos_gg 2.27.0.20170315) 1.12
,
Sep 28 2017
gold should not be used for kernel. is that the issue here?
,
Sep 28 2017
,
Sep 28 2017
FWIW, gold _is_ used for kernel builds. I had to explicitly disable it for my x86_64 test builds because the version in our toolchain has the bug described in https://bugzilla.kernel.org/show_bug.cgi?id=187841. That is not the issue here, though. I tried to disable gold for arm64, but the result was the same.
,
Sep 28 2017
I don't think we have had any major toolchain changes recently. Does this issue repro with clang as well?
,
Sep 28 2017
"USE=clang emerge-arm64-generic chromeos-kernel-4_12" tells me:
die "Clang is not yet supported for ${ARCH}"
The problem may have existed before; symbol handling was updated in kernel 4.12 and now uses relative addressing to reduce image size. This triggered a number of problems with binutils, such as the gold issue referenced in #4.
For reference, the "working" toolchain is:
aarch64-linux-gcc.br_real (Buildroot 2015.11.1-00010-g77c236c) 5.2.0
GNU ld (GNU Binutils) 2.25.1
,
Sep 28 2017
This looks similar to another bug I saw (and fixed) back in May; let me take a closer look at this bug.
,
Sep 28 2017
Yes, this is pretty much the same bug. The fix is very easy. The file to be patched is ~/trunk/src/third_party/kernel/v4.12/scripts/kallsyms.c.
The fix is:
diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index 5d554419170b..dc91c2e31f00 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -158,7 +158,7 @@ static int read_symbol(FILE *in, struct sym_entry *s)
else if (str[0] == '$')
return -1;
/* exclude debugging symbols */
- else if (stype == 'N')
+ else if (toupper(stype) == 'N')
return -1;
/* include the type field in the symbol name, so that it
Someone on the kernel team should probably make this change.
,
Sep 28 2017
#8: Please provide background. This patch is not in the upstream kernel, suggesting that it either was not submitted or that it is not really a kernel problem. I am especially curious to understand why no one outside Google seems to experience this problem.
,
Sep 28 2017
This was encountered by a ChromeOS Partner trying to build the 4.9 kernel using the ChromeOS toolchain (aarch64). (see https://b.corp.google.com/issues/36661449). The problem appears to be that the toolchain sets the symbol type for "__efistub_$d" type to "n" instead of "N" (which is what kallsyms appears to expect). In the bug referenced above, an alternate fix was to change the compilation from using the -fgcc-record-switches flag to -fno-gcc-record-switches I tried to do that with this bug, but I could not make that work this time around.
,
Sep 28 2017
#9. We add compiler options in our compiler wrappers that are not enabled by default. -fgcc-record-switches is one of them.
,
Sep 28 2017
This Cl was supposed to remove -fgcc-record-switches flag but looks like it didn't. https://chromium-review.googlesource.com/c/chromiumos/overlays/chromiumos-overlay/+/536233
,
Sep 28 2017
I may have made an error earlier when testing the flags solution; I am trying it again to see if it will really work in this case or not.
,
Sep 29 2017
-fgcc-record-switches is a valid flag, so if things need updating to support it, that should happen as well
,
Sep 29 2017
#11: Guess that explains why I could not disasble it on the command line. #14: Agreed. -fgcc-record-switches generates a section named .GCC.command.line which is ignored by modpost. I had, however, observed some other errors in association with .GCC.command.line when trying to build x86_64 images without "-g"; see chromium:769037 for details. Also, using my own toolchain, adding "-fgcc-record-switches" to the build flags works just fine. So there must be something else. I don't mind adding the patch from #8 to the kernel if necessary, ie if the problem can be encountered by others as well. However, at this point I am concerned that it just paints over some other problem.
,
Sep 29 2017
so, maybe it is the interaction with other flags we add. (need more than one flag added to reproduce the problem) From here: https://cs.corp.google.com/chromeos_public/src/third_party/chromiumos-overlay/sys-devel/gcc/files/sysroot_wrapper.hardened.body You can see we add: FLAGS_TO_ADD = set(['-fstack-protector-strong', '-fPIE', '-pie', '-D_FORTIFY_SOURCE=2', '-fno-omit-frame-pointer', ]) gcc_flags = ['-frecord-gcc-switches', '-fno-reorder-blocks-and-partition', '-Wno-unused-local-typedefs', '-Wno-maybe-uninitialized', ] so, it may be the combination with -fPIE? you can see the exact command line if you add the option -print-cmdline (which is a wrapper option).
,
Sep 29 2017
Ah! and for ARM we add -mthumb. Maybe that is the one ...
FLAGS_TO_ADD.add('-mthumb')
,
Sep 29 2017
#17: This was arm64, which does not know about -mthumb. I'll do some experiments; problem is that the flags are added through the backdoor, meaning I can't just use KCFLAGS when I use another toolchain to get the same results. This only happens with efi symbols, meaning there is something different in that directory / build. I'll try to create some front-end for use with my self-built toolchain and try to replicate this way.
,
Sep 29 2017
I finished re-testing with -fno-record-gcc-switches, and the package built just fine (for arm64-generic).
,
Sep 29 2017
-frecord-gcc-switches alone is sufficient to trigger the problem. Reproduced with gcc 5.2.0/binutils 2.25.1 and gcc 7.2.0/binutils 2.9 (both built using buildroot) by adding -fgcc-record-switches to KCFLAGS. I thought I had tested that before, and it didn't fail, but I must have done something wrong.
This is only seen on little endian arm64 images with CONFIG_EFI enabled. It appears that some directory specific combination of compiler flags in efi/libstub triggers the problem.
The following kernel change "fixes" the problem for me.
diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index e078390ba477..a67429940c19 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -10,7 +10,7 @@ cflags-$(CONFIG_X86) += -m$(BITS) -D__KERNEL__ -O2 \
-fPIC -fno-strict-aliasing -mno-red-zone \
-mno-mmx -mno-sse
-cflags-$(CONFIG_ARM64) := $(subst -pg,,$(KBUILD_CFLAGS)) -fpie
+cflags-$(CONFIG_ARM64) := $(subst -pg,,$(subst -frecord-gcc-switches,,$(KBUILD_CFLAGS))) -fpie
cflags-$(CONFIG_ARM) := $(subst -pg,,$(KBUILD_CFLAGS)) \
-fno-builtin -fpic -mno-single-pic-base
I'll discuss with upstream.
,
Sep 29 2017
,
Oct 4 2017
arm64 builds with "-fno-record-gcc-switches" generates symbols named "$d". Those are filtered by kallsyms. Problem is that efi is built with "--prefix-symbols=__efistub_". As a result, all $d symbols are converted to __efistub_$d. kallsyms does not recognize "__efistub_$d" and does not filter them. This means that the combination of "--prefix-symbols" and "-fgcc-record-switches" is toxic for arm64 kernel builds. Either "-fgcc-record-switches" will have to be filtered out if "--prefix-symbols" is enabled, or kallsyms will need to handle symbols of type 'n'. We'll see what upstream has to say.
,
Oct 4 2017
Follow-up on #4: For some reason, it appears that gold is no longer used for kernel builds, and I no longer have to disable it. No idea what changed.
,
Oct 4 2017
I think the confusion is because we have a different default linker for aarch64: $ i686-pc-linux-gnu-ld -v GNU gold (binutils-2.27.0-r4-53dd00a1a34ebf5251f6210d778768b4157c5e11_cos_gg 2.27.0.20170315) 1.12 $ x86_64-cros-linux-gnu-ld -v GNU gold (binutils-2.27.0-r4-53dd00a1a34ebf5251f6210d778768b4157c5e11_cos_gg 2.27.0.20170315) 1.12 $ armv7a-cros-linux-gnueabi-ld -v GNU gold (binutils-2.27.0-r4-53dd00a1a34ebf5251f6210d778768b4157c5e11_cos_gg 2.27.0.20170315) 1.12 $ aarch64-cros-linux-gnu-ld -v GNU ld (binutils-2.27.0-r4-53dd00a1a34ebf5251f6210d778768b4157c5e11_cos_gg) 2.27.0.20170315 i.e. default for aarch64 is ld.bfd, and for i686/x86_64/armv7a it is ld.gold. Unless the kernel builds are selecting a linker explicitly with -fuse-ld flag, the default linker will get used.
,
Oct 5 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/a8369d79f01483c4e7634f814c7bbc6b018e78ff commit a8369d79f01483c4e7634f814c7bbc6b018e78ff Author: Guenter Roeck <linux@roeck-us.net> Date: Thu Oct 05 12:17:03 2017 FROMLIST: scripts/kallsyms: Ignore symbol type 'n' gcc on aarch64 may emit synbols of type 'n' if the kernel is built with '-frecord-gcc-switches'. In most cases, those symbols are reported with nm as 000000000000000e n $d and with objdump as 0000000000000000 l d .GCC.command.line 0000000000000000 .GCC.command.line 000000000000000e l .GCC.command.line 0000000000000000 $d Those symbols are detected in is_arm_mapping_symbol() and ignored. However, if "--prefix-symbols=<prefix>" is configured as well, the situation is different. For example, in efi/libstub, arm64 images are built with '--prefix-alloc-sections=.init --prefix-symbols=__efistub_'. In combination with '-frecord-gcc-switches', the symbols are now reported by nm as: 000000000000000e n __efistub_$d and by objdump as: 0000000000000000 l d .GCC.command.line 0000000000000000 .GCC.command.line 000000000000000e l .GCC.command.line 0000000000000000 __efistub_$d Those symbols are no longer ignored and included in the base address calculation. This results in a base address of 000000000000000e, which in turn causes kallsyms to abort with kallsyms failure: relative symbol value 0xffffff900800a000 out of range in relative mode The problem is seen in little endian arm64 builds with CONFIG_EFI enabled and with '-frecord-gcc-switches' set in KCFLAGS. Explicitly ignore symbols of type 'n' since those are clearly debug symbols. BUG=chromium:722580, chromium:769824 TEST=Little endian arm64 build with efi enabled Change-Id: Icd624f8f27d8ab11f9393c8fe477be587a83b61b Cc: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Guenter Roeck <groeck@chromium.org> (am from https://patchwork.kernel.org/patch/9985205/) Reviewed-on: https://chromium-review.googlesource.com/699661 Reviewed-by: Caroline Tice <cmtice@chromium.org> [modify] https://crrev.com/a8369d79f01483c4e7634f814c7bbc6b018e78ff/scripts/kallsyms.c
,
Oct 5 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/d52ea05f93e2853d7430763b35ecb70b6db224d2 commit d52ea05f93e2853d7430763b35ecb70b6db224d2 Author: Guenter Roeck <linux@roeck-us.net> Date: Thu Oct 05 12:17:00 2017 FROMLIST: scripts/kallsyms: Ignore symbol type 'n' gcc on aarch64 may emit synbols of type 'n' if the kernel is built with '-frecord-gcc-switches'. In most cases, those symbols are reported with nm as 000000000000000e n $d and with objdump as 0000000000000000 l d .GCC.command.line 0000000000000000 .GCC.command.line 000000000000000e l .GCC.command.line 0000000000000000 $d Those symbols are detected in is_arm_mapping_symbol() and ignored. However, if "--prefix-symbols=<prefix>" is configured as well, the situation is different. For example, in efi/libstub, arm64 images are built with '--prefix-alloc-sections=.init --prefix-symbols=__efistub_'. In combination with '-frecord-gcc-switches', the symbols are now reported by nm as: 000000000000000e n __efistub_$d and by objdump as: 0000000000000000 l d .GCC.command.line 0000000000000000 .GCC.command.line 000000000000000e l .GCC.command.line 0000000000000000 __efistub_$d Those symbols are no longer ignored and included in the base address calculation. This results in a base address of 000000000000000e, which in turn causes kallsyms to abort with kallsyms failure: relative symbol value 0xffffff900800a000 out of range in relative mode The problem is seen in little endian arm64 builds with CONFIG_EFI enabled and with '-frecord-gcc-switches' set in KCFLAGS. Explicitly ignore symbols of type 'n' since those are clearly debug symbols. BUG=chromium:722580, chromium:769824 TEST=Little endian arm64 build with efi enabled Change-Id: Icd624f8f27d8ab11f9393c8fe477be587a83b61b Cc: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Guenter Roeck <groeck@chromium.org> (am from https://patchwork.kernel.org/patch/9985205/) Reviewed-on: https://chromium-review.googlesource.com/701496 [modify] https://crrev.com/d52ea05f93e2853d7430763b35ecb70b6db224d2/scripts/kallsyms.c
,
Oct 5 2017
Can't really blame the toolchain for this problem.
,
Jan 22 2018
,
Jan 23 2018
|
|||||||||
►
Sign in to add a comment |
|||||||||
Comment 1 by groeck@chromium.org
, Sep 28 2017