_msize link error in ARM64 Win32 builds of scoped_handle_test_dll |
|||
Issue descriptionOS: Windows When doing release arm64 builds for Win32 (requires the new toolchain or a local install of VS 2017 15.9): ninja -C out\release_arm base_unittests ninja: Entering directory `out\release_arm' [116 processes, 595/711 @ 6.9/s : 86.473s ] LINK_MODULE(DLL) scoped_handle_test_dll.dll scoped_handle_test_dll.dll.pdb FAILED: scoped_handle_test_dll.dll scoped_handle_test_dll.dll.pdb c:/src/depot_tools/win_tools-2_7_6_bin/python/bin/python.exe ../../build/toolchain/win/tool_wrapper.py link-wrapper environment.arm64 False link.exe /nologo /DLL /OUT:./scoped_handle_test_dll.dll /PDB:./scoped_handle_test_dll.dll.pdb @./scoped_handle_test_dll.dll.rsp libucrt.lib(msize.obj) : error LNK2005: _msize already defined in base.lib(allocator_shim.obj) ./scoped_handle_test_dll.dll : fatal error LNK1169: one or more multiply defined symbols found [1 processes, 710/711 @ 4.1/s : 172.771s ] CXX obj/base/base_unittests/safe_numerics_unittest.obj ninja: build stopped: subcommand failed. This is related to a recent change where we started defining _msize - https://chromium-review.googlesource.com/c/chromium/src/+/1354219 - but you already know about that change. Presumably the ARM64 msize.obj file in libucrt.lib contains another symbol that we are also pulling in. I have some tools that can help us find that (using /verbose linking) if help is needed. Verbose linking shows that _msize_base is what pulls in msize.obj, so we need to define this function as well: Found _msize_base Referenced in libucrt.lib(recalloc.obj) Loaded libucrt.lib(msize.obj)
,
Dec 13
After some discussion, it appears that the best fix is to supply _recalloc_base. This should break the chain to _msize_base and will fix the link errors.
This was determined by applying this diff:
diff --git a/base/BUILD.gn b/base/BUILD.gn
index 87ff0c857091..7b11b0a00664 100644
--- a/base/BUILD.gn
+++ b/base/BUILD.gn
@@ -2155,6 +2155,7 @@ if (is_win) {
":base",
"//base/win:base_win_buildflags",
]
+ ldflags = [ "/VERBOSE" ]
}
}
Then building scoped_handle_test_dll and running tools\win\linker_verbose_tracking.py on the verbose output, as documented here:
https://www.chromium.org/developers/windows-binary-sizes
The build flags used are:
is_component_build = false
is_debug = false
target_cpu = "x86"
enable_nacl = false
,
Dec 14
Thanks Bruce for the investigation. As your build config shows, this is not Windows ARM64 specific issue. It seems the link error is caused by the different behavior of COMDAT linking between MSVC link.exe and lld-link.exe. _msize() is compiled as COMDAT (/Gy) for both definition from libucrt.lib(msize.obj) and allocator_shim.obj. MSVC link.exe reports symbol redefinition error when it saw 2 COMDATs for _msize(), but lld-link.exe doesn't report error for multiple COMDAT with the same symbol, seems it picks the first one it saw and we are lucky here that it picks the right version (allocator_shim.obj).
,
Dec 14
CCing Rui as this seems like perhaps it should also be an error with lld-link.exe?
,
Dec 14
I actually mis-specified the build flags - note that use_lld=false was missing. But, otherwise the conclusion is correct. Note that the build failures only happen with the new SDK (currently available in crrev.com/c/1377510) when linking with link.exe. That is, the new SDK adds the calls to _msize_base and _recalloc_base which trigger the problem but lld-link ignores the duplicate symbols. The simplest repro is to patch in crrev.com/c/1377510, run "gclient runhooks", and then build scoped_handle_test_dll with a minimal args.gn of: use_lld = false is_debug = false A more minimal repro could be created by having other duplicate functions, presumably, if that is needed, but this will do for now. So... We need to fix the link error by providing overrides for _msize_base and/or _recalloc_base. We also need an llvm bug filed to request that lld-link diagnose this error.
,
Dec 15
As to lld-link, yes, I think lld-link should report an error for duplicate COMDAT symbols that are different in size or whatever. That's I believe just a missing feature.
,
Dec 18
Some brief testing suggests that lld-link does detect and fail on duplicate symbols when they are explicit inputs. However when they are on-demand inputs from .lib files then it is willing to ignore the duplication. Given the /Gy switch that puts _msize and _msize_base in different COMDATs it is not clear that this is really an lld-link deficiency. One could reasonably argue that ignoring the definition of _msize is entirely reasonable. After all, link.exe ignores it if it doesn't have any need to examine that .obj file, lld-link just ignores it using a slightly different heuristic. link.exe's habit of pulling in all of an .obj file when just one symbol is pulled in has some value, but it has also been the cause of many problems over the years. Either way, if we want to support link.exe (which seems like a worthy goal, and currently needed for ARM64) then we need to override _msize_base to prevent this error. > it picks the first one it saw and we are lucky here that it picks the right version (allocator_shim.obj) This is not true. lld-link will prefer to grab symbols from .obj files that are explicitly listed on the command line. .obj files that are in .lib files are always treated as "weaker" symbols that should only be pulled in if needed, so there is no ambiguity.
,
Dec 18
> Either way, if we want to support link.exe (which seems like a worthy goal, and currently needed for ARM64) then we need to override _msize_base to prevent this error. When we spoke last time you mentioned that _msize_base was being pulled in from a call to _realloc_base, I think what we should actually do is provide a _realloc_base allocator shim. The point of the allocator shims after all is to replace all uses of the allocator, so the _msize_base dependency is really just the canary in the coal mine that there's a new function we're not shimming.
,
Dec 18
I thought they were both being pulled in, but I think you are correct that _realloc_base (actually _recalloc_base) was pulling in _msize_base. It may be one of those cases where multiple functions are pulling in _msize_base. Override one, see if things are still broken, analyze the verbose linker output with tools\win\linker_verbose_tracking.py (see above), rinse, repeat. Or, wait until lld-link can handle ARM64 and don't worry about it, but I don't think that is ideal, and it might be actively broken.
,
Dec 19
This is a fixed version of the previous comment:
Apparently the lld-link behavorial difference is not quite as I characterized and the clang-cl team is interested in fixing it. So, repro steps for lld-link.exe versus link.exe are:
git cl patch -b new_toolchain 1377510
gclient runhooks
gn gen out\Default --args="use_lld=false is_debug=false use_goma=true"
ninja -C out\Default scoped_handle_test_dll
This will fail with:
libucrt.lib(msize.obj) : error LNK2005: _msize already defined in base.lib(allocator_shim.obj)
./scoped_handle_test_dll.dll : fatal error LNK1169: one or more multiply defined symbols found
If you rebuild with use_lld=true then the link succeeds. If you modify the build.gn file to add the ldflags line shown below (create the c:\src\temp\linkrepro directory first) then you will get a repro.tar file which contains all of the linker inputs:
loadable_module("scoped_handle_test_dll") {
+ ldflags = [ "/linkrepro:c:\src\temp\linkrepro" ]
,
Jan 4
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/9116f1578a6f5b4eb6be7acd689267daa325bafc commit 9116f1578a6f5b4eb6be7acd689267daa325bafc Author: Tom Tan <Tom.Tan@microsoft.com> Date: Fri Jan 04 19:56:26 2019 Shim _recalloc() and _recalloc_base() to avoid link error of duplicated symbol definition msize.obj from libucrt.lib defines 2 functions, _msize() and _msize_base(). _msize_base() is called in _recalloc_base() from UCRT. Shim both _recalloc() and _recalloc_base() to not call _msize_base() from CRT, instead, call the overridden _msize() in the same source file to avoid pulling in msize.obj from UCRT library. Bug: 915016 Change-Id: I8487e3162a01e8c17ddbcef3128b4838e5c521d3 Reviewed-on: https://chromium-review.googlesource.com/c/1394877 Reviewed-by: Vlad Tsyrklevich <vtsyrklevich@chromium.org> Reviewed-by: Bruce Dawson <brucedawson@chromium.org> Reviewed-by: Primiano Tucci <primiano@chromium.org> Commit-Queue: Bruce Dawson <brucedawson@chromium.org> Cr-Commit-Position: refs/heads/master@{#620037} [modify] https://crrev.com/9116f1578a6f5b4eb6be7acd689267daa325bafc/base/allocator/allocator_shim_override_ucrt_symbols_win.h
,
Jan 10
Fixed by the crrev.com/c/1394877 according to my testing. Unfortunately I can't tag the fixer as owner at the moment. |
|||
►
Sign in to add a comment |
|||
Comment 1 by brucedaw...@chromium.org
, Dec 13Going through the full chain using the script I have for this: python tools\win\linker_verbose_tracking.py verbose_linker_output.txt msize.obj Database loaded - 527 xrefs found Searching for libucrt.lib(msize.obj) libucrt.lib(msize.obj).obj pulled in for symbol "_msize_base" by libucrt.lib(recalloc.obj) libucrt.lib(recalloc.obj).obj pulled in for symbol "_recalloc_base" by libucrt.lib(onexit.obj) libucrt.lib(argv_wildcards.obj) libucrt.lib(setenv.obj) libucrt.lib(onexit.obj).obj pulled in for symbol "_initialize_onexit_table" by libcmt.lib(utility.obj) libucrt.lib(setenv.obj).obj pulled in for symbol "__dcrt_set_variable_in_narrow_environment_nolock" by libucrt.lib(environment_initialization.obj) libucrt.lib(argv_wildcards.obj).obj pulled in for symbol "__acrt_expand_narrow_argv_wildcards" by libucrt.lib(argv_parsing.obj) libcmt.lib(utility.obj).obj pulled in for symbol "__scrt_is_nonwritable_in_current_image" by libcmt.lib(dll_dllmain.obj) libucrt.lib(argv_parsing.obj).obj pulled in for symbol "_configure_narrow_argv" by libcmt.lib(utility.obj) libucrt.lib(environment_initialization.obj).obj pulled in for symbol "_initialize_narrow_environment" by libcmt.lib(utility.obj) libcmt.lib(dll_dllmain.obj).obj pulled in for symbol "_DllMainCRTStartup" by So, _msize_base/msize.obj is pulled in by _recalloc_base/recalloc.obj which is pulled in by various functions. So we need to either supply _msize_base or _recalloc_base.