clang-cl can't divide __uint128_t |
||
Issue description
The following program doesn't build with clang-cl, but it builds with normal clang:
#include <inttypes.h>
#include <stdint.h>
#include <stdio.h>
#include <string.h>
int main(int argc, char **argv) {
// Silliness to prevent the compiler from optimizing stuff away.
__uint128_t a;
memcpy(&a, argv[1], sizeof(a));
uint64_t b;
memcpy(&b, argv[2], sizeof(b));
__uint128_t c = a % b;
printf("%" PRIu64, (uint64_t)c);
return 0;
}
clang-cl.exe u128.cc
u128-100481.obj : error LNK2019: unresolved external symbol __umodti3 referenced in function main
u128.exe : fatal error LNK1120: 1 unresolved externals
clang-cl.exe: error: linker command failed with exit code 1120 (use -v to see invocation)
The reason I'm interested in this is we have some code in BoringSSL which uses uint128_t to get at the CPU's 64x64->128 multipliers and some other operations. We've had to disable that on Windows Chrome because MSVC doesn't support it, but with clang-cl, it looks like we can finally close that performance gap.
Except it turns out not all of our __uint128_t code works with clang-cl. The stuff we particularly care about works, but it'd be nice to not need to hack around having only partial __uint128_t support.
Relatedly, I'm not sure why there's an intrinsic used here at all. Can't the x64 div/idiv instructions do an RDX:RAX / Rxx division? But both GCC and Clang seem not to use it:
https://godbolt.org/g/r14N3g
,
Oct 10
,
Oct 10
When I tried to build BoringSSL without assembler, I saw following error at link time. e:/b/s/w/ir/cipd_bin_packages/bin/python.exe ../../build/toolchain/win/tool_wrapper.py link-wrapper environment.x64 False ../../third_party/llvm-build/Release+Asserts/bin/lld-link.exe /nologo /OUT:./foo.exe /PDB:./foo.exe.pdb @./foo.exe.rsp lld-link: error: undefined symbol: __udivti3 >>> referenced by e:\b\s\w\ir\cache\builder\client\third_party\boringssl\src\crypto\fipsmodule\bn\div.c:324 >>> boringssl.lib(bcm.obj):(BN_div) >>> referenced by e:\b\s\w\ir\cache\builder\client\third_party\boringssl\src\crypto\fipsmodule\bn\div.c:764 >>> boringssl.lib(bcm.obj):(BN_div_word) I believe it related to this issue.
,
Oct 10
Ah, hrm, I think we forgot to add another BN_CAN_DIVIDE_ULLONG condition, but you're the right person to build with clang-cl and no assembly. :-) Does this patch fix it for you?
Does applying this patch fix it for you?
diff --git a/crypto/fipsmodule/bn/div.c b/crypto/fipsmodule/bn/div.c
index 57485bd1..28bba799 100644
--- a/crypto/fipsmodule/bn/div.c
+++ b/crypto/fipsmodule/bn/div.c
@@ -64,10 +64,14 @@
#include "internal.h"
-#if !defined(BN_ULLONG)
+#if !defined(BN_CAN_DIVIDE_ULLONG)
// bn_div_words divides a double-width |h|,|l| by |d| and returns the result,
// which must fit in a |BN_ULONG|.
-static BN_ULONG bn_div_words(BN_ULONG h, BN_ULONG l, BN_ULONG d) {
+//
+// This function is used in complex conditions (see |bn_div_rem_words|), so just
+// mark it unused to silence compiler warnings.
+static OPENSSL_UNUSED BN_ULONG bn_div_words(BN_ULONG h, BN_ULONG l,
+ BN_ULONG d) {
BN_ULONG dh, dl, q, ret = 0, th, tl, t;
int i, count = 2;
@@ -168,7 +172,7 @@ static inline void bn_div_rem_words(BN_ULONG *quotient_out, BN_ULONG *rem_out,
: "a"(n1), "d"(n0), "rm"(d0)
: "cc");
#else
-#if defined(BN_ULLONG)
+#if defined(BN_CAN_DIVIDE_ULLONG)
BN_ULLONG n = (((BN_ULLONG)n0) << BN_BITS2) | n1;
*quotient_out = (BN_ULONG)(n / d0);
#else
(It would be nice for the clang-cl bug to get fixed though. It's weird to have an operation available most but not all of the time and the fallback code is undoubtedly slower than a single divq instruction.)
,
Oct 10
> but you're the right person to build with clang-cl and no assembly. :-) Sorry, I meant "but you're the *first* person to build with clang-cl and no assembly".
,
Oct 10
The following revision refers to this bug: https://boringssl.googlesource.com/boringssl/+/e3418028023f6399bee43d78db1ee7f374f52dc2 commit e3418028023f6399bee43d78db1ee7f374f52dc2 Author: Yoshisato Yanagisawa <yyanagisawa@google.com> Date: Wed Oct 10 15:33:35 2018 Fix div.c to divide BN_ULLONG only if BN_CAN_DIVIDE_ULLONG defined. Since clang-cl uses __udivti3 for __uint128_t division, linking div.obj fails. Let me make div.c use BN_CAN_DIVIDE_ULLONG to decide using __uint128_t division instead of BN_ULLONG. Bug: https://bugs.chromium.org/p/chromium/issues/detail?id=787617 Change-Id: I3ebe245f6b8917d59409591992efbabddea08187 Reviewed-on: https://boringssl-review.googlesource.com/c/32404 Commit-Queue: David Benjamin <davidben@google.com> Reviewed-by: David Benjamin <davidben@google.com> CQ-Verified: CQ bot account: commit-bot@chromium.org <commit-bot@chromium.org> [modify] https://crrev.com/e3418028023f6399bee43d78db1ee7f374f52dc2/crypto/fipsmodule/bn/internal.h [modify] https://crrev.com/e3418028023f6399bee43d78db1ee7f374f52dc2/crypto/fipsmodule/bn/div.c |
||
►
Sign in to add a comment |
||
Comment 1 by bugdroid1@chromium.org
, Dec 11 2017