New issue
Advanced search Search tips

Issue 593344 link

Starred by 1 user

Issue metadata

Status: Duplicate
Merged: issue 592903
Owner: ----
Closed: Mar 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug



Sign in to add a comment

clang emits full fences around base::subtle::NoBarrier_Load when using the sysroot

Project Member Reported by primiano@chromium.org, Mar 9 2016

Issue description

OS: Linux Desktop, x86_64
Was doing some microbenchmarks today, and discovered this fun thing.

Looks like clang is emitting full fences before and after loads when compiling base::subtle::NoBarrier_Load. GCC doesn't
I tried comparing a bunch of targets that use NoBarrier_Load, see below

When using clang: gn args
  is_debug = false
  use_goma = true
  goma_dir = "/usr/local/google/home/primiano/tools/goma"
  

$ objdump -C -d out/gnrel/obj/net/net/net_log.o | grep -a10 "net::NetLog::IsCapturing() const"
0000000000000000 <net::NetLog::IsCapturing() const>:
   0:   0f ae f0                mfence                   <-----------------------
   3:   83 7f 34 00             cmpl   $0x0,0x34(%rdi)
   7:   0f ae f0                mfence                   <----------------------- 
   a:   0f 95 c0                setne  %al
   d:   c3                      retq   


$ objdump -C -d out/gnrel/obj/base/base/trace_log.o | grep -a12 "TraceLog::UpdateCategoryGroupEnabledFlags()>"
Disassembly of section .text._ZN4base11trace_event8TraceLog31UpdateCategoryGroupEnabledFlagsEv:

0000000000000000 <base::trace_event::TraceLog::UpdateCategoryGroupEnabledFlags()>:
   0:   55                      push   %rbp
   1:   41 57                   push   %r15
   3:   41 56                   push   %r14
   5:   41 55                   push   %r13
   7:   41 54                   push   %r12
   9:   53                      push   %rbx
   a:   48 83 ec 18             sub    $0x18,%rsp
   e:   49 89 fc                mov    %rdi,%r12
  11:   0f ae f0                mfence                    <-----------------------
  14:   4c 8b 2d 00 00 00 00    mov    0x0(%rip),%r13        # 1b <base::trace_event::TraceLog::UpdateCategoryGroupEnabledFlags()+0x1b>
  1b:   0f ae f0                mfence                    <-----------------------
  1e:   4d 85 ed                test   %r13,%r13



Instead, when using GCC:
  is_debug = false
  use_goma = true
  is_clang=false
  use_sysroot=false


$ objdump -C -d out/gnrel_gcc/obj/net/net/net_log.o | grep -a10 "net::NetLog::IsCapturing() const
0000000000000000 <net::NetLog::IsCapturing() const>:
   0:   8b 47 34                mov    0x34(%rdi),%eax
   3:   85 c0                   test   %eax,%eax
   5:   0f 95 c0                setne  %al
   8:   c3                      retq   


$ objdump -C -d out/gnrel_gcc/obj/base/base/trace_log.o | grep -a12 "TraceLog::UpdateCategoryGroupEnabledFlags()>"
0000000000000000 <base::trace_event::TraceLog::UpdateCategoryGroupEnabledFlags()>:
   0:   41 54                   push   %r12
   2:   49 89 fc                mov    %rdi,%r12
   5:   55                      push   %rbp
   6:   53                      push   %rbx
   7:   48 8b 2d 00 00 00 00    mov    0x0(%rip),%rbp        # e <base::trace_event::TraceLog::UpdateCategoryGroupEnabledFlags()+0xe>
   e:   31 db                   xor    %ebx,%ebx
  10:   48 85 ed                test   %rbp,%rbp
  
 
Summary: clang emits full fences around base::subtle::NoBarrier_Load, GCC doesn't (was: clang emitting full fences around base::subtle::NoBarrier_Load, GCC doesn't)
Summary: clang emits full fences around base::subtle::NoBarrier_Load when using the sysroot (was: clang emits full fences around base::subtle::NoBarrier_Load, GCC doesn't)
AHA, I think I found the culprit, It's not clang itself, it's the clang + sysroot

Minified repro case:
test.cc http://pastebin.com/AGYDHqPM

$ clang++ -fPIC -m64 -march=x86-64 -g0 -std=gnu++11 -fno-rtti -fno-exceptions -c -o test-nosysroot.o test.cc -O2 
$ clang++ -fPIC -m64 -march=x86-64 -g0 -std=gnu++11 -fno-rtti -fno-exceptions -c -o test.o test.cc -O2 --sysroot=/s/chrome/src/build/linux/debian_wheezy_amd64-sysroot

/s/clang_volatile  primiano 15:28:37 09/03
$ objdump -d -C test-nosysroot.o | grep -A10 "<get()>"
0000000000000000 <get()>:
   0:   48 8b 05 00 00 00 00    mov    0x0(%rip),%rax        # 7 <get()+0x7>
   7:   48 8b 00                mov    (%rax),%rax
   a:   48 85 c0                test   %rax,%rax
   d:   0f 95 c0                setne  %al
  10:   c3                      retq   

/s/clang_volatile  primiano 15:28:39 09/03
$ objdump -d -C test.o | grep -A10 "<get()>"
0000000000000000 <get()>:
   0:   0f ae f0                mfence 
   3:   48 8b 05 00 00 00 00    mov    0x0(%rip),%rax        # a <get()+0xa>
   a:   48 83 38 00             cmpq   $0x0,(%rax)
   e:   0f ae f0                mfence 
  11:   0f 95 c0                setne  %al
  14:   c3                      retq 


In comparison GCC emits similar code of clang without sysroot:
$ g++ -fPIC -m64 -march=x86-64 -g0 -std=gnu++11 -fno-rtti -fno-exceptions -c -o test-gcc.o test.cc -O2
$ objdump -d -C test-gcc.o | grep -A10 "<get()>"
0000000000000000 <get()>:
   0:   48 8b 05 00 00 00 00    mov    0x0(%rip),%rax        # 7 <get()+0x7>
   7:   48 8b 00                mov    (%rax),%rax
   a:   48 85 c0                test   %rax,%rax
   d:   0f 95 c0                setne  %al
  10:   c3                     


Mergedinto: 592903
Status: Duplicate (was: Untriaged)
Hmm less excited, this seems to be a K.I, duping against  Issue 592903 

Sign in to add a comment