New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 889932 link

Starred by 4 users

Issue metadata

Status: Started
Owner:
Cc:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 2
Type: Bug



Sign in to add a comment

Android: Use ELF symbol aliases to export selected symbols from libchrome_base.so

Project Member Reported by digit@google.com, Sep 27

Issue description

For the transition to android app bundles, the Chrome native library (e.g. libchrome.so or libmonochrome.so) is going to be split up:

- libchrome_base.so, which will be part of the base bundle module,
  and always installed with Chrome.

- libvr_ui.so, which will be part of a feature module, i.e. a split APK
  that will only be installed on demand.

Because libvr_ui.so depends on libchrome_base.so, this requires that the latter exports a small number of symbols from 10 different GN component targets (e.g. base, skia, cc, ui).

Doing so requires non-trivial modifications to the build system. cjgrant@ has developed a prototype that essentially does the following:

- Ensure that the 10 GN components targets are compiled with _all_
  symbols exported. Note that these are still source sets / static
  libraries since this is not a component build.

- Generate a list of symbols that libvr_ui.so actually depends on.
  Without describing too many details, this requires building an
  intermediate version of libvr_ui.so, recording it's required
  symbols, and effectively discarding it.

- Generate a linker version script that is used when linking
  the final libchrome_base.so, to ensure that the symbols of
  interest, and only them, are exported by the ELF library.

- Generate the final libvr_ui.so linked against the final
  libchrome_base.so

This works but has the following drawbacks:

- Exports _all_ symbols from the 10 GN components, even those that
  are not used by libvr_ui.so at all. Even with --visibility=protected
  this has consequences regarding code size and performance (e.g.
  it prevents inlining). For more information about this, see this
  article: https://www.airs.com/blog/archives/307

- Worse, this affects the regular libchrome.so generated for APKs,
  because it is built by linking the same GN dependencies (base, skia,
  etc...). Making the library larger and potentially slower while this
  is completely not needed in this case.

  NOTE: Good news, benchmarking shows that this doesn't seem to affect
  performance significantly (at least startup time seems unaffected).

- Requires double compilation and linking of libraries, and thus slows
  down the build. This also makes the build rules required to handle
  all this very complex, in a way that may not scale neatly if we want
  to support more feature modules / separate native libraries in the
  future.
  
- Currently requires manually changing the _export.h headers of
  GN components of interest, to recognize that they may need to export
  symbols, even when not in a component build (the original code
  checks for the COMPONENT_BUILD macro definition).

After discussing the issue with cjgrant@, I believe there is a way to
achieve a similar result with less trouble. This entry is to document
and experiment with the idea, and eventually implement it. The main
ideas are the following:

1) Have a predefined list of required symbols for libvr_ui.so, for
   example in a text file that can be easily processed by Python
   scripts.

  Benefits:
    - Avoid the double-link steps and associated complexities.
    - Can be used to auto-generate the linker version script and
      also perform checks after the build to verify that no extra
      symbol was exported by mistake.

  Drawbacks:
    - Symbol list may need manual updates whenever the libvr_ui.so
      change and require new symbols (which would probably only
      appear at build time). However, that would only affect people
      working on the libvr_ui.so sources.

2) Use ELF symbol aliases to export only symbols of interest.

  Let's say libvr_ui.so needs a symbol like 'base::Foo' from the 'base'
  component. In reality, it will import the C++ mangled version which
  would look like '_Z4base3FooE' or something similar.

  It is possible to compile the whole 'base' GN target as a usual
  source set, which uses --visibility=hidden, just like regular builds,
  and then link an extra object file that contains an alias definition
  for 'base::Foo'.

  There are two tricks to achieve this:

    1) Use macros to rename the symbol names internally when building
       'base', e.g.:

        defines += [ "Foo=__renamed_Foo" ]

       This will ensure

    2) Use compiler annotations like the following, when building
       the auto-generated exported_symbol_aliases.cc

        namespace base {
        void __renamed_Foo()  __attribute__((alias("Foo")));
        }

  What this does is the following:

   - |base::Foo()| is still compiled with hidden visibility, allowing
     inlining and other performance optimizations. However, its
     internal symbol name will be |base::__renamed_Foo| instead.

   - The final libchrome_base.so will link exported_symbol_aliases.o
     and will thus contain a GOT/PLT entry pointing to one of
     |base::__renamed_Foo| instances (because the function may be
     duplicated due to inlining or constant propagation / 
     specialization, and exported as "base::Foo" (or _Z4base3FooZ).


 
Is there a way to script the drawback? seems kinda bad for folks working on the feature module
Description: Show this description
I made a minor correction to the description:  The prototype doesn't actually build libchrome.so twice - only the module library needs to be built twice.  But either way, this approach would certainly simplify things.
On the drawback, I don't see a way to script this.

At build time, we need the exported symbol list before building base.  But, we need base to build the dummy VR module.  That leaves a chicken-egg problem that we can't fix, unless we went so far as to build base source twice (non-starter).

The only indication we'd get of missing symbols is after the main library's been built, and we're trying to link the feature module lib.  That link would fail with a list of missing symbols.

Technically we could instrument the build to capture the linker's error output, and funnel that into a local edit of the symbol whitelist, but that's not pretty (things like telling the linker to dump all errors, not just the first N...).

digit@, ideas?



Thought:  What happens to decoded stacks that include exported symbols?  Would we now see something like:

Base::TimeInternal
Base::__renamed_GetTicks
Base::SomeCallingMethod

I wonder if that'd have consequences with crash analysis tools.

This would work if we inverted the naming too, right?  Ie, everything in base libs stays the same, we generate aliases like "__exported_Foo", and apply the rename defines to the module source instead?  I guess that approach exposes the renaming across library boundaries though, which might not be great.

About #5, You are correct, we can avoid renaming the symbols in libchrome_base.so after all. I.e.:

  1) Compile the libchrome_base.so symbols as usual (no renaming through macros).

  2) Use an __alias_ prefix for our exported ELF aliases, e.g. 
     libchrome_base.so will export '_Z4Base12__alias_TimeZ' which will
     point to the internal _Z4Base4TimeZ symbol (matching 'base::Time' here).

  3) When compiling the libvr_ui.so, use macro renaming to ensure that its
     code actually imports the aliases instead of the original symbol names.
     I could imagine a auto-generated header that would have lines like:

     #define Time  __alias_Time   // for base::Time
     ....

     And could be included by each feature module component.

This would ensure that libvr_ui.so imports the __alias_ prefixed symbols only.

So, I've performed a little experimentation, and so far, I'v found the following:

1) The alias name must be the C++ mangled version, i.e. you would use something like:

  void Foo()  __attribute__((alias("_Z4BarEv")))        // Foo() -> Bar()
              __attribute__((visibility("protected"))); // Make Foo() public.

  In practice, this should not be a problem is we auto-generate these.

2) This only works if the symbol being aliased is already defined in the current translation unit / object file. I.e. the following works:

foo_1.cc:

  namespace base {
  struct TimeTicks {
    static TimeTick Now();
    long ticks_;
  };
  }  // namespace base

  base::TimeTicks base::TimeTicks::Now() { return {42}; }

  base::TimeTicks GetTimeTicks()
    __attribute__((alias("_Z4base9TimeTicks3NowEv")))
    __attribute__((visibility("protected")))

Because everything is defined in the same object file, however, this fails:

foo_2.cc:

  namespace base {
  struct TimeTicks {
    static TimeTick Now() { return {42}; }
    long ticks_;
  };
  }  // namespace base

  base::TimeTicks GetTimeTicks()
    __attribute__((alias("_Z4base9TimeTicks3NowEv")))
    __attribute__((visibility("protected")))

Because:

  1) base::TimeTicks::Now() is now an inlined method, and isn't compiled into the object file (even with -O0 -g) since nothing references it.

  2) __attribute__((alias(...))) only works if the referenced symbol is already
     defined.

This also prevents "extern" aliases as in:

foo_exports_only.cc:

  extern void Bar();
  void Foo() __attribute__((alias("_Z3BarEv")));   // Foo() -> Bar()

The compile will complain with something like:

  error: 'void Foo()' aliased to undefined symbol '_Z3BarEv'.

This is a larger problem, because it would mean we would have to modify each source file that contains exported symbols, which doesn't sound a good idea at all :-/

I even found a related article on the subject: it looks like extern aliases are possible on OS X and Windows, but not on ELF systems:

  http://blog.omega-prime.co.uk/2011/07/06/the-sad-state-of-symbol-aliases/


Alternatives to consider for now:

  1) Use linker version scripts to add exported symbols to the local ones.
     Is such a thing possible? Link to script documentation:

       https://sourceware.org/binutils/docs/ld/Scripts.html#Scripts

  2) Play clever tricks in the compiler wrapper script, e.g. recognize
     white-listed source files (e.g. base/time/time.cc) at compile time to
     generate a temporary version that appends alias/export lines to the
     source before compiling this to an object file. We might want to do that
     on the pre-processed files to avoid messing with the source file path
     in the debug info / stack traces.

  3) Something else?

Those are really interesting findings.

It's probably worth restating now that, in terms of any symbol exporting, the primary concern of base owners is complexity.  That makes the latest option (2) sound non-ideal, although it's certainly an option.
Simple assembly-based jump stubs seem to work well, here's an example for x86_64:

foo_exports_x86_64.S:

  .type Foo,STT_FUNC
 
  .global Foo_exported    @ Make "Foo_exported" exported.

  Foo_exported:
    jmp Foo               @ Direct PC32 relative jump relocation.


The same for Arm:

  .type Foo,STT_FUNC
  .global FOO_exported
  Foo_exported:
    ldr ip,=Foo      @ |ip| is a scratch register during function calls
    bx ip            @ it is also used by PLT entries to perform jumps.

The result links properly with an implementation of Foo that is hidden due to -fvisibility=hidden. And the resulting shared libraries export the Foo_exported symbol.

I've looked at the final disassembly that seems perfectly correct.

We should be able to generate libchrome_base_exports.S automatically for the target CPU architecture from a simple list of symbols (mangled C++ ones, to be exact), and just link the resulting object file into libchrome_base.so.

I'll provide more info later, I need to leave the office. This seems much simpler than using alias symbols after all :)
A few more remarks about #9:

This is equivalent to writing a wrapper function, e.g.

  void Foo_exported() __attribute__((visibility("protected"))) {
    Foo();
  }

Except that:

  1) This does not alter the call stack or registers during the call, so
     becomes completely transparent in the call stack. This is due to the
     use of a direct jump instruction, which cannot be performed from C / C++.

  2) The C or C++ version would need to replicate the exact signature of each
     symbol, including its return type, and this is simply not available
     easily (e.g. the mangled C++ name contains the types of the parameters
     but not that of the return value).

The drawback is that this required platform-specific assembly stubs, but that shouldn't be a great issue in practice.

Also, we should be able to generate reverse jump stubs for libvr_ui.so to avoid playing stupid macro tricks when compiling its object files, i.e.:

  1/ Compile a libvr_ui.so source file that includes something from 
     libchrome_base.so normally. E.g. its object file imports 'Foo'

  2/ Generate the following stub that would be linked into the final
     libvr_ui.so:

         .type FOO_exported,STT_FUNC
         Foo:                      @ Local definition of Foo.
           jmp @FOO_exported
         
In practice, this means that each call to Foo from libvr_ui.so will result in a triple indirection:

     1) libvr_ui.so: call to Foo_exported in libchrome_base.so PLT
     2) libchrome_base.so: PLT jumps to local FOO_exported
     3) libchrome_base.so: Foo_exported jumps to local Foo

I don't think it will be very noticeable in practice. There are ways to turn this into a simple indirection by essentially re-implementing a custom PLT scheme between libchrome_base.so and libvr_ui.so, but I won't detail it here for now.

*HOWEVER* One thing where this method doesn't work is accessing global data symbols exported by libchrome_base.so from libvr_ui.so. Because the latter cannot simply "jump" to the address. I'm currently looking into it.

More experimentation shows that it is not possible to stub access to global objects (data) from one shared library to the other. To briefly describe the details, conside that libvr_ui.so wants to access the |gData| global variable at some point.

  1/ The generated machine code in to do that in libvr_ui.so would do something
     similar to:

        r1 = pc + kPcRelativeOffsetToGOT     # Get GOT address into |r1|.
        r2 = [reg + kGDataAddressIndex * 4]  # Get address of gData into |r2|.

     where kPcRelativeOffsetToGOT and kGDataAddressIndex are hard-coded
     constants in the read-only machine code stream.

     kGDataAdressIndex is a slot index into the GOT. This one contains the
     address of gData, its value is determined at load time by the dynamic
     linker through a symbolic relocation to the "gData" imported symbol.

  2/ There is no simple way to get the address of GOT[kGDataAddressIndex],
     either from assembly or C++, so we cannot just patch the address at
     runtime.

  3/ Moreover, the symbolic relocation would fail, unless libchrome_base.so
     would export the symbol as "weak". Which is essentially not possible
     without changing the sources where |gData| is defined, which we want
     to avoid.

I think the simplest solution is simply to prevent libvr_ui.so from accessing
any global variable. If it really needs the address of some data from libchrome_base.so, the latter should provide a function to return it, as in:

   static GData& GetGData() { return gData; }

One important thing though is that to be able to create a jump stub for it, the GetGData() function must first exist in libchrome_base.so, which may or may not happen for simple inlined functions.

By default, the compiler decides whether an inline function/method declaration should generate a instance of the function or not. For very short ones like the one above, it could decide to never instantiate it, and rely exclusively on inlining the code into the caller's site. If this happens we will get a data symbol relocation in libvr_ui.so which will simply fail to link, either statically or at runtime.

So, we just need to ensure that these functions are not always completely inlined. There are two ways to do that:

  A) Make the function non-inlined, which is easy, i.e. in a header file:

        static GData& GetGData();

     While the implementation is in a separate C++ source file linked into
     libchrome_base.so.

  B) Ensure a copy of the function is always instantiated in libchrome_base.so.
     And none exists in libvr_ui.so. This could be done using function
     attributes like [gnu::gnu_inline]. For example:

       class Foo {
        public:
         // Always generate a visible static method Get() instantiation
         // even though its content is still available to caller site inlining.
         [[gnu::gnu_inline]] [[gnu::visibility("default")]]
         static int& Get() { return x_; }

        private:
         static int x_;
       };

  Note that in this case, Foo::Get() would not need a jump stub at all.    

  For more information about gnu_inline:

       https://clang.llvm.org/docs/AttributeReference.html#gnu-inline

In the VR context, lack of direct access to data affects use of vector icons (example below).  That's a case where we'd need to employ the getter method approach (probably along with an enum of icons that the VR module needs access to).

https://cs.chromium.org/chromium/src/chrome/browser/vr/ui_scene_creator.cc?type=cs&sq=package:chromium&q=ui_scene_crea&g=0&l=2615
Cc: -cjgrant@chromium.org digit@chromium.org
Labels: -Pri-3 Pri-2
Owner: cjgrant@chromium.org
Status: Started (was: Available)
Just a quick correction, on ARM, the assembly should use relative offsets, so would look like:

  .type Foo,STT_FUNC
  .global FOO_exported

  Foo_exported:
    ldr ip,= Foo_exported_offset
    add pc, ip

  Foo_exported_offset:
    .word Foo - (Foo_exported + 8)

The reason for the +8 is the following, on ARM, the value of |pc| will be the current instruction address + 4, so when executing the |add pc, ip| line, the value of |pc| will really be |Foo_exported + 8|. And you want to add a relative offset that will make the sum point to Foo.

Actually, I may be wrong about the offset, so let me check further, but you have the general idea. This should result in a R_ARM_REL_32 relocation being used to define the value in the offset, which will be resolved properly by the static linker.


This is linking!  I'm re-attaching the VR module now (pulling it out of the prototype CL), and will plumb in the inverse of what's above to adapt the module's code to these interface shims.

Thanks for doing the heavy lifting in terms of the assembly code, David!
Thanks, it links but I suspect this won't work at runtime.
I've created an experimental CL that uses this technique, but results in a runtime crash [1]. It looks like ThinLTO is actually making weird things to the final code.

Looking into it. I suspect we need more GNU assembler directives to ensure the linker does the right thing in the end (e.g. placing each function in its own section, things like that).

[1] https://chromium-review.googlesource.com/c/chromium/src/+/1339819
Cc: p...@chromium.org
As I mentioned over GVC to cjgrant@ a few months ago, this is likely to make the rollout of CFI significantly more complicated unless the list of exported symbols is controlled *very* closely. Please try not to expose any interfaces between libvr_ui.so and libchrome_base.so that pass objects with virtual functions between the two modules. To help avoid regressions, please make sure that the code in libvr_ui.so is tested by the "Android CFI" bot.

>  Foo_exported:
>    ldr ip,= Foo_exported_offset
>    add pc, ip
>
>  Foo_exported_offset:
>    .word Foo - (Foo_exported + 8)

This can just be "b Foo", no?
Regarding #18, do you have more information about this CFI rollout and why this would restrict passing interfaces between libvr_ui.so and libchrome_base.so (which would essentially be a thread-kill for this whole project).

Regarding "b Foo":

- We would really need the "bx" instruction to not worry about thumb/arm
  interwork, and this one doesn't have an immediate offset encoding.

  And for the record, writing to the |pc| register directly already deals
  with thumb/arm state changes automatically on ARMv6+.

- "b Foo" only works if the destination is sufficiently close to the branch
  instruction, which cannot always be guaranteed. Otherwise this would result
  in a static link-time error. Using a hard-coded 32-bit relative offset
  avoids the problem.

Answering my own question with this link: https://clang.llvm.org/docs/ControlFlowIntegrityDesign.html

Still trying to understand the consequences of this though :-/
"b Foo" causes the assembler to produce an R_ARM_JUMP24 (or R_ARM_THM_JUMP24 on Thumb) relocation pointing to Foo. Linkers are programmed to create ARM/Thumb interworking thunks and range-extension thunks for these types of relocations automatically. This is necessary otherwise most large programs (including libmonochrome.so) wouldn't link. See for example in lld: http://llvm-cs.pcc.me.uk/tools/lld/ELF/Arch/ARM.cpp#259 (this is the code where we decide whether we need to create a thunk for a relocation).

Sign in to add a comment