New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 902557 link

Starred by 3 users

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

Investigate potential binary size reduction approaches for generated Mojo code

Project Member Reported by oksamyt@chromium.org, Nov 6

Issue description

As a follow-up to Mojo-related binary size increases described in  https://crbug.com/597125 , this issue covers measuring potential savings that could be achieved by sharing the same serialization code for the same sets of data fields (same approach as in new JS bindings -  https://crbug.com/894376 ). 
 
Description: Show this description
Description: Show this description
Labels: -Pri-3 Pri-2
Labels: Performance-Size
See also bug 689690
FYI, we currently have over 2MB of mojo-generated code in chrome for Android: https://storage.googleapis.com/chrome-supersize/viewer.html?load_url=milestones%2Farm%2FMonochrome.apk%2Freport_71.0.3578.20.ndjson&include=mojom&exclude=assets%2Funwind_cfi

Would be great to prioritize this
Cc: roc...@chromium.org jam@chromium.org dougt@chromium.org
Labels: -Pri-2 Pri-1
Actually that's an over-estimate because it also includes symbols with names "mojom" but could be due to actual client code calling into it. The number in pure generated code seems closer to 1.2M so it's somewhere in between those two numbers.

Note that it has grown by ~600kb (i.e. doubled) since M61: https://storage.googleapis.com/chrome-supersize/viewer.html?load_url=milestones%2Farm%2FMonochrome.apk%2Freport_61.0.3163.98_71.0.3578.20.ndjson&include=mojom.cc&exclude=assets%2Funwind_cfi&diff_mode=on
(presumably as things get mojo-ified)
Cc: haraken@chromium.org
One thing that's not measured is that a lot of onion-souping had happened, which removed code in content/ and made blink call directly to the browser. so the above gain since 61 in mojo bindings won't show the amount of deleted code.

Kentaro: do we have stats on how many LOC was deleted so far because of OS?
More specifically what oksamyt@ and I were discussing the other day was to experiment with a rewrite of the C++ bindings in a fashion similar to the recent JS bindings rewrite: i.e. rather than generating serialization and deserialization code, we generate only declarative descriptions of structures and methods. Then the generated Proxy/Stub types call into shared code in mojo/public, passing their inputs along with whatever relevant structure descriptions are needed to interpret them.

I think we can get away with something like:

  template <typename Args>
  void SerializeMessage(MessageEncoder* m,
                        const StructDescription& params_description,
                        Args&&... args) {
    SerializeFields(m, params_description, 0, std::forward<Args>(args)...);
  }

  template <typename NextArg, typename... Rest>
  void SerializeFields(MessageEncoder* m,
                       const StructDescription& params_description,
                       size_t field_index,
                       NextArg&& next_arg,
                       Rest&&... rest) {
    return SerializeField(m, params_description[field_index], next_arg) &&
        SerializeFields(m, params_description, field_index + 1,
                        std::forward<Rest>(rest)...);
  }

  template <>
  bool SerializeFields(Message* m, const StructDescription& params_description) {
    return true;
  }

Then we'd have specializations of SerializeField for every serializable type T, and generic helpers for using mojom traits and stuff like that. Note that because MessageEncoder and StructDescription would be concrete types, this will result in unique specializations of SerializeField only for unique sequences of types as they appear in method or struct signatures. So like:

  struct Point {
    int32 x;
    int32 y;
  };

  interface Foo {
    Erase(int32 x, int32 y);
    Write(string s, int32 x, int32 y);
  };

The bulk of compiler-generated serialization code supporting Erase and Point will be exactly the same, rooted in the specialization SerializeFields<int32_t, int32_t>. And for Write, we would have SerializeFields<std::string, int32_t, int32_t>, and that would merely be implemented in terms of SerializeField<std::string> and SerializeFields<int32_t, int32_t>.

So the net result *should* be:

  - Lots of de-duplication of what is currently redundant inlined serialization code
  - Probably a measurable performance degradation, but only affecting serialization
    and probably eclipsed by cost of IPC (e.g. context switching)

One *major* win would also be that lazy serialization would no longer have any code size cost whatsoever, so we could just turn it on everywhere. That would relegate the performance degradation to strictly the inter-process case.

Finally, it's also worth noting that this effort would coincide nicely with an opportunity to add a new Mojom IDL (in addition to the core Mojo IDL we have) for Blink, with an API surface that takes structure descriptions and arbitrary args and does de/serialization in native C++. This would significantly benefit JS code size and performance, and reduce tons of redundancy (C++ and JS bindings would share all serialization code and most bindings machinery like associated interfaces) and allow for some nice features like sync calls from JS, which would be especially nice in test code.
Re #9: I haven't counted deleted LOC recently. When I counted 6 months ago, it was 40+k LOC (though this number includes code removed from WTF and blink/web/).


We will prioritize the investigation of the approach from #10 and follow up with plans depending on the result.

Sign in to add a comment