Change StructTraits deserializer to return an Optional<T> instead of a bool + T* in an output param |
||||||
Issue descriptionRight now, I see this written in a number of different ways. 1. Deserialize into a temporary and copy into the output var if good: std::string x; if (!view.ReadField(&x)) return false; out->x = x; return true; 2. Deserialize directly into the field (if a friend or public): return view.ReadField(&out->x); 3. Indirect through std::unique_ptr and construct into the result: std::string x; if (!view.ReadField(&x)) return false; out->reset(new Out(x)); return true; and others. Now that we have Optional, it would be nice to combine the error signalling and return value into one. - This avoids confusion around whether or not output params are used on failure (they aren't, but it's not clear from the current design) - No more "default ctor for IPC/Mojo" The main question here is how this would affect perf/binary size.
,
Feb 14 2017
RE original bug report:
There are a few things that need to consider:
- (minor issue though) It is slightly more verbose because now we need to explicitly provide the type for deserialization.
out->foo = view.ReadField<FooType>();
if (!out->foo)
return false;
V.S.
if (view.ReadField(&out->foo))
return false;
- It is a little confusing when the field is "nullable" and is meant to be deserialized as a base::Optional<FooType>:
// The outer base::Optional<> indicates whether the deserialization succeeds; the inner base::Optional<>
// indicates whether the object is null.
base::Optional<base::Optional<FooType>> foo = view.ReadField<base::Optional<FooType>>();
if (!foo) return false;
out->foo = std::move(foo.value());
Users may mistakenly do:
base::Optional<FooType> foo = view.ReadField<FooType>();
out->foo = std::move(foo);
> - No more "default ctor for IPC/Mojo"
I think this is a nice thing.
RE #1:
Sorry I don't understand your idea. Would you please explain a little bit and give examples? Thanks!
,
Feb 14 2017
Read(T*) {} takes an already constructed T. So you can't initialize it by calling a constructor. This makes it easy to build a T that is not a valid state.
class {
T(int)
T(flat)
private:
T(); // for mojo
friend StructTraits<T>;
int i_;
float f_;
}
In this example itd be very easy to write mojo that would decode a message with both a float and an int, even tho T is only ever actually able to be constructed with one or the other.
It also prevents fields in T from being const.
,
Feb 14 2017
,
Feb 14 2017
RE #3: I see what you meant. I assume Daniel's idea of "No more 'default ctor for IPC/Mojo'" means the same thing. I agree that it is a nice thing. As I described in #2, the one major thing I am not very satisfied with is that base::Optional<> is used for both nullable and deserialization failure, which could be confusing in some cases. Maybe we could introduce some different type, say, DeserializationResult, which has similar behavior as base::Optional<> but is an incompatible type? WDYT?
,
Feb 14 2017
You'd still have to step thru multiple levels of indirection and check for valid or not along the way.. fwiw the code you wrote above won't work unless the object stores an optional. base::Optional<base::Optional<FooType>> foo = view.ReadField<FooType>(); if (!foo) return false; if (*foo) out->foo = std::move(**foo);
,
Feb 14 2017
Optional<Optional<T>> is kind of obnoxious. However, I think nullable types are easy to overuse, which this would discourage =P It's also a little unusual to have what is essentially a duplicate of Optional in everything but in name. However, all that said... Optional<Optional<T>> is pretty unreadable though and the meaning would be pretty ambiguous... so perhaps it's better to have a distinct type =)
,
Feb 14 2017
RE #6: > fwiw the code you wrote above won't work unless the object stores an optional. Yeah. My example in #2 about nullable type assume that out->foo is base::Optional<FooType>, which is pretty common for optional things.
,
Mar 9 2017
I've been thinking about this, and one thing I'm concerned about with this proposal is the extra temporaries and copies involved. With the current deserialization mechanism, we deserialize in place so it's possible to minimize the number of copies. However, this approach requires deserializing all the arguments before invoking the object's constructor. This is extra stack space to hold the temporaries, as well as extra code to create/destroy the objects in question. In addition, if we have structs B and C nested in A, we'll need to deserialize B and C first--and then we construct A, we will (likely) end up copying B and C into A when we pass it into the constructor.
,
Mar 9 2017
RE #9: Yeah. It seems possible to result in more copies. Although the current approach has the requires-default-constructor issue, maybe that is a relatively minor issue comparing with extra potential copies? WDYT?
,
Mar 9 2017
To provide the opposite POV.. bool deserialize(T*) can be used to deserialize in place, but I am finding cases where not calling the constructor is problematic, because constructors if not trivial can impose restrictions on the state of the object. Setting fields directly is like memcpying into the object from a malicious memory range. So when going thru the constructor now you have to build a T, and call a copy/move constructor to assign it to pointer. With an optional you'd get some RVO instead.
,
Mar 9 2017
> bool deserialize(T*) can be used to deserialize in place, but I am finding > cases where not calling the constructor is problematic I don't think we suggest users to use a not-yet-constructed T*. They should pass in a valid T object, which is usually default constructed. The deserialize() doesn't take responsibility of constructing the object.
,
Mar 9 2017
I think the problem is that for some Ts, there is no sensible default-constructed state. It would be nice not to have a "temporarily invalid default constructed object" just for IPC. At the same time, it would be good to make sure we still have good codegen. And deserialization performance is pretty important in many areas. It's true that with an optional you get RVO if you don't have to go through the assignment, but many types also don't do that and just assign fields through setters, so there is no copy/move involved at all.
,
Mar 9 2017
Ya I think there is an important distinction that is missed by mojo overall. For some types any internal state is valid. This includes primitive types, or types with public members. In this case it seems that we have very high validation costs from mojo right now for no benefits. For some types there are invalid states. Simple example is gfx::Size can't have a negative width or height. But if you friended mojo and deserialized into it (even if default constructed to 0), you could easily receive negative values and place them there, unless you duplicate the logic of the constructor in mojo, which is not intuitive. In this case the validation is better done by the constructor than by mojo. There's probably some middle ground too.
,
Mar 9 2017
> For some types any internal state is valid. This includes primitive types, or > types with public members. In this case it seems that we have very high > validation costs from mojo right now for no benefits. I don't think I understand what you mean by very high validation cost. Would you please explain?
,
Mar 9 2017
> For some types there are invalid states. Simple example is gfx::Size can't have a negative width or height. > But if you friended mojo and deserialized into it (even if default constructed to 0), you could easily receive > negative values and place them there, unless you duplicate the logic of the constructor in mojo, which is not > intuitive. In this case the validation is better done by the constructor than by mojo. (In this example, the constructor silently replaces negative values with 0. That is not exactly what mojo validation should do -- drop this message and close the pipe.) But it makes sense to me to share validation logic somehow.
,
Mar 9 2017
> I don't think I understand what you mean by very high validation cost. Would you please explain? weiliangc@ has been investigating the cost of serializing cc::CompositorFrame and has been finding a lot of time spent on validation of POD. She's still looking into things but you could reach out to her if you wanna know more right away (could be very productive).
,
Mar 9 2017
Validation happens as a totally separate step, prior to deserialization (where StructTraits are used). So that seems to be a topic unrelated to our current discussion. Validation is done on serialized mojo objects (as bytes in mojo messages). So they are not exactly the same PODs as in-memory objects. For example, struct has a header, nested structs are not in-lined, etc. I would be happy to learn about opportunities to improve the validation perf.
,
Mar 14 2017
I was looking at IPC perf for compositor frames, and thinking about copying bulk memory of POD rather than iterating through individual structs as a way to improve. IPC system doesn't have explicit validation, other than some checks while reading. I am still exploring the possibilities. Also I have no idea whether this idea is even valid in Mojo. I'll start email thread offline about investigating this idea. For perf of validation, over half of the time in deserialization is in Validation.
,
Mar 14 2017
Sounds good. I would be happy to learn more about the idea. Using bulk memory of POD is possible, although that means the user takes full responsibility how to interpret that block of memory. If it is used by multiple environments (C++/Java/JavaScript), the users need to make sure they interpret the block in the same way, for example. You could for example, use an array<uint8> for it in your mojo interfaces or structs. I also looked at the cc serialization performance tests before, I think validation took maybe one-third of the deserailization time. But that really depends on the objects being tested. So half of the time in validation is also possible.
,
Mar 15 2017
weiliangc@: can you clarify what you mean by the IPC system doesn't have explicit validation? AFAIK, it does (when you read an int, it makes sure there's enough bytes left in the IPC pickle for a 4-byte integer). Do you mean that Mojo is doing a lot of additional checks for POD types? AFAIK, we strongly discourage anything from going over IPC as a block of bytes that's just memcpy'ed around.
,
Apr 6 2018
This issue has been Available for over a year. If it's no longer important or seems unlikely to be fixed, please consider closing it out. If it is important, please re-triage the issue. Sorry for the inconvenience if the bug really should have been left as Available. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Oct 17
,
Nov 15
Triaging. This still seems like a nice improvement. I think a good compromise would be to support [no_default_constructor] as a type-mapping option for specific types. This would result in generated code which expects that type's traits' Read() method to take a base::Optional<T>* instead of a T*. Then we can still avoid base::Optional<base::Optional<T>> in the most common nullable type cases. |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by danakj@chromium.org
, Feb 14 2017