mojo strings should require valid utf-8/utf-16 |
|||
Issue descriptionWe don't really do this today, but things like WTF::String do require valid UTF-8. We should try enabling this check universally, and encourage all binary data to be passed using something like array<uint8> instead. The current alternative is random struct traits checking for utf-8 themselves, which seems suboptimal. Alternatively, there's been a longstanding proposal to make it possible to enforce simple length constraints in mojom. This seems like it would be a natural fit there as well.
,
Mar 8 2017
I wrote a perf test to compare string deserialization with/without utf8 check: https://codereview.chromium.org/2738643004 The following numbers were obtained with the following settings: Z620; Linux; non-component release build. Commandline: mojo_public_bindings_perftests --gtest_filter=*String* DeserializeString_NoUtf8Check/8 2.03549e+07 times/second DeserializeString_Utf8Check/8 1.55403e+07 times/second DeserializeString_NoUtf8Check/128 2.0278e+07 times/second DeserializeString_Utf8Check/128 3.01196e+06 times/second DeserializeString_NoUtf8Check/1024 1.64714e+07 times/second DeserializeString_Utf8Check/1024 446277 times/second The check has quite big an impact as the length increase. But I think typically long strings are used to transfer binary data. We should convert to array<uint8>. I will send a mail to chromium-mojo@.
,
Nov 15
Issue 900747 has been merged into this issue.
,
Nov 15
Triage refresh. Still seems like a good idea. We can also add a DCHECK-guarded validation step on send to catch mistakes earlier.
,
Jan 19
(4 days ago)
Parallel that might be helpful -- protobuf has a "string" type for UTF-8, and a "bytes" type for binary data. https://developers.google.com/protocol-buffers/docs/proto?csw=1#scalar |
|||
►
Sign in to add a comment |
|||
Comment 1 by yzshen@chromium.org
, Mar 8 2017