'Bad inspector message' errors when running URL web-platform-tests via webdriver
Reported by
sim...@opera.com,
May 17 2017
|
|||||||
Issue descriptionConversation from #testing on irc.w3.org: <zcorpan> https://travis-ci.org/w3c/web-platform-tests/jobs/232845351 looks like something weird happened <jgraham> zcorpan: That's pretty funny; I think it's a bug in Chromedriver I don't know what the bug is, so I guess this needs some investigation...
,
May 26 2017
,
May 26 2017
My investigation shows that this is not a ChromeDriver issue, but is a combination of at least two issues inside Chrome. 1. The URL parsing code in Chrome is not standard compliant, according to tests at https://github.com/w3c/web-platform-tests/blob/master/url/urltestdata.json. For example, one test case has an input URL of "non-special://test:@test/x", and expects the browser to change it to "non-special://test@test/x" (note the removal of a ':'), but Chrome keeps the URL unchanged. (I tried this on some other browsers, and found Safari handles it correctly, while Edge makes the same mistake as Chrome.) Overall, about a third of these test cases fail with Chrome, resulting in a large amount of error messages. 2. A JSON encoding/decoding issue caused the "Bad inspector message" error reported at https://travis-ci.org/w3c/web-platform-tests/jobs/232845351. Part of the error message from part 1 contains an invalid Unicode character \uFDD0 (from https://github.com/w3c/web-platform-tests/blob/34435a4/url/urltestdata.json#L3596). The JSON encoder inside Chrome didn't detect such error, and passed it through in the JSON blob sent to ChromeDriver. ChromeDriver uses base/json/json_parser.cc to parse the JSON string. This parser does a more thorough error detection, notices that \uFDD0 is an invalid character, and reports an error. I think our JSON encoder and decoder should have exactly the same amount of error checking. It's problematic that the encoder can create a blob that is rejected by the decoder.
,
Jul 4 2017
Thanks John. This was just hit again here: https://travis-ci.org/w3c/web-platform-tests/jobs/249949077. I can reproduce the failure whenever I run any of the URL tests with wptrun (eg. (./wptrun chrome url/url-constructor.html - see https://github.com/w3c/web-platform-tests#running-tests-automatically). The high URL test failures are a known problem, see issue 660384, issue 682149. What JSON encoder is used here? The JavaScript JSON API is allowed to have invalid UTF-16 strings (though there's some debate about changing this here: https://github.com/tc39/ecma262/issues/944). I verified that I can round-trip this invalid-string via JSON.stringify/parse on Chrome and Firefox: http://jsbin.com/detoyuq/1. But from a naive reading the JSON spec (http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-404.pdf) seems to require valid UTF-16 code points. The test in question here is intentionally using an invalid unicode character, so I don't think we should be changing the test to not generate that character in it's output. Perhaps the encoding path should be escaping/stripping invalid Unicode characters? John, where does the encoding occur? domenic - any advice on the right place/technique to prevent invalid Unicode characters from getting into the system? Presumably there's some point where we go from an Ecmascript string to UTF-16 where we're missing some error checking or escaping?
,
Jul 5 2017
> But from a naive reading the JSON spec (http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-404.pdf) seems to require valid UTF-16 code points. Not quite. It requires "Unicode code points", with no restriction on avoiding isolated surrogates. > domenic - any advice on the right place/technique to prevent invalid Unicode characters from getting into the system? Presumably there's some point where we go from an Ecmascript string to UTF-16 where we're missing some error checking or escaping? My advice would be to use a type that can represent JavaScript strings [1] all the way through. I'm unsure why UTF-16 would be involved here at all. It seems like Chrome should be able to do this given that the inspector outputs JavaScript strings all the time. I don't know what other parts of Chrome use base/json/json_parser.cc but it seems like as it stands right now, it's unable to be used for web content, which is bad. [1]: https://infra.spec.whatwg.org/#javascript-string
,
Jul 6 2017
Thanks Domenic. I think the problem with using JavaScript strings end-to-end is that the strings are sent over the WebDriver protocol (not sure we can influence the charset of that) and then parsed in chromedriver C++ code that deals only with UTF-8. So we'd need some way to escape or otherwise encode these strings as UTF-8 I think (without rewriting the entire chromedriver codebase to work with a new character set). to kereliuk@ who is going to look into this in more detail. At a minimum there's probably a short-term hack we can safely do to filter out isolated surrogates in chromedriver code rather than fail at JSON decode time.
,
Jul 7 2017
> So we'd need some way to escape or otherwise encode these strings as UTF-8 I think (without rewriting the entire chromedriver codebase to work with a new character set). It's best not to try to represent unpaired surrogates at the character encoding layer. Since the ECMA JSON layer can represent them, it makes the most sense to make the JSON serializer you use emit \u escapes for lone surrogates so that the character encoding layer sees them as ASCII.
,
Jul 7 2017
Of course, using /u escapes makes perfect sense - thanks!
,
Jul 7 2017
The JSON encoding happens in protocol layout of DevTools, just before the result is sent back to ChromeDriver. The relevant code is in https://cs.chromium.org/chromium/src/out/Debug/gen/v8/src/inspector/protocol/Protocol.cpp. In particular, escapeStringForJSON function is responsible for encoding strings. It's actually quite conservative. Anything above 126 is encoded in \uXXXX format. (Note that Protocol.cpp is a generated file. The real source is https://cs.chromium.org/chromium/src/v8/third_party/inspector_protocol/lib/Values_cpp.template.) The error occurs in the JSON parser used by ChromeDriver. The decoding of \uXXXX sequence happens at https://cs.chromium.org/chromium/src/base/json/json_parser.cc?l=564 and https://cs.chromium.org/chromium/src/base/json/json_parser.cc?l=670. After decoding an escape sequence, the decoder rejects anything that's not a valid Unicode character. I noticed that there was a recent change to prevent a JSON encoder from emitting invalid Unicode code point: https://crrev.com/478900. Unfortunately it's not the JSON encoder used by the code involved in this bug, so it doesn't help us directly, but it's an indication that we're not the only ones affected by this type of issue.
,
Jul 17 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/a31c717dfee004fb600546086207be85b4f78486 commit a31c717dfee004fb600546086207be85b4f78486 Author: Jonathon Kereliuk <kereliuk@chromium.org> Date: Mon Jul 17 17:55:33 2017 Replace invalid UTF-16 escape sequences when decoding invalid UTF strings in chromedriver. Web platform tests use ECMAScript strings which aren't necessarily utf-16 characters. This change will help us deal with that case. More info: https://bugs.chromium.org/p/chromium/issues/detail?id=723592 To reproduce the error message from above, run: ./wptrun chrome url/url-constructor.html on web platform tests. Bug: 723592 Change-Id: Icf4b7de968ace81abba9f3020b9c123ab0791318 Reviewed-on: https://chromium-review.googlesource.com/565461 Commit-Queue: Jonathon Kereliuk <kereliuk@chromium.org> Reviewed-by: John Budorick <jbudorick@chromium.org> Reviewed-by: Robert Sesek <rsesek@chromium.org> Reviewed-by: Daniel Cheng <dcheng@chromium.org> Cr-Commit-Position: refs/heads/master@{#487152} [modify] https://crrev.com/a31c717dfee004fb600546086207be85b4f78486/base/json/json_parser.cc [modify] https://crrev.com/a31c717dfee004fb600546086207be85b4f78486/base/json/json_parser.h [modify] https://crrev.com/a31c717dfee004fb600546086207be85b4f78486/base/json/json_parser_unittest.cc [modify] https://crrev.com/a31c717dfee004fb600546086207be85b4f78486/chrome/test/chromedriver/chrome/devtools_client_impl.cc
,
Jul 17 2017
|
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by rbyers@chromium.org
, May 26 2017Status: WontFix (was: Untriaged)
Summary: "bad Bug in ChromeDriver causes problems in web-platform-tests' stability checker (was: Bug in ChromeDriver causes problems in web-platform-tests' stability checker)