Parent encoding (not default) used for XML resources without encoding specification |
||||||
Issue descriptionIf an XHTML file loaded into an iframe does not specify an encoding, the parent document's encoding is used. This differs from Firefox, despite protestations in our code that we're matching behavior here and using UTF-8 as the default. Repro: 1. Serve out attached files 2. load parent.html Expected: reports "child encoding: UTF-8" Actual: reports "child encoding: windows-1252" Works as expected in Firefox ... In BuildTextResourceDecoderFor() there's a clause "Disable autodetection for XML/JSON to honor the default encoding (UTF-8) for unlabelled documents" ... but then at the bottom of the function, the encoding source is set to TextResourceDecoder::kEncodingFromParentFrame which causes this behavior to be ignored for subresources. This manifests in loading an iframe. This seems to be the root cause of Firefox/Chrome inconsistency noted in https://github.com/web-platform-tests/wpt/pull/13082 A fix might be as simple as setting `use_hint_encoding` to false for XML (and JSON?) MIME types.
,
Nov 12
FYI, this is captured in third_party/WebKit/LayoutTests/external/wpt/encoding/utf-32-from-win1252.html This is a testharness.js-based test. PASS Expect resources/utf-32-big-endian-bom.html to parse as windows-1252 FAIL Expect resources/utf-32-big-endian-bom.xml to parse as UTF-8 assert_equals: expected "UTF-8" but got "windows-1252" PASS Expect resources/utf-32-big-endian-nobom.html to parse as windows-1252 FAIL Expect resources/utf-32-big-endian-nobom.xml to parse as UTF-8 assert_equals: expected "UTF-8" but got "windows-1252" PASS Expect resources/utf-32-little-endian-bom.html to parse as UTF-16LE PASS Expect resources/utf-32-little-endian-bom.xml to parse as UTF-16LE PASS Expect resources/utf-32-little-endian-nobom.html to parse as windows-1252 FAIL Expect resources/utf-32-little-endian-nobom.xml to parse as UTF-8 assert_equals: expected "UTF-8" but got "windows-1252" Harness: the test ran to completion. Passes in Firefox.
,
Nov 12
jsbell@ I can take it if you haven't already started. Feel free to take it back if you have.
,
Nov 13
Looks like you already did.
,
Nov 13
I uploaded a CL just to see if anything would fail with a trivial fix: https://chromium-review.googlesource.com/c/chromium/src/+/1331953 Result: No failures. So we're missing tests for this behavior. jinsukkim@ - you probably have more context, so you're welcome to take it further. We probably need to find the right Blink>Loader folks to weigh in, too.
,
Nov 14
,
Nov 15
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/3d1e2e08c8e1118ef9a53d8c97dca95d422411dd commit 3d1e2e08c8e1118ef9a53d8c97dca95d422411dd Author: Joshua Bell <jsbell@chromium.org> Date: Thu Nov 15 20:02:01 2018 Loader: use default encoding not parent encoding for XML/JSON If unspecified (in headers or content), resources like HTML in iframes fall back to the parent's encoding. This should not be the case for XML/JSON which have well-defined defaults. Bug: 904017 Change-Id: I42f8950c5c2ef63c98bcec58a1f4b0bd0b190c33 Reviewed-on: https://chromium-review.googlesource.com/c/1331953 Reviewed-by: Nate Chapin <japhet@chromium.org> Reviewed-by: Jinsuk Kim <jinsukkim@chromium.org> Commit-Queue: Joshua Bell <jsbell@chromium.org> Cr-Commit-Position: refs/heads/master@{#608483} [delete] https://crrev.com/d503ca09233b6a5e72c2816ae4d84c4cff7aa4cd/third_party/WebKit/LayoutTests/external/wpt/encoding/utf-32-from-win1252-expected.txt [add] https://crrev.com/3d1e2e08c8e1118ef9a53d8c97dca95d422411dd/third_party/WebKit/LayoutTests/fast/encoding/resource-default-encoding.html [add] https://crrev.com/3d1e2e08c8e1118ef9a53d8c97dca95d422411dd/third_party/WebKit/LayoutTests/fast/encoding/resources/undeclared-encoding.json [add] https://crrev.com/3d1e2e08c8e1118ef9a53d8c97dca95d422411dd/third_party/WebKit/LayoutTests/fast/encoding/resources/undeclared-encoding.xhtml [modify] https://crrev.com/3d1e2e08c8e1118ef9a53d8c97dca95d422411dd/third_party/blink/renderer/core/loader/text_resource_decoder_builder.cc
,
Nov 16
|
||||||
►
Sign in to add a comment |
||||||
Comment 1 by jsb...@chromium.org
, Nov 12